Elements in Unicode: was odd question about elements 119+
Submitted by Anonymous on 11 March 2004 - 3:44am.
First off, it's been about 6 years since I last took any chemistry. Second, I don't know if this question makes any sense, but I need the information for a project so here goes:
1. Is there any expectation that elements above atomic number 118 are even physically possible within our universe?
2. If so, do we have an expectation of a set of 18 groups corresponding to a new orbital geometry not present in lighter elements, or do we expect only 32 elements in period 8? This is just based on my observation of a patern of a set of 2 groups in period 1; sets of 2 and 6 in periods 2 and 3; 2,6,10 in 4 and 5; 2,6,10,14 in 6&7; therefore 2,6,10,14, & 18 in periods 8&9.

Delicious
Digg
Reddit
Facebook
Google
Yahoo
Technorati
This proposal includes allocation for elements through series 9, so there is more than enough wiggle room for the foreseeable future. I have also included two small allocations after my series 9 space, one of which is 6 "control characters" for indicating atomic number, atomic mass, charge, and the number of an atom in a molecule - I don't know what to call it, so there's some nomenclature that needs to be cleaned up, the second of which is a 32 character "Extended" section for symbols like P (for Protium), D (Deuterium) and T (Tritium). This is the area that I would be afraid of for running out of allocation room. At any rate, that uses 256 code points - a very good number when dealing with Unicode.
Ok, I am going to ask this as a complete question because apparantly no one sorted through my last post to find it:
What do you call the subscript that tells how many of an atom are in a particular molecule (eg the 2 in H2O)?
I don't know if there is a formal name for that? It might just be a matter of notation.
Anyone can talk about element names - but only one organisation/entity can approve a new name - otherwise anarchy. See http://www.iupac.org/publications/pac/2002/7405/7405x0787.html
Probably this is a silly question, but why does it make sense to think of chemical symbols at the character level? I tend to think of them as "words". If you want to add "meaning" to a document, why not use something like XML? For example,
The symbol for hydrogen is <elementSymbol>H</elementSymbol>I think this approach gives you more flexibility and you don't have to worry about having the latest Unicode font installed. Of course you still have to worry about standardizing an XML vocabulary.
What worries me is that if we start thinking of element symbols as characters, then we'll want to think of element names as characters, and then colors will be characters and so on... Do we need a Unicode character for "red"?
It is because the symbol "H" has a distinct meaning from, sorts differently from, and is used in hundreds of other languages than the Latin alphabetical character "H". A markup like XML is platform specific. Unicode is necessarily universal. When I don't have a Greek font available (like when I sent a final to my Greek prof), I use a Latinization scheme called Beta code. Does that work? Sure, for my context. Could I just put a tag like <Greek> and then type all my text in Latin characters? Absolutely, as long as the person I am writing to knows how it works. In that context, we both knew what we were doing. For other contexts, eg sorting those words for inclusion in a dictionary, it would have been woefully inadequate, placing entire sections of words in the wrong place. Even if the chemical symbols in Unicode, there would be nothing that says that you cannot use an XML markup to indicate the symbol for Hydrogen. There is nothing that says you cannot just put "H" and be done with it. Unicode is a tool that allows people to standardize, and enshrine in the computing world, exactly how their discipline records information. If your XML markup is standardized, then upgrading a document to or from the Unicode would be incredibly simple.
Is there a unique character for red that obeys rules that are separate from defined rules of, for example, the Latin alphabetization of the sequence "red"? If that were the case, "red" would arguably qualify as a distinct glyph. Do you think that some guy studying chemistry in China has any idea why the symbol "H" stands for the element Hydrogen? Of course not, it is just another symbol that stands for a concept. The symbol "H" has a specific ideographic meaning that acts differently from the Latin character H in several distinct contexts. The element names, in addition to being language dependent, still function like any other word. They obey the rules of normal text in every language, and therefore are not eligible for inclusion in Unicode. The symbols "H", "He", "Li", etc. do not follow the rules of normal text, and thus should be included in Unicode. The fact is, you can think of the element symbols any way you want. I personally look at a symbol like OH- and that has a distinct ideographic meaning to me - an hydroxide ion, but it won't make it into Unicode if we have the symbols for elements in - it obeys all the rules and acts as the components would dictate. It is not a unique glyph, it is, in fact, a sequence of already existing glyphs.
Perhaps there are other pseudoelement symbols that might usefully be included? -
Me - methyl
Et - ethyl
Pr - propyl (distinct from the element symbol Pr - praseodymium
Bu - butyl
Ph - phenyl
Bz - benzyl
and others