--> | |
<FONT FACE> considered harmful |
The
The problemText is normally transfered on the Web - as well as in other Internet applications - as a sequence of coded characters. That is, to each code value corresponds by convention a single character, which a receiving application can interpret and display. There are a number of such character codes, each covering a given character repertoire, normally corresponding to a script.
The point is that if you use
Why is it bad?There are a number of problems with the above approach. The most evident is that bad things happen if the user looking at your page does not have exactly the font that you specified: he will see the text in his browser's default font, which will not be Greek (unless he is Greek, of course!), whereas he may have a perfectly good Greek font on his system, which could have been used if you had coded the text properly.
This brings to the forefront the problem of font proliferation: the characters (actually
glyphs) in a font are
numbered, the set of glyph-number associations forming what is
known as the coding of the font. But there are a large number of these, even for a given
language or script. If you use simplistic font mapping (which is what
And the Webspace has become fragmented, with mutually incomprehensible parts (unless you have all the right fonts), not exactly what the Web was intended for! Just think of the mess if there existed 5 flavours of ASCII, all incompatible, and if users constantly had to convert from one to the other, after guessing which it is.
Searching, sorting, text processing, etc.And what about searching for this Greek (or other) text, using your favorite search engine? Not a chance! You have to know what font the author used even before you look for the text, in order for you to provide the search engine with the "correct" false characters. Even using your browser's search function to look for a word within a page is not likely to give the expected results.
Similarly, a list of Greek (or other, again) words coded by font mapping is not likely to
sort correctly, if you ever want to do that. In fact, any kind of text processing is next
to impossible using that technique. If you are authoring Web pages - and you are if you
use
Mail, news and other Internet applications
The Internet is about communication. The Web uses HTML as a common document format, but what
about all the other means of communications? HTML is generally not an option, and you will
have to forget about using the HTMLish
Complex scripts
The situation gets even hairier when a complex script is involved. What is a complex script?
Well, "simple" scripts are those where there is a one-to-one relationship between characters
and glyphs; the others are complex. Examples of simple scripts are Latin, Cyrillic and Chinese.
Chinese is simple? Characters and glyphs do map one-to-one in Chinese, but for computers it is
complex because there are so many characters; one byte is not enough to encode all of them (by
far!), two or more are required and In some complex scripts the glyph changes according to the position of the character within a word (initial, medial, final or isolated), as in Arabic. Or there exist compulsory ligatures where two or more characters turn into a single glyph, as in Devnagari (used by the Hindi language). Or one character is displayed as two glyphs that straddle the glyph of another character, as in Tamil. Additionally:
Conclusion
The conclusion is very simple: do not use If you think you are doing some language community a service by making up fonts and using them as described above to publish on the Web, please think again. Consider instead keeping your bytes, characters and glyphs as separate things:
|
The Tango Multilingual Browser will properly display all of Babel's languages. | 1996, Alis Technologies inc. | |