Dict's web pronunciations (Shallow Thoughts)

Akkana's Musings on Open Source, Science, and Nature.

Wed, 18 Aug 2004

Dict's web pronunciations

I went to dict this afternoon to find out whether "cerebral" was best pronounced ser EE brul or SER e bral.

The Collaborative International Dictionary of English v.0.44 [gcide] gives two definitions, both with the same pronunciation: /Cer"e*bral/

Great! What does that mean? It's not the standard phonetic markings like dictionaries use (lucky for me, since if it used accent marks and such, I wouldn't be able to display it in my terminal font).

Jutta helped me out with the investigation, and with some combined googling and README-reading, she eventually found gcide's pronunc.web file.

That holds the key to the stresses: the double quote (") is a heavy stress (light light stress, not indicated, would be a backquote). The asterisk (*) is simply a hyphen to separate syllables (why they don't just use a dash, or even a space, I'm not sure).

That's progress, but what do those vowels mean? And the C? (Okay, I know it's pronounced as an ess. I even knew by now that both pronunciations are acceptable, since I'd looked it up in a dead-tree dictionary and so had about four other people on the channel while I was trying to track down a dict pronunciation guide). pronunc.web talks about a long list of special characters that are supposed to correspond to web fonts (haha). But dict doesn't actually use those: try dict free, which gives the pronunciation /free/, while pronunc.web says it should show up as /fr<emac// (which, you have to admit, would be pretty confusing what with the close slash for the special character followed by the close slash for the pronunciation; even aside from the question of who can read strings like /fr<emac//, ick).

Some other googling mentioned web dictionaries, including gcide, using the pronunciation guide from the Jargon file. Ironically, this was very hard to read since it uses smartquote characters all over the place which not only don't appear in the font I was using in mozilla, but also don't get substituted properly in mozilla (moz is usually pretty good about that) so I just see boxes. It's possible that the font claims to have the characters, then shows boxes instead.

Jutta wondered why PRONUNC.JPG and PRONUNC.WEB weren't in the Debian package, since they're mentioned in /usr/share/doc/dict-gcide/README.dictionary.gz. I have mixed feelings: I think it's a bug that there's no file that describes the pronunciation system being used, but since neither of those files does describe it, not including them is probably not a bug.

At Jutta's suggestion, I filed a bug on dict-cgide (bug 266773).

Tags:
[ 20:19 Aug 18, 2004    More misc | permalink to this entry ]