README file for the nohyphb and nohyphbc hyphenation patterns. -------------------------------------------------------------- These patterns are generated from a collection of 750000 Norwegian words taken from both variants of the Norwegian language (bokmål and nynorsk). They should work on every word in the list they are generated from. This dictionary together with scripts to make it usable with ispell and scripts to generate these patterns are available from http://www.uio.no/~runekl/dictionary.html The nohyphb patters hyphenates Norwegian text as TeX used to, for example `barne-hage-as-si-stent'. The nohyphbc hyphenates only between compound words, e.g. `barnehage-assistent', `barne-hage'. This is good for ragged text or very long lines. These patterns will not fail on common words, e.g. words in the dictionary. If they do it is a bug. The old Norwegian patterns failed on many common words. Proof is on the web page above. But even if these patterns are of high quality, they still might fail on Norwegian words not in the dictionary, so if you don't feel particularly lucky you will have to do something about that too. There are two strategies. 1. Mark the compound point in the compound word with "-, e.g. administrasjons"-sjef"-stillings-"søker. If you have patched ispell with the patch from my web page, you can do this during spell-checking most of the time. I dislike this approach. 2. Install the Norwegian dictionary from the web page above and Use the script inorsk-hyphenmaybe to print every word in your document not in the dictionary (nynorsk and bokmål) hyphenated by TeX. Then you can easily browse through this list and put the badly hyphenated words in a \hyphenation command. The next time you run the script (and TeX) it should produce correct hyphenation. For example if inorsk-hyphenmaybe outputs `kon-flik-t-akse' and `kon-flik-t-ak-sen' you have to say \hyphenation{kon-flikt-akse `kon-flikt-ak-sen'} in your TeX document. This approach can also help you if you have lots of foreign words in your documend and don't want to use \foreignlanguage from Babel. There are two things TeX still cant do regarding hyphenation. It can't hyphenate `villede' automatically as `vill-lede'; you need to say `vi"llede' if you have Babel-3.7. If you don't have Babel-3.7 I recommend norsk.cfg available from my web page. However, the patterns will not fail on these kind of words, they will only miss a possible hyphen. And by the way, the ispell dictionary from my web page accepts the spelling vi"llede. The second thing is multi level hyphenation, e.g. different penalties for different hyphen points. The scripts available from my web page makes it possible to generate patterns for a TeX that can do this, and the change files by Matthias Clasen that used to be available from ftp://peano.mathematik.uni-freiburg.de/pub/etex/ makes it possible to build such a TeX. If they aren't availiable from there, look in http://www.uio.no/~runekl/clasen.tar.gz If you want to install both sets of Norwegian patterns in the same format, you have a TeX capacity problem. The variable ssup_tree_size needs to be bigger than 65535 and trie_op_size bigger than 1501. I use 262142 and 3501. So you need to change tex.ch (and omega.ch) and recompile TeX. If you are using teTeX that should be quite easy. Here is an unsupported patch: *** tex.ch~ Fri Jan 21 23:13:24 2000 --- tex.ch Mon Jul 10 18:46:15 2000 *************** *** 196 **** ! @d ssup_trie_size == 65535 --- 196 ---- ! @d ssup_trie_size == 262143 *************** *** 215 **** ! @!trie_op_size=1501; {space for ``opcodes'' in the hyphenation patterns; --- 215 ---- ! @!trie_op_size=3501; {space for ``opcodes'' in the hyphenation patterns; *************** *** 217 **** ! @!neg_trie_op_size=-1501; {for lower |trie_op_hash| array bound; --- 217 ---- ! @!neg_trie_op_size=-3501; {for lower |trie_op_hash| array bound; *** omega.ch~ Thu Jul 13 11:37:08 2000 --- omega.ch Sun Jul 23 20:38:03 2000 *************** *** 125,127 **** @d ssup_trie_opcode == 65535 ! @d ssup_trie_size == 100000 --- 125,127 ---- @d ssup_trie_opcode == 65535 ! @d ssup_trie_size == 262143 *************** *** 139,143 **** {Use |hash_offset=0| for compilers which cannot decrement pointers.} ! @!trie_op_size=1501; {space for ``opcodes'' in the hyphenation patterns; best if relatively prime to 313, 361, and 1009.} ! @!neg_trie_op_size=-1501; {for lower |trie_op_hash| array bound; must be equal to |-trie_op_size|.} --- 139,143 ---- {Use |hash_offset=0| for compilers which cannot decrement pointers.} ! @!trie_op_size=3501; {space for ``opcodes'' in the hyphenation patterns; best if relatively prime to 313, 361, and 1009.} ! @!neg_trie_op_size=-3501; {for lower |trie_op_hash| array bound; must be equal to |-trie_op_size|.} The easiest way to use both sets of patterns in the same document is to define the macros \def\goodhyphens{\lefthyphenmin2\righthyphenmin2\language=\l@norskc} \def\allhyphens{\lefthyphenmin1\righthyphenmin2\language=\l@norsk} and change hyphenation whenever you want to. In this case the language.dat file should contain the lines norsk nohyphb.tex norskc nohyphbc.tex