diff options
author | Torstein Hernes Dybdahl <torstad@runbox.no> | 2005-04-17 22:54:18 +0000 |
---|---|---|
committer | Torstein Hernes Dybdahl <torstad@runbox.no> | 2005-04-17 22:54:18 +0000 |
commit | 0fa3e4a875575d84f46264670c8b2551be0e39f6 (patch) | |
tree | 87cb58d9df3b392ede338dcdacb5eb35f0d9be03 | |
parent | d587c667d87b1307e58416f1f449b5d0805f57ab (diff) | |
download | homepage-0fa3e4a875575d84f46264670c8b2551be0e39f6.tar.gz homepage-0fa3e4a875575d84f46264670c8b2551be0e39f6.tar.bz2 homepage-0fa3e4a875575d84f46264670c8b2551be0e39f6.tar.xz |
Added developer information, about spellchecker list building.
Changed homepage to indicate that something is going on.
-rw-r--r-- | developer.html | 35 | ||||
-rw-r--r-- | index.html | 342 | ||||
-rw-r--r-- | index_old.html | 298 |
3 files changed, 374 insertions, 301 deletions
diff --git a/developer.html b/developer.html new file mode 100644 index 0000000..6da195c --- /dev/null +++ b/developer.html @@ -0,0 +1,35 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html> +<head> + <meta content="text/html; charset=ISO-8859-1" + http-equiv="content-type"> + <title>Developer information</title> +</head> +<body> +Developer information:<br> +<br> +Short information for building the wordlist with cvs checkout.<br> +<br> +Steps for building wordlists.<br> +<br> +Download ispell.3.2.06<br> +Untar ispell package<br> +Patch with ispell patch from cvs<br> +copy files for cvs into languages/norsk folder inside ispell folder.<br> +build ispell. (ispell is a bit tricky to build some times, check out +the Readme file included in the ispell-tarball)<br> +go to languages/norsk folder<br> +make -f Makefile.new myspell-dist<br> +make -f Makefile.new aspell-dist<br> +the aspell tar.bz2 files are aspell wordlists<br> +and myspell files are there.<br> +<br> +To make the ispell files type<br> +<br> +make all<br> +<br> +this will make the nynorsk.hash and bokmal.hash ispell files.<br> +<br> +More information coming.<br> +</body> +</html> @@ -1,311 +1,51 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"> - +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> -<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> -<title>The Norwegian ispell-dictionary page</title> -<style type="text/css"> -<!-- - body { background-color: white } - h1, h2, h3, b { font-family: sans-serif } - .center { align: center } - .red { color: red } ---> -</style> - + <meta content="text/html; charset=ISO-8859-1" + http-equiv="content-type"> + <title>Spell-norwegian</title> +</head> <body> -<h1 class="center">The Norwegian ispell-dictionary home page</h1> - - -<hr> - -<p>The most important file available here contains a list of 750 000 -Norwegian words. Each word is marked with a number indicating the -commonness of that word. Compound words are hyphenated at their -compound points. Some words are marked as belonging to a specific -classes; mathematics, oil, conservative language, 'samnorsk' etc. -Words marked with a star are allowed in Nynorsk. - -<p>This file is usable to several things: -<ul> -<li>Making dictionaries for the Ispell program of different sizes, -choosing which words to include in a sensible way. - -<li>Making Norwegian dictionaries for word processors that doesn't -have one, again with a sensible subset of words. - -<li>Making new and better hyphenation patterns for TeX. - -<li>Text-recognition (OCR) programs. - -<li>Encourage the e-TeX team to implement multi-level hyphenation in -TeX. - -<li>Encourage people to use frequency-information when they write -programs making suggestions for replacements for misspelled words. - -</ul> - -<p>Routines for the first three items on the above list is included in -the Makefiles. The last three was too hard to implement in Make. - -<H3>Requirements</H3> - - -<ul> -<li><b>Ispell</b><br> -For the ispell-related stuff, you need the ispell program, and you can -get it from the -<A -HREF="http://ficus-www.cs.ucla.edu/ficus-members/geoff/ispell.html"> -ispell home-page</A>. You can also find dictionaries for a lot of -languages there. - -You also need the version of the look program in <a -href="ftp://ftp.win.tue.nl/pub/linux/utils/util-linux/">util-linux-2.9</a>. -Older versions have a bug which shows up when searching dictionaries -with non-English characters. Ispell uses look to complete words -(ispell-complete-word). If you don't plan to have a Norwegian words -file for lookup, you don't need to worry about the Look program. - -<li><b>Emacs</b><br> - -If you want to use ispell from Emacs, i recommend upgrading to the -latest version of <A -HREF="ftp://kdstevens.com/pub/stevens/ispell.el.gz"> ispell.el</A>. -This version supports Norwegian, and it has become clean to include -local dictionary definitions. It is almost like the version included -in in Emacs-20.4. There is also an add-on to ispell.el, -<A HREF="http://kaolin.unice.fr/~serrano/emacs/flyspell"> -flyspell.el</A>, written by <A href="http://kaolin.unice.fr/~serrano"> -Manuel Serrano</A> available, offering better `on-the-fly' -spell-checking. An old version is included in emacs-20.3, but you -would like to have the new version with important speed improvements. - -<li><b>(La)TeX</b><br> - -If you want to make your own hyphenation patterns for TeX (you -probably don't), you need a version of the patgen program with greater -capacities than standard versions, e.g. you have to compile patgen -with a different patgen.ch. See the patterns/Makefile for more -information. Almost every TeX distribution contains the patgen -program. I recommend <a -href="ftp://ftp.rrzn.uni-hannover.de/pub/local/misc/teTeX-beta/">teTeX</a>. -If you want both kinds of hyphenation in the same TeX format, you -probably need to recompile TeX due to capacity problems. Again, this -is easy with teTeX. - -</ul> - - -<h3>Distribution</h3> - -<p>The distribution <a -href="http://www.uio.no/~runekl/ispell-norsk-2.0.tar.gz">ispell-norsk-2.0.tar.gz</a> -(2204k) is free in the GPL sense and contains these files: - -<ul> -<li><b><a href="README">README</a></b><br> -How to make ispell and emacs work with these dictionaries. - -<li><b>words.norsk.sq</b><br> This file contains the Norwegian words and -the indication of their commonness compressed with the sq program. - -<li><b>norsk.aff.in</b><br> A template for the affix file for the -Norwegian language. This file is made for ispell with 64 maskbits -that understands HTML. Most pre-made versions of ispell supports only -32 maskbits and don't understand HTML. Use the patch and recompile -ispell, or delete the html-related stuff. - -<li><b>Ispell-3.1.20.no.patch</b><br>A patch for ispell-3.1 that adds -the amsmath and breqn environments to the skip-list and fixes a bug in -buildhash. It also makes ispell html-aware, and tries to fix `the -backslash bug'. In addition it makes ispell suggest "- as a compound -word mark when seeing an unknown compound word, but only in TeX mode -if the dictionary is named norsk. This is an ugly hack that works for -me. It also implement the -r flag which is like the -a flag, but the -suggestions are printed even if the word is found in the dictionary. - -<li><b>norsk.single.tex</b><br>This is a set of hyphenation patterns -for TeX that works well on non-compound words. It is used when making -the new hyphenation patterns for TeX. This file is basically made -from nohyph3.tex, a hyphenation file I released May 1998. But a lot of -errors have been removed by comparing its action on the single words -by the action of <a -href="ftp://ftp.dante.de/tex-archive/language/hyphenation/nohyph.tex">nohyph.tex</a> -(standard in teTeX), <a -href="ftp://ftp.dante.de/tex-archive/language/hyphenation/nohyph2.tex">nohyph2.tex</a>, -and the unreleased hyphenation patterns by Simen Gaure used at the <a -href="http://www.math.uio.no/index.html">Department of -Mathematics</a>, <a href="http://www.uio.no/index.html">University of -Oslo</a>. I have tried to follow the rules given in <a -href="ftp://ftp.dante.de/tex-archive/language/hyphenation/nohyph.tex">nohyph.tex</a>, -at least where I find it reasonable. Bear in mind that there is no -authoritative source for hyphenation in Norwegian. Please get in -touch if you want to help improving the Norwegian hyphenation -patterns. - -<li><b><a href="norsk.cfg">norsk.cfg</a></b><br> An -addition to Babel-3.6 for LaTeX that makes the character " active and -offers you many `different' hyphen signs. You can say o"ppussing in -LaTeX to get correct hyphenation opp-pussing! This functionality will -appear in Babel-3.7 for Norwegian. Danish and Swedish have had it for -several years. - -<li><b><a href="inorsk-compwordsmaybe">inorsk-compwordsmaybe</a></b><br> -Search for words in a file or from standard input that maybe should be -written in one word. Like `matematikk lærer' etc. - -<li><b><a href="inorsk-hyphenmaybe">inorsk-hyphenmaybe</a></b><br> -Search for words in a file from from standard input that the Norwegian -hyphenation patterns from this distribution might not hyphenate -properly. Incorrect hyphenation of words not printed is considered to -be a bug in the patterns. There is only a finite number of them. - -<li><b>Makefile</b><br> This file contains rules for making -dictionaries for ispell and lists of the most common words for dumb -word processors. There is also a Makefile in the patterns directory -for making hyphenation patterns. - -<li><b><a href="nohyphbc.tex">nohyphbc.tex</a></b>, <b><a -href="nohyphb.tex">nohyphb.tex</a></b><br> This -is the hyphenation patterns for TeX. The file nohyphbc.tex hyphenates -only at compound points. The nohyphb.tex hyphenates each component of -a word too, but avoiding to hyphenate 'near' compound points. I think -'bar-nepsykologen' looks really bad. Too bad TeX doesn't support -multi-level hyphenation yet.<br> - - -The naming of the files follow the paradigm in Babel; if a replacement -for a file foo.bar is offered, it is named foob.bar, where the b -stands for big. - -These new patterns easily outpreforms those available before, mostly -because of better compound word hyphenation. For reference I have -made lists of about 2000 compound word errors made by previous -patterns: <a href="err.nohyph">err.nokyph</a> and <a href="err.nohyph2"> -err.nokyph2</a>. +<h1><span style="font-weight: bold;">Spell-norwegian</span></h1> +<a href="#Information_"><br> +Information</a><br> +<a href="#Download">Download</a><br> +<a href="index_old.html">Oldpage</a><br> +<a href="links.html">Links</a><br> +<a href="developer.html">Developer page</a><br> +<a href="#Contact">Contact</a><br> <br> - -The size of the patterns can be argued over. The patterns are copied -into each format file, thus occupying some disk space. They also -limit the number of languages one can load hyphenation patterns for on -most TeX systems. But size considerations has become less important -recent years, so I prefer to focus on getting things right, not small. -It is also possible to recompile teTeX such that there is more room -for hyphenation patterns, but the patterns take up more memory then. -There is surely a lot of unnessesary structure within the hyphenation -patterns , but it is very time-consuming to remove. The file -patterns/Makefile can be configured, such that one can make smaller -sets of patterns, taking only the most common words into -concideration. Everyone is invited to play. - -<li><b>COPYING</b><br> The GNU general public license. - -</ul> - -<p> - - -<h3>Changes</h3> - -<p>There has been a lot of changes since version 1.1a. The quality has -improved a lot, and the structure of the distribution is completely -new. Therefore i choose not to make the previous versions available -from this site. - -<p>Here is a rough summary of the changes: - -<ul> - -<li>New distribution format - -<li>Support for Nynorsk - -<li>Commonness indicator for each word from Bokmål - -<li>Words are hyphenated at their compound points - -<li>A lot of common words added, especially compound words - -<li>Makefile completly rewritten. It is possible to configure the -size of the dictionary for ispell without beaking the munching. - -<li>Makefile to make hyphenation patterns for TeX added - -<li>The pregenerated TeX patterns are included in the distribution. - -<li>Controlled compoundwords support added. This includes affix file -updates. - -<li>Some uncommon and misspelled words removed - -<li>Affix file updated for html. This will only work if you use the patch. - -</ul> - -<h3>Todo list</h3> - +<h2><a name="Information"></a></h2> +<h2>Information</h2> +This will be the new information site for the bokmål and nynorsk +spellchecker data for aspell, ispell and myspell.<br> +This page will continue the good work done by Rune Kleveland.<br> +<br> +The plan is as follows:<br> <ul> - -<li>Remove/mark uncommon words that are close to common words. If you -type 're' you probably meant 'er', even if 're' is a valid word. - -<li>There are too many words with commonness 0. Split this group in -two. - -<li>Some words in the basic category belongs in special categories. -When making a small dictionary with all words from mathematics, many -such words are missing, since they are in the basic category. They -should be moved. - -<li>Make ispell sort the suggested replacements for misspelled words -by commonness of the suggested words. One (easy) way to do this is to -make an external file containing the most common words, and make -ispell look into that file each time it has more than one suggestion. -Or the file could be read into memory. (I don't think frequency -information is representable within the root/affix structure, since -one flag can represent multiple words.) This would slow ispell down a -little bit, but only when it makes suggestions. If you would like to -help with this, please get in touch. - + <li>Make release of aspell, myspell and ispell packages.</li> + <li>Make all the distributions use the same source</li> + <li>Improve the Nynorsk wordlist to contain more words, here we hope +to get help from the An Gramadoir project.</li> </ul> - -<p>Comments, suggestions and bug-reports to <a -href="mailto:runekl@opoint.com">runekl@opoint.com</a>. If you have -or want to make a correct dictionary from some field of knowledge, i -would like to include it in the next release. See the <a -href="README">README</a> file for some -suggestions about how to get started. All you need is a large amount -of Norwegian text from the field in question and some time to organize -the dictionary.<p> - -<h3>Related sites</h3> - +The plans for the future :<br> <ul> - - <li><a href="http://www.speling.org/">An overview of Open Source - word lists for spell checking</a></li> - - <li><a href="http://www.dokpro.uio.no/ordboksoek.html">Search - in norwegian dictionaries from "Dokumentasjonsprosjektet"</a></li> - + <li>Make spellchecker that can do grammar checking, this is a major +task. <br> + </li> + <li>Improve the wordlists for bokmål and nynorsk to be +up-to-date with regards of new words.</li> + <li>More to come.<br> + </li> </ul> - -<a target="_top" href="http://v.extreme-dm.com/?login=runekl"> -<img src="http://v1.extreme-dm.com/i.gif" height=38 -border=0 width=41 alt=""></a><script language="javascript"><!-- -an=navigator.appName;d=document;function -pr(){d.write("<img src=\"http://v0.extreme-dm.com", -"/0.gif?tag=runekl&j=y&srw="+srw+"&srb="+srb+"&", -"rs="+r+"&l="+escape(d.referrer)+"\" height=1 ", -"width=1>");}srb="na";srw="na";//--> -</script><script language="javascript1.2"><!-- -s=screen;srw=s.width;an!="Netscape"? -srb=s.colorDepth:srb=s.pixelDepth;//--> -</script><script language="javascript"><!-- -r=41;d.images?r=d.im.width:z=0;pr();//--> -</script><noscript><img height=1 width=1 alt="" -src="http://v0.extreme-dm.com/0.gif?tag=runekl&j=n"></noscript> +<h2><a name="Download"></a>Download</h2> +<br> +Aspell, myspell and ispell packages are comming soon.<br> +<br> +<br> +<h2><a name="Contact"></a>Contact</h2> +email torsted<æt>runbox dot no<br> +<br> +Sunday: 17/04/2005<br> </body> </html> diff --git a/index_old.html b/index_old.html new file mode 100644 index 0000000..7267096 --- /dev/null +++ b/index_old.html @@ -0,0 +1,298 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"> + +<html> +<head> +<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> +<title>The Norwegian ispell-dictionary page</title> +<style type="text/css"> +<!-- + body { background-color: white } + h1, h2, h3, b { font-family: sans-serif } + .center { align: center } + .red { color: red } +--> +</style> + +<body> +<h1 class="center">The Norwegian ispell-dictionary home page</h1> + + +<hr> + +<p>The most important file available here contains a list of 750 000 +Norwegian words. Each word is marked with a number indicating the +commonness of that word. Compound words are hyphenated at their +compound points. Some words are marked as belonging to a specific +classes; mathematics, oil, conservative language, 'samnorsk' etc. +Words marked with a star are allowed in Nynorsk. + +<p>This file is usable to several things: +<ul> +<li>Making dictionaries for the Ispell program of different sizes, +choosing which words to include in a sensible way. + +<li>Making Norwegian dictionaries for word processors that doesn't +have one, again with a sensible subset of words. + +<li>Making new and better hyphenation patterns for TeX. + +<li>Text-recognition (OCR) programs. + +<li>Encourage the e-TeX team to implement multi-level hyphenation in +TeX. + +<li>Encourage people to use frequency-information when they write +programs making suggestions for replacements for misspelled words. + +</ul> + +<p>Routines for the first three items on the above list is included in +the Makefiles. The last three was too hard to implement in Make. + +<H3>Requirements</H3> + + +<ul> +<li><b>Ispell</b><br> +For the ispell-related stuff, you need the ispell program, and you can +get it from the +<A +HREF="http://ficus-www.cs.ucla.edu/ficus-members/geoff/ispell.html"> +ispell home-page</A>. You can also find dictionaries for a lot of +languages there. + +You also need the version of the look program in <a +href="ftp://ftp.win.tue.nl/pub/linux/utils/util-linux/">util-linux-2.9</a>. +Older versions have a bug which shows up when searching dictionaries +with non-English characters. Ispell uses look to complete words +(ispell-complete-word). If you don't plan to have a Norwegian words +file for lookup, you don't need to worry about the Look program. + +<li><b>Emacs</b><br> + +If you want to use ispell from Emacs, i recommend upgrading to the +latest version of <A +HREF="ftp://kdstevens.com/pub/stevens/ispell.el.gz"> ispell.el</A>. +This version supports Norwegian, and it has become clean to include +local dictionary definitions. It is almost like the version included +in in Emacs-20.4. There is also an add-on to ispell.el, +<A HREF="http://kaolin.unice.fr/~serrano/emacs/flyspell"> +flyspell.el</A>, written by <A href="http://kaolin.unice.fr/~serrano"> +Manuel Serrano</A> available, offering better `on-the-fly' +spell-checking. An old version is included in emacs-20.3, but you +would like to have the new version with important speed improvements. + +<li><b>(La)TeX</b><br> + +If you want to make your own hyphenation patterns for TeX (you +probably don't), you need a version of the patgen program with greater +capacities than standard versions, e.g. you have to compile patgen +with a different patgen.ch. See the patterns/Makefile for more +information. Almost every TeX distribution contains the patgen +program. I recommend <a +href="ftp://ftp.rrzn.uni-hannover.de/pub/local/misc/teTeX-beta/">teTeX</a>. +If you want both kinds of hyphenation in the same TeX format, you +probably need to recompile TeX due to capacity problems. Again, this +is easy with teTeX. + +</ul> + + +<h3>Distribution</h3> + +<p>The distribution <a +href="http://www.uio.no/~runekl/ispell-norsk-2.0.tar.gz">ispell-norsk-2.0.tar.gz</a> +(2204k) is free in the GPL sense and contains these files: + +<ul> +<li><b><a href="README">README</a></b><br> +How to make ispell and emacs work with these dictionaries. + +<li><b>words.norsk.sq</b><br> This file contains the Norwegian words and +the indication of their commonness compressed with the sq program. + +<li><b>norsk.aff.in</b><br> A template for the affix file for the +Norwegian language. This file is made for ispell with 64 maskbits +that understands HTML. Most pre-made versions of ispell supports only +32 maskbits and don't understand HTML. Use the patch and recompile +ispell, or delete the html-related stuff. + +<li><b>Ispell-3.1.20.no.patch</b><br>A patch for ispell-3.1 that adds +the amsmath and breqn environments to the skip-list and fixes a bug in +buildhash. It also makes ispell html-aware, and tries to fix `the +backslash bug'. In addition it makes ispell suggest "- as a compound +word mark when seeing an unknown compound word, but only in TeX mode +if the dictionary is named norsk. This is an ugly hack that works for +me. It also implement the -r flag which is like the -a flag, but the +suggestions are printed even if the word is found in the dictionary. + +<li><b>norsk.single.tex</b><br>This is a set of hyphenation patterns +for TeX that works well on non-compound words. It is used when making +the new hyphenation patterns for TeX. This file is basically made +from nohyph3.tex, a hyphenation file I released May 1998. But a lot of +errors have been removed by comparing its action on the single words +by the action of <a +href="ftp://ftp.dante.de/tex-archive/language/hyphenation/nohyph.tex">nohyph.tex</a> +(standard in teTeX), <a +href="ftp://ftp.dante.de/tex-archive/language/hyphenation/nohyph2.tex">nohyph2.tex</a>, +and the unreleased hyphenation patterns by Simen Gaure used at the <a +href="http://www.math.uio.no/index.html">Department of +Mathematics</a>, <a href="http://www.uio.no/index.html">University of +Oslo</a>. I have tried to follow the rules given in <a +href="ftp://ftp.dante.de/tex-archive/language/hyphenation/nohyph.tex">nohyph.tex</a>, +at least where I find it reasonable. Bear in mind that there is no +authoritative source for hyphenation in Norwegian. Please get in +touch if you want to help improving the Norwegian hyphenation +patterns. + +<li><b><a href="norsk.cfg">norsk.cfg</a></b><br> An +addition to Babel-3.6 for LaTeX that makes the character " active and +offers you many `different' hyphen signs. You can say o"ppussing in +LaTeX to get correct hyphenation opp-pussing! This functionality will +appear in Babel-3.7 for Norwegian. Danish and Swedish have had it for +several years. + +<li><b><a href="inorsk-compwordsmaybe">inorsk-compwordsmaybe</a></b><br> +Search for words in a file or from standard input that maybe should be +written in one word. Like `matematikk lærer' etc. + +<li><b><a href="inorsk-hyphenmaybe">inorsk-hyphenmaybe</a></b><br> +Search for words in a file from from standard input that the Norwegian +hyphenation patterns from this distribution might not hyphenate +properly. Incorrect hyphenation of words not printed is considered to +be a bug in the patterns. There is only a finite number of them. + +<li><b>Makefile</b><br> This file contains rules for making +dictionaries for ispell and lists of the most common words for dumb +word processors. There is also a Makefile in the patterns directory +for making hyphenation patterns. + +<li><b><a href="nohyphbc.tex">nohyphbc.tex</a></b>, <b><a +href="nohyphb.tex">nohyphb.tex</a></b><br> This +is the hyphenation patterns for TeX. The file nohyphbc.tex hyphenates +only at compound points. The nohyphb.tex hyphenates each component of +a word too, but avoiding to hyphenate 'near' compound points. I think +'bar-nepsykologen' looks really bad. Too bad TeX doesn't support +multi-level hyphenation yet.<br> + + +The naming of the files follow the paradigm in Babel; if a replacement +for a file foo.bar is offered, it is named foob.bar, where the b +stands for big. + +These new patterns easily outpreforms those available before, mostly +because of better compound word hyphenation. For reference I have +made lists of about 2000 compound word errors made by previous +patterns: <a href="err.nohyph">err.nokyph</a> and <a href="err.nohyph2"> +err.nokyph2</a>. +<br> + +The size of the patterns can be argued over. The patterns are copied +into each format file, thus occupying some disk space. They also +limit the number of languages one can load hyphenation patterns for on +most TeX systems. But size considerations has become less important +recent years, so I prefer to focus on getting things right, not small. +It is also possible to recompile teTeX such that there is more room +for hyphenation patterns, but the patterns take up more memory then. +There is surely a lot of unnessesary structure within the hyphenation +patterns , but it is very time-consuming to remove. The file +patterns/Makefile can be configured, such that one can make smaller +sets of patterns, taking only the most common words into +concideration. Everyone is invited to play. + +<li><b>COPYING</b><br> The GNU general public license. + +</ul> + +<p> + + +<h3>Changes</h3> + +<p>There has been a lot of changes since version 1.1a. The quality has +improved a lot, and the structure of the distribution is completely +new. Therefore i choose not to make the previous versions available +from this site. + +<p>Here is a rough summary of the changes: + +<ul> + +<li>New distribution format + +<li>Support for Nynorsk + +<li>Commonness indicator for each word from Bokmål + +<li>Words are hyphenated at their compound points + +<li>A lot of common words added, especially compound words + +<li>Makefile completly rewritten. It is possible to configure the +size of the dictionary for ispell without beaking the munching. + +<li>Makefile to make hyphenation patterns for TeX added + +<li>The pregenerated TeX patterns are included in the distribution. + +<li>Controlled compoundwords support added. This includes affix file +updates. + +<li>Some uncommon and misspelled words removed + +<li>Affix file updated for html. This will only work if you use the patch. + +</ul> + +<h3>Todo list</h3> + +<ul> + +<li>Remove/mark uncommon words that are close to common words. If you +type 're' you probably meant 'er', even if 're' is a valid word. + +<li>There are too many words with commonness 0. Split this group in +two. + +<li>Some words in the basic category belongs in special categories. +When making a small dictionary with all words from mathematics, many +such words are missing, since they are in the basic category. They +should be moved. + +<li>Make ispell sort the suggested replacements for misspelled words +by commonness of the suggested words. One (easy) way to do this is to +make an external file containing the most common words, and make +ispell look into that file each time it has more than one suggestion. +Or the file could be read into memory. (I don't think frequency +information is representable within the root/affix structure, since +one flag can represent multiple words.) This would slow ispell down a +little bit, but only when it makes suggestions. If you would like to +help with this, please get in touch. + +</ul> + +<p>Comments, suggestions and bug-reports to <a +href="mailto:runekl@opoint.com">runekl@opoint.com</a>. If you have +or want to make a correct dictionary from some field of knowledge, i +would like to include it in the next release. See the <a +href="README">README</a> file for some +suggestions about how to get started. All you need is a large amount +of Norwegian text from the field in question and some time to organize +the dictionary.<p> + +<a target="_top" href="http://v.extreme-dm.com/?login=runekl"> +<img src="http://v1.extreme-dm.com/i.gif" height=38 +border=0 width=41 alt=""></a><script language="javascript"><!-- +an=navigator.appName;d=document;function +pr(){d.write("<img src=\"http://v0.extreme-dm.com", +"/0.gif?tag=runekl&j=y&srw="+srw+"&srb="+srb+"&", +"rs="+r+"&l="+escape(d.referrer)+"\" height=1 ", +"width=1>");}srb="na";srw="na";//--> +</script><script language="javascript1.2"><!-- +s=screen;srw=s.width;an!="Netscape"? +srb=s.colorDepth:srb=s.pixelDepth;//--> +</script><script language="javascript"><!-- +r=41;d.images?r=d.im.width:z=0;pr();//--> +</script><noscript><img height=1 width=1 alt="" +src="http://v0.extreme-dm.com/0.gif?tag=runekl&j=n"></noscript> +</html> |