Browser Add-ons, Comment Snob, Google Chrome, JavaScript, Mozilla, Mozilla Firefox, Programming, Typo.js, YouTube Comment Snob

Announcing Typo.js: Client-side JavaScript Spellchecking

When I first ported YouTube Comment Snob to Chrome, Chrome’s lack of a spellchecking API for extensions meant that I would be unable to implement Comment Snob’s most popular and distinguishing feature: the ability to filter out comments based on spelling mistakes. That, my friend, is about to change.

I’ve finished work on the first version of a client-side spellchecker written entirely in JavaScript, and I’m calling it Typo.js. Its express purpose is to allow Chrome extensions to perform spellchecking, although there’s no reason it wouldn’t work in other JavaScript environments. (Don’t use it for Firefox extensions though; use Firefox’s native spellchecking API.)

How does it work?

Typo.js uses Hunspell-style dictionaries – the same ones used in the spellcheckers of OpenOffice.org and Firefox. (Typo.js ships with the latest American English dictionary, but you could add any number of other dictionary files to it yourself.) You initialize a Typo.js instance in one of two ways:

Method #1

var dictionary = new Typo("en_US");

This tells Typo.js to load the dictionary represented by two files in the dictionaries/en_US/ directory: en_US.aff and en_US.dic. The .aff file is an affix file: a list of rules for creating multiple forms of a word by adding prefixes and suffixes. The .dic file is the dictionary file: a list of root words and the affix rules that apply to them. Typo parses these files and generates a complete dictionary by applying the applicable affix rules to the list of root words.

Method #2

var dictionary = new Typo("en_US", affData, dicData);

With this initialization method, you supply the data from the affix and dictionary files. This method is preferable if you wish to change the location of the affix and dictionary files or if you are using Typo.js in an environment other than a Chrome extension, such as in a webpage or in a server-side JavaScript environment.

Once you’ve initialized a Typo instance, you can use it to check whether a word is misspelled:

var is_correct_spelling = dictionary.check("mispelled");

Customization

Depending on your needs, you can configure Typo.js to perform word lookups in one of two ways:

  1. hash: Stores the dictionary words as the keys of a hash and does a key existence check to determine whether a word is spelled correctly. Lookups are very fast, but this method uses more memory.
  2. binary search: Concatenates dictionary words of identical length into sets of long strings and uses binary search in these strings to check whether a word exists in the dictionary. It uses less memory than the hash implementation, but lookups are slower. This method was abandoned as it became impractical to implement for some features.

See this blog post by John Resig for a more detailed exploration of possible dictionary representations in JavaScript.

Practice vs. Theory

Typo.js is already in use in my Comment Snob extension. You can install it today to experience Typo.js in action, filtering comments on YouTube based on the number of spelling mistakes in each one.

What’s next for Typo.js?

The next step is adding support for returning spelling suggestions; right now, all Typo.js can do is tell you whether a word is spelled correctly or not. It also needs to support Hunspell’s compound word rules. These are the rules that a spellchecker uses to determine whether words like “100th”, “101st”, “102th” are correct spellings (yes, yes, and no, for those of you keeping track) since it would be impossible to precompute a list of all possible words of these forms.

The Typo.js code is available on GitHub. I welcome any and all suggestions or code contributions.

Standard

22 comments on “Announcing Typo.js: Client-side JavaScript Spellchecking

  1. DB says:

    I see the licensing information for the en_US dictionary files (LGPL and BSD). Does “typo.js” carry any special licensing?

  2. Is the default setup using hash or binary word lookups? Also how can this be configured? It is mentioned in this post but I don’t see instructions on how to do so anywhere.

  3. Do you have an example for Method #2?

    var dictionary = new Typo(“en_US”, affData, dicData);

    Like what affData and dicData look like?

    I tried creating two large strings based on what you had in your en_US.aff and en_US.dic files, but doesn’t seem to be working for me. Might be doing something wrong, I’ve only been playing with it for half an hour or so.

    • affData and dicData should be strings containing the contents of en_US.aff and en_US.dic, respectively. Is it possible that your strings do not use \n as a newline, but rather \r or \r\n? That might be problematic. If you can find out what this.rules equals after line 67, that would help with debugging the problem.

      • Thanks! I had the string like this:

        var en_US_dic = “a\
        as\
        about”;

        But that didn’t work, not quite sure why. I replaced the \ with \n and got rid of the carriage returns in the file and it’s working now!

        Looks like a great tool. Thanks for making this!

  4. Am says:

    really good, but with the suggested word i found a problem, it suggest only words with the same number of letter, for example if i tape :

    “mispelle” the suggestions words are “misspell, misspells, ispell”, it does’nt suggest “misspelled”, same thing with appl, it does’nt suggest “apple”

  5. Godfrey says:

    Try to load the dictionary in this way:

    $.get( '/scripts/typo/dictionaries/en_US/en_US.aff', function ( affData ) {
    	$( '#loading-progress' ).append( 'Loading English dictionary (this takes a few seconds)...' ).append( $( '' ) );
            
    	$.get( '/scripts/typo/dictionaries/en_US/en_US.dic', function ( wordsData ) {
    		$( '#loading-progress' ).append( 'Initializing Typo...' );
    
    		dictionary = new Typo( "en_US", affData, wordsData );
    
    		checkWord( 'mispelled' );
    	} );
    } );
    

    I am getting:
    jquery.min.js:4 GET http://localhost:60989/scripts/typo/dictionaries/en_US/en_US.aff 404 (Not Found)

    Any help would be appreciated.

  6. Godfrey says:

    In case someone has the same problem as above. This solved it for me:

    In the web.config in the system.webServer section add:

    <staticContent>
      <remove fileExtension=".aff" />
      <mimeMap fileExtension=".aff" mimeType="application/text" />
      <remove fileExtension=".dic" />
      <mimeMap fileExtension=".dic" mimeType="application/text" />
    <staticContent>
    
  7. Good library. To use this with React in the front end, I have written a simple loader for webpack.

    Add this to your webpack config:

    {
        test: /\.(aff|dic)$/,
        use: {
            loader: './src/main/webapp/loaders/dictionaryLoader.js'
        }
    }
    

    and create dictionaryLoader.js:

    module.exports = function (data) {
        return `export default ${ JSON.stringify({data}) }`;
    };
    

    then in your react library just import as below:

    import aff from 'typo-js/dictionaries/en_US/en_US.aff'
    import englishDic from 'typo-js/dictionaries/en_US/en_US.dic'
    import myCustomDic from './dictionary/custom.dic'
    
    const dictionary = new Typo('en_US', aff.data, englishDic.data + ' ' + myCustomDic.data)
    
  8. Andrei says:

    Hello,
    I am strugglin with the library. I tried several times to use it but it does not work. I am not quite sure what I am doing wrong. I am trying to implement it in my react native javascript application. Is it enough to just install it or do i need to do something else to get it working?
    Thank you!

    • The README covers the typical use case: https://github.com/cfinke/Typo.js/blob/master/typo/README.md

      I’m not familiar with React Native development, but there are two things you should need to do:

      1. Make sure that typo.js is included in your project in a way that you can call `new Typo()`. In a regular webpage, this would be mean adding a script tag like <script src=”typo/typo.js” />

      2. Make sure that Typo can read the dictionary files. It already has methods for reading them when it’s bundled in a Chrome extension, webpage, or Node.js. See the `_readFile()` method for this.

      If you run into issues, let me know and I can help work it out.

  9. Gil says:

    Hi This looks like a great tool. Might be what we are looking for.
    We have a Laravel (a PHP framework) environment running on linux.
    Can I just follow your instructions the same for this?

  10. Gil says:

    Where can I find documentation on how to include a custom dictionary so users can save special words to a common dictionary?
    How do we save words to a custom dictionary?
    How do we get the code to check the custom dictionary when it doesn’t find a word in regular dictionary?

  11. Gil says:

    Hey Chris,
    I figured out how to have spellcheck look into personal dictionary, it was embarrassingly simple.
    And I found your answer to adding words to it.

    Thanks again for a great tool!

  12. GIL says:

    Can anyone help with the following:

    I have this working nice in a Laravel environment.

    I have created a custom_dictionary which I can save to, and that works.

    But for some reason even though the custom dictionary file is updated, the spellcheck doesn’t pick up the new word. I tried clearing cache, I tried closing browser and it will not find the new word. But if I redeploy the entire app via Azure it will then copy the file over and then the typo.check will find the word in the custom dictionary.

    What’s going on?

  13. Doug says:

    I got this to work in the node js environment. Now I’m not sure how to get around the fact that require gives me an error when using VScode environment / app.js – index.html – style.css setup. I’m pretty new to this. I did try tossing this script tag in my html… http://”typo/typo.js”

Leave a Reply

Your email address will not be published. Required fields are marked *