The Stripey Site

Spell-Checking Files

Vim can be configured to make it easy to check the spelling in text files, including having misspelt words highlighted in colour, something which can even be done with HTML source code. Various techniques for spell-checking are outlined here, along with an easy way of adding words to a custom dictionary.

All the settings below are included in the sample .vimrc configuration file.

Basics

Integrating Vim with a spell-checker obviously relies on there being a suitable spell-checker available. I use Ispell on Linux, but it should be possible to tweak the code below to work with other tools. Some of what follows also relies on a couple of Vim variables being defined like this:
let IspellLang = 'british'
let PersonalDict = '~/.ispell_' . IspellLang
The first specifies the name of the Ispell dictionary language, so to use these settings with a different language should only involve altering that one variable. The second is the personal dictionary file, but in most cases this should be computed correctly from the language.

Inserting Words from a Dictionary

One of the easiest way to avoid spelling mistakes is to spell words correctly in the first place. Vim can help with this if there’s a suitable plain-text dictionary handy. These settings use Linux’s handy /usr/dict/words and the Ispell personal dictionary (the main Ispell dictionaries are not plain text):
execute 'set dictionary+=' . PersonalDict
set dictionary+=/usr/dict/words
set complete=.,w,k
set infercase
Then at any point you can type the first few letters of a word then press <Ctrl>+N and Vim will type the rest of the first word in the dictionary that begins with those letters. Repeated pressings of <Ctrl>+N will cycle through replacing it in turn with other matching words, and <Ctrl>+P cycles backwards. For example, with the text “elep” typed, pressing <Ctrl>+N replaces this with “elephant” and pressing it again changes that to “elephants”; this is very useful for discovering the spellings of long words.

Correcting Common Typos

Common (unambiguous) typos can be corrected automatically by defining them as ‘abbreviations’:
abbreviate teh the
abbreviate spolier spoiler
abbreviate Comny Conmy
abbreviate atmoic atomic

Interactive Spell Checking

Ispell is an interactive spell-checker, so it is easy to create the \si (“spelling interactive”) mapping to save the current document, have Ispell check (and correct) it, and load the corrected version back into Vim:
execute 'nnoremap \si :w<CR>:!ispell -x -d ' . IspellLang . ' %<CR>:e<CR><CR>'

Listing Spelling Errors

Interactive in-place spell-checking is irritating with things like news and e-mail messages, since these contain headers largely consisting of things that aren’t in the dictionary but definitely don’t want ‘correcting’. Filtering these out prevents the spell-checker seeing them, but also loses them permanently from the file! An alternative is to use a non-interactive checker, which simply lists the errors, like this \sl (“spelling list”) mapping:
execute 'nnoremap \sl :w ! grep -v "^>" <Bar> grep -E -v "^[[:alpha:]-]+: " ' .
  \ '<Bar> ispell -l -d ' . IspellLang . ' <Bar> sort <Bar> uniq<CR>'
This uses ispell with the -l flag to get a list of misspellings; this is similar to the traditional Unix spell command. Header lines are filtered out with grep, as are lines beginning with a > to denote quoted text (so other people’s mistakes are ignored). The sort and uniq commands are used so that just one instance of each misspelling is listed.

Highlighting Spelling Errors

Often it can be convenient to have misspelt words highlighted in the text, rather than just listed. The function below does this, first getting a list of misspelt words in the same way as the previous mapping (but only avoiding headers and quoted text in mail and news articles) Then it dynamically defines syntax highlighting expressions to mark the misspellings in red.

The function treats HTML files as a special case, spell-checking (with the help of Lynx) the rendered version of the webpage, not the HTML source, then highlighting the errors in the source. This avoids HTML mark-up being treated as misspellings, while ensuring that text (even including alt tags) gets checked). The function also turns off regular HTML highlighting to prevent this being distracting.

English possessives are also treated specially (in all file types), with no word ending in “’s” being marked as an error if the base word is in the dictionary (so “Lizzy’s” is not highlighted if “Lizzy” is in the dictionary). This could be annoying if used in languages in which “’s” has a different meaning.

These mappings are used, with (normal mode) <F9> or \sh (“spelling highlight”) highlighting spelling mistakes, and <F10> or \sc (“spelling clear”) clearing them (and in the case of HTML files, restoring normal syntax highlighting):

nnoremap \sh :call HighlightSpellingErrors()<CR><CR>
nmap <F9> \sh
nnoremap \sc :if &ft == 'html' <Bar> sy on <Bar>
  \ else <Bar> :sy clear SpellError <Bar> endif<CR>
nmap <F10> \sc
Here is the function definition:
function! HighlightSpellingErrors()
" highlights spelling errors in the current window; used for the \sh operation
" defined above;
" requires the ispell, sort, and uniq commands to be in the path;
" requires the global variable IspellLang to be defined above, and to contain
" the preferred `Ispell' language;
" for mail/news messages, requires the grep command to be in the path;
" for HTML documents, saves the file to disk and requires the lynx command to
" be in the path
"
" by Smylers  http://www.stripey.com/vim/
" (inspired by Krishna Gadepalli and Neil Schemenauer's vimspell.sh)
" 
" 2000 Jun 1: for `Vim' 5.6

  " for HTML files, remove all current syntax highlighting (so that
  " misspellings show up clearly), and note it's HTML for future reference:
  if &filetype == 'html'
    let HTML = 1
    syntax clear

  " for everything else, simply remove any previously-identified spelling
  " errors (and corrections):
  else
    let HTML = 0
    if hlexists('SpellError')
      syntax clear SpellError
    endif
    if hlexists('Normal')
      syntax clear Normal
    endif
  endif

  " form a command that has the text to be checked piping through standard
  " output; for HTML files this involves saving the current file and processing
  " it with `Lynx'; for everything else, use all the buffer except quoted text
  " and mail/news headers:
  if HTML
    write
    let PipeCmd = '! lynx --dump --nolist % |'
  else
    let PipeCmd = 'write !'
    if &filetype == 'mail'
      let PipeCmd = PipeCmd . ' grep -v "^> " | grep -E -v "^[[:alpha:]-]+:" |'
    endif
  endif

  " execute that command, then generate a unique list of misspelt words and
  " store it in a temporary file:
  let ErrorsFile = tempname()
  execute PipeCmd . ' ispell -l -d '. g:IspellLang .
    \ ' | sort | uniq > ' . ErrorsFile

  " open that list of words in another window:
  execute 'split ' . ErrorsFile

  " for every word in that list ending with "'s", check if the root form
  " without the "'s" is in the dictionary, and if so remove the word from the
  " list:
  global /'s$/ execute 'read ! echo ' . expand('<cword>') .
    \ ' | ispell -l -d ' . g:IspellLang | delete
  " (If the root form is in the dictionary, ispell -l will have no output so
  " nothing will be read in, the cursor will remain in the same place and the
  " :delete will delete the word from the list.  If the root form is not in the
  " dictionary, then ispell -l will output it and it will be read on to a new
  " line; the delete command will then remove that misspelt root form, leaving
  " the original possessive form in the list!)

  " only do anything if there are some misspellings:
  if strlen(getline('.')) > 0

    " if (previously noted as) HTML, replace each non-alphanum char with a
    " regexp that matches either that char or a &...; entity:
    if HTML
      % substitute /\W/\\(&\\|\&\\(#\\d\\{2,4}\\|\w\\{2,8}\\);\\)/e
    endif

    " turn each mistake into a `Vim' command to place it in the SpellError
    " syntax highlighting group:
    % substitute /^/syntax match SpellError !\\</
    % substitute /$/\\>!/
  endif

  " save and close that file (so switch back to the one being checked):
  exit

  " make syntax highlighting case-sensitive, then execute all the match
  " commands that have just been set up in that temporary file, delete it, and
  " highlight all those words in red:
  syntax case match
  execute 'source ' . ErrorsFile
  call delete(ErrorsFile)
  highlight SpellError term=reverse ctermfg=DarkRed guifg=Red

  " with HTML, don't mark any errors in e-mail addresses or URLs, and ignore
  " anything marked in a fix-width font (as being computer code):
  if HTML
    syntax case ignore
    syntax match Normal !\<[[:alnum:]._-]\+@[[:alnum:]._-]\+\.\a\+\>!
    syntax match Normal
      \ !\<\(ht\|f\)tp://[-[:alnum:].]\+\a\(/[-_.[:alnum:]/#&=,]*\)\=\>!
    syntax region Normal start=!<Pre>! end=!</Pre>!
    syntax region Normal start=!<Code>! end=!</Code>!
    syntax region Normal start=!<Kbd>! end=!</Kbd>!
  endif

endfunction " HighlightSpellingErrors()

Adding Words to a Custom Dictionary

Some words highlighted as misspellings will be spelt correctly, but just not in the dictionary. Adding these to the personal dictionary can prevent these from being marked erroneously in future, and can be done with the function below and these mappings — <F8> or \sa (“spelling add”) adds the word currently under the cursor (and ceases highlighting it as a spelling error):
nnoremap \sa :call AddWordToDictionary()<CR><CR>
nmap <F8> \sa
Again possessives are given special treatment, with the non-possessive form being added to the dictionary. Here’s the function definition:
function! AddWordToDictionary()
" adds the word under the cursor to the personal dictonary; used for the \sa
" operation defined above;
" requires the global variable PersonalDict to be defined above, and to contain
" the `Ispell' personal dictionary;
"
" by Smylers  http://www.stripey.com/vim/
" 
" 2000 Apr 30: for `Vim' 5.6

  " get the word under the cursor, including the apostrophe as a word character
  " to allow for words like "won't", but then ignoring any apostrophes at the
  " start or end of the word:
  set iskeyword+='
  let Word = substitute(expand('<cword>'), "^'\\+", '', '')
  let Word = substitute(Word, "'\\+$", '', '')
  set iskeyword-='

  " override any SpellError highlighting that might exist for this word,
  " `highlighting' it as normal text:
  execute 'syntax match Normal #\<' . Word . '\>#'

  " remove any final "'s" so that possessive forms don't end up in the
  " dictionary, then add the word to the dictionary:
  let Word = substitute(Word, "'s$", '', '')
  execute '!echo "' . Word . '" >> ' . g:PersonalDict

endfunction " AddWordToDictionary()


Feedback is welcome via vim-www@stripey.com. © Copyright 2000 — see copying information for details.