All functionality of 4Spell is mostly the result of writing a (TeX) parser. To make it easier for others to write their own parser, and for those who are just curious to know how it works, we will explain the parsing algorithms.
The parser will read words until the end of a file is reached. This is done by letting a pointer start at the beginning of the file and start with the procedure READWORD.
The result of 1. and 2. is a word.
This READWORD procedure is repeated until the end of the file. With these words you need to do a lot of checks before you can spell-check (since a word as defined above can contain (TeX) commands, etc.). Note also (within TeX) the EndOfWord characters are defined as: a space, a hyphen, a tilde, a Carriage-Return, a Line-Feed, and an End-Of-File character.
If one of the above is true you keep on reading words until:
This seems easy, but the problem is that when looking for, say, an environment to be ended, the same environment can start again and hence we do not stop at the first end-environment part, but at the second (or even higher) end-environment parts. This example will hopefully explain the problem:
\begin{skipping} To explain the spell problem see this example \begin{skipping} This won't work if you do not count the number of begin environments \end{skipping} You understand the example? \end{skipping}
What the spell-checker should do is skip the complete example above. It should not stop skipping at the first \end{skipping} command.
When performing the actions above, we were looking for parts of the words. This means that after these actions we will have found (part of) a word preceding the action and (part of) a word after ending the action. With these two words (which may be empty) we proceed as with a word that doesn't trigger one of the actions described above.
,;:.!@#$&*?"%(){}[]-+=0123456789\`~^*_/|'
An example could clarify the meaning of subwords. Suppose we have the word
\def\hello{\textbf{Hello}}
This will be divided into four subwords:
\def \hello \textbf Hello
If the subword doesn't belong to any of the six categories above, we spell-check the subword (Alex's routines do the job fast and easy). If the subword is correct we skip the subword. If it is not a correct word, we will search for alternatives for this (sub)word. The user will be prompted by 4Spell what to do in this case: select one of the alternatives, enter your own text, ignore this word, or add it to the user dictionary.
It seems easy, but be aware that when building a parser, you will you need to do a lot of bookkeeping, and you will need some more advanced programming tricks (e.g., all word actions and subword actions are recursive procedures).