How to spell English compounds

How does the CompSpell compound spelling algorithm work?

There is no official correct way of spelling English compound words, but some spellings are more common and acceptable than others. I analysed the spelling of over ten thousand compounds and concentrated on those that are spelled the same way in all the dictionaries I used. Then I analysed what these central compounds have in common and used my results to work out the most likely spelling of new compound words.

I found out that the spelling of English compounds is connected to an overwhelmingly large amount of different things, like how common the compound's parts are, or which part receives more emphasis. That is what makes spelling English compounds so confusing. But on a general level, it all boils down to a few very simple principles. These work for the majority of compounds with two parts and are also used by the program on this website:

  • Use a hyphen in compound verbs, adjectives and adverbs.
  • Use a space in noun compounds with three or more syllables.
  • Use a hyphen in noun compounds with two syllables whose second part has two letters.
  • Spell noun compounds with two syllables whose second part has more than two letters as a single word.

My strategy is so simple because word class and length are connected to many of the other meaningful aspects that play a role in English compound spelling.

If you are interested in the details of the linguistic research underlying my compound spelling hack, take a look at my book English Compounds and their Spelling. There you will find information on the principles underlying the CompSpell compound spelling algorithm - like the syllabification principle used for counting the number of syllables (on page 115), which is also used by the CompSpell program on this website:

  • The single letters <a>, <e>, <i>, <o> and <u> count as one syllable each.
  • Two vowels in a row correspond either to a long vowel (root) or to a diphthong (loud) and thus to a single syllable.
  • The sequence <eau> (beauty) counts as one vowel and thus one syllable.
  • The letter <y> can represent either a vowel or a consonant. Before a consonant (byte, boys) or at the end of a compound part (my, prey), it counts as a vowel, regardless of the preceding letter.
  • Before a vowel , <y> is considered a consonant (you), except if the <y> is preceded by yet another vowel (eye, soya). In this context <y> counts as a vowel. However, when <y> is preceded by a vowel and followed by <i> (crop+spraying), the <i> usually belongs to the suffix -ing (which is realised as a separate syllable), so that <y> counts as a consonant in this context.
  • At the end of a compound part, <e> is usually silent (snake+bite). It therefore only counts as a vowel if there is no other vowel (the) or if the <e> comes after <l> (subtle).
  • At the end of a word, <ed> is usually a suffix and pronounced /d/ or /t/ as part of the preceding syllable, depending on the voicing of the context (loved, packed). Constituent-final <e> followed by <d> therefore only counts as a syllabic vowel if preceded by <t> (sharp+witted) or <d> (simple+minded) and in constituents that contain no other vowel (bed).
  • At the end of a word, <es> is often a plural marker (sales) pronounced as part of the preceding syllable. It only counts as a syllabic vowel if preceded by <s>, <z>, <sh>, <ch> or <dg> (e.g. glasses, breeches) and in compound parts that contain no other vowel (yes).
  • At the end of a compound part, <ier> counts as two syllables, because the <er> usually corresponds to the comparative suffix -er, which is realised as a separate syllable (holier+than+thou).
  • If a compound part contains no vowel, it still counts as one syllable, because a vowel is inserted when reading out e.g. as x+axis or f+word (possibly because compounds necessarily comprise at least two constituents and thus two syllables).
Back to main page