Measures of word prevalence for 61,800 English words

At long last we found time to make the English word prevalence measures available.

Word prevalence indicates how many people know a word. Because percentage known has an uninteresting distribution, word prevalence is calculated on the basis of a probit transformation. The following are interesting landmarks:

  • negative prevalence values: words known by less than 50% of the people; only of interest for word learning studies
  • prevalence = 0.0 : 50% of the people know this word
  • prevalence = 1.0 : 84% know the word
  • prevalence = 1.5 : 93% know the word
  • prevalence = 2.0 : 98% know the word
  • prevalence = 2.5 : nearly everyone knows the word

You find all the information in:

  • Brysbaert, M., Mandera, P., McCormick, S.F., & Keuleers, E. (in press). Word prevalence norms for 62,000 English lemmas. Behavior Research Methods. pdf

You find an Excel file with the word prevalence measure for English here.

If you want more information about the use of word prevalence, have a look at our findings in Dutch.

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.