Our paper on measuring vocabulary size and word prevalence is now in press

Our paper “Word knowledge in the crowd: measuring vocabulary size and word prevalence using massive online experiments” is now in press in The Quarterly Journal of Experimental Psychology.

The word prevalence values for 54,319 Dutch words in Belgium and the Netherlands used in this paper can be found on this page.

In this paper, we have analyzed part of the data from our online vocabulary test (http://woordentest.ugent.be) in which hundreds of thousands of people from Belgium and the Netherlands participated.

Important results from this paper:

  • Word prevalence, the proportion of people who know a word, appears to be the most important variable in predicting visual word recognition times in the lexical decision task. We conjecture that this is because word prevalence estimates the true occurrence of words better than word frequency in the low range.
  • A person’s vocabulary accumulates throughout life in a predictable way: the number of words known increases logarithmically with age.
  • This result mirrors the growth of the number of unique words encountered with the length of a text (known as Herdan’s law in quantitative linguistics). It is first demonstrated here for human language acquisition.
  • Knowing more foreign languages increases rather than decreases vocabulary in your first language. This is probably a result of the shared vocabulary between languages and the faster growth in  new types when acquiring a new language.


Word prevalence has been used for the analysis of the data from the Dutch Lexicon Project 2.

Comments are closed.