The Zipf-scale: A better standardized measure of word frequency
A problem with word frequency counts is that they depend on the size of the corpus. As a result, absolute numbers are difficult to interpret. For instance, the frequency count of apple in HAL is 65,844. In SUBTLEX-US it is 1,207.
To make word frequency norms comparable, researchers use a standardized measure, a measure that is independent of the corpus size. The standardized measure used thus far has been frequency per million words (fpmw). So, the standardized SUBTLEX-US frequency of apple is 23.67 pmw (as the corpus includes 51 million words). The fpmw measure of HAL is more difficult to calculate because no-one knows how large the HAL corpus is. It has been claimed to be 130 million words or 160 million words, but in all likelihood it is larger than 400 million words (if you simply add up all the frequencies of the words in the ELP lexicon, you already get this figure).
Increasingly, however, we have felt unease with this standardized measure, because it leads to a wrong intuitive understanding of the word frequency effect. Here are two problems with the fpmw measure:
Intuitively, people associate a measure of 1 with the lowest value. However, more than half of the words in a frequency list have frequencies lower than 1 pmw. The reason why 1pmw for a long time seemed like a good start of the scale was that all word frequency research was based on the Kucera & Francis (1967) word frequency list, which used a corpus of 1 million words only. So, a frequency count of 1 indeed was the lowest value. However, now that corpora easily include 100 million or even 100 billion words, we see that very many word types have frequencies below 1 pmw.
The frequency effect does not stop below 1 pmw. As a matter of fact, as can be seen below and has been reported by us a few times before, nearly half of the word frequency effect is situated below 1 pmw. In addition, because the word frequency effect is a logarithmic effect, the difference between .1 fpmw and .7 fpmw equals the difference between 5 fpmw and 35 fpmw. Again, this is very difficult to explain to psycholinguistic researchers. It leads to particularly bad results when authors are “matching” conditions on word frequency. So, you’d read that one condition has a mean frequency of .5 pmw and the other has a mean frequency of 3 pmw. This means that the average frequency in the former condition is six times lower than that in the latter (which no one would except if the frequencies were 10 and 60). However, because the raw frequency norms are used for the analysis (instead of the logarithmic values), the difference between the conditions usually is not significant (p > .05!) and, hence, is not noticed by the authors and the readers.
We have been thinking long and hard about how a standardized word frequency scale should look like in order to lead to intuitively correct understanding. These are the elements we saw necessary:
- It should be a logarithmic scale (e.g., like the decibel scale of sound loudness).
- It should look like a typical Likert rating scale (e.g., from 1 to 7), so that the values are easy to interpret.
- The middle of the scale should separate the low-frequency words from the high-frequency words.
- The scale should have a straightforward unit.
Once you know what you are looking for, it is not so difficult to come up with a scale that fulfills all requirements. Simply taking log10(frequency per billion words) already solves the first 3 problems. In such a scale, words with a frequency of .1 pmw get a value of 2, words with a frequency of 1 pmw get a value of 3, and words with a frequency of 10 pmw get a value of 4. The word apple gets a SUBTLEX Zipf value of 4.37.
To meet the fourth requirement of our list, we propose to call the new scale the Zipf scale, after the American linguist George Kingsley Zipf (1902–1950) who first thoroughly analyzed the regularities of the word frequency distribution and formulated a law that was named after him (Zipf, 1949). The unit then becomes the Zipf.
We presented the Zipf scale for the first time in a 2014 article on word frequency measures for British English (Van Heuven, Mandera, Keuleers, & Brysbaert, 2014; please, refer to it when you are using the Zipf scale). In that article we also give examples of words with various Zipf values. Here they are (click on the picture to get a larger image):
To see how the word frequency effect translates to the Zipf values, in the figure below we plot the lexical decision RTs to the known words (accuracy > .67) in the British Lexicon Project (N = 19,487). As can be seen, the word frequency effect is now nicely centralized relative to the word frequency scale, with values of 1-3 representing low frequency words, and values of 4-7 representing high frequency words.
A criticism often raised against frequency values lower than 1 pmw is that these words are not known to the participants. Again, we can have a look at the British Lexicon Project. If we only take the words that were answered positively by more than two thirds of the participants, we get the following distribution as a function of Zipf values:
Again, the distribution centers nicely on the scale. Below we give some examples of BLP words in the various bins (all BLP words were monosyllabic or disyllabic words).
In our future publications we will make the Zipf norms available as the primary word frequency variable, because we think this will help researchers and lay people to much better understand what the word frequency effect is and how it should be studied and controlled for. We hope many of you will join us! The Zipf values are easy to calculate from fpmw values. Simply take log10(fpmw)+3 or log10(fpmw*1000).
Here you find a zipped Excel file of the SUBTLEX-US frequencies with the Zipf values added.
Here you can look up the UK Zipf frequencies for thousands of words.
References
Van Heuven, W.J.B., Mandera, P., Keuleers, E., & Brysbaert, M. (2014). Subtlex-UK: A new and improved word frequency database for British English. Quarterly Journal of Experimental Psychology, 67, 1176-1190. pdf
Zipf, G. (1949), Human Behaviour and the Principle of Least Effort. Reading MA: Addison-Wesley.