General Information

SUBTLEX-NL is a database of Dutch word frequencies based on 44 million words from film and television subtitles.

The word frequency measure was validated on a lexical decision study involving fourteen thousand monosyllabic and disyllabic Dutch words (Keuleers, Brysbaert, & New, 2010). The SUBTLEX-NL word frequencies explain up to 10% more variance in accuracies and reaction times of the lexical decision task than the existing CELEX word frequencies, previously the gold standard for Dutch word frequencies.

SUBTLEX-NL also includes a measure of contextual diversity, which accounts for slightly more variance in accuracy and RT than the raw frequency of occurrence counts.

SUBTLEX-NL further includes frequencies for differents parts-of-speech, which were obtained through automatic memory-based POS tagging, using the TADPOLE software, made by our colleagues of the ILK-group at the University of Tilburg. This post contains more details on how to interpret the PoS information in SUBTLEX-NL.

You can download the database as a text file or Excel spreadsheet. If you only need the frequencies of a few words, you can also do a direct search in the database.

Comments are closed.