Frequently Asked Questions

How should I refer to SUBTLEX-NL when I use it in my work?

Keuleers, E., Brysbaert, M. & New, B. (2010). SUBTLEX-NL: A new frequency measure for Dutch words based on film subtitles. Behavior Research Methods, 42(3), 643-650.

The paper can be found here.

How was the SUBTLEX-NL database constructed?

Between March 10 and March 19, 2009 a computer program written specifically for this purpose processed a large number of Dutch subtitles found on an internet site grouping contributions made available by individual internet users.

Is the corpus available?

No. We do not have the copyright to the actual subtitles and therefore cannot make them available. With SUBTLEX-NL is a collection of statistical measures about the frequency of occurence of words in subtitles.

What was the size of the corpus?

Disregarding duplicates, we  processed 43,729,424 words coming from 8,443 subtitles.

What kind of subtitles were used for the SUBTLEX-NL database?

The majority (5966 out of 8443) were translated subtitles of American films and television series. The remaining part are translated subtitles from languages other than English, and captions from Dutch films and television series.

How can I lookup SUBTLEX frequencies for particular words using EXCEL?

We have prepared a document that teaches you how to this.

Where can I find information on how to interpret the PoS and lemma frequencies.

We haver written a document called “Understanding the Part of Speech (PoS) information in SUBTLEX-NL“.

Comments are closed.