SUBTLEX-PL

SUBTLEX-PL is a database of Polish word frequencies based on 101 million words from film and television subtitles. It was introduced in this paper and should be cited as:

Mandera, P., Keuleers, E., Wodniecka, Z., & Brysbaert, M. (2014). SUBTLEX-PL: Subtitle-based word frequency estimates for Polish. Behavior Research Methods.

Some information in Polish can be found here.

You can download the frequencies in several formats:

  • word frequencies – all words : CSV, R
  • word frequencies – observed in minimum 3 subtitle files: Excel, CSV, R
  • information about lemmas : CSV, R
  • information about word bigrams : CSV

Here you can find some information about how to get frequencies for your stimuli using Excel.

If you have any question about these frequencies you can contact Paweł Mandera.

Comments are closed.