Software & Data

This page lists the data we collected and the software we developed in the course of our research.

Word ratings

  • Word ratings of age of acquisition, concreteness, valence and arousal for thousands of Dutch and English words

Lexicon projects


  • Wuggy, our pseudoword generator.
  • Vwr, an R package with utitily functions for visual word recognition research.
  • A small library for fast computation of average levenshtein distances (e.g., OLD20) in Python.
  • Duometer, a tool for detecting near-duplicate documents in text corpora.

Word Frequencies

  • different databases containing word frequency norms for several languages based on film subtitles

Word Prevalence

  • Word prevalence refers to the percentage of people who know the word. It is largely complementary to word frequency. Here you find the measure for Dutch words.

Vocabulary tests (language proficiency)

Pictures of tools (laterality)

Outside Resources

  • LexicALL contains many useful data-sets and other resources for psycholinguistic research.
  • The supplemental data archive of the Psychonomic Society contains a lot of useful material for researchers, in particular  the many norming studies published in Behavior Research Methods.
  • The Dutch Word Association Database contains word associations to thousands of Dutch words. Very useful if you are interested in the meaning of words.
  • Colleagues from Northwestern University have used our databases to collect Dutch, English, French, German and Spanish phonological and orthographic cross-language neighborhood densities. Go and have a look at their Clearpond page.

Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.

Comments are closed.