Software & Data
This page lists the data we collected and the software we developed in the course of our research.
Word ratings
- Word ratings of age of acquisition, concreteness, valence and arousal for thousands of Dutch and English words
Our Lexicon projects
Software
- Wuggy, our pseudoword generator.
- Vwr, an R package with utitily functions for visual word recognition research.
- A small library for fast computation of average levenshtein distances (e.g., OLD20) in Python.
- Duometer, a tool for detecting near-duplicate documents in text corpora.
Word Frequencies
- different databases containing word frequency norms for several languages based on film subtitles
Word Prevalence
- Word prevalence refers to the percentage of people who know the word. It is largely complementary to word frequency.
- Here you find the measure for Dutch words.
- Here you find the measure for English words.
Vocabulary tests (language proficiency)
- Lemhöfer & Broersma’s LEXTALE tests for English, German, and Dutch
- Our LEXTALE_FR test for French
- Our LEXTALE_Esp test for Spanish
- Our LexITA test for Italian
- Dutch one minute reading test for students (Een Minuut Test voor Studenten)
- Dutch vocabulary test with multiple choice answers (Nederlandse woordenschattest met meerkeuzevragen)
- Flamingo test (Test reading aloud for students)
- The Dutch Auditory & Image Vocabulary test (DAIVT); format of the Peabody Picture Vocabulary test.
Spelling tests
- Spelling dictation tests for English and Dutch
The Dutch Author Recognition Test
- The Dutch Author Recognition Test to estimate the number of fiction books individuals have read.
Picture stimuli
- The Verma & Brysbaert pictures of tools with matched objects and non-objects
- The Multipic database with colored pictures of 750 objects
Semantic Vectors
- Semantic vectors for English and Dutch
- Semantic vectors for Italian
List of megastudies with links to data (if available)
- Have a look here
Outside Resources
- Language goldmine has links to several hundreds of data-sets and other resources for psycholinguistic research.
- Jack Taylor has made a shiny app that allows you to select stimulus materials based on some 60 variables (or matched on those variables).
- Jamie Reilly has a website summarizing the main resources for English words.
- Geoff Hollis and Chris Westbury calculated measures of valence, arousal, dominance, AoA, and concreteness for 80 thousand words on the basis of our ratings and semantic vectors.
- LexicALL also contains many useful data-sets and other resources for psycholinguistic research.
- The Word Association Database contains word associations to thousands of Dutch and English words. Very useful if you are interested in the meaning of words.
- Colleagues from Northwestern University have used our databases to collect Dutch, English, French, German and Spanish phonological and orthographic cross-language neighborhood densities. Go and have a look at their Clearpond page.
- Erin Buchanan and her group have created a bibliography of resources and a database of data they collected themselves. You find more information about the bibliography here.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.