In this study, we present word prevalence data (i.e., the number of people who know a given word) for 40,777 Catalan words. An online massive visual lexical decision task involving more than 200,000 native speakers of this language was carried out. The characteristics of the participants as well as those of the words which mostly influence word knowledge were examined. Regarding the participants, the analysis of the data revealed that their age was the main factor influencing vocabulary size, followed by their educational level and other variables such as the number of languages spoken and their level of proficiency in Catalan. Concerning the words, by far the most determining factor was lexical frequency, with a minor influence of both length and the size of the orthographic neighborhood. These data mainly agree with those reported in other languages in which the same variables have been analyzed (Dutch, English, and Spanish, thus far). Therefore, the list is increased with Catalan, a language which, due to its use in an essentially bilingual context, is of special interest to researchers interested in the field of bilingualism and second language acquisition.
Més informació: Prevalence norms for 40,777 Catalan words: An online megastudy of vocabulary size | SpringerLink