A Trilingual Learner Corpus illustrating European Reference Levels

  • Andrea Abel EURAC, Bolzano
Keywords: Common European Framework of Reference for Languages (CEFR), learner corpus, learner language, language teaching, language certification


Since its publication in 2001, the Common European Framework of Reference for Languages (CEFR) has gained a leading role as an instrument of reference for language teaching and certification and for the development of curricula. Nonetheless, there is a growing concern about CEFR reference levels being insufficiently illustrated in terms of authentic learner data, leaving practitioners without comprehensive empirical characterizations of the relevant distinctions. This is particularly the case for languages other than English (cf. e.g. Hulstijn 2007, North 2000).

The MERLIN project addresses this demand to illustrate and validate the CEFR levels for Czech, German and Italian by developing a didactically motivated online platform that enables CEFR users to explore authentic written learner productions. The core of the multilingual online platform is a trilingual learner corpus composed of roughly 200 learner texts per CEFR level, produced in standardized language certifications validly related to the CEFR, covering the levels A1-C1.

The aim of this paper is to both present the MERLIN project with the motivation behind and its corpus and to discuss its current state.


[CEFR 2001] Council of Europe (2001), The Common European framework of reference for languages: Learning, teaching, assessment, Cambridge, Cambridge University Press.

Alderson J.C. (2007), The CEFR and the need for more research, in “The Modern Language Journal” 91: 658-662.

Alderson J.C. et al. (2006), Analysing Tests of Reading and Listening in Relation to the Common European Framework of Reference: The Experience of the Dutch CEFR Construct Project, in “Language Assessment Quarterly” 3(1): 3-30.

Alderson J.C. (1991), Bands and scores, in J.C Alderson, B. North (eds.), Language testing in the 1990s. London, British Council/Macmillan: 71-86.

Arnaud P.J.L. (1984), The lexical richness of L2 written productionos and the validity of vocabulary tests, in T. Culhane, C. Klein-Braley, D.K. Stevenson (eds.), Practice and Problems in Language, University of Essex Occasional Papers. Colchester, University of Essex: 113-148.

Arras U. (2010), Subjektive Theorien als Faktor bei der Beurteilung fremdsprachlicher Kompetenzen, in A. Berndt, K. Kleppin, (eds.), Sprachlehrforschung: Theorie und Empirie – Festschrift für Rüdiger Grotjahn, Frankfurt, Lang: 169-179.

Bachman L.F. (2004), Statistical analyses for language assessment, Cambridge, CUP 2004.

Bachmann T. (2002), Kohäsion und Kohärenz: Indikatoren für Schreibentwicklung: Zum Aufbau kohärenzstiftender Strukturen in instruktiven Texten von Kindern und Jugendlichen, Innsbruck, Studienverlag.

Bausch K.-R. et al. (eds.) (2003), Der Gemeinsame Europäosche Referenzrahmen für Sprachen in der Diskussion. Arbeitspapiere der 15. Frühjarskonferenz zur Erforschung des Fremdsprachenunterrichts, Tübingen, Narr.

Bardovi-Harlig K. (2009), Conventional Expressions as a Pragmalinguistic Resource: Recognition and Productions of Conventional Expressions in L2 Pragmatics, in “Language Learning” 59 (4) : 755-795.

Bestgen Y., Granger S. (2011), Categorising spelling errors to assess L2 writing, in “International Journal of Continuing Engineering Education and Life Long Learning”, 21 (2) : 235–252.

Bond T.G., Fox C.M. (2007), Applying the Rasch model: Fundamental measurement in human sciences, Mahwah (NJ), Lawrence Erlbaum.

Bulté B., Housen A. (2012), Defining and operationalising L2 complexity, in A. Housen, F. Kuiken, I. Vedder (eds.), Dimensions of L2 Performance and Proficiency: Complexity, Accuracy and Fluency in SLA, Amsterdam, Benjamins: 21-46.

Burger H. (2007), Phraseologie. Eine Einführung am Beispiel des Deutschen. (3. Aufl.), Berlin, Erich Schmidt Verlag.

Carlsen C. (ed.) (2013), Norsk Profil. Det felles europeiske rammeverket spesifisert for norsk. Et første steg, Oslo, Novus.

Carlsen C. (2010), Discourse connectives across CEFR levels: A corpus-based study, in I. Bartning, M. Martin, I. Vedder (eds.), Communicative Proficiency and Linguistic Development: intersections between SLA and language testing research (Eurosla): 191-210. purl.org/net/Carlsen-10.pdf

Christ O. (1994), A modular and flexible architecture for an integrated corpus query system, in Proceedings of COMPLEX’94: 3rd Conference on Computational Lexicography and Text Research, Budapest: 23-32.

Corder S.P. (1993 [1973]), Introducing Applied Linguistics, Harmondsworth, Pelican.

Dallapiazza R.M., von Jan E., Schönherr T. (1998) (eds.), Tangram: Deutsch als Fremdsprache. Kurs- und Arbeitsbuch 1 A. Munich: Hueber.

Daller H., van Hou R., Treffers-Daller J. (2003), Lexical richness in spontaneous speech of bilinguals, in “Applied Linguistics” 24: 197-222.

Dewaele J.-M. (2004), Indiviual differences in the use of colloquial vocabulary. The effects of sociobiographical and psychological factors, in P. Bogaards, L. Laufer (eds.), Vocabulary in a second language, Amsterdam, John Bejamins: 127-154.

Díaz-Negrillo A., Fernández-Domínguez J. (2006), Error-coding systems for learner corpora, in “RESLA” 19: 83-102.

Eckes, T. (2008), Rater types in writing performance assessments: A classification approach to rater variability, in “Language Testing” 25 (2) : 155-185.

Eckes T. (2009), Reference Supplement to the Manual for Relating Language Examinations to the Common European Framework of Reference for Languages: Learning, Teaching, Assessment, Section H: Many-Facet Rasch Measurement (http://www.coe.int/t/dg4/linguistic/manuel1_en.asp, January 2014.)

Eisenberg P. (2007), Sprachliches Wissen im Wörterbuch der Zweifelsfälle. Über die Rekonstruktion einer Gebrauchsnorm, in “Aptum. Zeitschrift für Sprachkritik und Sprachkultur” 3/2007: 209-228.

Ellis R. (1994), The study of Second Language Acquisition, Oxford, Oxford University Press.

Fulcher G. (2004), Deluded by Artifices? The Common European Framework and Harmonization, in “Language Assessment Quarterly” 1 (4) : 253-266.

Fulcher G., Davidson F. (2007), Language Testing and Assessment. London/New York, Routledge.

Gould S.J. (1996), The mismeasure of man, London, Penguin.

Glaznieks A. et al. (2012), Establishing a Standardised Procedure for Building Learner Corpora, in “Apples – Journal of Applied Language Studies”. Special Issue: Proceedings of LLLC2012.

Granger S. (2003), Error-tagged learner corpora and CALL: a promising synergy, in “CALICO Journal” 20 (3). Special issues on error analysis and error correction in computer-assisted language learning: 465-480.

Granger S. (2008), Learner corpora, in A. Lüdeling, M. Kytö (eds.), Corpus linguistics: an international handbook (Handbooks of linguistics and communication science; 29.1 _ 29.2), Berlin – New York: de Gruyter: 259-275.

Granger S. (2002), A Bird’s-eye view of learner corpus research, in S. Granger, J. Hung, St. Petch-Tyson (eds.), Computer Learner Corpora, Second Language Acquisition and Foreign Language Teaching, Amsterdam, John Benjamins: 3–33.

Halliday M.A.K., Hasan R. (1989), Language, context and text: a social semiotic perspective, Oxford, Oxford University Press.

Hancke J., Meurers D., Vajjala S. (2012), Readability Classification for German using lexical, syntactic, and morphological features, in Proceedings of the 24th International Conference on Computational Linguistics (COLING): 1063-1080.

Hancke J. (2013), Automatic Prediction of CEFR Proficiency Levels Based on Linguistic Features of Learner Language, Master’s thesis, University of Tübingen.

Hasil J., Hájková E., Hasilová H. (2007), Brána jazyka českého otevřená, Prague, Karolinum.

Housen A., Kuiken F. (2009), Complexity, Accuracy, and Fluency in Second Language Acquisition, in “Applied Linguistics” 30 (4) : 461–473.

Hulstijn J.H. (2007), The shaky ground beneath the CEFR: Quantitative and qualitative dimensions of language proficiency, in “The Modern Language Journal” 91: 663–667.

Hulstijn J.H., Alderson C., Schoonen R. (2010), Developmental stages in second-language acquisition and levels of second-language proficiency: Are there links between them?, in I. Bartning, M. Martin, I. Vedder (eds.), Communicative Proficiency and Linguistic dvelopment: intersections between SLA and language testing research, Eurosla Monograph Series (http://eurosla.org/monographs/EM01/EM01home.html)

Laufer B., Nation P. (1995), Vocabulary size and use: lexical richness in L3 written production, in “Applied Linguistics” 16: 307-322.

Little D. (2007), The Common European Framework of Reference for Languages: Perspectives on the Making of Supranational Languages Education Policy, in “The Modern Language Journal” 91: 645-655.

Lu X. (2011), A corpus-based evaluation of syntactic complexity measures as indices of College-level ESL writers’ language development, in “TESOL Quarterly” 45 (1) : 36-62.

Lu X. (2010), Automatic analysis of syntactic complexity in second language writing, in “International Journal of Corpus Linguistics” 15 (4) : 474–496.

Lüdeling A. (2008), Mehrdeutigkeiten und Kategorisierung: Probleme bei der Annotation von Lernerkorpora, in M. Walter, P. Grommes (eds.), Fortgeschrittene Lernervarietäten: Korpuslinguistik und Zweitsprachenerwerbsforschung, Tübingen, Niemeyer: 119-140.

Lüdeling A. et al. (2005), Multi-level Error Annotation in Learner Corpora, in S. Hunston, P. Danielsson (eds.), Proceedings from the Corpus Linguistics Conference Series (Corpus Linguistics 2005, Birmingham, 1415 July 2005) (http://www.corpus.bham.ac.uk/PCLC)

Malvern D. et al. (2008), Lexical Diversity and Language Development. Quantification and Assessment, New York, Palgrave Macmillan.

Mellor A. (2011), Essay Length, Lexical Diversity and Automatic Essay Scoring, in “Memoirs of the Osaka Institute of Technology”, Series B Vol. 55, No. 2 (2011) : 1–14.

Meurers D. (2012), Natural Language Processing and Language Learning, in Encyclopedia of Applied Linguistics, Blackwell, purl.org/dm/papers/meurers-11.html

Mezzadri M. (2000), Rete! Book 1, Perugia, Guerra Edizioni.

Müller Ch., Strube M. (2006), Multi-Level Annotation of Linguistic Data with MMAX2, in S. Braun, K. Kohn, J. Mukherjee (eds.), Corpus Technology and Language Pedagogy. New Resources, New Tools, New Methods, Frankfurt, Peter Lang: 197–214.

Nation P. (2001), Learning vocabulary in another language, Cambridge, Cambridge University Press.

Nation P. (2007), Fundamental issues in modelling and assessing vocabulary knowledge, in H. Daller, J. Milton, J. Treffers-Daller, (eds.), Modelling and Assessing Vocabulary Knowledge, Cambridge, Cambridge University Press.

Nesselhauf N. (2005), Collocations in a Learner Corpus, Amsterdam, John Benjamins.

North B. (2000), The Development of a Common Framework Scale of Language Proficiency, Oxford, Peter Lang.

O’Loughin K. (1995), Lexical density in candidate output on direct and semi-direct versions of an oral proficiency test, in “Language Testing” 12 (2) : 217-237.

Ortega L. (2003), Syntactic complexity measures and their relationship to L2 proficiency: A research synthesis of college-level L2 writing, in “Applied Linguistics” 24 (4) : 492–518.

Paquot M., Granger S. (2012), Formulaic language in Learner Corpora, in “Annual Review of Applied Linguistics” 32: 130-149.

Pollitt A., Murray N.L. (1996), What raters really pay attention to, in M. Milanovic, N. Saville (eds.), Performance testing, cognition and assessment; Selected papers from the 15th Language Testing Research Colloquium, Cambridge, Cambrudge University Press: 74-91.

Read J., Nation P. (2004), Measurement of formulaic sequences, in N. Schmitt (ed.), Formulaic sequences: Acquisition, processing and use, Amsterdam, John Benjamins: 23-35.

Read J. (2000), Assessing vocabulary, Cambridge, Cambridge University Press.

Reznicek M. et al. (2012), Das Falko-Handbuch. Korpusaufbau und Annotationen. Version 2.01. HU Berlin (http://www.linguistik.hu-berlin.de/institut/professuren/korpuslinguistik/forschung/falko/


Reznicek M., Lüdeling A., Hirschmann H. (in print), Competing Target Hypotheses in the Falko Corpus. A Flexible Multi-Layer Corpus Architecture, in A. Díaz-Negrillo, N. Ballier, P. Thompson (eds.), Automatic Treatment and Analysis of Learner Corpus Data, Amsterdam, John Benjamins (Series Studies in Corpus Linguistics).


imrott A., Heift T. (2008), Evaluating automatic detection of misspellings in German, in “Language Learning & Technology” 11 (3) : 73-92.

Schmitt N., Carter N. (2004), Formulaic sequences in action: An Introduction, in N. Schmitt (ed.), Formulaic sequences: Acquisition, processing and use, Amsterdam, John Benjamins: 1-21.

Schneider J.G. (2013), Sprachliche ‚Fehler’ aus sprachwissenschaftlicher Sicht, in “Sprachreport” 1-2/2013: 30-37.

Spinelli B., Parizzi F. (ed.) (2010), Profilo della lingua italiana, Firenze, La Nuova Italia.

Stede M. (2007), Korpusgestützte Textanalyse. Grundzüge der Ebenen-orientierten Textlinguistik, Tübingen, Narr.

Trosborg A. (1995), Interlanguage Requests and Apologies, Berlin, de Gruyter.

Vajjala S., Meurers D. (2012), On Improving the Accuracy of Readability Classification using Insights from Second Language Acquisition, in J. Tetreault, J. Burstein, C. Leacock (eds.), Proceedings of the 7th Workshop on Innovative Use of NLP for Building Educational Applications (BEA7) at NAACL-HLT, Montreal, Canada, Association for Computational Linguistics: 163–173.

Vaughan C. (1991), Holistic assessment: What goes on in the rater’s mind?, in L. Hamp-Lyons (ed.), Assessing Second Language Writing in Academic Contexts, Norwood, Ablex: 111-125.

Wisniewski K. (2013), The empirical validity of the CEFR fluency scale: the A2 level description, in E.D. Galaczi, C.J. Weir (eds.), Exploring Language Frameworks: Proceedings of the ALTE Krakow Conference, Cambridge, Cambridge University Press: 253-272. Studies in Language Testing.

Wisniewski K. (2014), Die Validität der Skalen des Gemeinsamen europäischen Referenzrahmens für Sprachen. Eine empirische Untersuchung der Flüssigkeits- und Wortschatzskalen des GeRS am Beispiel des Italienischen und des Deutschen, Frankfurt, Peter Lang. Language Testing and Evaluation Series, 33.

Wisniewski K. et al. (2013), MERLIN: An online trilingual learner corpus empirically grounding the European Reference Levels in authentic learner data, in ICT for Language Learning, Conference Proceedings 2013, Libreriauniversitaria.it Edizioni (http://conference.pixel-online.net/ICT4LL2013


Wisniewski K., Abel A. (2012), Die Sprachkompetenzerhebung: Theorie, Methoden, Qualitätssicherung, in A. Abel, C. Vettori, K. Wisniewski (eds.), Gli studenti altoatesini e la seconda lingua: indagine linguistica e psicosociale. / Die Südtiroler SchülerInnen und die Zweitsprache: eine linguistische und sozialpsychologische Untersuchung, I.1, Bolzano-Bozen, Eurac: 13-64 (http://www.eurac.edu/en/research/publications/PublicationDetails.aspx?pubId=0100156&type=Q)

Wolfe-Quintero K., Inagaki S., Kim H.-Y. (1998), Second Language Development in Writing: Measures of Fluency, Accuracy & Complexity. Honolulu, Second Language Teaching & Curriculum Center, University of Hawaii at Manoa.

Yang W., Sun Y. (2012), The use of cohesive devices in argumentative writing by Chinese EFL learners at different proficiency levels, in “Linguistics and Education”, 23 (1) : 31-48.

Wray A. (2002), Formulaic Language and the Lexicon, Cambridge, Cambridge University Press.

Zeldes A. et al. (2009), Annis: A search tool for multi-layer annotated corpora. In Proceedings of Corpus Linguistics, July 20-23, Liverpool (http://ucrel.lancs.ac.uk/publications/cl2009/)

Zipser F. et al. (2010), A model oriented approach to the mapping of annotation formats using standards, in Workshop on Language Resource and Language Technology Standards, LREC 2010.