The BATMAT Corpus

The BATMAT corpus (‘BA Theses MA Theses’) is a new addition to advanced EFL learner corpora. The compilation of the corpus was initiated in 2012 and directed until 2017 by Dr Signe-Anita Lindgrén, who served as H.W. Donner Lecturer in Applied Linguistics within the project “Advances in Applied Linguistics” (AAL, 2012-2017, a project directed by Professor Tuija Virtanen-Ulfhielm). The five-year AAL project was financed by the H.W. Donner Fund of the Åbo Akademi Foundation.

The BATMAT corpus comprises some 3 million words derived from 120 BA theses and MA theses written by predominantly L1 Finland-Swedish speakers studying English Language and Literature at Åbo Akademi University, Finland. The BA theses included in the corpus were completed between 2002 and 2016 while the MA theses originate from the period of 1972-2016. The corpus consists of theses from three topic areas: linguistics, literature, and society. The L1 of some of the students is Finnish, or occasionally another language; all are fluent speakers of Swedish. Written consent to include the thesis in the corpus was collected from the writers. Enquiries concerning the use of the corpus for research purposes should be addressed to Professor Brita Wårvik.


Studies based on the BATMAT corpus:

Lindgrén, Signe-Anita. 2015. ‘Academic Vocabulary and Readability in EFL Theses’. In: Päivi Pietilä, Katalin Doró & Renata Pípalová (eds.) Lexical Issues in L2 Writing. Newcastle upon Tyne: Cambridge Scholars Publishing: 155-174.

Larsson, Tove. 2017. ‘The importance of, it is important that or importantly? The use of morphologically related stance markers in learner and expert writing’. International Journal of Corpus Linguistics 22(1): 57-84.

Stormbom, Charlotte. 2019. ‘Language change in L2 academic writing: The case of epicene pronouns’. Journal of English for Academic Purposes 38: 95-105.

Unpublished Master’s Theses based on the BATMAT corpus:

Heinonen, Lauri. 2013. Documenting the Emergence of the BATMAT Corpus. Supervisor: Signe-Anita Lindgrén.

Lindkvist, Sofia. 2014. Readability in Academic Texts Written by EFL University Students. Supervisor: Signe-Anita Lindgrén.

Lundström, Emelie. 2014. Shell Nouns in EFL Theses. Supervisor: Signe-Anita Lindgrén.


Student work based on another two learner English corpora:

Within the framework of the AAL project, Dr Signe-Anita Lindgrén also initiated and directed collections of learner English for studies of EFL lexical knowledge and reading fluency (LONGLEX) as well as spoken academic discourse (Talkabulary). These two corpora are no longer available for use. The following BA and MA theses use data from the two corpora.

Unpublished Bachelor’s and Master’s Theses based on the LONGLEX corpus, supervised by Signe-Anita Lindgrén:

Hurme, Ida. 2013. Receptive and Productive EFL Lexical Knowledge of Finnish 8th Graders – A Study Comparing Swedish-Finnish Bilinguals with Finnish and Swedish Monolinguals. MA Thesis.

Lassus, Fanny. 2014. Receptive and Productive Vocabulary Outside the Curriculum: A Study of EFL Finnish-speaking and Swedish-speaking 8th Grade Learners. MA Thesis.

Linderborg, Ellen. 2014. An analysis of errors made in EFL by Finnish-speaking and Swedish-speaking students in lower secondary school. BA Thesis.

Järnström, Felicia. 2015. Out-of-School English Online and Its Effect on the Vocabulary Knowledge of Finnish 8th Graders. BA Thesis.

Räty, Reetta. 2016. English Vocabulary Knowledge Development of Finnish Secondary School Students. MA Thesis.

Lindberg, Erica. 2017. Spelling erros in EFL vocabulary in Finnish and Finland-Swedish 8th and 9th grade. Supervisors: Signe-Anita Lindgrén and Tuija Virtanen-Ulfhielm. MA Thesis.

Unpublished Master’s Theses based on the Talkabulary corpus:

Holmgård, Sebastian. 2016. ”Oh, Well, You Know” Discourse Marker Use Among Finland-Swedish Advanced Students of English in Academic Conversation. Supervisor: Signe-Anita Lindgrén.

Koskilahti, Emilia. 2018. Linking Adverbials in EFL academic presentations. Supervisor: Brita Wårvik.

Korpinen, Kaisa. 2018. Fluency in novice and advanced EFL students’ academic presentations: two case studies of formulaic sequences. Supervisors: Signe-Anita Lindgrén and Brita Wårvik.

Updated 4.4.2024