ICLE: The International Corpus of Learner English

ICLE is a computerized corpus of argumentative essays on various topics written by advanced learners of English (university students of English mainly in their second or third year). The ICLE project was launched in 1990 by Professor Sylviane Granger, University of Louvain-la Neuve, Belgium, and in 2002 the corpus was released in CD-ROM format, accompanied by a handbook which describes its structure and the status of English in the countries of origin of the learners.

An expanded version, ICLEv2,  featuring a built-in concordance, was published in 2009, and in 2020, a third version of the corpus was released, ICLEv3. The third version has its own web-based interface and comprises 5 million words of written English. The corpus is made up of subcorpora representing as many as 26 different language backgrounds. Corpus users can search the corpus using 21 variables concerning the writers’ background and the task type. There is also a smaller corpus of British and American undergraduate essays, entitled LOCNESS, The Louvain Corpus of Native English Essays .

The Finnish subcorpus consists of essays written by Finnish-speaking and Swedish-speaking Finns. The essays were collected from several different universities. The Finnish coordinators of the ICLE project were Håkan Ringbom and Tuija Virtanen, and Signe-Anita Lindgrén functioned as project researcher (English Department at Åbo Akademi University). Many other people kindly offered their time and help; these collaborators included R. Goldblatt, P. Hirvonen, C. Rohlich and G. Watson from the University of Joensuu; A. Mauranen from Savonlinna School of Translation Studies; R. Alanen, S. Leppänen, A. Pitkänen-Huhta and K. Sajavaara from the U of Jyväskylä; A. Chesterman and M. Hatakka from the U of Helsinki; B. Pettersson and O. Pickering from the U of Turku; and K. Timlin from the U. of Oulu.

The existence of a corpus of advanced learner English makes possible a new, more concrete approach to the features of learner English. Opinions about how learner language actually differs from native speaker language are frequently found, but in earlier times they could seldom be substantiated by concrete evidence from larger collections of texts. The present corpus can be used for many different purposes. It will, for instance, now be possible to find concrete answers to the question to what extent there is a general ’advanced learner language’ that shows consistent differences from equivalent native speaker language, and to what extent influence of the different first languages (language transfer) is manifested. There is a list of publications based on data from learner corpora such as ICLE.


Studies in the department making use of ICLE data:

Unpublished Master’s theses based on ICLE data:

