English CorporaThis link opens in a new windowOver 200 BILLION separate words (in context) from different registers and from varying discrete sources, including contemporary and historical American English, the World Wide Web, Wikipedia, Coronavirus, TV, movies, soap operas, the Supreme Court, TIME Magazine, early and contemporary British and Canadian Corpora, plus Google Books comparative n-grams.
International Corpus of Learner EnglishThis link opens in a new windowICLE is a computerized corpus of argumentative essays on different topics written by advanced learners of English (university students of English mainly in their second or third year). The ICLE project was launched in 1990 by Sylviane Granger, University of Louvain-la Neuve, Belgium, and in 2002 the corpus was released in CD-ROM format, accompanied by a handbook which describes its structure and the status of English in the countries of origin of the learners. The corpus is made up of a number of subcorpora representing the following language backgrounds: Bulgarian, Czech, Dutch, Finnish, French, German, Italian, Polish, Russian, Spanish, and Swedish. There is also a smaller comparable corpus of British and American undergraduate essays. The length of the essays varies between 500 and 1000 words. (as described by bo Akademi - Finland)
Corpus de la litterature medievaleThis link opens in a new windowLa base de textes du Corpus de la litterature medievale est une base litteraire de plus de 800 oeuvres en langue d'oil des origines a la fin du XVe siecle, geree par le logiciel BABEL dedie aux analyses de contenu. Le Corpus de la litterature medievale se caracterise par sa coherence editoriale et ses possibilites d'acces au texte original jumelees aux fonctions de recherche.