The Downloadable Corpus
The downloadable WRICLE corpus includes just a plain text file for each essay, with metadata for each text (learner profile, essay profile) provided in a separate file.
The Searchable Corpus
An annotated version of the corpus is accessable online for search only (See the "Search Interface" tab at your left). This version has been annotated at multiple layers: whole document (with the learner and essay metadata), sentence, clause and NP.
The corpus was annotated using the UAM CorpusTool, software for multi-layer text annotation. The development of this software was partially funded by the WOSLAC project.
At the clause level, basic syntactic categories such as Voice, Modality, Polarity and Finiteness have been annotated. These were automatically tagged by CorpusTool using lexical patterns, and false matches eliminated manually.
At the NP level, segments have been tagged by form (proper, common, pronominal), semantic class (human, organisation, location, etc.), and also by syntactic function (subject, object, etc.)
The Search interface permits you to search using these categories, for instance, to locate human Subject units within passive clauses in essays by 3rd year students.
Full WOSLAC Corpus
We are currently extending the annotation, tagging instances of clauses with marked word order. For more detail, see the WOSLAC link at left.