The corpus contains data for the following languages and forms:
Synchronic
German irgend series
Italian (uno) qualunque
Spanish cualquiera
Dutch wh* dan ook
Czech kterykoli
English any and some
Diachronic
German irgend series
Spanish cualquiera
Dutch wh* dan ook
The indefinites have been annotated with the functions in an extended version of Haspelmath’s (1997) semantic map proposed by Aguilar-Guevara et al. (2011). A description of the functions and the annotation procedure can be found in the Annotation Guidelines. Aloni et al. (2012) reports results on inter-annotator agreement.
The corpus is searchable through an online web interface and is also available as raw data.
Full documentation describing the organization of the database and the search functionality, as well as highlights of key results, is available here.
The following publications are based on data included in the database
Aloni, M, Cranenburgh, A van, Fernandez, R, and Sznajder, M. 2012. “Building a Corpus of Indefinite Uses Annotated with Fine-grained Semantic Functions.” In Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12). European Language Resources Association (ELRA).
Natural languages possess a wealth of indefinite forms that typically differ in distribution and interpretation. Although formal
semanticists have strived to develop precise meaning representations for different indefinite functions, to date there has hardly been
any corpus work on the topic. In this paper, we present the results of a small corpus study where English indefinite forms any and
some were labelled with fine-grained semantic functions well-motivated by typological studies. We developed annotation guidelines
that could be used by non-expert annotators and calculated inter-annotator agreement amongst several coders. The results show that
the annotation task is hard, with agreement scores ranging from 52% to 62% depending on the number of functions considered,
but also that each of the independent annotations is in accordance with theoretical predictions regarding the possible distributions
of indefinite functions. The resulting annotated corpus is available upon request and can be accessed through a searchable online database.
@inproceedings{AloniEtAl2012,
author = {Aloni, Maria and van Cranenburgh, Andreas and Fernandez, Raquel and Sznajder, Marta},
title = {Building a Corpus of Indefinite Uses Annotated with Fine-grained Semantic Functions},
booktitle = {Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)},
year = {2012},
publisher = {European Language Resources Association (ELRA)}
}
External References
M. Haspelmath. 1997. Indefinite Pronouns. Oxford University Press.