Studies in Linguistics and Linguistic Data Science (SLLDS)
Our series of linguistic publications introduces work done at the lab and makes it available for the public. You can find all editions here:
SLLDS

Here you find resources produced by members of the Lab in their studies and projects. All publications, including those related to projects, can be found here.
Our series of linguistic publications introduces work done at the lab and makes it available for the public. You can find all editions here:
SLLDS
GerEO is a set of syntactic and semantic annotations on German sentences containing an experiencer-object (EO) verb. EO verbs are psychological predicates whose Experiencer argument is mapped onto the object. They are claimed to be syntactically special in the literature.
GerEO
PrepSensNZZ is a collection of over 19,000 sentences containing ambiguous prepositions, which have been automatically annotated for parts-of-speech and syntactic dependency structure (following the TiGer guidelines), and also for the head of the NPs embedded by the prepositions in terms of morphological structure and lexical information.
The BECL comprises valuable data, we gladly share with other researchers. The project’s website informs you about itself and offers the opportunity to download the BECL:
BECL
Based on Kiss, Tibor & Jan Strunk (2006) Unsupervised Multilingual Sentence Boundary Detection. Computational Linguistics. 485-525. (see here) PUNKT has been implemented for the NLKT project and integrated as part of the NLTK package. On the project’s website you can find an introduction to the PUNKT sentence tokenizer and the package’s source code.
PUNKT in NLTK