How to use TEI for the annotation of CMC and social media resources: a practical introduction
October 4th, 2017, 14:30–18:00, Eurac Research, Italy
The goal of the event is to give a practical introduction into the annotation of language data from genres of computer-mediated communication (CMC) and social media using the formats of the Text Encoding Initiative (TEI). In an introductory section participants will learn about the general architecture of TEI encoding schemas and about rules for the creation of so-called customizations which allow for extending the use of TEI with textual genres and in domains which are not yet covered by the current version of the TEI guidelines. Examples for TEI customizations are the representation schemas for CMC/social media genres developed in the TEI special interest group “computer-mediated communication”.
In a hands-on session, participants will learn how to use these customizations to create a basic TEI representation for their own CMC/social media data. For this purpose participants may bring samples from their own data/corpora or select a sample from collections of Wikipedia talk pages in several languages prepared by the instructors. Format specifications for participants’ own data will be announced in advance. For the hands-on session, participants will be asked to bring a laptop computer with WLAN and a full or trial license of the oXygen XML editor.
The tutorial is funded as a CLARIN User Involvement Event and will be held in association with the 5th Conference on CMC and Social Media Corpora for the Humanities (cmccorpora17), held Oct 3rd & 4th @ Eurac Resarch, Italy.
Registration
There is no registration fee for the workshop. Participants who also want to attend the main conference can register for the workshop together with their conference registration.
For participants only interested in the CLARIN UI Event (without participation in the CMC-corpora 2017 conference) registration is mandatory. Please, write an e-mail to linguistics@eurac.edu.
The tutorial is held by:
- Harald Lüngen (Institute for the German Language, Mannheim, Germany)
- Michael Beißwenger (Universität Duisburg Essen, Germany)
- Laura Herzberg (University of Mannheim, Germany)
The workshop is organized by:
- Michael Beißwenger (University of Duisburg-Essen, Germany)
- Ciara R. Wigham (Université Clermont Auvergne, France)
- Egon W. Stemle (Eurac Research, Italy)