How to use TEI for the annota­tion of CMC and social media resources: a prac­tical introduction

Octo­ber 4th, 2017, 14:30–18:00, Eurac Research, Italy

The goal of the event is to give a prac­tical intro­duc­tion into the annota­tion of lan­guage data from genres of com­puter­-­me­di­ated com­mu­nic­a­tion (CMC) and social media using the formats of the Text Encod­ing Ini­ti­at­ive (TEI). In an intro­duct­ory sec­tion par­ti­cipants will learn about the gen­eral archi­tec­ture of TEI encod­ing schemas and about rules for the cre­ation of so-c­alled cus­tom­iz­a­tions which allow for extend­ing the use of TEI with tex­tual genres and in domains which are not yet covered by the cur­rent ver­sion of the TEI guidelines. Examples for TEI cus­tom­iz­a­tions are the rep­res­ent­a­tion schemas for CMC/so­cial media genres developed in the TEI spe­cial interest group “com­puter­-­me­di­ated communication”.

In a hand­s-on ses­sion, par­ti­cipants will learn how to use these cus­tom­iz­a­tions to cre­ate a basic TEI rep­res­ent­a­tion for their own CMC/so­cial media data. For this pur­pose par­ti­cipants may bring samples from their own data/­cor­pora or select a sample from col­lec­tions of Wiki­pe­dia talk pages in sev­eral lan­guages pre­pared by the instruct­ors. Format spe­cific­a­tions for par­ti­cipants’ own data will be announced in advance. For the hand­s-on ses­sion, par­ti­cipants will be asked to bring a laptop com­puter with WLAN and a full or trial license of the oXy­gen XML editor.

The tutorial is fun­ded as a CLARIN User Involve­ment Event and will be held in asso­ci­ation with the 5th Con­fer­ence on CMC and Social Media Cor­pora for the Human­it­ies (cmccorpora17), held Oct 3rd & 4th @ Eurac Resarch, Italy.


There is no regis­tra­tion fee for the work­shop. Par­ti­cipants who also want to attend the main con­fer­ence can register for the work­shop together with their con­fer­ence registration.

For participants only inter­ested in the CLARIN UI Event (without par­ti­cip­a­tion in the CMC-­cor­pora 2017 con­fer­ence) regis­tra­tion is man­dat­ory. Please, write an e-mail to

The tutorial is held by:

  • Har­ald Lün­gen (In­sti­tute for the Ger­man Lan­guage, Man­nheim, Ger­many)
  • Michael Beißwenger (Uni­versität Duis­burg Essen, Germany)
  • Laura Herzberg (Uni­versity of Man­nheim, Germany)

The work­shop is organ­ized by:

  • Michael Beißwenger (Uni­versity of Duis­bur­g-Essen, Germany)
  • Ciara R. Wigham (Uni­versité Cler­mont Auvergne, France)
  • Egon W. Stemle (Eurac Research, Italy)

Pro­vi­sional Agenda