5th Conference on CMC and Social Media Corpora for the Humanities (cmccorpora17)
The 5th conference on CMC and Social Media Corpora for the Humanities will be held in Bolzano/Bozen, Italy on 3-4 October 2017 and will focus on the collection, analysis and processing of mono and multimodal, synchronous and asynchronous communications. The focus will encompass different CMC genres. These include, but are not limited to, discussion forums, blogs, newsgroups, emails, SMS and WhatsApp, text chats, wiki discussions, social network exchanges (such as Facebook, Twitter, Linkedin), discussions in multimodal and/or 3D environments (virtual worlds, gaming worlds).
The conference will bring together researchers who are interested in the collection, organization, processing, analysis and sharing of CMC data for research purposes. We invite submissions on corpus analysis of various types of CMC data for linguistic or applied linguistic purposes and Natural Language Processing.
The conference is hosted by Eurac Research and will include a post-conference workshop on using the TEI for annotating CMC and social media resources (5 October). It will be followed by the 4th Learner Corpus Research Conference, which will be held at the same venue from 5-7 October.
Topics of interest
- Development of CMC corpora
- Building CMC corpora: from data collection to publication
- Open data for research on CMC: questions of ethics and rights
- Annotation of CMC genres: representation of CMC genres, annotation of linguistic phenomena, metadata
- Multimodal corpora
- Analysis of CMC corpora
- Sociolinguistic studies of CMC
- Discourse analysis of CMC
- Linguistic characteristics of CMC
- Multimodal aspects of CMC
- Language in contact and code-switching in CMC
- CMC in language learning & teaching
- Natural Language Processing of CMC
- PoS Tagging
- Syntactic parsing
- Named-entity recognition
We invite submissions for papers, posters and software/corpus demonstrations on any topic relevant to the above list of themes. For this conference, we are requesting extended abstracts (2-4 pages) in English. All abstracts will be peer-reviewed by the scientific committee. All submissions should follow the template which you can download here:
- MSWord Template (Example document)
- Libre-/Openoffice Template (Example document)
- LaTeX Package (Example included)
Please submit your paper via the online conference system (EasyChair).
Paper presentations will consist of a 20 minute talk followed by 10 minutes for questions and discussion. The poster presentationand software/corpus demonstration session will be opened with each presenter/demonstrator giving a one-minute ‘teaser talk.’ Accepted papers will be published in online proceedings before the conference. After the conference, authors of best-reviewed papers will be invited to submit extended versions of their papers to be published in an edited monograph to appear in 2018.
| ||Submission deadline|
| ||Notification of acceptance|
| ||Submission of camera-ready version|
|3rd & 4th October||Conference|
|5th October||Post-conference workshop (CLARIN Workshop)|
CLARIN Tutorial: How to use TEI for the annotation of CMC and social media resources: a practical introduction
The goal of the event is to give a practical introduction into the annotation of language data from genres of computer-mediated communication (CMC) and social media using the formats of the Text Encoding Initiative (TEI). In an introductory section participants will learn about the general architecture of TEI encoding schemas and about rules for the creation of so-called customizations which allow for extending the use of TEI with textual genres and in domains which are not yet covered by the current version of the TEI guidelines. Examples for TEI customizations are the representation schemas for CMC/social media genres developed in the TEI special interest group “computer-mediated communication”.
In a hands-on session, participants will learn how to use these customizations to create a basic TEI representation for their own CMC/social media data. For this purpose participants may bring samples from their own data/corpora or select a sample from collections of Wikipedia talk pages in several languages prepared by the instructors. Format specifications for participants’ own data will be announced in advance. For the hands-on session, participants will be asked to bring a laptop computer with WLAN and a full or trial license of the oXygen XML editor.
The tutorial is funded as a CLARIN User Involvement Event and will be held in association with the 5th Conference on CMC and Social Media Corpora for the Humanities (cmccorpora17), held Oct 3rd & 4th @ Eurac Resarch, Italy.
There will be no registration fee for the workshop. Registration will be possible as of June 30, 2017 via https://cmc-corpora2017.eurac.edu/registration/. Participants who also want to attend the main conference can register for the workshop together with their conference registration. The registration fee for the main conference is 75 EUR and includes conference materials, coffee breaks, and lunch.
The tutorial is held by:
- Harald Lüngen (Institute for the German Language, Mannheim, Germany)
- Michael Beißwenger (Universität Duisburg Essen, Germany)
- Laura Herzberg (University of Mannheim, Germany)
The workshop is organized by:
- Michael Beißwenger (University of Duisburg-Essen, Germany)
- Ciara R. Wigham (Université Clermont Auvergne, France)
- Egon W. Stemle (Eurac Research, Italy)