December 14, 2018

2019 Edition with a focus on: Crowd-sourcing Web Corpora of Nigerian languages
1-3 July, 2019

A Special one co-organised with the collaboration of Africa Regional Centre for Information Science (ARCIS), University of Ibadan, Nigeria; Institut Français de Recherche en Afrique (IFRA), Ibadan, Nigeria; and NaijaSynCor {Centre National de la Recherche Scientifique (CNRS), Paris, France: Langage, Langues et Cultures d'Afrique Noire ( LLACAN), Villejuif, France; Modèles, Dynamiques, Corpus (MODYCO), Nanterre, France}

Programme Description
The 2019 edition of the Summer School in Natural Language Processing (NLP) at the Africa Regional Centre for Information Science of the University of Ibadan, will give hands-on access to methods and tools for collecting written language data of various languages from the Web.

The goal is that the participants will “class-source” a comprehensive collection of textual data for many different languages that are spoken in Nigeria. Each participant would collect three types of data: (i) public web data from websites, (ii) discussion groups from social media, and (iii) extensive text message exchanges. After the actual collection step of the raw data, the files will undergo all the necessary steps for public release: cleaning and homogenization of the data, anonymization of private data, creation of standardized metadata including free licensing rules.

Simple statistical error-mining procedures will insure the compatibility of the created multi-lingual corpora and will give a first glimpse of possible ways to statistically analyze the corpus for lexical and variational studies. The NLP Summer School comprises four sessions of lecture as well as hours of practical sessions.

Target Participants
Target participants are final year and/or postgraduate students undertaking research in the area of NLP. Participants would be required to come with their laptops, and participate in all sessions both at the theory and practical sessions. A certificate of participation will be given at the end of the summer school.

The NLP summer school will take place at Africa Regional Centre for Information Science, No. 6, Benue Road, University of Ibadan, Ibadan, Nigeria.

Submission of Research Plan and Application
To enroll in the school, all applicants are required to submit detailed CV; indicating clearly their institutional affiliation (to a University or other), previous work done in NLP, as well as a succinct research plan containing the following 5 points, one sentence per answer:

1. Name of the language of the data to be collected.
2. The interest and competence of the candidate in the said language.
3. A very brief description of a corpus-based study the candidate wishes to conduct on the data.
4. The URLs of the websites to be collected with a brief description of the websites’ content including a rough estimate of the number of pages and number of words contained in the websites.
5. The URLs and a brief description of the social media discussion groups to be collected.
6. A description of the private discussion group, to be collected from a textual message program that allows exportation of the complete data.

During the summer school, cost of lecture materials (soft copies), internet connectivity, tea breaks and lunch for participants will be covered by the organisers, while the participants (and/or their institutions) should make provision for their accommodation, travel expenses and feeding outside of the summer school. No registration (or any other) fee will be charged.

Deadline for the submission of research plan and application is 22 February, 2019
The application and research plan should be sent to:
ARCIS NLP SUMMER SCHOOL [email protected]
Selected candidates will be notified on or before 22 March, 2019

The NLP Summer School is funded by Africa Regional Centre for Information Science, University of Ibadan, The French Embassy, Nigeria and IFRA

