SADiLaR Language Resource Repository

Autshumato TMX Integrator

Author: Martin Schlemmer
Wildrich Fourie
Publication venue: Centre for Text Technology (CTexT)
Publication date: 20/06/2013
Field of study

Utility to merge multiple translation memories over a network using Subversio

Lwazi Telephony Platform

Author: Mixo Shiburi
Louis Joubert
Richard Carlson
Tshepo Moganedi
Publication venue: Meraka Institute, CSIR
Publication date: 15/07/2013
Field of study

Lwazi is a robust telephony platform aiming to facilitate speedy development of experimental applications without sacrificing power by combining Asterisk with the MobilIVR Python interface bundled into a single build with a unified control interface

Multilingual Fulfulde-Bambara Children First Langage Acquisition (Babbling & First Words)

Author: CISSE Ibrahima Abdoul Hayou
Publication venue: Ibrahima Abdoul Hayou CISSE
Publication date: 01/01/2010
Field of study

Dataset contains videos of children interacting with caregivers. Languages: Fulfulde/Fula/Pulaar/Pular/Fulani; Bambara/Bamanakan/Dioula/Mande; Songhay; Soninke; Tamasheq; Hassany

Bambara Monolingual Children First Language Acquisition (Babbling & First Words)

Author: CISSE Ibrahima Abdoul Hayou
Publication venue: Ibrahima Abdoul Hayou CISSE
Publication date: 01/01/2010
Field of study

Dataset contains videos of children interacting with caregivers. Languages included: Bambara/Bamanakan/Dioula/Mand

Fulfulde Monolingual Children First Language Acquisition (Babbling & First Words)

Author: CISSE Ibrahima Abdoul Hayou
Publication venue: Ibrahima Abdoul Hayou CISSE
Publication date: 01/01/2010
Field of study

Dataset contains videos of children interacting with caregivers. Languages included: Fulfulde/Fula/Pulaar/Pular/Fulan

Afrikaans text unit identification data

Author: Puttkammer Martin
Publication venue: Centre for Text Technology, North-West University
Publication date: 01/01/2006
Field of study

This dataset was developed during a masters degree and used in the development of a text unit identifier capable of tagging sentences, named-entities, words, abbreviations and punctuation in Afrikaans text. The dataset consists of 39,762 tokens, containing 3,294 named entities in 1,581 sentences. The data was manually annotated by the author and verified by an independent linguist according to the tagset developed during the same study. Details on the annotation and tagset used are available in the publication mentioned above in (2). The data is also presented in CoNNL-2002 format (Sang, E. F., & De Meulder, F. (2003). Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. Available at: https://www.aclweb.org/anthology/W02-2024)

8

full texts

536

metadata records

Updated in last 30 days.

SADiLaR Language Resource Repository

Access Repository Dashboard

Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇