SADiLaR Language Resource Repository
Not a member yet
536 research outputs found
Sort by
Autshumato TMX Integrator
Utility to merge multiple translation memories over a network using Subversio
Lwazi Telephony Platform
Lwazi is a robust telephony platform aiming to facilitate speedy development of experimental applications without sacrificing power by combining Asterisk with the MobilIVR Python interface bundled into a single build with a unified control interface
Multilingual Fulfulde-Bambara Children First Langage Acquisition (Babbling & First Words)
Dataset contains videos of children interacting with caregivers.
Languages: Fulfulde/Fula/Pulaar/Pular/Fulani; Bambara/Bamanakan/Dioula/Mande; Songhay; Soninke; Tamasheq; Hassany
Bambara Monolingual Children First Language Acquisition (Babbling & First Words)
Dataset contains videos of children interacting with caregivers.
Languages included: Bambara/Bamanakan/Dioula/Mand
Fulfulde Monolingual Children First Language Acquisition (Babbling & First Words)
Dataset contains videos of children interacting with caregivers.
Languages included: Fulfulde/Fula/Pulaar/Pular/Fulan
Afrikaans text unit identification data
This dataset was developed during a masters degree and used in the development of a text unit identifier capable of tagging sentences, named-entities, words, abbreviations and punctuation in Afrikaans text.
The dataset consists of 39,762 tokens, containing 3,294 named entities in 1,581 sentences. The data was manually annotated by the author and verified by an independent linguist according to the tagset developed during the same study. Details on the annotation and tagset used are available in the publication mentioned above in (2). The data is also presented in CoNNL-2002 format (Sang, E. F., & De Meulder, F. (2003). Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. Available at: https://www.aclweb.org/anthology/W02-2024)