SADiLaR Language Resource Repository
Not a member yet
536 research outputs found
Sort by
NCHLT Siswati fastText-CBoW embeddings
Static word and subword embeddings for the continuous bag of words (CBoW) flavour of the fastText architecture (Bojanowski et al., 2017). The embedding provides real-valued vector representations for Siswati text
NCHLT Sesotho GloVe embeddings
Static word embedding model based on the Global Vectors architecture (Pennington et al., 2014). The embeddings provide real-valued vector representations for Sesotho text
NCHLT Siswati FLAIR-forward embeddings
Contextual word/string embeddings for the forward flavour of the FLAIR architecture (Akbik et al., 2018). The embedding provides real-valued vector representations for Siswati text
NCHLT Sesotho FLAIR-forward embeddings
Contextual word/string embeddings for the forward flavour of the FLAIR architecture (Akbik et al., 2018). The embedding provides real-valued vector representations for Sesotho text
Afrikaans lexical blends dataset
This a dataset of Afrikaans blend constructions that have been collected and analysed using the Levenshtein distance metric. This dataset serves as the basis for the paper titled "Analysing Afrikaans lexical blends using Levenshtein distances" presented at the 4th Afrikaans Grammar Workshop (AGW) in 2023.
*Dataset, R code and citation to be added as soon as the full paper has been published in the AGW conference proceedings
NCHLT Xitsonga FLAIR-backward embeddings
Contextual word/string embeddings for the backward flavour of the FLAIR architecture (Akbik et al., 2018). The embedding provides real-valued vector representations for Xitsonga text
NCHLT Siswati fastText-Skipgram embeddings
Static word and subword embeddings for the Skipgram flavour of the fastText architecture (Bojanowski et al., 2017). The embedding provides real-valued vector representations for Siswati text
NCHLT Afrikaans GloVe embeddings
Static word embedding model based on the Global Vectors architecture (Pennington et al., 2014). The embeddings provide real-valued vector representations for Afrikaans text
NCHLT Tshivenḓa word2vec-CBOW embeddings
Static word embeddings for the continuous bag of words (CBoW) flavour of the word2vec (w2v) architecture (Mikolov et al., 2013). The embedding provides real-valued vector representations for Tshivenḓa text
NCHLT isiNdebele fastText-CBoW embeddings
Static word and subword embeddings for the continuous bag of words (CBoW) flavour of the fastText architecture (Bojanowski et al., 2017). The embedding provides real-valued vector representations for isiNdebele text