SADiLaR Language Resource Repository
Not a member yet
536 research outputs found
Sort by
NCHLT Setswana FLAIR-forward embeddings
Contextual word/string embeddings for the forward flavour of the FLAIR architecture (Akbik et al., 2018). The embedding provides real-valued vector representations for Setswana text
NCHLT Afrikaans FLAIR-forward embeddings
Contextual word/string embeddings for the forward flavour of the FLAIR architecture (Akbik et al., 2018). The embedding provides real-valued vector representations for Afrikaans text
NCHLT isiZulu FLAIR-forward embeddings
Contextual word/string embeddings for the forward flavour of the FLAIR architecture (Akbik et al., 2018). The embedding provides real-valued vector representations for isiZulu text
NCHLT isiNdebele RoBERTa language model
Contextual masked language model based on the RoBERTa architecture (Liu et al., 2019). The model is trained as a masked language model and not fine-tuned for any downstream process. The model can be used both as a masked LM or as an embedding model to provide real-valued vectorised respresentations of words or string sequences for isiNdebele text
NCHLT Sesotho RoBERTa language model
Contextual masked language model based on the RoBERTa architecture (Liu et al., 2019). The model is trained as a masked language model and not fine-tuned for any downstream process. The model can be used both as a masked LM or as an embedding model to provide real-valued vectorised respresentations of words or string sequences for Sesotho text
NCHLT Sepedi RoBERTa language model
Contextual masked language model based on the RoBERTa architecture (Liu et al., 2019). The model is trained as a masked language model and not fine-tuned for any downstream process. The model can be used both as a masked LM or as an embedding model to provide real-valued vectorised respresentations of words or string sequences for Sepedi text
NCHLT Siswati word2vec-Skipgram embeddings
Static word embeddings for the Skipgram flavour of the word2vec (w2v) architecture (Mikolov et al., 2013). The embedding provides real-valued vector representations for Siswati text
NCHLT isiNdebele GloVe embeddings
Static word embedding model based on the Global Vectors architecture (Pennington et al., 2014). The embeddings provide real-valued vector representations for isiNdebele text
NCHLT isiZulu fastText-Skipgram embeddings
Static word and subword embeddings for the Skipgram flavour of the fastText architecture (Bojanowski et al., 2017). The embedding provides real-valued vector representations for isiZulu text
NCHLT Siswati word2vec-CBOW embeddings
Static word embeddings for the continuous bag of words (CBoW) flavour of the word2vec (w2v) architecture (Mikolov et al., 2013). The embedding provides real-valued vector representations for Siswati text