SADiLaR Language Resource Repository
Not a member yet
    536 research outputs found

    NCHLT Siswati fastText-CBoW embeddings

    No full text
    Static word and subword embeddings for the continuous bag of words (CBoW) flavour of the fastText architecture (Bojanowski et al., 2017). The embedding provides real-valued vector representations for Siswati text

    NCHLT Sesotho GloVe embeddings

    No full text
    Static word embedding model based on the Global Vectors architecture (Pennington et al., 2014). The embeddings provide real-valued vector representations for Sesotho text

    NCHLT Siswati FLAIR-forward embeddings

    No full text
    Contextual word/string embeddings for the forward flavour of the FLAIR architecture (Akbik et al., 2018). The embedding provides real-valued vector representations for Siswati text

    NCHLT Sesotho FLAIR-forward embeddings

    No full text
    Contextual word/string embeddings for the forward flavour of the FLAIR architecture (Akbik et al., 2018). The embedding provides real-valued vector representations for Sesotho text

    Afrikaans lexical blends dataset

    No full text
    This a dataset of Afrikaans blend constructions that have been collected and analysed using the Levenshtein distance metric. This dataset serves as the basis for the paper titled "Analysing Afrikaans lexical blends using Levenshtein distances" presented at the 4th Afrikaans Grammar Workshop (AGW) in 2023. *Dataset, R code and citation to be added as soon as the full paper has been published in the AGW conference proceedings

    NCHLT Xitsonga FLAIR-backward embeddings

    No full text
    Contextual word/string embeddings for the backward flavour of the FLAIR architecture (Akbik et al., 2018). The embedding provides real-valued vector representations for Xitsonga text

    NCHLT Siswati fastText-Skipgram embeddings

    No full text
    Static word and subword embeddings for the Skipgram flavour of the fastText architecture (Bojanowski et al., 2017). The embedding provides real-valued vector representations for Siswati text

    NCHLT Afrikaans GloVe embeddings

    No full text
    Static word embedding model based on the Global Vectors architecture (Pennington et al., 2014). The embeddings provide real-valued vector representations for Afrikaans text

    NCHLT Tshivenḓa word2vec-CBOW embeddings

    No full text
    Static word embeddings for the continuous bag of words (CBoW) flavour of the word2vec (w2v) architecture (Mikolov et al., 2013). The embedding provides real-valued vector representations for Tshivenḓa text

    NCHLT isiNdebele fastText-CBoW embeddings

    No full text
    Static word and subword embeddings for the continuous bag of words (CBoW) flavour of the fastText architecture (Bojanowski et al., 2017). The embedding provides real-valued vector representations for isiNdebele text

    8

    full texts

    536

    metadata records
    Updated in last 30 days.
    SADiLaR Language Resource Repository
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇