California Polytechnic State University

DigitalCommons@CalPoly
Not a member yet
    41530 research outputs found

    The Boy with the Switch in his Head

    Full text link

    Funeral Clothes

    Full text link

    Boogey Monster

    Full text link

    The Alchemists

    No full text

    Monster Face

    No full text

    Forever With My Yellow

    Full text link

    Picking Up the Pieces

    Full text link

    Delusory Dream

    Full text link

    Frequent Itemset Mining with tidyclust in R

    Full text link
    Unsupervised learning is closely associated with clustering, however other methods fall under this umbrella such as data mining. In R, the tidyclust package provides a unified interface for clustering models, yet lacks support for data mining. This thesis addresses this gap by introducing the Apriori and ECLAT algorithms into tidyclust, with a focus on frequent itemset mining. Unlike traditional clustering models, frequent itemsets produce groupings of column variables, rather than cluster labels or partitions of observations. To address this, a novel clustering approach is proposed: items (columns) are grouped based on their ”dominant” frequent itemset. A key contribution is a new prediction method, modeled as a recommender system, to predict missing items. This implementation extends tidyclust to support column-based clustering, with applications in market basket analysis and recommender systems

    Density-Based and Model-Based Clustering with Tidyclust in R

    Full text link
    Clustering is a fundamental technique in unsupervised learning that can be used to find hidden patterns and structures within unlabeled data. The tidyclust package in R provides a unified interface for applying various clustering techniques to data. This paper outlines the addition of density-based clustering with DBSCAN, and model-based clustering using Gaussian mixture models (GMMs) to the tidyclust package. DBSCAN can be performed using the db_clust() function and makes use of the dbscan package implementation as its engine. GMMs can be fit using the gm_clust() function which makes use of the mclust package implementation. This paper highlights the changes made to these underlying implementations in the process of bringing these methods into tidyclust. This includes changes to the model argument names, how the model is fit on data, and how the model is used to predict on future data

    40,274

    full texts

    41,530

    metadata records
    Updated in lastΒ 30Β days.
    DigitalCommons@CalPoly
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! πŸ‘‡