IT University of Copenhagen

The IT University of Copenhagen's Repository
Not a member yet
    9607 research outputs found

    Chaudhry, Annam Bashir

    No full text

    Wagenen, Noah van

    No full text

    Rosenkvist, Sebastian

    No full text

    Nampeera, Christine

    No full text

    Flexible I/O for Database Management Systems with xNVMe

    No full text
    Today, NVMe SSDs cover a diverse family of devices (e.g., Zoned Namespaces, Flexible Data Placement, and Key-Value SSDs) and offer high performance (microsecond-scale latency). To leverage the capabilities of these devices, a variety of I/O paths are available (e.g., libaio, io_uring, and SPDK). On the other hand, to avoid the challenges and unpredictability that comes with writing code to target such diversity, most data systems today still rely on the conventional filesystem APIs (POSIX) and synchronous IO. While (maybe) increasing programmer productivity, this choice leads to sub-optimal utilization of the modern NVMe storage.To unify the diverse I/O storage paths and make them more accessible to a wider-scale of programmers, Samsung built xNVMe that exposes a single message-passing API with minimal overhead. This paper takes the next step and integrates xNVMe into a state-of-the-art database system, DuckDB, by creating a new filesystem extension, nvmefs, that interacts with blocks on disk instead of files. We demonstrate that xNVMe integration allows DuckDB to utilize IO Passthru, SPDK, and Flexible Data Placement. Using these modern I/O methods, compared to DuckDB’s default sync I/O, nvmefs achieves either comparable performance for non-I/O-intensive cases or up to 50% lower query times on I/O-intensive queries

    Polarization in an Evolving Social Media Landscape: Multimodal Insights from the Climate Debate

    Full text link
    Polarization of opinions in contemporary societies, particularly on social media, has become a central concern in academic and institutional settings, especially because of its association with misinformation, communicative breakdowns, and toxic exchanges. Although this phenomenon has been widely examined, two aspects of polarization require further attention.The first aspect concerns the fact that polarization in public debates is often shaped by underlying social cleavages that bring into confrontation groups that differ in their socioeconomic backgrounds, belief systems, moral frameworks, and even psychological orientations. These substantial divergences can in turn shape the way in which each side communicates and presents its views. As these communicative differences manifest, they can exacerbate hostility and undermine opportunities for constructive dialogue. This makes it essential to systematically analyse these communicative divergences in order to understand how they shape both the articulation of positions and the audience reactions they give rise to.A second aspect involves the shifting architecture of contemporary platforms. Social media environments are increasingly characterized by the proliferation of visual formats, which now play a central role in the production, circulation, and reception of political messages. This visual turn reconfigures both the affordances available to users and the dynamics through which political meaning is produced and contested, with inevitable implications for the ways polarized conflicts emerge and persist online. Under these conditions, in order to keep pace with the current evolution of digital platforms, we need analytical strategies that allow us to capture the role of visual formats in shaping online debates.To examine these two dimensions in a concrete research setting, the dissertation takes the climate change debate as its empirical focus. This issue represents a salient contemporary cleavage, marked by sustained public attention, pronounced stakeholder divisions, and extensive use of multimodal communication. Within this context, the dissertation advances a twofoldcontribution. Methodologically, it introduces computational tools for analysing multimodal communication in social-media settings, including automated ideological detection from textual content and semantic categorization of visual material. Empirically, it shows that the two sides of the climate debate diverge in both their textual communication styles and the visual themes they propagate, and that these discursive differences are reflected in systematic variations inaudience engagement

    Scalable Approximate Nearest Neighbour Algorithms for machine learning and data mining

    Full text link
    In this thesis we explore different uses of techniques for approximate nearest neighbor search (ANN search). Finding a set of close points is necessary for many diverse tasks such as clustering and outlier detection, but finding the exact closest points with guarantee can be expensive, often quadratic in running time complexity. Considering the increasing size of data available, this is a problem that will only get more relevant as time goes on.ANN search can be used to speed up the process of finding near neighbors at the cost of accuracy by introducing a degree of approximation.We introduce implementations of algorithms using Locality Sensitive Hashing (LSH) and Hierarchical Navigable Small Worlds (HNSW) for ANN search for different tasks and explore the tradeoffs they provide between accuracy and running time. The main contributions of the thesis are:• We introduce an implementation of an LSH-based algorithm for approximate DBSCAN clustering. Fast algorithms for DBSCAN clustering exist in literature, but they often lack theoretical guarantees or rely on the dataset being low-dimensional to work optimally. The proposed algorithm was compared to otherstate-of-the-art DBSCAN algorithms. We show that while other algorithms failed in some settings for different reasons, ours was consistently either the fastest implementation or performing competitively.• We explore the benefits and limitations of HNSW graphs for outlier detection. We evaluate direct implementations that compute thekNNand LOF scores fromapproximate neighborhoods aswell as white-box methods. The white-box methods compute outlier-scores directly from the underlying graph and serve asan example to break newground to make more task-aware tools based on ANN search techniques.• We present three algorithms for approximate Single-Linkage clustering. As with DBSCAN, the issue lies in the quadratic scaling of the problem. All three utilize HNSW graphs to varying degrees of approximation. We explore their respective tradeoffs in accuracy vs. running time as well as compare how theyperform against exact implementations. One of the algorithms in particular explores recent methods using heap-of-searchers for incremental ANN search

    Helsted, Emilia

    No full text

    Kordes, Tom Jonas

    No full text

    Kastrup, Søren

    No full text

    4,472

    full texts

    9,607

    metadata records
    Updated in last 30 days.
    The IT University of Copenhagen's Repository
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇