1,721,079 research outputs found
Recommended from our members
AI-Generated Summaries for Course Selection
Many university students use course evaluation guides to select courses. However, these guides
do not present course feedback in a way conducive to the course selection process; many offer lists of written
comments without providing tools for students to easily analyze these data. A simple improvement to such
guides would be the inclusion of AI-generated summaries of the comments. This paper implements a summarization
tool for the Harvard Course Evaluation Guide which efficiently summarizes feedback comments
through few-shot prompting of ChatGPT with a focus on capturing the overall quality, instructor quality,
and workload of each course. Using summaries generated in this way, ChatGPT is better able to rate the
qualities of a course than random sampling, using summaries generated through zero-shot prompting, and
using the verbatim first five feedback comments. A user study investigating the difference between course
selection using feedback comments and using summaries of the comments generated by the summarization
tool did not find statistically significant differences. However, summaries might potentially improve
understanding of a course’s workload, and qualitative feedback suggested AI-generated summaries offer distinct
advantages, especially in terms of cost. Therefore, AI-generated summaries cannot replace feedback
comments, but tangibly improve the course selection process.Computer Scienc
Recommended from our members
Socrates Sim: A Dialog Simulation Framework to Support Task Completion Dialog Research
In this thesis, we propose an end-to-end dialog simulation framework, called Socrates Sim, to support task completion dialog research. The goal of the framework is to provide a set of tools that will simulate conversations between a user simulator and a dialog agent in order evaluate the performance of the dialog agent and generate annotated data. Specifically, Socrates Sim framework allows the researcher to define the custom dialog domains, build user simulators, and run multiple simulations with a provided dialog agent. To demonstrate the flexibility of the framework to generalize to new domains, we will implement end-to-end simulations for the restaurant recom- mendation and move booking use case. The framework is implemented in Python 3.6 and made available on github (https://github.com/dhairyadalal/socrates).Software Engineerin
Recommended from our members
Analyzing Easy Data Augmentation Techniques for Text Classification
In natural language processing, text classification is the task of assigning a category to a given text example. Text classification has a variety of applications ranging from automated processing of customer reviews to spam detection. Current state-of-the-art approaches for text classification tasks use neural language models. These models are resource-intensive, requiring large amounts of labeled training data. However, training data may not always be available in large quantities, especially for low-resource languages, and labeled data is often laborious to obtain. Consequently, it is desirable to understand the factors contributing to text classification models' performance. I address several questions about which factors contribute to the high performance achieved by the current state-of-the-art neural models. To do so, I analyze traditional and neural methods for a diverse range of text classification tasks. I study various properties such as model assumptions and word vector representations to determine the effect of each of these features on text classification performance. On the best performing models from these understandings, I evaluate existing data augmentation techniques for text classification proposed by Wei and Zou (2019), which are methods that perform simple text editing operations to generate new training examples. However, such existing data augmentation techniques require external datasets or knowledge about the semantic properties of words. To this end, I propose and assess a novel length-based method that does not require external linguistic knowledge. This method replaces words with other words of similar length, as word length closely reflects the average information content and conceptual complexity of words in English (Piantadosi, Tily, and Gibson, 2011; Lewis and Frank, 2016). I demonstrate that this length-based technique adds consistent gains for several of the evaluated text classification tasks
Recommended from our members
Examining the Authenticity of Plato’s Epistle VII through Deep Learning
Plato’s Epistle VII, a text in which the famous Athenian philosopher describes his political involvement in the affairs of 4th-century B.C.E Syracuse, has long been considered dubious by classical philologists. In particular, scholars have scrutinized two sections of the letter, in the first of which Plato gives political advice contrary to other claims made in his other
works, and in the second of which Plato digresses from his political narrative to discuss a philosophical doctrine known as the Theory of Forms. Specifically, some scholars have raised the possibility of textual interpolation, whereby inauthentic passages might have been added to an otherwise authentic text.
This paper sets out to apply computational methodology from deep learning to provide further insight on such a long-standing problem in Platonic scholarship. As such, I developed a bidirectional long-short-term memory (LSTM) recurrent neural network (RNN) with trainable word embeddings to classify units of roughly 100 words of Ancient Greek text as belonging to Plato or one of six other Ancient Greek prose authors. Given Ancient Greek’s rich morphology, special care was taken to formulate an optimal pre-processing approach: of four methods — plaintext, lemmatization, byte-pair encoding (BPE), and a lemmatization-BPE ensemble — the ensemble exhibited the highest test accuracy (89.28%), improving significantly upon a Naïve Bayes baseline model (70.93%). Applied to Epistle VII, this model reveals that the letter seems mostly authentic, except for two markedly more spurious sections, one of which corresponds nearly perfectly with the boundaries of the section consisting of political advice to the Sicilians. Such a result provides further support to the pre-existing claim that this section is an interpolation by a non-Platonic author within an otherwise Platonic text
Recommended from our members
Causal Mediation Analysis Reveals Syntactic Agreement Mechanisms in Neural Language Models
Targeted syntactic evaluations have demonstrated the ability of language models to perform subject-verb agreement given difficult contexts. Although this is well established, the mechanisms by which neural language models achieve syntactic agreement are still not well understood. As a remedy, this thesis applies causal mediation analysis to pre-trained neural language models to locate model components and discover mechanisms responsible for predicting correctly inflected verbs. In particular, we investigate the magnitude of models’ grammatical inflections preferences, as well as compare which neurons process subject-verb agreement across sentences with different syntactic structures. In our results, we uncover both similarities and differences across architectures and model sizes, and get a glimpse at the within-model mechanisms that produce number agreement. Notably, we learn that larger models do not necessarily learn stronger preferences, we observe two distinct mechanisms for producing subject- verb agreement depending on the syntactic structure of the input sentence, and we find that language models rely on similar sets of neurons when given sentences with similar syntactic structure
Rich Linguistic Structure from Large-Scale Web Data
The past two decades have shown an unexpected effectiveness of Web-scale data in natural language processing. Even the simplest models, when paired with unprecedented amounts of unstructured and unlabeled Web data, have been shown to outperform sophisticated ones. It has been argued that the effectiveness of Web-scale data has undermined the necessity of sophisticated modeling or laborious data set curation. In this thesis, we argue for and illustrate an alternative view, that Web-scale data not only serves to improve the performance of simple models, but also can allow the use of qualitatively more sophisticated models that would not be deployable otherwise, leading to even further performance gains.Engineering and Applied Science
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
Recommended from our members
Interactive AI to Support Human-Human Communication
Such important bases of our society as healthcare, education, and productivity typically rely on effective communication between humans.
Human-human communication in such settings is often challenging, as it requires advanced communication skills that are not available to everyone.
This dissertation argues that systems that leverage models or data about communication can be used to ultimately improve communication.
Through two main kinds of studies, this dissertation characterizes challenges when modeling communication from data, as well as when applying these approaches, and it formalizes the problem in such settings.
The dissertation introduces systems to model spoken and written communication.
It further defines recommendation systems that identify patterns in communication and provides suggestions to people on how to improve their communication.
The dissertation also presents designs, implementations and evaluations of systems based on the communication models in the domains of productivity, social media conversations, healthcare, and video broadcasting.
The results of experiments evaluating these mechanism show that, compared to current practice, communication models generate new insights, and our AI-human interfaces lead to improved outcomes.
The main implication of this dissertation is that design of AI algorithms and user interfaces impact how people communicate with each other.
Importantly, technology makes teaching communication skills more accessible, democratizing skills that were only available to experts.Engineering and Applied Sciences - Computer Scienc
Variations on the Author
“Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
- …
