281 research outputs found
Evaluating LLM-Generated Legal Explanations for Regulatory Compliance in Social Media Influencer Marketing
The rise of influencer marketing has blurred boundaries between organic content and sponsored content, making the enforcement of legal rules relating to transparency challenging. Effective regulation requires applying legal knowledge with a clear purpose and reason, yet current detection methods of undisclosed sponsored content generally lack legal grounding or operate as opaque ``black boxes.'' Using 1,143 Instagram posts, we compare gpt-5-nano and gemini-2.5-flash-lite under three prompting strategies with controlled levels of legal knowledge provided. Both models perform strongly in classifying content as sponsored or not (F1 up to 0.93), with Gemini favouring recall (0.93) and GPT favouring precision (0.95), though performance drops by over 10 points on ambiguous cases. We further develop a taxonomy of reasoning errors, showing frequent citation omissions (28.57%), unclear references (20.71%), and hidden ads exhibiting the highest miscue rate (28.57%). While adding regulatory text to the prompt improves explanation quality, it does not consistently improve detection accuracy. The contribution of this paper is threefold. First, it makes a novel addition to regulatory compliance technology by providing a taxonomy of common errors in LLM-generated legal reasoning to evaluate whether automated moderation is not only accurate but also legally robust, thereby advancing the transparent detection of influencer marketing content. Second, it features an original dataset of LLM explanations annotated by two students who were trained in influencer marketing law. Third, it combines quantitative and qualitative evaluation strategies for LLM explanations and critically reflects on how these findings can support advertising regulatory bodies in automating moderation processes on a solid legal foundation
LegalLens Shared Task 2024:Legal Violation Identification in Unstructured Text
This paper presents the results of the Legal-Lens Shared Task, focusing on detecting legal violations within text in the wild across two sub-tasks: LegalLens-NER for identifying legal violation entities and LegalLens-NLI for associating these violations with relevant legal contexts and affected individuals. Using an enhanced LegalLens dataset covering labor, privacy, and consumer protection domains, 38 teams participated in the task. Our analysis reveals that while a mix of approaches was used, the top-performing teams in both tasks consistently relied on fine-tuning pretrained language models, outperforming legalspecific models and few-shot methods. The topperforming team achieved a 7.11% improvement in NER over the baseline, while NLI saw a more marginal improvement of 5.7%. Despite these gains, the complexity of legal texts leaves room for further advancements
LegalLens Shared Task 2024:Legal Violation Identification in Unstructured Text
This paper presents the results of the Legal-Lens Shared Task, focusing on detecting legal violations within text in the wild across two sub-tasks: LegalLens-NER for identifying legal violation entities and LegalLens-NLI for associating these violations with relevant legal contexts and affected individuals. Using an enhanced LegalLens dataset covering labor, privacy, and consumer protection domains, 38 teams participated in the task. Our analysis reveals that while a mix of approaches was used, the top-performing teams in both tasks consistently relied on fine-tuning pretrained language models, outperforming legalspecific models and few-shot methods. The topperforming team achieved a 7.11% improvement in NER over the baseline, while NLI saw a more marginal improvement of 5.7%. Despite these gains, the complexity of legal texts leaves room for further advancements
Utilizing Longitudinal Data to Build Decision Trees for Profile Building and Predicting Eating Behavior
AbstractIn this paper a framework for warning people when they are at risk of unhealthy eating is presented. Data is collected trough a mo- bile application called “ThinkSlim” which was developed for the purpose of studying eating behavior using Ecological Momentary Assessment (EMA) principles. Data is converted in order to allow early prediction of healthy and unhealthy eating events and a decision tree algorithm taking into account the longitudinal structure of the dataset is utilized to predict healthy versus unhealthy eating events. Rules that are derived from this decision tree are used to cluster users to groups based on the rule triggering frequen- cies. Groups created are used for providing users with semi-tailored feedback and are analyzed providing useful insights regarding the conditions that lead to unhealthy eating among different participants allowing for building different eating profiles
SeqAttack: On Adversarial Attacks for Named Entity Recognition
Named Entity Recognition is a fundamental task in information extraction and is an essential element for various Natural Language Processing pipelines. Adversarial attacks have been shown to greatly affect the performance of text classification systems but knowledge about their effectiveness against named entity recognition models is limited. This paper investigates the effectiveness and portability of adversarial attacks from text classification to named entity recognition and the ability of adversarial training to counteract these attacks. We find that character-level and word-level attacks are the most effective, but adversarial training can grant significant protection at little to no expense of standard performance. Alongside our results, we also release SeqAttack, a framework to conduct adversarial attacks against token classification models (used in this work for named entity recognition) and a companion web application to inspect and cherry pick adversarial examples
Massive Open Online Courses Temporal Profiling for Dropout Prediction
Massive Open Online Courses (MOOCs) are attracting the attention of people all over the world. Regardless the platform, numbers of registrants for online courses are impressive but in the same time, completion rates are disappointing. Understanding the mechanisms of dropping out based on the learner profile arises as a crucial task in MOOCs, since it will allow intervening at the right moment in order to assist the learner in completing the course. In this paper, the dropout behaviour of learners in a MOOC is thoroughly studied by first extracting features that describe the behavior of learners within the course and then by comparing three classifiers (Logistic Regression, Random Forest and AdaBoost) in two tasks: predicting which users will have dropped out by a certain week and predicting which users will drop out on a specific week. The former has showed to be considerably easier, with all three classifiers performing equally well. However, the accuracy for the second task is lower, and Logistic Regression tends to perform slightly better than the other two algorithms. We found that features that reflect an active attitude of the user towards the MOOC, such as submitting their assignment, posting on the Forum and filling their Profile, are strong indicators of persistence
Adapting End-to-End Speech Recognition for Readable Subtitles
Automatic speech recognition (ASR) systems are primarily evaluated on transcription accuracy. However, in some use cases such as subtitling, verbatim transcription would reduce output readability given limited screen size and reading time. Therefore, this work focuses on ASR with output compression, a task challenging for supervised approaches due to the scarcity of training data. We first investigate a cascaded system, where an unsupervised compression model is used to post-edit the transcribed speech. We then compare several methods of end-to-end speech recognition under output length constraints. The experiments show that with limited data far less than needed for training a model from scratch, we can adapt a Transformer-based ASR model to incorporate both transcription and compression capabilities. Furthermore, the best performance in terms of WER and ROUGE scores is achieved by explicitly modeling the length constraints within the end-to-end ASR system
Exploring the Context of Recurrent Neural Network based Conversational Agents
Conversational agents have begun to rise both in the academic (in terms of research) and commercial (in terms of applications) world. This paper investigates the task of building a non-goal driven conversational agent, using neural network generative models and analyzes how the conversation context is handled. It compares a simpler Encoder-Decoder with a Hierarchical Recurrent Encoder-Decoder architecture, which includes an additional module to model the context of the conversation using previous utterances information. We found that the hierarchical model was able to extract relevant context information and include them in the generation of the output. However, it performed worse (35-40%) than the simple Encoder-Decoder model regarding both grammatically correct output and meaningful response. Despite these results, experiments demonstrate how conversations about similar topics appear close to each other in the context space due to the increased frequency of specific topic-related words, thus leaving promising directions for future research and how the context of a conversation can be exploited.</p
"Thy algorithm shalt not bear false witness": An Evaluation of Multiclass Debiasing Methods on Word Embeddings
With the vast development and employment of artificial intelligence applications, research into the fairness of these algorithms has been increased. Specifically, in the natural language processing domain, it has been shown that social biases persist in word embeddings and are thus in danger of amplifying these biases when used. As an example of social bias, religious biases are shown to persist in word embeddings and the need for its removal is highlighted. This paper investigates the state-of-the-art multiclass debiasing techniques: Hard debiasing, SoftWEAT debiasing and Conceptor debiasing. It evaluates their performance when removing religious bias on a common basis by quantifying bias removal via the Word Embedding Association Test (WEAT), Mean Average Cosine Similarity (MAC) and the Relative Negative Sentiment Bias (RNSB). By investigating the religious bias removal on three widely used word embeddings, namely: Word2Vec, GloVe, and ConceptNet, it is shown that the preferred method is ConceptorDebiasing. Specifically, this technique manages to decrease the measured religious bias on average by 82,42%, 96,78% and 54,76% for the three word embedding sets respectively.</p
LoGAN: Generating Logos with a Generative Adversarial Neural Network Conditioned on Color
Designing a logo is a long, complicated, and expensive process for any designer. However, recent advancements in generative algorithms provide models that could offer a possible solution. Logos are multi-modal, have very few categorical properties, and do not have a continuous latent space. Yet, conditional generative adversarial networks can be used to generate logos that could help designers in their creative process. We propose LoGAN: an improved auxiliary classifier Wasserstein generative adversarial neural network (with gradient penalty) that is able to generate logos conditioned on twelve different colors. In 768 generated instances (12 classes and 64 logos per class), when looking at the most prominent color, the conditional generation part of the model has an overall precision and recall of 0.8 and 0.7 respectively. LoGAN''s results offer a first glance at how artificial intelligence can be used to assist designers in their creative process and open promising future directions, such as including more descriptive labels which will provide a more exhaustive and easy-to-use system
- …
