1,721,088 research outputs found

    A Multi-Language Comparison of Influences on Author Verification using Character N-Grams

    No full text
    We create a new multi-language corpus for author verification based on Wikipedia talkpages, and evaluate the influence that differences in topic and time have on character n-gram author profiles. Topic alignment between two texts is found to increase author verification precision, and an authors writing style is found to change over time, but not more significantly after 3 years than after 1 year.Information ArchitectureWISElectrical Engineering, Mathematics and Computer Scienc

    The Design and Implementation of a Key Performance Indicator Dashboard for KE-chain

    No full text
    KE-works is a six years old company which aims to optimise the product development process in industrial applications. To accomplish this, KE-works deploys a web-application called KE-chain. KEchain is an engineering workflow management system with the objective to increase the efficiency of the product development process through better control, more efficient distribution, access and use of product-related information. Users have the possibility to set-up a project, manage the tasks belonging to this project, and control the workflow and information distribution. With KE-chain users are able to create structure in the heap of information that composes their product and, when used right, improve the process of their project development. One of the key elements in optimising the product development process is the monitoring of the available data to give users insight in the status of the project. Currently it is difficult to get a good overview of a project within KE-chain and it is not possible to see what tasks are cirtical at a certain moment. A common way of showing the status or performance of systems is the use of Key Performance Indicators (KPI’s). These indicators, for example in the form of a graph or a table, can quickly give information about the performance of a system. KE-works has decided that it wants to give its users an overview in the formof a project-specific dashboard with KPI widgets. Therefore the assignment is to design and develop an integrated KPI dashboard into KE-chain. To design the KPI dashboard, which we named KE-board, we shortly researched the field of Performance Measurement to get an overview of the different approaches for the design of KPI’s. As a basis for the design we have adopted the Lean methodology [1] which has been used by KE-works in the past. In our research we have actually connected the Lean wastes to measures in KE-chain. To do this, we have chosen a bottom-up approach, which means we started by identifying the available data, after which we extracted several groups of measures. We have interviewed several clients of KEworks, the users of KE-chain. From these interviews we deducted which groups of measures were important for which user roles. To verify which measures are of importance for these dashboards, we have questioned and interviewd the consultants of KE-works. By combining the results of the interviews and the questionnaires we designed 7 KPI widgets. Finally, we created KE-board and integrated it into KE-chain in five weeks of implementation. After that we have evaluated the complete dashboard by interviewing the consultants of KE-works. On top of that, we have sent them a questionnaire in which they rated the functionality of the widgets to see if they contribute to their purpose and achieve the goals that we set for them. KE-board has been received well by the management and employees of KE-works and according to the extensive evaluation we can state that it definitely contributes to the optimization of the product development process in KE-chain.Computer ScienceComputer ScienceElectrical Engineering, Mathematics and Computer Scienc

    Bursting Filter Bubbles With Serendipity

    No full text
    When talking about personalization online, Google CEO Eric Schmidt recently said "it will be very hard for people to watch or consume something that has not in some sense been tailored for them." This level of personalized filtering of content has worried academics and activists. Many argue that users will be trapped in a so-called "Filter Bubble," limiting their exposure to challenging or new ideas they are not expected to like. In answer to these worries, serendipitous recommendation systems have been developed to help users make surprising and pleasant discoveries outside of their typical online content sphere. In this thesis, we investigate whether serendipitous recommendation system do indeed help break the filter bubble effect. In contrast to previous work, we investigate this question on one large-scale live experiment. We find that serendipity partially mitigates the filter bubble effect, but that users are more responsible for their own filter bubbles than algorithms. Further, we search for user characteristics that can be used to identify users more likely to experience the filter bubble effect. We find that users spending more time on site are less likely to experience the filter bubble effect.Electrical Engineering, Mathematics and Computer ScienceSoftware Technolog

    Tweet-Based Election Prediction

    No full text
    Twitter is a microblogging service that has more than 500 million messages on a daily basis. Scholars has been utilizing Twitter to monitor people reactions in political activities, such as debates and campaigns. By doing so, some of them claim that a forecast or prediction to an election can be made. Using the data from 2014 Indonesia Presidential Election, we calculate predictions with many different parameters. Our analysis of the prediction results shows the importance of a proper data collection method, removing spam, incorporating sentiment detection to the tweets, and performing data normalization using demographic information. Although looks very promising, our results show that result prediction is not applicable to any election. Dividing the data into 33 provinces, the data suggests that applying the methodology to provinces with a small dataset leads into inaccurate predictions.Information ArchitectureComputer ScienceElectrical Engineering, Mathematics and Computer Scienc

    Suggesting Queries using Query-Flow Graphs to find Dutch Content with Curated Tags

    No full text
    One of the standard features of today’s major Web search engines are query suggestions, which aid the user in the formulation of their search queries. Over the years, a number of different approaches have been proposed that have com- monly been evaluated in the standard Web search setting. In this thesis, we build a query suggestion pipeline based on query log data collected from a more con- strained environment which, though also large-scale, differs considerably from standard Web search with respect to its users, indexing process and Web cover- age. We implement a number of suggestion approaches based on query-flow and term-query graph models and investigate the extend to which we can replicate the results in the literature in this more constraint environment. In the process, we investigate the implementability and replicability of published Web-based query-log approaches and experiments. We find that it is possible to apply the query suggestion techniques to a constrained environment, but a trade-off be- tween suggestion usefulness and query coverage is introduced when considering suggestion effectiveness.Information ArchitectureWeb Information SystemsElectrical Engineering, Mathematics and Computer Scienc

    Streaming Fraud Detection on Session Based Data

    No full text
    Financial fraud is, within the banking world, a major source of expenses. Improving on timely detecting fraud is a constantly ongoing cat and mouse game between the financial institutions and criminal organisations. To be on the edge of fraud detection, new approaches have to constantly be developed. This research will tap into the large amount of data produced by systems around the actual financial transactions and attempts to use this data to timely, as quick as possible, react on events. This means that the system developed is able to receive an incoming stream of data, find relations in this data, classify instances as either fraudulent or not and return this information. The proposed algorithm performs as second when compared to three peer researched techniques on a widely scientifically used, topic related, publicly annotated intrusion detection dataset. The comparable cost of our algorithm equals 0.141 where the cost for our peers is 0.058, 0.254 and 0.376. 20000 real world cases lead to 0.16% found anomalies (32) of which expert review pointed out six suspicious cases where three were to be investigated. These results show the viability of our research on the to hand problem of timely detecting financial fraud in an ongoing stream of events.Web Information SystemsSoftware TechnologyElectrical Engineering, Mathematics and Computer Scienc

    Zoekfunctionaliteit StamboomNederland

    No full text
    In opdracht van het bedrijf 42 is er een nieuwe zoekfunctionaliteit ontwikkelt voor het systeem StamboomNederland, dit systeem is in opdracht van het Centraal Bureau voor Genealogie (CBG) ontwikkelt door 42. Het systeem bevat genealogische data voor het bijhouden van stambomen. De huidige zoekfunctionaliteit werkt echter niet naar behoren vandaar de opdracht voor een nieuwe zoekfunctionaliteit, voor deze nieuwe zoekfunctionaliteit is gekozen voor de ontwikkeling voor een systeem dat gebruik maakt van query segmentatie. Query Segmentatie houdt in dat de gebruiker door middel van een gewone zoekbalk een zoekopdracht invult welk wordt ontleed in segmenten, oftwel delen, welk vervolgens bepaalde labels worden toegewezen aan de hand van een ontwikkeld algoritme. Het ontworpen algoritme is aan de hand van onderzoek naar bestaande projecten ontwikkeld en toegepast op het bestaande systeem van Stamboom Nederland waar ook het nodige onderzoek voor is gedaan aan de hand van query logs en het uitvoeren van vragenlijsten met gebruikers. Dit was om te kijken hoe deze zoekopdrachten zouden uitvoeren gegeven het query segmentatie zoek systeem. Het uiteindelijke systeem is getest door middel van een gebruikers test welk positieve resultaten gaf ten opzichte van de oude zoek functionaliteit.Software technologyComputer ScienceElectrical Engineering, Mathematics and Computer Scienc

    The Learning Tracker: A Learner Dashboard that Encourages Self-regulation in MOOC Learners

    No full text
    Massive Open Online Courses (MOOCs) have the potential to make quality education affordable and available to the masses and reduce the gap between the most privileged and the most disadvantaged learners worldwide. However, this potential is overshadowed by low completion rates, often below 15%. Due to the high level of autonomy that is required when learning with a MOOC, literature identifies limited self-regulated learning skills as one of the causes that lead to early dropouts in MOOCs. Moreover, existing tools designed to aid learners in the online learning environment fail to provide the support needed for the development of such skills. The aim of the present work is to bridge this gap by investigating how self-regulated learning skills can be enhanced by encouraging metacognition and reflection in MOOC learners by means of social comparison. To this end, following an iterative process, we have developed the Learning Tracker, an interactive widget which allows learners to visualise their learning behaviour and compare it to that of previous graduates of the same MOOC. Each iteration was extensively evaluated in live TU Delft MOOCs running on the edX platform while engaging over 20.000 MOOC learners over the whole duration of each MOOC. Our results show that learners that have access to the Learning Tracker are more likely to graduate the MOOC. Moreover, we have observed that the widget has a positive impact on learners' engagement and reduces procrastination. However, we have little evidence that learners improved their self-regulated learning skills by the end of the MOOCs. Based on our results, we argue that the mere fact of receiving feedback on a limited number of learning habits could trigger self-reflection in learners and lead to improved learner performance. This work underlines the powerful effect feedback and self-reflection on one's behaviour has on learning performance. We recommend that future research should investigate learners' feedback literacy and devise effective ways of presenting learners with personalised feedback based on their goals, learning skill level and cultural background.Electrical Engineering, Mathematics and Computer ScienceSoftware Technolog

    Using Github Profiles in Software Developer Recruitment and Hiring

    No full text
    Social coding platforms can provide initial understanding about the skills exhibited by the developers on these platforms. In contexts where candidates social profile information is useful for recruiting software developers, the information regarding the developers on these platforms can be leveraged by the recruiters with some software knowledge. However, recruiters have to put many efforts in inferring about a developer skill on social coding platforms. In this thesis, we investigate on providing relevant information regarding software developer capabilities on a social coding platform to the recruiters. We used GitHub as our social coding platform for this purpose. We explored regarding, the attributes to use for indicating the skills exhibited by a developer on GitHub. We also investigated GitHub as a resource containing some potential software developer candidates by recommending GitHub developer profile, solely based on skill set requirements of job advertisements. Our results indicate that the generated developer skill profiles have a valid set of attributes when combined, to indicate the regarding three skills exhibited by a software developer on GitHub. However, the generated profile was slightly preferred by the technical recruiters because of the profile's complexity in understanding and incompleteness. In the investigation of recommending developer profiles to suit job advertisement requirements our recommendation strategy could only achieve a precision of 0.39 on average and an Normalized Distance Based Performance Measure (NDPM) ranking accuracy value of 0.43 on average.Software TechnologyComputer ScienceElectrical Engineering, Mathematics and Computer Scienc
    corecore