Journal of Digital Information (Texas Digital Library - TDL E-Journals)

Not a member yet

252 research outputs found

Sort by

Building a DDC-annotated Corpus from OAI Metadata

Author: Lösch Mathias
Waltinger Ulli
Horstmann Wolfram
Mehler Alexander
Publication venue: Texas Digital Library
Publication date: 29/04/2011
Field of study

Document servers complying to the standards of the Open Archives Initiative (OAI) are rich, yet seldom exploited source of textual primary data for research fields in text mining, natural language processing or computational linguistics. We present a bilingual (English and German) text corpus consisting of bibliographic OAI records and the associated full texts. A particular added value is that we annotated each record with at least one Dewey Decimal Classification (DDC) number, inducing a subject-based categorization of the corpus. By this means, it can be used as training data for machine learning-based text categorization tasks in digital libraries, but also as primary data source for linguistic research on academic language use related to specific disciplines. We describe the construction of the corpus using data from the Bielefeld Academic Search Engine (BASE), as well as its characteristics

Diversity and Interoperability of Repositories in a Grid Curation Environment

Author: Aschenbrenner Andreas
Enke Harry
Fischer Thomas
Ludwig Jens
Publication venue: Texas Digital Library
Publication date: 29/04/2011
Field of study

IT based research environments with an integrated repository component environments are increasingly important in research. While grid technologies and its relatives used to draw most attention, the e-Infrastructure community is now often looking to the repository and preservation communities to learn from their experiences. After all, trustworthy data-management and concepts to foster the agenda for data-intensive research are among the key requirements of researchers from a great variety of disciplines. The WissGrid project aims to provide cross-disciplinary data curation tools for a grid environment by adapting repository concepts and technologies to the existing D-Grid e Infrastructure. To achieve this, it combines existing systems including Fedora, iRODS, DCache, JHove, and others. WissGrid respects diversity of systems, and aims to improve interoperability of the interfaces between those systems

Institutional Repositories, Long Term Preservation and the changing nature of Scholarly Publications

Author: Doorenbosch Paul
Sierman Barbara
Publication venue: Texas Digital Library
Publication date: 24/05/2011
Field of study

In Europe over 2.5 million publications of universities and research institutions are stored in institutional repositories. Although institutional repositories make these publications accessible over time, a repository does not have the task to preserve the content for the long term. Some countries have developed an infrastructure dedicated to sustainability. The Netherlands is one of those countries. The Dutch situation could be regarded as a successful example of how long term preservation of scholarly publications is organised through an open access environment. In this article it will be explained how this infrastructure is structured, and some preservation issues related to it will be discussed. This contribution is based on the long term preservation studies into Enhanced Publications, performed in the FP7 project DRIVER II (2007-2009). The overall conclusion of the DRIVER studies about long term preservation is that the issues are rather of an organisational nature than of a technical one. The nature of publications in scholarly communication is changing. Enhanced Publications and Collaborative Research Environments are new phenomena in scholarly communication using the wide range of possibilities of the digital environment in which researchers and their audience act. This rapidly changing digital environment also affects long term preservation archives. Raising awareness of long term preservation in the research community is important because researchers are responsible for public dissemination of their research output and need to understand their role in the life cycle of the digital object. Researchers should be aware that constant curation and preservation actions must be undertaken to keep the research results fit for verification, reuse, learning and history over time

Recruiting Content for the Institutional Repository: The Barriers Exceed the Benefits

Author: Troll Covey Denise
Publication venue: Texas Digital Library
Publication date: 18/04/2011
Field of study

Focus groups conducted at Carnegie Mellon reveal that what motivates many faculty to self-archive on a website or disciplinary repository will not motivate them to deposit their work in the institutional repository. Recruiting a critical mass of content for the institutional repository is contingent on increasing awareness, aligning deposit with existing workflows, and providing value-added services that meet needs not currently being met by other tools. Faculty share concerns about quality and the payoff for time invested in publishing and disseminating their work, but disagree about metrics for assessing quality, the merit of disseminating work prior to peer review, and the importance of complying with publisher policies on open access. Bridging the differences among disciplinary cultures and belief systems presents a significant challenge to marketing the institutional repository and developing coherent guidelines for deposit

Preserving repository content: practical tools for repository managers

Author: Pickton Miggie
Morris Debra
Meece Stephanie
Coles Simon
Hitchcock Steve
Publication venue: Texas Digital Library
Publication date: 29/04/2011
Field of study

The stated aim of many repositories is to provide permanent open access to their content. However, relatively few repositories have implemented practical action plans towards permanence. Repository managers often lack time and confidence to tackle the important but scary problem of preservation. Written by, and aimed at, repository managers, this paper describes how the JISC-funded KeepIt project has been bringing together existing preservation tools and services with appropriate training and advice to enable repository managers to formulate practical and achievable preservation plans. Three elements of the KeepIt project are described: 1. The initial, exploratory phase in which repository managers and a preservation specialist established the current status of each repository and its preservation objectives; 2. The repository-specific KeepIt preservation training course which covered the organisational and financial framework of repository preservation; metadata; the new preservation tools; and issues of trust between repository, users and services; 3. The application of tools and lessons learned from the training course to four exemplar repositories and the impact that this has made. The paper concludes by recommending practical steps that all repository managers may take to ensure their repositories are preservation-ready

Rich Internet Publications: "Show What You Tell"

Author: Breure Leen
Voorbij Hans
Hoogerwerf Maarten
Publication venue: Texas Digital Library
Publication date: 10/03/2011
Field of study

The journal article is still the basis of scholarly communication. This genre, however, largely adheres to the rules of the printed publication and does not meet the requirements of this age of digital Web publishing. Today we do not need to restrict ourselves any longer to communicating the results of the research process only. We can also allow readers to inspect the underlying data online, to publish their own comments and, using a variety of multimedia content, to be witness to intermediary stages of the scientific discovery process. This development has stimulated the transformation of the conventional article: when published in a digital format, it is more and more enhanced with data sets, photos, videos, interactive maps and animations; these enhancements affect its structure and layout. A variety of new publication formats is appearing, some of which can be no longer adequately described as simply "enhanced" publications. They are rather to be conceived as a new genre, for which we propose the term Rich Internet Publication (RIP), analogue to the well-known concept of Rich Internet Application. Both share features of information integration, visualization and exploration (i.e. non-linear reading), typical for hypermedia products. RIPs do not constitute a sharply delimited category, but are part of a broad spectrum, which starts with regular enhanced publications closely resembling their printed counterparts, and ends with high-quality multimedia presentations having more in common with Web applications than with the conventional journal article. We distinguish two subcategories: RIP type I is primarily based on a linear text, but fully integrated with multimedia content and tools to access and analyze data, while RIP type II is more imagedriven, has a user interface with more graphic elements and encourages explorative, non-linear reading. The production of enhanced publications and RIPs is not yet a straightforward process. It requires extra effort from the author, which is currently insufficiently rewarded. This may change when funding agencies get more interested in research products that go beyond the level of textual publications. Dedicated tools for construction of RIPs are equally important, which requires consensus on architecture and infrastructure. Development of these tools could fit in with the recently started research line of adding semantic metadata to object-based enhanced publications. Moreover, the creation of a RIP will rely on the author\u27s basic competencies of e-scholarship. When authors start creating RIPs on a larger scale, the process of exchanging and preserving them has to be supported. Usually, a RIP is not a single static file, which can be downloaded and attached to an email, but a set of related components. Preserving the content\u27s integrity will be a major concern

Intermediary schemas for complex XML applications: an example from research information management

Author: Gartner Richard
Publication venue: Texas Digital Library
Publication date: 03/06/2011
Field of study

The complexity and flexibility of some XML schemas can make their implementation difficult in working environments. This is particularly true of CERIF, a standard for the interchange of research management information, which consists of 192 interlinked XML schemas. This article examines a possible approach of using \u27intermediary\u27 XML schemas, and associated XSLT stylesheets, to make such applications easier to employ. It specifically examines the use of an intermediary schema, CERIF4REF, which was designed to allow UK Higher Education institutions to submit data for a national periodic research assessment exercise in CERIF. The wider applicability of this methodology, particularly in relation to the METS standard, is also discussed

Generic Adaptation Framework: a Process-Oriented Perspective

Author: Knutov Evgeny
De Bra Paul
Pechenizkiy Mykola
Publication venue: Texas Digital Library
Publication date: 10/03/2011
Field of study

Adaptive Hypermedia Systems (AHS) have long been mainly represented by domain- or application-specific systems. Few reference models exist and they provide only a brief overview of how to describe and organize the `adaptation process\u27 in a generic way. In this paper we consider the process aspects of AHS from the very first classical `user modelling-adaptation\u27 loop to a generic detailed flowchart of the adaptation in AHS. We introduce a Generic Adaptation Process and by aligning it with a layered (data-oriented) AHS architecture we show that it can serve as the process part of a new reference model for AHS

Curation Micro-Services: A Pipeline Metaphor for Repositories

Author: Abrams Stephen
Cruse Patricia
Kunze John
Minor David
Publication venue: Texas Digital Library
Publication date: 29/04/2011
Field of study

The effective long-term curation of digital content requires expert analysis, policy setting, and decision making, and a robust technical infrastructure that can effect and enforce curation policies and implement appropriate curation activities. Since the number, size, and diversity of content under curation management will undoubtedly continue to grow over time, and the state of curation understanding and best practices relative to that content will undergo a similar constant evolution, one of the overarching design goals of a sustainable curation infrastructure is flexibility. In order to provide the necessary flexibility of deployment and configuration in the face of potentially disruptive changes in technology, institutional mission, and user expectation, a useful design metaphor is provided by the Unix pipeline, in which complex behavior is an emergent property of the coordinated action of a number of simple independent components. The decomposition of repository function into a highly granular and orthogonal set of independent but interoperable micro-services is consistent with the principles of prudent engineering practice. Since each micro-service is small and self-contained, they are individually more robust and collectively easier to implement and maintain. By being freely interoperable in various strategic combinations, any number of micro-services-based repositories can be easily constructed to meet specific administrative or technical needs. Importantly, since these repositories are purposefully built from policy neutral and protocol and platform independent components to provide the function minimally necessary for a specific context, they are not constrained to conform to an infrastructural monoculture of prepackaged repository solutions. The University of California Curation Center has developed an open source micro-services infrastructure that is being used to manage the diverse digital collections of the ten campus University system and a number of non-university content partners. This paper provides a review of the conceptual design and technical implementation of this micro-services environment, a case study of initial deployment, and a look at ongoing micro-services developments

Digital Whistleblowing in Restricted Environments

Author: Bell Graeme Baxter
Publication venue: Texas Digital Library
Publication date: 03/06/2011
Field of study

The exposure of an organisation’s illegal or unethical practices is often known as whistleblowing. It is currently a high- profile activity as a consequence of whistleblowing websites such as Wikileaks. However, modern digital fingerprinting technologies allow the identification of the human users associated with a particular copy of a leaked digital file. Fear of such discovery may discourage the public from exposing illegal or unethical practices. This paper therefore introduces the novel whistleblower- defending problem, a unique variant of the existing document- marking and traitor-tracing problems. It is addressed here by outlining practical steps that real-world whistleblowers can take to improve their safety, using only standard desktop OS features. ZIP compression is found to be useful for indirect file comparison, in cases where direct file comparison or use of checksums is impossible, inconvenient or easily traceable. The methods of this paper are experimentally evaluated and found to be effective

0

full texts

252

metadata records

Updated in last 30 days.

Journal of Digital Information (Texas Digital Library - TDL E-Journals)

Access Repository Dashboard

Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇