Journal of Digital Information (Texas Digital Library - TDL E-Journals)
Not a member yet
252 research outputs found
Sort by
Studying Social Tagging and Folksonomy: A Review and Framework
This paper reviews research into social tagging and folksonomy (as reflected in about 180 sources published through December 2007). Methods of researching the contribution of social tagging and folksonomy are described, and outstanding research questions are presented. This is a new area of research, where theoretical perspectives and relevant research methods are only now being defined. This paper provides a framework for the study of folksonomy, tagging and social tagging systems. Three broad approaches are identified, focusing first, on the folksonomy itself (and the role of tags in indexing and retrieval); secondly, on tagging (and the behaviour of users); and thirdly, on the nature of social tagging systems (as socio-technical frameworks)
Advanced Information Access to Parliamentary Debates
Parliamentary debates are highly structured transcripts of meetings of politicians in parliament. These debates are an important part of the cultural heritage of many countries; they are often free of copy-right; citizens often have a legal right to inspect them; and several countries make great effort to digitize their entire historical collection and make it available to the general public. This provides many opportunities for the Information Retrieval community. In this paper, we analyze the structure of parliamentary proceedings and sketch a widely applicable DTD. We show how proceedings in PDF format can be transformed into deeply nested XML. Having the proceedings in XML makes a wide range of applications possible. We elaborate on five applications: entry point retrieval, advanced content and structure search; automatic creation of tables of contents and hyperlinked navigation menus; graphical result aggregation; large savings on storage space and bandwidth for scanned documents
Extending Domain-Specific Resources to Enable Semantic Access to Cultural Heritage Data
Cultural heritage material often contains rich semantic information, which can be utilised for alternative forms of information access beyond keyword searching and browsing by subject categories. In order to provide such functionality it is desirable to annotate all the material in a collection with named entities and their relationships so that all the collection is available for semantic search. In this paper, we examine issues involved with automatic semantic annotation of information about artists from Tate Online using a pre-existing domain-specific structured resource (ULAN). In particular, we focus on extending ULAN\u27s coverage of artists and their associated semantic properties (e.g. birth/death date, birth/death location) by applying focused crawling and automatic information extraction techniques to exploit semi-structured sources of information. This enables the cross-referencing of collections against a range of information sources, thereby improving visibility and end-user information access
The ALOCOM Framework: Towards Scalable Content Reuse
This paper presents a framework that enables flexible content reuse. Unlike the usual practice where document components, such as images, definitions, text fragments, tables or diagrams, are assembled manually through copy-and-paste, the framework enables on-the-fly access and reuse. Retrieval of relevant components is enabled by automatic decomposition of legacy documents and storage of individual components, enriched with metadata. Furthermore, the automatic assembly of these components in mainstream authoring tools is supported. The paper describes the framework and its current support for re-assembling PowerPoint, Wikipedia and SCORM components in authoring tools. In addition, an evaluation is presented that aims to assess the effectiveness and efficiency of such content reuse for presentations
Digital (Library Services) and (Digital Library) Services
This paper is an exploration of digital library services, in both possible senses: services provided digitally by physical libraries, and services provided by digital libraries. Services, regardless of the environment in which they are provided, break down into services performed on materials (technical services) and services provided to individual users and communities of users (public services). Both traditional and new services are discussed as a means for exploring the question of what a library service is. Value is proposed as the concept unifying all library services. Libraries are called upon to experiment with providing new services, and to study users’ perceptions of value and methods of value creation
Developing and Sustaining the Northwest Digital Archives
The Northwest Digital Archives is a union database of Encoded Archival Description (EAD) finding aids from institutions in Washington, Oregon, Idaho, Alaska, and Montana. The purpose of the NWDA is to make information about collections of primary sources widely accessible to researchers over the internet. This article will explore the selection and development of the NWDA database, the creation of tools to associate digital objects with EAD/XML documents, and the methods employed by the technical staff at the Washington State University Libraries to expose metadata contained in the NWDA to Google (and aggregators) with Google Sitemaps and the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). Through the analysis of web site use statistics, we realized that most users are locating individual NWDA finding aids directly from Google and other search engines. While not entirely surprising, the implications of this when combined with three rounds of usability testing led to major revisions of the NWDA website and finding aid stylesheet. The article concludes with a discussion the model developed to sustain the NWDA after the end of National Endowment of the Humanities funding
Tagging tagging. Analysing user keywords in scientific bibliography management systems
In this paper, an empirical study of tagging behaviour in web-based bibliographic annotation systems is presented. Starting from an initial category finding phase in which tags attributed to selected articles from Connotea were classified we have set up a category model for linguistic and functional aspects of tag usage as well as for the relationship between tags and document full text. In a second phase this model is applied to approx. 500 tagged articles from the information and computer technology domain randomly selected from Connotea. Our findings show significant differences to other tagging research which was primarily conducted using popular (non-scientific) tagging platforms like Flickr or Delicious. We observe a great overlap of tag material and document text and rather few non-content related tags. The comparison of user tags with author keywords shows that users tend to use less and more general tags. Finally, system functionality seems to play a role for users’ tagging behaviour
Nine questions to guide you in choosing a metadata schema
This article is a guide for collection developers at the point of considering a metadata schema for their digital collection. The nine questions asked in this article will assist a developer in clarifying how he wants the collection to be organized, described, and used. This article uses examples to illustrate how these questions guided the development of a digital collection built at the University of Southern California
A Review of Organizational Structures of Personal Information Management
Personal information management (PIM) covers a large area of research fragmented into separate sub-areas such as file management, web bookmark organization, and email management. Consequently, it is hard to obtain a unified view of the various approaches to PIM developed in these different sub-areas. In this article, we synthesize and classify existing research on PIM based on the approach used to organize information items. We classify the organizational structures into five categories: hierarchical, flat, linear, spatial, and network. We discuss the strengths and weaknesses of each structure along with examples showing how to deal with the weaknesses. Finally, we provide design recommendations and a framework for researchers to experiment with various ideas for developing novel PIM tools