Proceedings of the International Conference on Dublin Core and Metadata Applications (DCMI)
Not a member yet
455 research outputs found
Sort by
National Diet Library Data for Open Knowledge and Community Empowerment
The National Diet Library (NDL) has been promoting utilizations of the data created and provided on the Internet by the NDL since it established its "Policy of providing databases created by the National Diet Library." The NDL provides bulk download of open datasets and takes part in public events related to open data and civic technology, which increased visibility of NDL data in communities throughout Japan. The NDL also organizes ideathons and hackathons to promote its data and services. These outreach activities resulted in any number of interesting and potentially useful initiatives. This presentation will demonstrate the NDL's efforts and achievements in promoting the use of its data, while showcasing some of the best civic-driven applications and visualizations of library data
Why Build Custom Categorizers Using Boolean Queries Instead of Machine Learning? Robert Wood Johnson Foundation Case Study
This presentation will cover a case study for using Boolean queries to scope custom categories, provide a Boolean query syntax primer, and then present a step-by-step process for building a Boolean query categorizer. The Robert Wood Johnson Foundation (RWJF) is the largest philanthropy dedicated solely to health in the United States. Taxonomy Strategies has been working with RWJF to develop an enterprise metadata framework and taxonomy to support needs across areas including program management, research and evaluation, communications, finance, etc. We have also been working with RWJF on methods to apply automation to support taxonomy development and implementation within their various information management applications. Machine learning has become a popular and hyped method promoted by large information management application vendors including Microsoft, IBM, Salesforce and others. The problem is that machine learning is opaque. The benefit is that you don’t need to do any preparation, content just gets processed. The problem is that the categories are generic, may be irrelevant, can be biased, and are difficult to change or tune. Pre-defined categories (e.g., a controlled vocabulary or taxonomy) plus Boolean queries to scope the context for categories are much more transparent. The benefit is relevant categories. The problem is that pre-defined categories requires work to set up, and specialized skills. But how hard is it do this
Visualizing Library Metadata for Discovery
The benefits of visualization have been discussed widely and it is already implemented into library services. However, use cases for visualization have been mostly focused on collection analysis to improve collection development policies and budget management, not for discovery services that take full advantage of the rich information contained in library catalog records. One of the challenges of working with library catalog records for visualization is the sheer volume of elements (such as control field, data field, subfield, and indicators) and information included in the MAchine-Readable Cataloging (MARC) format records. As is well-known, there are more than 1,900 fields in the MARC 21, which is just too many to use for effective visualizations (Moen and Benardino, 2003). In addition, some fields are used for recording the same information, for example, the control field 008 positions 7 to 14 and the subfield $c of the data field 264 are used for the production related date information. Instead of showing a clear relationship between resources, the large number of elements and duplicated information included in the catalog record may muddle those relationships in any visualization. The question then is which information added in which fields of the MARC 21 format catalog records should be considered essential information to be included in library catalog data visualizations for discovery. This paper explores ways to improve discovery service by visualizing selective library data
Linking knowledge organization systems via Wikidata
Wikidata is a large collaboratively curated knowledge base, which connects all of the roughly 300 Wikipedia projects in different languages and provides common data for them. Its items also link to more than 1500 different sources of authority information. Wikidata can therefore serve as a linking hub for the authorities and knowledge organization systems represented by these “external identifiers”. In the past, this approach has been applied successfully to rather straight-forward cases such as personal name authorities. Knowledge organization systems with more abstract concepts are more challenging due to, e.g., partial overlaps in meaning and different granularities of concepts
Data-Driven Development of the Dewey Decimal Classification
Changes involved in maintaining the Dewey Decimal
Classification (DDC), a general classification system, have derived in the past
from many distinct sources. Without disregarding these sources, the DDC
editorial team is also considering data-driven methods of (1) identifying
existing areas of the DDC warranting further development or (2) identifying
topics with sufficient literary warrant to justify explicit inclusion in the
DDC. The use of two sources of data is under investigation. The first data
source reflects the assignment of recently created Library of Congress Subject
Headings (LCSHs) to resources described in WorldCat records (i.e., LCSHs added
within the past 5 years). The second data source reflects the assignment of
numbers from the current full edition of Dewey to WorldCat records. The topics
and schedule areas identified through these means require investigation to
ascertain if they are viable candidates for further development. Preliminary
work with these data sources reveals that the strategies hold
promise
Aggregating Metadata from Heterogeneous Pop Culture Resources on the Web
Japanese pop culture resources, such as manga, anime, and
video games, have recently experienced an increase in both their consumption,
and appreciation for their cultural significance. Traditionally seen as solely
recreational resources, the level of bibliographic description by cultural
heritage institutions has not kept up with the needs of users. In seeking to
remedy this, we propose the aggregation of institutional data, and rich hobbyist
data sourced from the web. Focusing on manga, a form of Japanese comic, this
paper discusses classification and aggregation, with the goal of improving
bibliographic description through the use of fan created data. Bibliographic
metadata for manga was collected from the Japanese Agency for CulturalAffairs
media arts database, along with several Englishlanguage manga fan websites.The
datawas organized into classesto enable property matching across data providers,
and then tested with existing ontologies and aggregation models,namely Europeana
and the Open Archives Initiative’s Object Reuse and Exchange, to determine their
suitability in working with these unique resources. The results show that
existing ontologies may be suitable for use with pop culture materials, but that
new vocabulary terms may need to be created if there is an abundance of granular
data that existing ontologies fail to properly describe. In addition, the
OAI-OREn aggregation method proved to be more promising than EDM when examining
the aggregation of related pop culture resources. The paper discusses these
issues, as well as recommendations for addressing them moving
forward
The Dutch Art & Architecture Thesaurus Put into Practice: The Example of Anet, Antwerp
Anet is a network of scientific libraries located in
Antwerp, Belgium. Among the connected institutions are research, higher
education and museum libraries. In 2014 was decided to adopt a new subject
heading system for cataloging the library collections with an art or heritage
scope. The Art & Architecture Thesaurus® (maintained by the Getty Research
Institute) was eventually selected, under the express condition that it can be
used in a flexible way by the libraries. The local subject heading systems
(terminologies) were converted to the new authority environment (Anet-AAT).
Manual mapping was performed because of the different application as subject
heading system and the opportunity to acquaint the librarians with AAT. Future
challenges for the Anet-AAT vocabulary consist of staying updated with changes
that occur in the ‘Mother AAT’ (Getty Vocabularies) and adding to its content to
create more library specific subjects - AAT is presently quite focused on the
description of (museum) objects. The content pf ATT is quite well suited for
indexing the special libraries. Nevertheless, the usage by the network did bring
to light issues in the structure of AAT, particularly some concerning the Dutch
translation. The necessity to address these issues has resulted in regular
contacts between Anet and the RKD-Netherlands Institute for Art History that
manages the Dutch translation of the AAT. The adaptation of AAT by Anet has
proven to be a promising showcase for the potential of this ‘museum thesaurus’
as a subject heading system for libraries as well
National Diet Library Dublin Core Metadata Description (DC-NDL): Describing Japanese Metadata and Connecting Pieces of Data
The National Diet Library (NDL) is the sole national
library in Japan. This poster mainly presents the National Diet Library Dublin
Core Metadata Description (DC-NDL), which is a descriptive metadata standard
utilized primarily for converting catalog records of publications held by the
NDL into metadata based on the Dublin Core Metadata Element Set (DCMES) and the
DCMI Metadata Terms. The key functions of the DC-NDL are the follows: (1)
Representing the yomi (pronunciation), one of the characteristics of the
Japanese language, (2) Connectivity with Linked Data especially for URI, and (3)
Compatibility with digitized materials. Furthermore, we describe an example of
implementing DC-NDL for use with NDL Search. To conclude, we point out issues
for future research on the DC-NDL
Cognitive and Contextual Computing - Laying a Global Data Foundation
A search of the current computing and technology zeitgeist
will not have to look far before stumbling upon references to Cognitive
Computing, Contextual Computing, Conversational Search, the Internet of Things,
and other such buzz-words and phrases. The marketeers are having a great time
coming up with futuristic visions supporting the view of computing becoming all
pervasive and ‘intelligent’. From IBM’s Watson beating human quiz show
contestants, to the arms race between the leading voice-controlled virtual
assistants – Siri, Cortana, Google Now, Amazon Alexa. All exciting and
interesting, but what relevance has this for DCMI, metadata standards, and the
resources we describe using them? In a word, “context”. No matter how
intelligent and human-like a computer is, it’s capabilities are only as good as
the information it has to work with. If that information is constrained by
domain, industry specialised vocabularies, or a lack of references to external
sources; it is unlikely the results will be generally useful. In the DCMI
community we have expertise in sharing information within our organisations and
on the web. Dublin Core being one of the first widely adopted generic
vocabularies. A path that Schema.org is following and in its breadth of adoption
is now exceeding. Schema.org has been a significant success. Used by over 12
million domains, on over a quarter of sampled pages. It is enabling a quiet
revolution of preparing and sharing data to be harvested into search engine
Knowledge Graphs. Knowledge Graphs that power Rich Snippets, Knowledge Panels,
Answer Boxes, and other search engine enhancements. Whilst delivering on one
revolution, it is helping to lay the foundations of another.Building a global
web of interconnected entities, for intelligent agents to navigate, these
Knowledge Graphs fed by the information we are starting to share generically,
are providing the context that will enable Cognitive, Contextual and associated
technologies scale globally. Ushering in yet another new technology
era
Remixing Archival Metadata Project (RAMP) 2.0: Recent Developments and Analysis of Wikipedia Referrals
This presentation covers an analysis of referrals from all
Wikipedia pages created using the Remixing Archival Metadata Project (RAMP)
tool. It will also feature a demo of the tool, and will highlight some of the
recent developments, which include a major overhaul of the interface, more
secure Wikipedia log in, easy upload capabilities, and an effective and
convenient installation process. With this recent development, we are providing
the library community with a tool that is easy to use and install and that
offers a convenient way to share data with other communities on a global
scale