Monday 21 February 2011

"Ownership" of MARC 21 records - please comment

The case of for open bibliographic data has been well made. The Open Knowledge Foundations' Bibliographic Working Group has established a set of Open Bibliographic Principles. The JISC Resource Discovery Taskforce has itself produced a comprehensive Open Bibliographic Data Guide, examining reasons for publishing through use cases and the wider context of open data elsewhere within the UK.

With such principles' and use-cases firmly established, one barrier to publishing open data lies in establishing the 'ownership' of a record, ensuring that as far as a library is aware, no existing license agreements with record vendors are breached.

COMET's in ital document on "Ownership" of MARC-21 records is designed to help identify where MARC-21 encoded metadata originates from and assist in establishing its provenance.

The documentation and underlying investigation was performed by Hugh Taylor, Head of Collection Description and Development at Cambridge University Library. Hugh is as familiar as anyone with the vast and varied dataset at the University Library. Given the size and scope of our data, the issues and examples raised will hopefully be of use to anyone else considering publishing of Open Data.

This guide is something of a work in progress, which we will revisit as COMET progresses. Next up is a brief summary of relevant licenses, aiming to provide an overview of what is allowed and not allowed with the array of data we have.

We would welcome feedback in the comments below.

Tuesday 15 February 2011

Welcome

Welcome to the COMET (Cambridge Open Metadata) project blog. COMET is a JISC funded collaboration between Cambridge University Library and CARET, University of Cambridge. It is funded under the JISC Infrastructure for Resource Discovery programme.

COMET will release a large sub-set of bibliographic data from Cambridge University Library catalogues as open (under a Public Domain Dedication License) metadata. It will also explore and test a number of technologies and methodologies for publishing XML/RDF.

COMET aims to build upon the successes of previous work in this area.

The library has previously contributed a dataset of 132,130 bibliographic records to the JISC-funded Open Bibliography project led by the Unilever Centre for Molecular Science Informatics at the University of Cambridge, in partnership with the Open Knowledge Foundation and the International Union of Crystallography.

This collaboration began to develop our understanding of the intellectual property and technical issues relating to the exposure of bibliographic data and potential value in linking the data.

COMET will have a particular focus on library-catalogue derived bibliographic data, aiming to provide the University of Cambridge and wider academic community with a readily accessable RDF store for bibliographic data. The development and installation work behind this will be documented in such a way as to be repeatable by others.

We will also investigate and document the availability of metadata for the library’s collections which can be released openly in machine-readable formats and the barriers which prevent other data from being exposed in this way.

The project will also explore the value of a linked approach to enrichment of records using services provided by OCLC to assign FAST (Faceted Application of Subject Terminology) and VIAF (Virtual International Authority File) headings to the metadata, allowing the development of innovative services for information retrieval and resource discovery.

You can find detailed information regrarding COMET on our about page, including full aims, objectives and expected ouputs, as well as a project plan.