Saturday, February 4, 2012

(Controlled Vocabularies + Authority Files)*Software=Interoperability

All of this week’s readings demonstrated the apparent need for authority control as part of cataloguing best practices. The work of creating and maintaining authority files using controlled vocabularies or a thesaurus, as far as Gorman is concerned, is imperative for achieving 100% precision and 100% recall in information retrieval within large databases. However, not everyone is interested in achieving perfection and so the Dublin Core metadata terms, which offer the less than satisfactory results that Gorman describes when using free text to search the Web is sufficient.

It was also apparent that each type of community, be it a library, archive or museum, has its own unique cataloguing requirements for bibliographic records to satisfy their unique user needs. Salo’s piece on the quality of metadata harvesting tools in institutional repositories brought into focus the difficulty that uncontrolled names can produce in collocation of scholarly articles for a chosen author. Salo lays the blame at two very different door-steps. One of the problems is the lack of standardization in institutional repository metadata that does not use authority control mechanisms and the other is software design that does not help facilitate the resolution of authority problems.

I could not find the reading listed in the syllabus, Lanzi, E. (1998). Standards: What role do they play? What, why and how of vocabularies. In Introduction to Vocabularies: Enhancing Access to Cultural Heritage Information, ed. E. Lanzi. Los Angeles, CA: Getty Information Institute, pp. 8-27, but I did find another piece on controlled vocabularies from the Getty Information Institute that made a strong argument for the use of community oriented vocabulary in conjunction with authority files for better retrieval. Tillett mentioned one of the Getty’s thesauri, Union List of Artists Names that the museum uses to control the name variations of an entity. Increasing precision and recall for a non-expert searcher at the Getty is handled by a software interface that uses the controlled lists to suggest terms for the individual to use. The same sort of program is used in a Google search.

Authority control is only part of the solution. Software programing that increases interoperability is another part. Together they will decrease the reality of the quip “Garbage In-Garbage Out”. 

Libraries-Metadata-Interoperability


I enjoyed reading Robert Darnton’s The Library in the New Age in which his emphasis on the continuity of the nature of information that is its instability was an interesting perspective. We cannot, however, divert our attention from the fact that the technology used to digitally organize and preserve this unstable text is rapidly changing. The flux of technology is the crux of the problem. This was in fact Darnton’s point number four of his argument when he pointed out that Google may even disappear or be over shadowed by another company or technology rendering the digital data inaccessible. It did the heart of this self-taught bookbinder good to hear Darnton say, “The best preservation system ever invented was the old-fashioned, pre-modern book.” But as they say about a lot of modern stuff, “They just don’t make them like they used to.”

The article on metadata sharing across the different information disciplines by Elings and Waibel (Metadata for All: Descriptive Standards and Metadata Sharing across Libraries, Archives and Museums) reinforced the need for standards when creating data structures and organizing data content, which allows for data sharing and aggregation across libraries, archives and museums. I was reminded of the different view museums have of information in their vision to engage the public in the discovery of cultural materials through the collections. As Elings and Waibel point out museums do not just describe materials for search and retrieval; museums organize interpretations of objects that lend to the authenticity of the object in their collections. They then reach out to the general public ultimately to bring the patron into the museums. A few years ago while gathering information for an undergraduate research project on museum conservation practices I experienced the fruits of database interoperability when I found an online project co-sponsored by The Getty’s conservation department and the Courtauld Gallery in the UK. The Getty was doing conservation work on a couple of Lucas Cranach the Elder painting for the Courtauld Gallery. One museum database held the conservation information the other held the historical information. Together they developed an amazing inside look into the mysteries of art conservation and introduced possibly thousands of people to cultural treasures they may never be able to view in person. Information from two different databases brought together by a fascinating interface that allowed self-directed investigation. I'm all for information professionals striving to bring continuity to information organization and retrieval.