Minutes of the OCLC Enhance and Expert Community Sharing Session
ALA Annual Conference
Friday, 2015 June 26
10:30 a.m.-12:00 p.m.
Moscone Convention Center, San Francisco
The ALA Annual 2015 edition of Breaking Through: What’s New and Next from OCLC and the compilation of News From OCLC were distributed.
We work on improving the matching algorithms all the time. We recently made a major fix to the algorithms for electronic resources and this is making a big difference; we now find and eliminate many more duplicates among e-resources. We work on tweaking the algorithms all the time and meet several times a week to discuss improvements and see if the tweaks we made have improved the process and if not, why not.
Many Encoding Level 3 records from vendors are so sparse that they are difficult to match. We do have special matching routines for sparse records that try to account for the risks of matching with too little information. OCLC can also use macros to match records with Encoding Level 3 using ISBNs. We can target some batchloaded records with library symbols that have notorious reputations for special matching attention. If you see library symbols with consistently sparse or bad data, you may report them to AskQC@oclc.org or bibchange@oclc.org.
We urge you not to create duplicates on purpose (or by accident, for that matter). Rather than create a duplicate, catalogers should use a macro to wipe out the bad content and redo it properly. That is exactly what Enhance and the Expert Community are for. The macro can be found in the Connexion client under Tools/Macros/Manage/OCLC/ClearELvl3Workform; it “Clears candidate fields from an Encoding Level 3 bibliographic record and replaces them with workform prompts.” You are also encouraged to convert the record to RDA.
Two years ago OCLC created a pilot project in which catalogers at four institutions were trained to merge duplicate WorldCat records. Merging of records had previously been done only by OCLC staff and OCLC processes. Although the participants learned to do this very well, the process was much more labor-intensive than expected, both for the OCLC trainers and for the institutional trainees. One institution is still extremely active merging records, but in the process of evaluating the results of the pilot project, OCLC wonders if the results have been worth the effort. Adolfo Tarango (University of California, San Diego) made a plea from the floor that the project not be abandoned, but rather be expanded and asked that a lot more institutions get involved in this work. In his opinion it is well worth doing and although it may cost us much time and effort now it will save us much time and effort later on.
In the Connexion client, you can report errors and duplicates directly from the bibliographic record. Use the Action menu and go to Report Error. Just tell us that OCLC Number X is a duplicate record and hit “Report Error.” It’s as simple as that. You could make it even simpler by adding the “ActionReportError” button to your toolbar using Tools/Toolbar Editor.
The definitions of the Encoding Levels are in Bibliographic Formats and Standards at http://www.oclc.org/bibformats/en/fixedfield/elvl.html. There are other details in Chapter 2.4 on “Full, Core, Minimal, and Abbreviated-Level Cataloging” (http://www.oclc.org/bibformats/en/onlinecataloging.html#BCGGBAFC), including a chart comparing the standards for the various levels of cataloging. In BFAS Chapter 5.3 (http://www.oclc.org/bibformats/en/quality.html#databaseenrichment) is a chart of which fields may be added and/or changed under Database Enrichment (and hence, the Expert Community), prefaced with the explanation of which records you may replace using a Full-level authorization, including many PCC records.
Yes, the issue of transactional credits is settled with the move to flat-rate credits. But interestingly, from Fiscal Year 2014 to Fiscal Year 2015, the number of bibliographic records replaced by OCLC member institutions actually rose, in the absence of transactional credits, from slightly over a million replaces to just short of 1.2 million replaces. This includes all member replaces under the Expert Community, Database Enrichment, Minimal-Level Upgrade, Enhance, and CONSER. This is a heartening affirmation of the cooperative spirit of OCLC that has built WorldCat into the unique resource it has been for decades.
At this time, there are no plans to re-evaluate the flat-rate credits.
IRs are definitely going away; that decision has been made. In the near-decade since the merger of the Research Library Group (RLG) and OCLC, which precipitated OCLC’s introduction of IRs to accommodate certain practices from the RLG Union Catalog, the electronic catalogs of many institutions have been made available online. That means that many more IR-equivalent records are freely available for examination on the Web than was the case back then. LBDs are in their relative infancy in terms of WorldCat and if they don’t currently do what you need them to do, you are encouraged to let OCLC know how those LBD capabilities can be expanded and made more visible. Such requests from OCLC members could have a real impact. Please send your comments and suggestions to IRInfo@oclc.org.
The transition to Resources Description and Access (RDA), the development of the Bibliographic Framework Initiative (BIBFRAME), the availability of WorldCat data as Schema.org Linked Data via WorldCat.org, and the support of the Virtual International Authority File (VIAF) are just a few of the things OCLC is involved in that make bibliographic and authority data more visible on the Web. Visit the OCLC “Data Strategy and Linked Data” page (http://oc.lc/data) to learn more.
Batchloaded records go through extensive preprocessing that attempts to clean them up and correct as many problems as can be identified and fixed. There are certain categories of records on incorrect bibliographic formats that we are able to fix, but obviously, not all such errors can be caught. If you come across records on the wrong format, please report them to bibchange@oclc.org if you are not able to correct them yourself. BFAS 5.1 “Type and BLvl changes” (http://www.oclc.org/bibformats/en/quality.html#CIAFAHJB) outlines which Type and Bibliographic Level changes you should be able to do with your specific authorization level.
You should be able to use that ClearELvl3Workform macro to delete the bad data on the vendor record. There are user-created macros available from the Connexion Client macros page that may help you copy and paste data from a better non-English Language of Cataloging record into the vendor record.
If the textbook itself does not include an explicit edition statement that would distinguish it from a similar but different textbook, an effective option is to apply RDA 2.5.1.4 (or AACR2 1.2B4, the parallel instructions in the other chapters in Part 1, and the associated LCRIs) and supply a bracketed edition statement suitable to the situation.
As of January 2015, about 61.3% of the bibliographic records in WorldCat represented non-English-language materials, with about 38.7% English-language materials. Regarding the Language of Cataloging (field 040 subfield $b), 50.7% of bibliographic records in WorldCat are cataloged in English and 49.3% in languages of cataloging other than English. Non-Latin scripts are represented on about 9.75% (roughly 33.1 million) of the bibliographic records in WorldCat as of June 2015. As far as I’m aware there is no easy way to differentiate records created by institutions in the United States from those created by institutions outside of the United States, if that was the literal intent of the question.
Yes, it does, but remember that records for the same resource but cataloged in different languages (specified in field 040 subfield $b) are considered to be parallel records (see BFAS 3.10, http://www.oclc.org/bibformats/en/specialcataloging.html#BCGBAEHC) rather than duplicates of records cataloged in English. Only records cataloged in the same language can be considered duplicates. Stephen Hearn (University of Minnesota) noted that he corrects tagging errors in UK and other records and asked if this helps with de-duplication; yes it may.
Currently, bibliographic record access points that are controlled to the LC/NACO Authority File display as hot links that take you to the specific authority record when clicked on in Connexion client. Part of what is actually going on behind the scenes in Connexion is the presence of the subfield $0 with that authority record identifier. The goal at some still-undefined point in the future is for the subfield $0 for LC/NACO Authority File to be implemented in Record Manager. As other authority files are added to Record Manager, they will use the subfield $0 implementation. This is currently true for the Dutch Names Authority File (NTA Personal Names). When working on a MARC 21 record In Record Manager, you can apply a heading from the NTA Personal Names file. This feature is available only for records that have “dut” in the 040 subfield $b. After searching for a Dutch authority record, you can copy the link data from the Authority field 100 and insert it into Bibliographic fields 100 and/or 700. You may find additional information on these functions in Record Manager Help at http://www.oclc.org/support/help/recordmanager/Default.htm#MARC_21/Wrkng_auth_recs/dtch_auths/NTA_auth_recs.htm?Highlight=nta.
As stated in the “OCLC RDA Policy Statement,” the plan is to begin removing GMDs in field 245 subfield $h after March 2016, which is three years after RDA “Day One.” We have no idea how long it will take to accomplish this. As you may have noticed, OCLC WorldCat Quality staff have been using macros to go through WorldCat to make numerous RDA-related and other changes to bibliographic records in the Books, Scores, and Cartographic Materials formats, including the addition of 33X fields, to the extent that we can safely do so. Some of the other changes we’re making are noted in the “OCLC RDA Policy Statement.” In the process, we are also fixing up other things, such as controlling headings when possible. As part of the process of removing the GMDs, we will also begin adding the appropriate 33X (and other RDA-related) fields to those records.
That’s a dilemma facing almost every institution, including OCLC, where we’ve seen the retirements of so many vital coworkers and the resulting loss of so much institutional memory. When we look around this room and in other cataloging sessions here at ALA, however, we can’t help but be encouraged by the many new faces we see among the familiar ones. That has to be encouraging. Adolfo Tarango also reminded us that we need to make sure that administrators fully understand the value of good quality metadata and that we have to educate those coming into the profession about the continuing importance of clean data, authority control, and the rest of our traditional concerns.
Respectfully submitted by
Doris Seely
University of Minnesota
2015 July 6
With edits by Becky Dean, Janet Hawk, Sandi Jones, Marty Loveless, Cynthia Whitacre, and Jay Weitz.
OCLC
2015 August 12