Search

Encoding Standards: ALA Annual Report 2022

MARC Advisory Committee (MAC) Annual Meetings, June 28-29, 2022

Report by Karen Peters (Library of Congress), Chair, Encoding Standards Subcommittee

(Recordings of the meetings are available here; note that timestamps are provided for each paper)

 

MAC Meeting No. 1, June 28, 2022, 10:30-12:35 EDT

MAC Chair Cate Gerhart opened the meeting and explained the protocols for this set of virtual meetings, which were held via WebEx. After the MAC members introduced ourselves, the minutes from the January 2022 meetings were approved.

 

It was noted that two fast track proposals have been approved since the January meetings: 

  • No. 2022-FT01, from the PCC Standing Committee on Standards, was converted from Discussion Paper No. 2022-DP04 at those meetings and approved. The proposal added subfields $i (Relationship information) and $4 (Relationship) to Field 373 (Associated Group) in the Authority Format.
  • No. 2022-FT02, from the Canadian Committee on Metadata Exchange and Library Archives Canada in consultation with OCLC, added subfield $5 (Institution to which field applies) to Field 788 (Equivalent Description in Another Language) in the Bibliographic Format.

 

Proposal No. 2022-07, “Modernization of Field 856 Second Indicator and Subfield $3 in the MARC 21 Formats,” was introduced by Jay Weitz on behalf its author, OCLC. The proposal, which sought to continue the modernization of Field 856 (Electronic Location and Access) begun recently via two discussion papers (Nos. 2020-DP01 and 2022-DP01) and an approved proposal (No. 2020-03), focused on clarifying the field’s second indicator (Relationship) and its relationship to subfield $3 (Materials specified). The proposal was approved with a number of modifications involving changes in terminology and the moving of some exemplary material to the Guidelines for the use of Field 856.

 

Proposal No. 2022-08, “Recording Persistent Identifiers and File Formats in Field 856 of the MARC 21 Formats,” the follow-up to Discussion Paper No. 2022-DP02, was introduced by Juha Hakala on behalf of the proposal’s authors, the ISSN International Centre, Paris and the National Library of Finland. The proposal sought to permit the recording of Persistent Identifiers (PIDs) separately from Uniform Resource Indicators (URIs) by (re)defining two obsolete subfields ($g for PIDs, and $h for non-functioning URIs) for this use. It also sought to support a more accurate specification of file formats and their versions by expanding the definition of subfield $q (Electronic format type) and making the subfield repeatable. The proposal was approved, with changes making subfield $g repeatable, adding pointers to the Guidelines for the use of Field 856 for subfields $g and $u (Uniform Resource Identifier), and a terminological change in the revised definition of subfield $q.

 

Discussion Paper No. 2022-DP07 was introduced by Karen Peters on behalf of its authors, MLA and OLAC (Online Audiovisual Catalogers). With compilations in mind, the paper investigates the possibility of enabling clearer coding of language information in bibliographic records through the addition of subfield $3 (Materials specified) to Field 041 (Language Code), along with the supporting changes that would need to be made to MARC 008/35-37 (Language). The discussion paper will return as a proposal at the midwinter meetings in January 2023.

 

MAC Meeting No. 2, June 29, 2022, 10:30-1:25 EDT

Discussion Paper No. 2022-DP06, “Defining a New Field to Record Electronic Archive Location and Access in the MARC 21 Formats” was introduced by Juha Hakala on behalf of the paper’s authors, the ISSN International Centre, Paris and the National Library of Finland. The paper explores the possibility of defining a new Field 857 for recording the persistent identifier or location of, and relevant information about, resources in digital or Web archives. The paper stems from the discussion during the 2022 MAC midwinter meetings of Discussion Paper No. 2022-DP02 (Enrichment of Web Archive Information in Field 856 of the MARC 21 Formats), during which MAC members indicated strong preference (21 votes for, 2 against) for the definition of a new field for the encoding of this information. The discussion paper will return as a proposal.

 

Discussion Paper No. 2022-DP08 was introduced by Elizabeth Miraglia on behalf of its author, the PC Standing Committee on Standards. Noting that in a linked data environment, an entity may not have a preferred heading or label, but rather the identifier or URI itself identifies the entity in question, the paper explores the possibility of adding subfields $0 (Authority record control number or standard number) and $1 (Real World Object URI) to fields 720 (Added Entry-Uncontrolled Name) and 653 (Index Term-Uncontrolled). Some MAC members expressed the opinion that $2 (Source of heading or term) should be added to these fields as well, and a straw poll taken on the question returned 12 votes in favor, with 7 against. The discussion paper will return as a proposal.

 

Discussion Paper No. 2022-DP09, “Defining a Field for Standardized Provenance Information in the MARC 21 Bibliographic, Holdings, and Authority Formats,” was introduced by Reinhold Heuvelmann on behalf of its authors, the D-A-CH Working Group on Provenance—Task Group on MARC, in cooperation with the German National Library and the Committee on Data Formats. The paper offers four options for accomplishing its goal, of which its authors’ preference is defining a new field in the 3XX range (361 is suggested) for encoding provenance information. MAC members, however, were not in agreement on which option was to be preferred; an additional concern was raised by Thurstan Young on behalf of a constituent who suggested that the encoding of custodial information might perpetuate the cultural (race/class/gender) bias already implicit in the entities chosen for representation by authority records. 

Discussion was cut short when it was found out that Kevin Ford, who would presenting the last two discussion papers being considered, would not be available for the next day’s scheduled meeting. In spite of that, Reinhold indicated his belief that he had been given sufficient information to turn the discussion paper into a proposal.

 

Discussion Paper No. 2022-DP10, “Defining a New Subfield in Field 264 to Record an Unparsed Statement in the MARC 21 Bibliographic Format,” and Discussion Paper No. 2022-DP11, “Defining a New Subfield in Field 490 to Record an Unparsed Statement in the MARC 21 Bibliographic Format” were each presented by Kevin Ford on behalf of their author, the Network Development and MARC Standards Office (NDMSO) of the Library of Congress. Neither paper received much support from the MAC membership. In the course of discussion of 2022-DP10, Kevin asserted that the motivation for this paper was related to BIBFRAME, but not because of BIBFRAME, but members indicated the desire for a stronger explanation of the author’s need for the recording an unparsed publication statement in Field 490. Further discussion suggested that motivation for the paper stemmed from the issues encountered in dealing with both ISBD punctuation and MARC subfields during conversion between BIBFRAME and MARC. Nevertheless, a straw poll flatly rejected the suggestion that the paper return as a proposal (21 votes against); a second straw poll, however, suggested that its return as a second discussion paper would be acceptable (20 votes for, 3 against).

The situation with 2022-DP11 was similar, but with the issue here being an unparsed series statement. MAC members reiterated the need for the paper’s authors to provide a clear explanation, including examples, of what the proposed change would accomplish. Both 2022-DP10 and 2022-DP11 may return as discussion papers.

Consideration of the papers was followed by a Business Meeting/Library of Congress Report, during which Sally McCallum (NDMSO) referenced an email she sent to the MARC list the previous week giving a report on LC’s experimentation with, and subsequent conversion of title (including name/title) authority records from the MARC Authority format to the MARC Bibliographic format, thus facilitating their use in BIBFRAME. While some discussion ensued, most of it dealing with the treatment of serial titles in this project, the discussion did not get very far, as we were by now well past the time (12:30 PM EST) when the meeting was scheduled to have ended. The Chair indicated that discussion of the report could continue on the MARC list (note that as of this writing, there has been no further discussion), and closed the meeting, noting that a third meeting (scheduled for the next day) would not need to take place.

 

Library of Congress BIBFRAME Update Forum, Monday, June 27, 2022, 1:00-2:00 EDT

Link to presentation recording, agenda, and presentation PowerPoint slides/PDFs

After a welcome by Paul Frank (Policy, Training, and Cooperative Programs Division), Sally McCallum (Chief, Network Development and MARC Standards Office (NDMSO)) gave an introduction to the forum and outlined the program, noting that while the presentations usually focused on community activity, the focus today would be on LC, and specifically on BIBFRAME 100 and its effect on the users of LC data when most of LC’s cataloging staff (but not its music and sound recording catalogers as yet) will be creating descriptions in BIBFRAME only, with MARC records created via conversion from BIBFRAME.

Sally’s introduction was followed by opening remarks from Beacher Wiggins (Director for Acquisitions and Bibliographic Access). Beacher gave a brief survey of the BIBFRAME Pilot’s history, and noted that the Pilot participants (approximately 100 catalogers) are fully trained in creating BIBFRAME descriptions using the Marva editor. Recently, cataloging of non-Latin script materials has been a major focus. At this time, Pilot participants have to input descriptions twice: once in BIBRAME/Marva, and once in MARC/Voyager. The aim is to cease this “double keying” by the end of the year (fiscal or calendar year not specified) and move the cataloging staff to BIBFRAME. Among other things, catalogers not participating in the Pilot will need to be trained by Pilot participants; and LC is working with OCLC to enable serials (CONSER) catalogers to make the move to BIBFRAME as well. Beacher’s presentation slides include some “useful” links to BIBFRAME sources.

Next followed three presentations on non-Latin script materials cataloging. The first, by Paul Frank discussed past romanization of non-Latin scripts in MARC records, and outlined when and how non-Latin scripts and their romanization would be used in BIBFRAME descriptions—including how these policies were worked out. As there is much material that will be entered in non-Latin scripts only, Paul also discussed the issue of conversion of BIBFRAME descriptions to MARC in these cases, including description of a romanization tool that is being developed for that purpose.

Paul’s presentation was followed by one from Jessalyn Zoom (Chief, Asian and Middle Eastern Division), in which she outlined need for a certain (and limited) amount of romanization in bibliographic description; and discussed the ongoing review of ALA-LC Romanization tables, noting the implementation of revised procedural guidelines in May 2021. Jessalyn pointed out that there is not a single internationally accepted romanization standard, which can create confusion, particularly for those who are not familiar with the original non-Latin script; and that automated/machine transliteration, preferably reversible, should be enabled to the extent possible. She reported that testing was underway to expand Voyager script input beyond the MARC-8 character set, as well as the addition of Armenian, Mongolian, and Thai scripts, with the hope that this will suggest how other scripts might be added—and possibly facilitate the same in BIBFRAME.

The third presentation on the subject of non-Latin scripts was given by Matt Miller (NDMSO) and dealt further with transliteration issues. Matt noted that a number of transliteration tools that are currently in use have been developed both at LC and externally; these tools, however, are not always in line with ALA-LC romanization, and do not always present completely accurate results in any case. Nevertheless, LC will be working over the next 6-7 months to consolidate these tools into a single open-source tool that can be used in Marva/BIBFRAME and beyond.

Next came an OCLC BIBFRAME Update from Nathan Putnam (Director, Data Quality and Governance), who reported on OCLC’s collaboration with LC, Cassalini Libri, Stanford University, and the PCC to advance the use and exchange of BIBFRAME data, including a 3-day summit that was held to discuss linked data topics. The result of the summit was the establishment of two working groups: one for use cases, which completed its work in February; and one for data exchange, the work of which is still in process. Nathan also spoke on OCLC’s work with LC on creating a system that will permit LC serials catalogers to work in BIBFRAME while serials catalogers in other libraries continue working in MARC.

BIBFRAME ingest into WorldCat is targeted for December 2022, and OCLC is working to incorporate the entire BIBFRAME lifecycle into WorldCat, and Nathan notes that there will be a BIBFRAME editor in OCLC. In the end, OCLC users should be able to find everything they are searching for, regardless of input format.

The final presentation came from Kevin Ford (NDMSO), who gave a technical report on metadata distribution from id.loc.gov, including a summary of activity in 2021, and an outline of activity from 2022 forward. Kevin outlined the bulk downloads of BIBFRAME metadata (works, instances, and hubs) that are available, as well as how often they are updated. The schedule will be documented at id.loc.gov.

During the question and answer period that followed, Paul Frank made it clear that the romanization practices outlined in his presentation were a best practice, but not a requirement.