CAPC and CC:DA: ALA Midwinter Report 2020

Philadelphia, PA, January 24-27, 2020

Reports from:

OLAC Cataloging Policy Committee (CAPC)
RDA Pre-Conference
RDA Update Forum
RDA Linked Data Forum
PCC At Large Meeting
PCC Participants Meeting
ALCTS Committee on Cataloging: Description and Access (CC:DA)

Reported by: Mary Huismann (St. Olaf College), Chair, Content Standards Subcommittee

OLAC Cataloging Policy Committee (CAPC)

The OLAC CAPC meeting was held on Friday, January 24, 2020. The majority of the meeting was devoted to liaison and task force reports; since these reports will be published in the March issue of the OLAC Newsletter only selected highlights appear in this report.

  • The 2020 OLAC Conference, celebrating OLAC’s 40th anniversary, will be held in Columbus, Ohio, October 14-17, 2020. The conference host is OCLC, and most meeting sessions will be held at OCLC.
  • CAPC is seeking candidates for full member (2 year) and intern (1 year) positions.
  • CAPC is also seeking a new chair because current chair Jessica Schomberg is stepping down due to a change in job duties.

ALCTS Committee on Cataloging: Description and Access (CC:DA) Liaison (Kelley McGrath)

  • CC:DA has not been as active in the absence of RDA revision proposals due to the ongoing freeze in RDA development.
  • The RSC is developing new workflows that do not align so cleanly with CC:DA’s twice-yearly meeting schedule.
  • The RSC is currently testing new procedures for reviewing proposals and is planning to move to a quarterly rather than annual review cycle.

MARC Advisory Committee (MAC) Liaison Report (Cate Gerhart)

  • There are seven discussion papers and two proposals to act on at Midwinter.
    • Of particular interest to OLAC is the discussion paper regarding aspect ratio (DP 2020-DP04 “Renaming Field 345 and Defining New Subfields for Aspect Ratio and Motion Technique in the MARC 21 Bibliographic Format”). The paper may be viewed at this link:
  • Cate is serving on the RDA/MARC Working Group to explore whether changes to MARC are required for the new RDA text.

LC Report (Janis Young)

  • The BIBFRAME Pilot Project has been expanded to include over one hundred staff members, including some from the overseas offices.
  • Work on the “Multiple Subdivisions” (SMH H 1090) project continues. Please note that the project is not tackling headings in alphabetical order!
  • A new ClassWeb interface was issued in August 2019 (but does not work well on the Internet Explorer browser)
  • The moratorium on LCDGT proposals continues.
  • LC will be implementing 670 $w in source citations (LCSH, LCGFT, LCMPT) sometime after April 2020. There will be a short presentation on this topic at the PCC At Large meeting.

OCLC Report (Jay Weitz)

  • The News from OCLC was distributed; of note is the new Mellon grant for linked data, addition of the Art and Architecture Thesaurus (AAT) to OCLC Record Manager, and release notes for various products and services (available at the OCLC website).

MOUG Liaison (Autumn Faulkner)

  • MOUG will meet in conjunction with Music Library Association in Norfolk, Virginia in 2020
  • Program session details can be found on the MOUG website (

Unified Best Practices Task Force (Jessica Schomberg for Marcia Barrett)

  • The planned switchover date for the beta Toolkit is December 15, 2020.
  • The initial publication of the Toolkit will not include policy statements from groups that do not already have policy statements in the current Toolkit (e.g., OLAC)

Objects Task Force (Jessica Schomberg for Julie Moore)

MLA/OLAC Task Force for Single-Title Audio Formats (Bruce Evans)

  • The group met briefly on Friday, January 24 to work out logistics for the task force work, and content work will be underway very soon.

Miscellaneous news

  • CAPC voted to change the term “CAPC Intern” to “CAPC Associate”
  • MARC 341 $2 accessibility practices: since there is no approved vocabulary for this field, there is no value for $2. CAPC is hoping to partner with the Canadian Committee on Metadata Exchange (CCM), who have developed a vocabulary for this purpose.
  • OLAC will have a representative on the CaMMS Subject Analysis Committee, beginning with ALA Annual. Interested candidates should notify OLAC President Thomas Whittaker by February 28, 2020.

RDA Pre-Conference

James Hennelly (Director, ALA Digital Reference) and members of NARDAC and the RDA Steering Committee (RSC) past and present hosted another hands-on RDA Toolkit workshop on Friday, January 24. This workshop used a new approach to RDA orientation using a series of short presentations followed by exercises based on the presentation.

Presentation slides and exercises (with answers) are available from the RDA Toolkit website (; slides are also available from the RSC Presentations webpage (

  • Welcome and Background (Kathy Glennan, RSC Chair)
  • RDA Beta Toolkit Basics (Kate James, former RDA Toolkit Examples Editor)
  • A Quickstart Guide to RDA Terminology: Elements, SES, and VES (Dominique Bourassa, NARDAC)
  • Everything Old is New Again (Thomas Brenndorfer, NARDAC)
  • Whatever happened to… (Melanie Polutta, LC)
  • New Concepts in the Beta Toolkit (Honor Moody, RDA Toolkit Examples Editor)

RDA Update Forum

The RDA Update Forum was held on Saturday, January 25. The RDA Forum is co-presented by ALA and NARDAC and serves as the official information channel from NARDAC to the ALA community. The forum included presentations by Dominique Bourassa (ALA representative to NARDAC), James Hennelly (Director, ALA Digital Reference), Thomas Brenndorfer (NARDAC representative to the RSC), and Kathy Glennan (RSC Chair). Presentation slides are expected to be available from the RSC Presentations webpage ( and/or the NARDAC Presentations webpage (

News from NARDAC (Dominique Bourassa)

  • Bourassa reviewed NARDAC membership and organizational roles
    • The full roster may be viewed at
    • Stephen Hearn is now the back-up NARDAC representative to the RSC
  • Work on RDA proposal and discussion paper review has resumed
    • NARDAC solicited feedback on Expression Excerpts, RDA content elements, the RSC operations document, and string encoding schemes in RDA
    • A test proposal for curator at the work level was done; although the proposal was ultimately not sent to the RSC, it was a valuable learning experience
    • Challenges included timing around the holidays and generally short turnaround times
    • Still trying to figure out community feedback can be handled more efficiently
  • Element labels
    • NARDAC was charged with developing a set of user-friendly element labels (a spreadsheet of over one thousand lines)
    • The CC:DA 3R Task Force took on the project at NARDAC’s request
    • The conclusion was that it is probably not feasible to have a single list of user-friendly labels, and that communities will likely come up with their own sets
  • Community document review
    • Reviewed an ORDAC document regarding conferences, an application profile compiled by EURIG, and a document from ARLIS-NA regarding the term “curator.”
  • Presentations
    • Presentations made by NARDAC members are posted to the NARDAC and/or RSC webpages
  • NARDAC major goals
    • Develop effective interaction with the communities NARDAC represents
    • Provide direction and support for the North American community
    • Note that NARDAC only consists of six individuals and everyone has their day jobs as well!

Hot Topics from the RSC (Thomas Brenndorfer)

  • Brenndorfer gave an update on the recent RSC meeting in Santiago, Chile (October 2019)
  • The accompanying colloquium was cancelled to due to political unrest.
  • The RSC is using Basecamp and quarterly asynchronous meetings to accomplish their work
  • The meeting focused on several topics “hot out of the oven, or still baking:”
    • Internationalization/Western focus: increasing support for different languages and community standards with a goal of eliminating Western focus. Examples include evaluation of the definition for “Conventional name – Local place of worship” to be more inclusive, and to move string encoding scheme (SES) instructions to a separate place since these are largely community-based.
    • Collective agent discussions: where do conferences, ships, buildings fit in?
    • Aggregates and content elements: we don’t always catalog all of the “parts” (i.e., the “Swiss cheese” model), so shortcut elements are necessary.
  • There is much, much more work to do!

Overview of the RSC Action Plan 2020-2022 (Kathy Glennan)

Presentation slides are available from the RSC website:

  • The action plan is a rolling three-year work plan for the RSC, mandated by the 2018 RDA Agreement (however, no plans were created for the RSC during the 3R Project)
  • The action plan is tied to the Strategic Action Plan for RDA 2020-2022, a joint document between the RSC and the RDA Board (who also has its own three-year Action Plan)
  • Key points of the Action Plan:
    • Develop RDA as a responsive and dynamic standard
    • Increase the adoption of RDA
    • Provide relevant governance
  • Tasks include standing tasks (done every year) and tasks specific to each year; the latter will be updated yearly in consultation with the RDA Board to reflect new realities and priorities
  • Standing task highlights:
    • Direct RDA development to ensure continued alignment with the governing objectives
    • Ensure an international focus for RDA instructions and examples
    • Provide content updates for four releases of RDA Toolkit per year
    • Be responsive to user feedback
    • Continue editorial cleanup and development of guidance chapters
    • Review membership, tasks, and progress of all working groups
    • Provide expertise and support for RDA orientation and training
  • 2020 task highlights:
    • Begin publishing translations and policy statements
    • Stabilize Registry functionality and data
    • Implement decisions (e.g., issues with expression excerpts and RDA Content Elements, SES instructions)
    • Develop relationship matrix and visual browser
    • Receive and act on initial recommendations from the Application Profiles Working Group
    • Add Latin American and the Caribbean representative
  • 2021 task highlights:
    • With the RDA Board, consider the date for starting the year-long countdown clock on the original RDA Toolkit site
    • Review Amalgamation instructions and initiate cleanup
    • Review Resources tab
    • Create working groups for extent, place/jurisdiction, collective agent, official language, and religious works issues
    • Begin BIBFRAME mapping
  • 2022 task highlights:
    • Further develop Nomen and Timespan instructions
    • Receive and act on final recommendations from the Application Profiles Working Group
    • Receive and act on recommendations from other task-and-finish working groups
    • Be alert to developments with the ISBN and ISSN standards and the impact on harmonizing with RDA
    • Continue BIBFRAME mapping
    • Create an Archives Working Group
  • The final presentation slide contains links to various documents mentioned in the presentation.

Update from ALA Publishing (James Hennelly)

  • 2020 Release schedule
    • Possibly a mini-release soon, within the next month: there will be minimal content change (only to fix typos, etc.); mostly fixes to beta site functionality (e.g., search, user-created documents, addition of sub-menus to the top navigation bar)
    • Full release in April/May 2020: likely to include changes related to string encoding scheme revision
    • Possible release in August/September 2020
    • Switchover of beta site to on December 15, 2020: the “original” Toolkit will then move to
    • The countdown clock will NOT start at this time; a delay is planned to allow translators and policy statement writers to complete their work
  • Policy statements
    • The policy statement group with various national libraries to test and develop best practices has expanded to include all policy statement writers
    • Analysis, planning, and writing of policy statements is underway
    • This work needs to be coordinated with application profile development
    • Policy statements are likely to be rolled out in focused batches (not in complete sets)
    • The British Library is serving as the test group, and the hope is to have some sample policy statements on the beta site in the spring.
  • Translations
    • Most teams are still working on RDA Reference translations
    • The Finnish team is already working on instructions, with the Norwegian team not far behind
    • Elements impacted by the string encoding scheme revision will be held back from translation until that revision is complete.
    • The target is to have a partial translation on the beta site in the spring.
  • Orientation
    • There was a pre-conference workshop at ALA Midwinter
    • The “New Concepts” webinar series returns in February 2020
    • A new online workshop series with Kate James is slated for this spring; the series will encompass six topics in five webinars (45 minutes each) with presentations and exercises.
    • Goal is to provide information in a variety of formats and price points.
  • Print products
    • New editions are planned for Introducing RDA, Maxwell’s Handbook, RDA Essentials
    • Glossary
    • An RDA workbook based on the Kate James webinar series

RDA Linked Data Forum

The RDA Linked Data Forum took place on Monday, January 27, 2020 and featured two presentations. Presentation slides are available from the RSC Presentations webpage (

The first presentation, “RDA Vocabulary Encoding Schemes,” was given by Kate James, former RDA Toolkit Examples Editor. She prefaced her presentation with the caveat that she would be speaking from a cataloger’s perspective, rather than a technical viewpoint.

RDA vocabulary encoding schemes (VES) are defined as [add definition from Glossary]. Examples of VES outside of RDA include the Library of Congress Subject Headings and the Getty Art & Architecture Thesaurus (AAT). RDA VES are intended to be used for the value of an RDA element, are limited to RDA elements for resource entities (work, expression, manifestation, item), and can be recorded as structured description, identifier, or IRI.

RDA VES are usually available in three places: an element page, the glossary, or a VES page. The term label and definition will be the same in all three places. The VES page provides “almost one-stop shopping” in that the preferred label, notation, IRI, definition, and alternate label are given. All recording methods are available for elements with RDA VESs; elements can utilize multiple recording methods. One can use the RDA VES for the element, or select another suitable VES. If the latter course is taken, the source of the VES term is recorded.

VES decisions also have linked data implications. Unstructured descriptions are not suitable for conversion to linked data. Structured descriptions and identifiers can be converted if the VES source is recorded and the term or notation also has an IRI equivalent and it is a one-to-one relationship. It is best, of course, to use IRIs if that option is available.

There are several factors to consider when choosing a non-RDA VES. Does the VES have a term, notation, and IRI available for the concept? Is the VES consistent with the scope of the RDA element for which it is to be used? Is the VES sufficiently granular to describe the resources? And finally, will the VES be around in ten years? Two non-RDA VESs were shown to be compatible with RDA when compared to these considerations: Getty AAT and Wikidata.

James Hennelly, RDA Toolkit Director (ALA Publishing), presented “Deliverance: A Journey Through the RDA Workflow.” In this presentation, James used a fictitious new element “banjo playing kid agent of work” to demonstrate the RDA publishing process. The publishing process uses many websites, including the staff registry, Github, the RDA script tools site, the content management system (CMS), the staging/development site, and the RDA Toolkit.

The first step of the process takes place in the staff registry, where a CSV file is downloaded. The formal element name, definition, scope notes, domain and range, alternate names, inverses, and related elements are added to the CSV file. The updated CSV file is imported back into the RDA registry as XML. The data can be exported as RDF/XML, N-Triples, or JSON-LD. From the staff registry, data is pushed out via Github to the Toolkit glossary, RIMMF, RDA vocabulary server, and the RDA registry.

Next, an operational script creates the file and the sections and headers within the file. The script also populates the definition and scope, element reference, and related elements. Another script uploads the updated files into the CMS.

Editing takes place in the CMS using an oXygen editor. Keys, identifiers for files that are defined in a DITA map, are created. These keys are used to create links using keyrefs and conkeyrefs, and allow for file name changes without breaking links. (Conrefs and conkeyrefs are tools that allow editors to re-use content from another place in a file.) Content is only editable in the originating file but changes are reflected wherever the content is referenced. Boilerplate text, for frequently appearing text, is also constructed from conkeyrefs. New elements must be added to the DITA map. These maps are used to organize files for hierarchy/browse, and hopefully in the future will allow a customized view of the Toolkit for specific communities.

The final editorial steps include adding a release tag, comments characterizing the edits and publishing date to the file. The publishing date is only added for significant content change. The publish script prepares the file for the front end: creates friendly links, processes the conrefs and conkeyrefs, assigns a citation number, and allows for PDF generation.

The last steps in the process generate PDFs, uploads the new files to the FTP site, runs the loader to process the files from FTP to the staging site. After review on the staging site, the file is published to the live RDA Toolkit. Though not included in the above process outline, translators and policy statement writers are to be informed of content changes.

PCC At Large Meeting

The PCC At Large meeting took place on January 26, 2020 and featured updates on four initiatives.

Judith Cannan (Library of Congress) provided an overview of the beta RDA Toolkit LC-PCC Policy Statements development. A document describing the process is posted at the PCC website ( Melanie Polutta is serving as the lead for this team project, drawing on other LC experts (particularly in special formats) as necessary.

The goal of the process is to retain the status quo for performing one’s work. Although the cutover date is scheduled for December 15, 2020, no decision has been made for an implementation date. In addition to the policy statements, an application profile and policy guidance will need to be completed. Once the work plan has been submitted to LC management it will be made accessible to the PCC. LC intends to finish policy statement work in September, however, some of the explanatory matter that will reside in the guidance chapters will likely not be completed until 2021.

Paul Frank (LC) gave an update on the PCC URI pilot project. A full description of the pilot project may be found at the PCC website ( The PCC has been laying the groundwork for practical applications of linked data with a variety of task group and committee work. Linked data work is also included in the PCC strategic directions document (; particularly Strategic Direction 3).

The purpose of this pilot project is to “engage metadata practitioners in formally applying techniques to further the PCC’s linked data transition.” The call for project participants yielded thirty-five responses, and all respondents were included in the project; PCC membership was not a requirement to participate. Workflow for the pilot will not be limited to authority records, but will include all types of linked data. The project hopes to deliver real world object URIs, use of traditional and non-traditional sources, guidance on recording alternative identifiers from multiple sources, vocabulary sources, use of MARC 024 in NACO authority records (which would end the current moratorium), and provide implication for local/shared practices. Most of the MARC groundwork is in place, with some dependencies for NACO with MARC $0 and $1. PCC Standing Committee on Training (SCT) has developed training for real world objects with best practices for application to bibliographic and authority data. There will need to be a way to mark pilot records so MARC 024 fields are not deleted by non-project participants. Watch for announcements on PCCLIST for more information about the pilot project.

Janis Young spoke about the implementation of MARC 670 $w in proposals. In current practice, the Library of Congress Control Number (LCCN) appears in $a of “Work cat.” citations in authority records. Beginning in April, LCCNs will appear in $w. Affected vocabularies include LC subject headings, genre/form terms, medium of performance terms for music, and demographic group terms (for which there is currently a moratorium on proposals). Impacted citations include “Work cat.” citations in proposals for new headings and new “Work cat.” citations in proposals to revise existing records. Note that existing “Work cat.” citations should not be revised to append $w.

Control numbers can include LCCNs (LC proposals, SACO proposals made for CIPs cataloged in the CIP Partnership Program). Other bibliographic record control numbers may also be used (e.g., local control number, bibliographic utility control number). Only one $w will be permitted per citation, and should appear as the final subfield of MARC 670. The control number is preceded by the MARC code for the agency to which the number applies, enclosed in parentheses (e.g., $w (DLC)2020123456).

The timeline calls for documentation, proposal templates and use of $w to appear in February 2020 (PTCP will correct coding of LCCNs in proposals). No earlier than April 2020 will 670 $w appear in approved authority records; at this time miscoded proposals will be returned to catalogers for correction before proposals are scheduled for a tentative list.

Judith Cannan (LC) concluded the session with a brief report on the PCC options for minimally punctuated bibliographic records. LC does not intend to implement these options at the moment. Original cataloging will retain punctuation. Copy cataloging will use ISBD punctuation for non-PCC records and will not change punctuation options for PCC records. A quick straw poll of the audience revealed that few have implemented the options at present.

PCC Participants Meeting

PCC Chair Jennifer Baxmeyer provided opening remarks and a short list of PCC activities in the past year:

  • Minimal Punctuation Guidelines  implementation
  • ISNI Pilot extension
  • URIs in MARC Pilot
  • LC-PCC Task Group on Aggregates in beta RDA Toolkit
  • LC-PCC Task Group on Data Provenance in beta RDA Toolkit
  • LC-PCC Task Group on Diachronic Works in beta RDA Toolkit, and LC-PCC Task Group on Element Labels in beta RDA Toolkit.

PCC future work includes linked data collaborations and much more RDA work!

Three presentations on the theme “The Relevance and Usefulness of the Romanization of Non-Latin Scripts: Now and in the Future” were given in the remainder of the meeting. The first presentation, “LD4P2 Non-Latin Scripts Affinity Group Survey Results” was given by Larisa Walsh (University of Chicago). Walsh noted how the need for romanization has changed over time – from necessity to potentially not even necessary in future catalogs.

A bit of background about the survey: the Linked Data for Production 2 (LD4P2) Non-Latin Script Materials Affinity Group aims to explore different models for dealing with native scripts in Sinopia (LD4P2 linked data editor) and to create a community of practice for cataloging non-Latin script materials in the linked data environment. LC’s practice for linked data cataloging is to reduce the amount of Romanized data, with romanization limited to access points (descriptive fields appear in native script only).

The survey was distributed during two weeks in fall 2019 to a target audience of non-Latin library and research community, mostly in North America and Europe. The survey contained eight questions:

  • Do you work in the library?
  • Do you work with non-Latin script materials?
  • In what capacity do you work with non-Latin script materials, either in native script or in romanization?
  • How necessary is it to your work that romanized data is provided for each of the following bibliographic elements, even if the native script is also provided?
  • How do you use romanized data?
  • Do you rely on romanized data for scripts you cannot read?
  • How much would a lack of romanized data in bibliographic records impact your work?
  • Please provide any additional comments you may have on romanization.

The group learned several things from the survey:

  • Romanization is an important aid in many library operations and research
  • Romanized data in library catalogs are used mostly for searching and sorting/indexing online library records
  • The largest impact of lacking romanization will be on catalogers/technical services staff
  • Titles and names of contributors are always necessary to appear in romanized form
  • The human factor cannot be ignored, even if the technology can handle native scripts
  • Libraries still need to provide romanization for resources they collect.

The full survey results are still under analysis and will be posted to the LD4P wiki.

Iman Dagher (UCLA, Arabic NACO Funnel Coordinator) presented “Path to Discovery! : Romanization & Scripts for Non-Latin/Arabic Materials–Challenges & Potential.” The development of the Unicode standard led to hope that vernacular script could be used in catalogs. There are several PCC practices in play:

  • RDA instructions to transcribe data in the language and script found in the resource
  • LC-PCC PS 1.4 subsequently instructs one to apply the first alternative, which is to record elements in a transliterated form
  • Use the ALA-LC Romanization Tables
  • Use of MARC Model A for bibliographic records (vernacular and transliteration)
    • The original script fields are coded as 880 parallel fields in bibliographic records
    • In OCLC, parallel fields display as the same MARC tags as their linked Latin equivalents
  • Use of MARC 066 (Character sets present)
  • Use of MARC Model B for authority records (provides unlinked non-Latin script fields)
  • Adding scripts is optional but recommended.

Use of romanization offers several practical advantages since it is used by different tasks in the library (e.g., integrating materials into the regular workflow, serials check-in, circulation, interlibrary loan, etc.). Library staff are not required to have specialized language knowledge. Use of romanization allows one to find resources even if one does not know the language. Also, some library systems do not support all non-Latin scripts. However, romanization in practice offers many challenges, including lack of consistent romanization, complex ALA-LC romanization tables, patron confusion, training issues, and so on.

Dagher then reviewed some facts about Arabic script and language. Arabic is one of the most widely used scripts in the world—around 660 million individuals use Arabic script to communicate in several languages, including Urdu, Pashto, Arabic, Punjabi, Persian, Malaysian, and Kurdish. The Arabic language is a Semitic language with about 221 million speakers, and it is spoken in more than thirty-four countries. Modern Standardized Arabic is the universal language of the Arabic-speaking world, and is understood by all Arabic speakers; there are also over thirty different varieties of colloquial Arabic.

Romanization with Arabic materials present several challenges. The ALA-LC romanization tables are quite complex. The process of searching and locating records is quite time-consuming, since an ISBN is not always present or correct. Romanizing certain titles requires a familiarity with the culture. Finally, Arabic language relies on tashkil (or tahrik), i.e., vocalization. Arabic texts are mostly written with tashkil and fluent speakers are able to automatically fill in the missing diacritics themselves. Since Arabic is a highly inflected language, romanization requires a good grammatical knowledge. Dagher ran through several examples of issues with romanizing foreign words, geographical names, and Arabic personal names.

Using scripts for Arabic materials does provide value: there is greater precision in discovery, and more efficient cataloging practice. The resultant metadata is more legible and understandable not only for the local library clientele but also on a global scale. There are several factors to consider when adding scripts. Using macros is very helpful, but additional review is necessary. Use of different scripts with different directionality in one field may affect the display. Not all scripts are available in the authority file (e.g., Armenian). In some scripts, certain diacritics cause display problems.

UCLA has undertaken a project to add scripts to add scripts to legacy data. The project involves adding scripts in OCLC for monographs in Russian Cyrillic in bibliographic records held by UCLA. Russian was chosen since there is generally a one-to-one transliteration. The project includes about 54,000 records (excluding mixed script records). Using an in-house process working with a library IT programmer, the master record fields 245 ($a,b,c,p,n), 250, 26X ($a,b,c), and 490 $a are replaced in OCLC via batch loading. Sample records are reviewed by librarians with language expertise. The processed/replaced records in OCLC are identified by a MARC 588 field “UCLA Machine-derived non-Latin script bibliographic record project.” Future potential projects include Armenian, which would provide a smaller, more easily reviewed project.

The next speaker, Lia Contursi (Princeton, CJK BIBCO Funnel Coodinator) presented “The Importance of Romanization in CJK Records: Pros and Cons with Some Examples.” Contursi noted that the presentation represented her view, not that of the CJK funnel.
There are several reasons not to utilize romanization for Chinese, Japanese, and Korean (CJK) materials. The process is cumbersome and controversial as well as time consuming and uneconomical. Romanization is also prone to typos and errors.
However, there are also reasons to utilize romanization. The chief reason is digraphia—the different writing systems for CJK languages (e.g., traditional, simplified, and pinyin for Chinese, kanji, katakana, and romaji for Japanese, and Hangul, Hanja, and Romaja for Korean). An example using the word “sushi” illustrated various digraphia issues. In an OCLC search of each of the 5 ways of writing “sushi” plus the romanized form the most results came from the romanized form. Thus, romanization can serve as a “bridge” language.

Use of romanization also improves sorting and indexing in search results, plus adds values for non-native readers of CJK. Several screenshots illustrated representative searches in Chinese and Japanese catalogs; Contursi noted that the nice alphabetical arrangement of the search results was possible because of what happens behind the scenes with romanization.

Two NACO CJK projects were noted: the 2008 machine-derived project (CJK name records in the LC authority file are pre-populated with non-Latin references) and the 2019/2020 review project (review of pre-populated characters not necessarily correct).

Romanization is not an all-or-nothing decision—one can compromise, for example, by only romanizing the non-phonetic scripts, or selectively choosing descriptive fields for romanization. There are some tools available to help with romanization. Princeton has a macro, “K-Romanizer,” for romanizing Korean. The accuracy rate is about 96%. There is an OCLC Connexion Pinyin conversion macro available. Unfortunately, there is no macro for romanization of Japanese, though there is some freeware available for romaji conversion.

The final speaker was Paul Frank (LC Policy, Training, and Cooperative Programs Division). His presentation was titled “Library of Congress Title: Romanization: What are we gaining? What are we losing?” Frank noted that six years ago at ALA Midwinter 2014 we had the first glimpse of a simple BIBFRAME prototype. As BIBFRAME experimentation at the Library of Congress has progressed it has prompted evaluation of existing cataloging practices, one of which is the practice of romanization.

The LC BIBFRAME pilot revealed several observations regarding romanization from participants:

  • Difficulties in romanizing some scripts
  • Complex romanization tables
  • Other recognized romanization schemes
  • Lack of or inadequate automated romanization tools
  • Duplicated work with sometimes a concurrent loss of clarity.

One of the realities is that complete romanization is a more recent cataloging practice than might be imagined. BIBFRAME and MARC will have a long-term coexistence, with a need for BIBFRAME to MARC to BIBFRAME conversions. This coexistence means that romanization will remain necessary for the foreseeable future.

There are several compromises along these lines that can be made:

  • Limited romanization (it’s not “all or nothing”)
  • Tagging by language or script
  • Use of automated romanization tools for data
  • Database or record—discovery vs. “inventory”

In summary, there are often good reasons for doing things the way we do—and it is good to keep this in mind when pondering these parting questions:

  • Is romanization one of those entrenched cataloging practices due to cataloging “inertia?”
  • Is it time to reconsider the practice, even if something has to be given up in the process, or compromises have to be made?

And what about these:

  • MARC 008 values vs. MARC variable fields (e.g., additional content, genre/form, target audience)
  • Flat MARC vs. “ontological” MARC (MARC “work” descriptions)
  • ISBD punctuation
  • “Linky” MARC (i.e., addition of $0, $1 to MARC field)

ALCTS Committee on Cataloging: Description and Access (CC:DA)

CC:DA met on Saturday, January 25, 2020 (The Monday meeting was cancelled). The CC:DA blog contains the full agenda and links to various documents and reports.

Chair’s report of CC:DA motions and other actions since ALA Midwinter (Amanda Ros)

  • The full report may be viewed at
  • Two motions/votes were conducted: to authorize the ALA Representatives to NARDAC to send the Curator proposal to NARDAC with the proposed revisions, and to authorize our NARDAC representatives to share CC:DA’s responses with NARDAC, so they can formalize a response to the RSC.
  • Four task forces were active between July and December: the 3R Task Force, the Virtual Participation Task Force, the CC:DA Procedures Review Task, and the CC:DA RDA Beta Toolkit Training Investigation Task Force.

Report from the Library of Congress Representative (Melanie Polutta)

  • The full report covering all LC activities (including other cataloging-related topics) is available on the “LC at ALA” Website (
  • Staffing update for Policy, Training, and Cooperative Programs Division (PTCP): Kate James resigned her position after sixteen years of service. Melanie Polutta now serves as the LC Representative to CC:DA and to NARDAC
  • LC-PCC Policy Statements remain frozen as a result of the RDA Toolkit 3R Project. Since there is now stabilized text, work has begun on the development of policy statements and application profiles required for the revised RDA text. The desire is to maintain the working status quo as much as possible
  • Four joint LC-PCC task groups have been charged to make recommendations regarding these policy statements: Diachronic Works, Aggregate Works, Element Labels, and Data Provenance.
  • Pilot project for Copy Cataloging following minimal punctuation: The Library of Congress has decided that it will not follow the minimal punctuation alternatives approved by the PCC in January 2020. However, a pilot project to accept copy cataloging that follows the minimal punctuation guidelines started with their implementation in January. The pilot will study the impact of following the guidelines on cataloging workflows.
  • PTCP staff continue to work on BIBFRAME development and testing. The scope of BIBFRAME Pilot Phase Two has expanded to include more than 100 Library of Congress catalogers, including staff members working in four of the six overseas offices.
  • LC’s Voyager system has been updated to include all of the MARC updates. Hopefully an authority record update with the NACO nodes will happen in the next several months

Report of the ALA Representatives to the North American RDA Committee (NARDAC) (Dominique Bourassa, Stephen Hearn)

  • The full report may be viewed at
  • NARDAC has now completed its second year of existence representing the North American region on the RSC
    • ALA representatives include Dominique Bourassa (who also serves as NARDAC chair) and Stephen Hearn
    • LC representatives include Damian Iseminger (NARDAC Coordinator of Web Content) and Melanie Polutta (who succeeded Kate James in October 2019)
    • Thomas Brenndorfer (Canadian Committee on Cataloguing) serves as the NARDAC representative to the RSC
  • NARDAC holds regular virtual meetings and accomplishes their work using Basecamp and Google Drive.
  • NARDAC has been busy participating in the development of the beta Toolkit, fulfilling requests from the RSC and soliciting community feedback on various reports and discussion papers.
  • Outreach activities
    • NARDAC members presented at various meetings and venues (presentations are available on the RSC Presentations webpage)
    • NARDAC members served as table leaders and presenters at the ALA Annual RDA pre-conference and RDA Forum
    • Stephen Hearn participated in the PCC Standing Committee on Standards meeting in June 2019
    • NARDAC members have been involved in the development of local policies
      • Stephen Hearn chaired the LC-PCC Task Group on Data Provenance
      • Melanie Polutta co-chaired (with Dominique Bourassa as consultant) the LC-PCC Task Group on Element Labels
      • Melaine Polutta is leading the work on the LC-PCC policy statements with Damian Iseminger and four other LC staffers
  • RDA activities (RSC)
    • There have been some RSC membership and governance changes: Honor Moody succeeded Kate James as RDA Examples Editor; Damian Iseminger has been named to the Technical Working Group
    • There were several topics covered at the October 2019 RSC meeting that took place in Santiago, Chile:
      • Internationalization and removal of Anglo-American and Christian focus from the RDA text, likely requiring a working group to accomplish
      • The need for a more logical assignment of RDA content to designated content area such as the Guidance and Resources tabs
      • Creation of a new Collective Agent entity for meetings, conferences, congresses, expeditions, fairs, festivals, etc. Any new entity would not overlap with Family or Corporate Body entities
      • Reviewed briefing papers on Work boundaries, and RDA metadata implementation scenarios
      • Began review of briefing papers on Expression excerpts and Content elements related to aggregated expressions
    • Began using asynchronous online meetings to increase capacity between in-person meetings
    • Used the Curator element proposal initiated by ARLIS/NA as a test case for the process of post-3R RDA revision proposal
    • Outreach opportunities included a series of Orientation webinars on various topics in the beta Toolkit; these webinars will be repeated in 2020
    • The next in-person RSC meeting will take place in Jerusalem in October 2020.

Report from the PCC liaison (Everett Allgood)

  • The full report may be viewed at
  • PCC Guidelines for Minimally Punctuated MARC Bibliographic Records is effective as of January 2020. Documentation is available via the PCC website. Additional mechanism to assist with local policies and workflow development is under consideration. An FAQ document will be linked to the guidelines
  • Standing Committee on Applications (SCA): SCA members serve on the PCC Task Group on Metadata Application Profiles, the Task Group on URIs in MARC Pilot Project, and the Task Group on Language Codes. SCA also created and tested a regular expression document to be used in MARCEdit to remove punctuation from bibliographic records according to the new guidelines
  • Standing Committee on Standards (SCS): recent SCS work includes revision of explanatory text in the Provider Neutral Guidelines to provide more context, revisions to DCM Z1 for MARC authority field 672, and participation on three of the joint LC-PCC task groups for RDA development.
  • Standing Committee on Training (SCT): recent SCT work includes creation of a training curriculum for PCC members on minimal punctuation guidelines, development of a Sinopia training curriculum for current PCC LD4P participants, development of a training curriculum for the IFLA Library Reference Model, and training materials for the URI Training Task Group and Real World Objects. Future work includes an update to the NACO manual, formation of a group to develop training for using the beta RDA Toolkit, and to form a joint task group with the Linked Data Advisory Committee for linked data training.

Report on the CC:DA 3R Task Group (Bob Maxwell)

  • The Task Group reviewed the list of RDA unconstrained elements in an effort to develop a set of user-friendly labels that could be used for public display (particularly of relationship designators)
  • Assisted CC:DA with preparation of a proposal for a change to RDA.
    • This was the first revision proposal since the beginning of the 3R Project
    • The proposal came jointly from the Art Libraries Society of North America (ARLIS/NA) and CC: DA and proposed adding an RDA element for curators who play a role at the work level (e.g. a curator who organizes an exhibition that publishes and exhibition catalog)
  • The Task Force commented on another change proposal, this time a proposal that was part of an RSC effort to clean up language regarding corporate bodies vs the names of places, and dealt with religious bodies named after a place (e.g. the name of a church building) in part to eliminate Western-centric focus.

Proposal on Reviewing Procedural Guidelines for Proposed New or Revised Romanization Tables (Beacher Wiggins, LC)

The proposal may be viewed at

  • The goal of the proposal is to explore establishment of a review board to help facilitate the approval process
  • The proposal proposes change to the review process to address romanization issues, manage the processes of creating and revising romanization tables, and to explore the need for less romanization
  • LC will draw on expertise from their staff (including overseas offices) and will work collaboratively with external community stakeholders
  • LC will consider all stakeholder input and work with ALA committees (CC:DA and CC:AAM) to reach consensus before approving the proposal.

Report from ALA Publishing Services and Presentation on RDA Toolkit changes (James Hennelly)

  • Much of the report contained information presented at the RDA Update Forum (see above report)
  • Accessibility targets for the beta Toolkit have been met as of September 2019.

Discussion of Future CC:DA work

  • Most of CC:DA’s upcoming work will be focused on development of RDA proposal review procedures

Announcement of the next CC:DA Meeting

  • The next CC:DA meeting is scheduled to be held at ALA Annual in Chicago, IL on June 27, 2020.