Step 2.1

From AC Wiki
Jump to: navigation, search

Step 2.1: Validation

Rda2-1.png

Numeric Field Validation

MARC fields that are incorrectly formatted often cause user searches to fail and prevent items in the collection from being included in the system indexes. MARS 2.0 software can validate the structure of numeric data in the following fields:
  • 010: Library of Congress Control Number (LCCN)
  • 020: International Standard Book Number (ISBN)
  • 022: International Standard Serial Number (ISSN)
  • 034: Coded Cartographic Mathematic Data (CCMD)

Historical fact

LC changed the structure of the LCCN beginning on Jan. 1, 2001 in order to accommodate a four-digit year. The length of the control number remains 12 characters as it was prior to the change. However, in the old LCCN structure (A), suffixes were occasionally used. Under the new LCCN structure (B), the location of elements is slightly altered to accommodate a four-digit year. Under both structures, the prefix, year and serial number are the basic elements required to make a LCCN unique.

Please indicate on Step 2.1 what kind of validation you would like performed on your 010, 020, 022, or 034 fields. Choosing Yes, With these modifications means that you would like the MARS 2.0 software to perform a modified validation (e.g., validate fields 020 and 022, but not fields 010 or 034).

Pre-2001 LCCN

LCCN Structure A (2000 and earlier) numbers are formatted according to the following 6 divisions (separated by hyphens):

	   1   2   3    4  5    6
	nb#-71-005810-#-/AC-/r86
  1. 3-character prefix with lowercase letters and/or blanks
  2. 2 digits, usually the last 2 digits of the year
  3. 6-digit serial number, with zeroes padded to the left to make 6 digits
  4. Blank space
  5. Optional variable length suffix and/or alphabetic identifier
  6. Optional revision date

Examples of LCCN Structure A (the character # represents a single space):

	  ###95156543#			Displayed as:	95-156543
	###94014580#/AC/r95		Displayed as:	94-14580/AC/r95
	###79310919#//r86		Displayed as:	79-310919//r86
	nb#71005810#			Displayed as:	nb71-5810

Post-2000 LCCN

LCCN Structure B (2001 and later) numbers are formatted according the following 3 divisions (separated by hyphens):

	   1   2     3
	##-2005-256543
  1. 2-character prefix with lowercase letters and/or blanks
  2. 4-digit year
  3. 6-digit serial number, with zeroes padded to the left to make 6 digits

Examples of LCCN Structure B (the character # represents a single space):

	  ##2005256543			Displayed as:	2005-256543
	##2010014580			Displayed as:	2010-14580
	nb2005005810			Displayed as:	nb2005-5810

According to the Library of Congress, Structure A LCCNs will not be changed to Structure B. This minimizes the impact of the LCCN change for local systems. Since LCCN structures A and B will continue to exist in authority and bibliographic records, MARS 2.0 programs provide for validation of both old and new LCCN formats. No provision is necessary, therefore, for the conversion of Structure A to the new Structure B formats, or vice versa.

LCCN structure A corrections

If the LCCN in the 010 $a is identified as a Structure A LCCN and does not have a valid structure, MARS 2.0 programs make the following format corrections (all changes are subsequently checked for validity):

  • If the first character of the LCCN is a number (no prefix is present), the programs insert 3 blanks (###) before the number:
	  Original:  95-156543			Corrected to:	###95156543#
  • If the first character of the LCCN is an alphabetic character and the second character is a number, MARS 2.0 programs insert 2 blanks (##) between the alphabetic character and the number to make a valid 3-character prefix:
	  Original:  n95-156543			Corrected to:	n##95156543#
  • If the first 2 characters of the LCCN are alphabetic and the third character is a number, MARS 2.0 programs insert 1 blank (#) between the alphabetic characters and the number to make a valid 3-character prefix:
	  Original:  nb95-156543			Corrected to:	nb#95156543#
  • If a hyphen appears in the 010 $a, MARS 2.0 programs count the number of digits before the hyphen. If one digit is before the hyphen, a 0 (zero) is inserted before the first digit in the LCCN (following the prefix). If 2 digits are before the hyphen, no zeroes are inserted at the beginning of the LCCN:
	  Original:  nb#9-156543			Corrected to:	nb#09156543#
	Original:  nb#95-156543			Corrected to:	nb#95156543#
  • MARS 2.0 programs also count the number of digits following the hyphen. If there are fewer than 6 digits, zeroes are added following the first 2 digits (##-) of the LCCN to make 6 digits (for a total of 8 digits). The hyphen is deleted from the LCCN:
	  Original:  nb#95-6543			Corrected to:	nb#95006543#
	Original:  nb#95-56543			Corrected to:	nb#95056543#
  • If the LCCN contains a suffix, the suffix is removed in accordance with the revised LC standard for Structure A LCCNs:
	  Original:  nb#95-516543//r86		Corrected to:	nb#95156543#
  • If the LCCN does not end with a blank, MARS 2.0 programs insert a blank following the last digit:
	  Original:  nb#95-516543			Corrected to:	nb#95156543#
  • If the 010 field data has been modified, the 010 field length is recalculated and the 010 directory entry is updated. The record length is recalculated and updated in the record leader.
  • If MARS 2.0 programs cannot correct the format of the LCCN in the 010 $a (e.g., there are 4 characters in the prefix or there are 9 digits), the 010 $a code is changed to $z and a report can be generated. See report R50 in step 5 for more information about this report.
  • The following invalid LCCN prefixes are corrected to the valid format (# = blank):
	  #a#		->		a##
	##a		->		a##
	#bc		->		bc#
	#		->		###
	##		->		###

LCCN structure B corrections

If the LCCN in the 010 $a is identified as a Structure B LCCN and does not have a valid structure, MARS 2.0 programs attempt to correct it by making these conversions (all changes are subsequently checked for validity):

  • If the first character of the LCCN is a number (no prefix is present), the programs insert 2 blanks before the number:
	  Original:  2005-256543		Corrected to:	##2005256543
  • If the first character of the LCCN is an alphabetic character and the second character is a number, MARS 2.0 programs insert 1 blank (#) between the alphabetic character and the number to make a valid 2-character prefix:
	  Original:  n2005-256543		Corrected to:	n#2005256543
	Original:  nb2005-256543	Corrected to:	nb2005256543
  • If a hyphen or blank space appears in the 010 $a, MARS 2.0 programs count the number of digits following the hypen. If there are fewer than 6 digits, zeroes are added following the first 4 digits (####-) of the LCCN to make 6 digits (for a total of 10 digits). The hyphen is deleted from the LCCN:
	  Original:  nb2005-6543		Corrected to:	nb2005006543
  • If the 010 field data has been modified, the 010 field length is recalculated and the 010 directory entry is updated. The record length is recalculated and updated in the record leader.
  • If MARS 2.0 programs cannot correct the format of the LCCN in the 010 $a (e.g., there are 3 characters in the prefix or there are 11 digits), the 010 $a code is changed to $z and a report can be generated. See report R50 in Step 5 for more information about this report.
  • The following invalid LCCN prefixes are corrected to the valid format (# = blank):
	  #a		->		a#
	#bc		->		bc
	#		->		##

020 Field

Some automated systems do not index an ISBN if the format is invalid. An ISBN in field 020 $a should be 10 digits or 13 digits. If the ISBN in 020 $a does not have the valid structure, MARS 2.0 programs attempt to correct the ISBN structure by performing the following conversions:

  • If there are 9 digits in the ISBN, a 0 (zero) is inserted before the first digit in the ISBN:
	  Original:  873671008		Corrected to:	0873671008
  • All hyphens are deleted:
	  Original:  1-873671-008		Corrected to:	1873671008
  • A lowercase x is converted to uppercase:
	  Original:  187367100x		Corrected to:	187367100X
  • If ISBN is 13 digits, MARS 2.0 programs will verify that the first 3 digits are 978.
  • As an optional service, MARS 2.0 programs will correct the order of the ISBN (i.e. pairs of 13/10 and 13/10)
  • As an optional service, MARS 2.0 programs will convert ISBN-10 to ISBN-13 (includes check-sum value for both 10 and 13 length ISBNs):
	  Original:  1873671008		Corrected to:	9781873671000
  • If MARS 2.0 programs cannot correct the format of the ISBN in the 020 $a (e.g., there are 11 digits), the 020 $a code is changed to $z and a report is generated. See report R50 in Step 5 for more information about this report.

Historical fact

The structure of the ISBN has changed over the past thirty years. Prior to 1977, the 020 field was not repeatable and multiple ISBNs and related information were placed in repeated subfields. Older bibliographic records may still have multiple ISBNs in a single 020 field rather than in multiple 020 fields. January 1, 2007 marked the final date for fully adopting ISBN-13. Between 2005 and 2008, publishers were encouraged to supply both an ISBN-10 and an ISBN-13 for the same manifestation, based on guidelines issued by the International ISBN Agency (IIA). The Library of Congress began accommodating ISBN-13 on October 1, 2004. At the beginning of 2007 is when publishers were expected to supply only ISBN-13.

Ordering 020 fields

LC will accept both an ISBN-13 and an ISBN-10 for the same manifestation. These numbers are shown by publishers according to guidelines issued by the IIA, which call for grouping the pairs of ISBNs by manifestation. In printed products the ISBN-13 appears first, and each number is preceded by a print constant as in the following example:

	  ISBN-13:  978-1-873671-00-0
	ISBN-10:  1-873671-008

Repeating 020 subfields

MARS 2.0 Update processing validates an 020 field for correct subfield repeatability. If the 020 field contains multiple $a, each $a is placed in a separate 020 field:

	  020 $a 11111111 $a 22222222
		Corrected to:
	020 $a 11111111
	020 $a 22222222

Binding information in 020 fields

Prior to 1978, binding information was placed in a $b. Older bibliographic records may have binding information in a $b rather than as a parenthetical qualifier in the $a.

If the 020 field contains a $b and an 020 $a exists:

  • $b delimiter and subfield code are deleted
  • 020 $b data is enclosed in parentheses
  • A blank is inserted at the end of the immediately preceding 020 $a data
  • 020 $b data, enclosed in parentheses, is moved after the blank at the end of the 020 $a data
	  Original:  020 $a 1873671008 $b pbk.	Corrected to:  020 $a 1873671008 (pbk.)

020 with missing $a

If the 020 field contains a $b and no 020 $a exists, the $b code will be changed to $c:

	  020 $b pbk.
		Corrected to:
	020 $c pbk.

020 with multiple $c

If the 020 field contains multiple $c, each $c is placed in a separate 020 field:

	  020 $c 4.95 (lib. bdg.) $c 3.60 (pbk.)
		Corrected to:
	020 $c 4.95 (lib. bdg.)
	020 $c 3.60 (pbk.)

Please note that if 020 $c follows an existing $q in the same field, the 020 $c will not be moved to a separate 020 field.

020 with multiple $a and $c

MARS 2.0 programs correctly handle 020 fields with multiple $a and $c:

	  020 $a 11111111 $c 4.95 $a 22222222 $c 3.60 $c 8.97 $b pbk.
		Corrected to:
	020 $a 11111111 $c 4.95
	020 $a 22222222 $c 3.60
	020 $c 8.97 (pbk.)

022 Field

MARS 2.0 Update processing can validated the format of the ISSN in field 022 $a. Some automated systems do not index an ISSN if the format is invalid. A valid ISSN in field 022 $a has the following structure: 4 digits, hyphen, 4 digits (or digits and an X):

	  1234-1234
	1234-123X

If the ISSN in field 022 $a does not have the valid structure, MARS 2.0 programs attempt to correct it by making these conversions:

  • If the ISSN has no hyphen, adds a hyphen between the fourth and fifth digits:
	  Original:  12345678			Corrected to:  1234-5678
  • Converts a lowercase x to uppercase:
	  Original:  1234-567x			Corrected to:  1234-567X
  • If MARS 2.0 programs cannot correct the format of the ISSN in the 022 $a (e.g., there are 9 digits), the 022 $a code is changed to $y and a report is generated. See report R50 in Step 5 for more information about this report.

034 Field

MARS 2.0 Update processing can validate field 034 CMD (Coded Mathematical Data) for correct format. If the 034 field first indicator has value 2 and the 034 field contains multiple $a, MARS 2.0 Update processing:

  • Places each $a in a separate 034 field
  • Changes each 034 field first indicator to value 1
	  034 2_$a a $b 100000 $a a $b 120000
		Corrected to:
	034 1_$a a $b 100000
	034 1_$a a $b 120000

Historical fact

First indicator value 2 became obsolete when field 034 was made repeatable in 1982. Older bibliographic records may still have first indicator value 2.

links

2.1 - 2.2 - 2.3 - 2.4 - 2.5 - 2.6 - 2.7 - 2.8 - 2.10 - 2.11 - 2.13
1.0 - 2.0 - 3.0 - 4.0 - 5.0 - 6.0