Difference between revisions of "Dedupe 2.1"
(→links) |
|||
(29 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
− | ==Dedupe 2.1: Group 1 - | + | ==Dedupe 2.1: Numeric Field Hits - Group 1 == |
− | + | [[Image:d2-1.png]]<br><br> | |
− | + | == Numeric Field Hits == | |
− | + | === LCCN === | |
− | + | LC changed the structure of the LCCN beginning on Jan. 1, 2001 in order to accommodate a four-digit year. The length of the control number remains 12 characters as it was prior to the change. However, in the old LCCN structure (A), suffixes were occasionally used. Under the new LCCN structure (B), the location of elements is slightly altered to accommodate a four-digit year. Under both structures, the prefix, year and serial number are the basic elements required to make a LCCN unique. | |
− | + | ||
− | | | + | '''LCCN - 010 $a:''' |
− | + | For the Library of Congress Control Number, $a will be used as default. For searching only (not changed in the final record), this field will be normalized. This will remove extra spaces, punctuation and extra data that is usually contained in a different subfield. | |
− | + | ||
+ | *Examples of LCCN normalization:<br> | ||
+ | <font size="3"> | ||
+ | '''original fields''': | ||
+ | ###755262 | ||
+ | ###80020863 <font color="red">/</font>AC<font color="red">/</font>r86 | ||
+ | |||
+ | '''updated fields''': | ||
+ | 75005262 | ||
+ | 80020863ACr86 ''and'' 80020863 | ||
+ | Both will be searched to find a potential match.</font> | ||
+ | |||
+ | '''Historical fact''': LC changed the structure of the LCCN beginning on Jan. 1, 2001 in order to accommodate a four-digit year. The length of the control number remains 12 characters as it was prior to the change. However, in the old LCCN structure (A), suffixes were occasionally used. Under the new LCCN structure (B), the location of elements is slightly altered to accommodate a four-digit year. Under both structures, the prefix, year and serial number are the basic elements required to make a LCCN unique. | ||
+ | |||
+ | === ISBN === | ||
+ | '''ISBN - 020 $a:''' | ||
+ | |||
+ | For the International Standard Book Number, $a will be used as default. For searching only (not changed in the final record), this field will be normalized. This will remove extra spaces, punctuation and extra data that is usually contained in a different subfield. | ||
+ | |||
+ | *Examples of ISBN normalization: | ||
+ | <font size="3"> | ||
+ | '''original fields''': | ||
+ | 0937295124<font color="red"> : $12.95</font> | ||
+ | 978<font color="red">-</font>0<font color="red">-</font>06<font color="red">-</font>108096<font color="red">-</font>8 | ||
+ | 0688076815 <font color="red">(pbk.)</font> | ||
+ | |||
+ | '''updated fields''': | ||
+ | 0937295124 | ||
+ | 9780061080968 | ||
+ | 0688076815</font> | ||
+ | |||
+ | '''Historical fact''': The structure of the ISBN has changed over the past thirty years. Prior to 1977, the 020 field was not repeatable and multiple ISBNs and related information were placed in repeated subfields. Older bibliographic records may still have multiple ISBNs in a single 020 field rather than in multiple 020 fields. January 1, 2007 marked the final date for fully adopting ISBN-13. Between 2005 and 2008, publishers were encouraged to supply both an ISBN-10 and an ISBN-13 for the same manifestation, based on guidelines issued by the International ISBN Agency (IIA). The Library of Congress began accommodating ISBN-13 on October 1, 2004. At the beginning of 2007 is when publishers were expected to supply only ISBN-13. | ||
+ | |||
+ | === ISSN === | ||
+ | '''ISSN - 022 $a:''' | ||
+ | |||
+ | For the International Standard Serial Number, $a will be used as default. For searching only (not changed in the final record), this field will be normalized. This will remove extra spaces, punctuation and extra data that is usually contained in a different subfield. | ||
+ | |||
+ | *Examples of ISSN normalization: | ||
+ | <font size="3"> | ||
+ | '''original fields''': | ||
+ | 0829<font color="red">-</font>0784 | ||
+ | 0009<font color="red">-</font>5753 <font color="red">PERIODICAL</font> | ||
+ | |||
+ | '''updated fields''': | ||
+ | 08290784 | ||
+ | 00095753</font> | ||
+ | |||
+ | == Default== | ||
+ | {| border="0" cellspacing="0" cellpadding="5" align="left" style="border-collapse:collapse;" | ||
+ | ! style="background:lightgray" align="left" colspan="2" | Group 1 (010, 020, 022) | ||
+ | |- style="background:#CCFFFF; font-size: 110%;" | ||
+ | | || Normalize 010, 020, 022 $a and search entire contents of field | ||
|- | |- | ||
|} | |} | ||
<div style=clear:both></div><br> | <div style=clear:both></div><br> | ||
− | + | The other options are as follows: | |
− | + | #Use the entire field data including subfields other than $a | |
− | + | #Don't use normalization | |
− | + | #Connect different tag numbers (see [http://ac.bslw.com/community/wiki/index.php5/Dedupe_4.3 Step 4-3] Like Tags) | |
− | + | ||
==links== | ==links== | ||
− | <center><font size="4">[[Dedupe_2.1|2.1]] - [[Dedupe_2.2|2.2]] - [[Dedupe_2.3|2.3]] - [[Dedupe_2.4|2.4]] - [[Dedupe_2.5|2.5]] - [[Dedupe_2.6|2.6]] - [[Dedupe_2.7|2.7]] - [[Dedupe_2.8|2.8]] - [[Dedupe_2.9|2.9]] - [[Dedupe_2.10|2.10]] - [[Dedupe_2.11|2.11]] - [[Dedupe_2.12|2.12]] | + | <center><font size="4">[[Dedupe_2.1|2.1]] - [[Dedupe_2.2|2.2]] - [[Dedupe_2.3|2.3]] - [[Dedupe_2.4|2.4]] - [[Dedupe_2.5|2.5]] - [[Dedupe_2.6|2.6]] - [[Dedupe_2.7|2.7]] - [[Dedupe_2.8|2.8]] - [[Dedupe_2.9|2.9]] - [[Dedupe_2.10|2.10]] - [[Dedupe_2.11|2.11]] - [[Dedupe_2.12|2.12]] - [[Dedupe_2.13|2.13]] |
<hr> | <hr> | ||
[[Dedupe_1.0|1.0]] - [[Dedupe_2.0|2.0]] - [[Dedupe_3.0|3.0]] - [[Dedupe_4.0|4.0]] - [[Dedupe_5.0|5.0]] - [[Dedupe_6.0|6.0]]</font></center> | [[Dedupe_1.0|1.0]] - [[Dedupe_2.0|2.0]] - [[Dedupe_3.0|3.0]] - [[Dedupe_4.0|4.0]] - [[Dedupe_5.0|5.0]] - [[Dedupe_6.0|6.0]]</font></center> | ||
[[category:Dedupe Profile Guide]] | [[category:Dedupe Profile Guide]] |
Latest revision as of 11:00, 2 April 2013
Contents
Dedupe 2.1: Numeric Field Hits - Group 1
Numeric Field Hits
LCCN
LC changed the structure of the LCCN beginning on Jan. 1, 2001 in order to accommodate a four-digit year. The length of the control number remains 12 characters as it was prior to the change. However, in the old LCCN structure (A), suffixes were occasionally used. Under the new LCCN structure (B), the location of elements is slightly altered to accommodate a four-digit year. Under both structures, the prefix, year and serial number are the basic elements required to make a LCCN unique.
LCCN - 010 $a: For the Library of Congress Control Number, $a will be used as default. For searching only (not changed in the final record), this field will be normalized. This will remove extra spaces, punctuation and extra data that is usually contained in a different subfield.
- Examples of LCCN normalization:
original fields: ###755262 ###80020863 /AC/r86 updated fields: 75005262 80020863ACr86 and 80020863 Both will be searched to find a potential match.
Historical fact: LC changed the structure of the LCCN beginning on Jan. 1, 2001 in order to accommodate a four-digit year. The length of the control number remains 12 characters as it was prior to the change. However, in the old LCCN structure (A), suffixes were occasionally used. Under the new LCCN structure (B), the location of elements is slightly altered to accommodate a four-digit year. Under both structures, the prefix, year and serial number are the basic elements required to make a LCCN unique.
ISBN
ISBN - 020 $a:
For the International Standard Book Number, $a will be used as default. For searching only (not changed in the final record), this field will be normalized. This will remove extra spaces, punctuation and extra data that is usually contained in a different subfield.
- Examples of ISBN normalization:
original fields: 0937295124 : $12.95 978-0-06-108096-8 0688076815 (pbk.) updated fields: 0937295124 9780061080968 0688076815
Historical fact: The structure of the ISBN has changed over the past thirty years. Prior to 1977, the 020 field was not repeatable and multiple ISBNs and related information were placed in repeated subfields. Older bibliographic records may still have multiple ISBNs in a single 020 field rather than in multiple 020 fields. January 1, 2007 marked the final date for fully adopting ISBN-13. Between 2005 and 2008, publishers were encouraged to supply both an ISBN-10 and an ISBN-13 for the same manifestation, based on guidelines issued by the International ISBN Agency (IIA). The Library of Congress began accommodating ISBN-13 on October 1, 2004. At the beginning of 2007 is when publishers were expected to supply only ISBN-13.
ISSN
ISSN - 022 $a:
For the International Standard Serial Number, $a will be used as default. For searching only (not changed in the final record), this field will be normalized. This will remove extra spaces, punctuation and extra data that is usually contained in a different subfield.
- Examples of ISSN normalization:
original fields: 0829-0784 0009-5753 PERIODICAL updated fields: 08290784 00095753
Default
Group 1 (010, 020, 022) | |
---|---|
Normalize 010, 020, 022 $a and search entire contents of field |
The other options are as follows:
- Use the entire field data including subfields other than $a
- Don't use normalization
- Connect different tag numbers (see Step 4-3 Like Tags)
links
1.0 - 2.0 - 3.0 - 4.0 - 5.0 - 6.0