Difference between revisions of "Terminology-D"

From AC Wiki
Jump to: navigation, search
(New page: ==Terminology… D== ===Deblinding Cross References=== LC Authority records are constructed so that they are naturally "self deblinding." This is in reference to the LC see reference in ...)
 
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
 
==Terminology… D==
 
==Terminology… D==
 +
 +
===Death Date Fix===
 +
See [[Terminology-H#Heading_Tracker | Heading Tracker]]
  
 
===Deblinding Cross References===
 
===Deblinding Cross References===
 
LC Authority records are constructed so that they are naturally "self deblinding."  This is in reference to the LC see reference in that the see reference (4XX) really points to the authorized heading.  In addition to the ''See'' cross-reference, most local systems will also generate a ''See also'' display based on fields 550 and 150 in an authority record.  If an associated authority record for the see also record does not exist then a blind record will exist.  Standard authority control creation practice will not allow for a see also (5XX) reference unless the Authorized Version is also created.  In automated authority control process, the authorized version of the see also reference does not automatically get delivered.  The only way to deblind a see also cross-reference is to delete from the authority record or change the 5XX to a 4XX.  The MARS 2.0 staff does '''not '''recommend removal or change of the 5XX tag.  (mpg)
 
LC Authority records are constructed so that they are naturally "self deblinding."  This is in reference to the LC see reference in that the see reference (4XX) really points to the authorized heading.  In addition to the ''See'' cross-reference, most local systems will also generate a ''See also'' display based on fields 550 and 150 in an authority record.  If an associated authority record for the see also record does not exist then a blind record will exist.  Standard authority control creation practice will not allow for a see also (5XX) reference unless the Authorized Version is also created.  In automated authority control process, the authorized version of the see also reference does not automatically get delivered.  The only way to deblind a see also cross-reference is to delete from the authority record or change the 5XX to a 4XX.  The MARS 2.0 staff does '''not '''recommend removal or change of the 5XX tag.  (mpg)
 +
 +
===Decomposed vs Composed Characters===
 +
What we<nowiki>’</nowiki>re talking about here is decomposed characters and the process of translation between UTF-8 and MARC-8 formats.
 +
 +
At the 2007 ALA Midwinter meeting, the Library of Congress announced the change to UTF-8 as the internal data exchange format for their database. The standard change came from LC<nowiki>’</nowiki>s migration to a Voyager-based server environment. As a Unicode format, UTF-8 holds an advantage over MARC-8 in allowing a broader range of languages and characters.
 +
 +
Backstage Library Works followed LC<nowiki>’</nowiki>s lead in developing MARS 2.0 by making UTF-8 our internal data exchange format. If your ILS and institutional policies allow, we recommend that you utilize the Unicode capabilities and switch your data over to UTF-8 for greater compatibility with the Library of Congress standard.
 +
 +
If it is not possible for your library to convert to a UTF-8 data exchange, MARS does have the capability to receive and deliver data in MARC-8. However, translating between UTF-8 and MARC-8 can be problematic because there are multiple ways to represent some characters, particulaly in a character<nowiki>’</nowiki>s level of '''composition''' or '''decomposition'''.
 +
 +
Characters with diacritical marks can generally be represented either as a single, '''composed''' character or as a '''decomposed''' sequence of a base letter plus one or more non-spacing marks. For example, a Spanish <nowiki>’</nowiki>''' ñ''' <nowiki>’</nowiki> can be a self-contained, composed character, separate from the English 26-letter alphabet, or it can be made up of the two decomposed elements — a standard <nowiki>’</nowiki> '''n''' <nowiki>’</nowiki> and a tilde, <nowiki>’</nowiki> '''<nowiki>~</nowiki>''' <nowiki>’</nowiki> — sharing the same display space.
 +
 +
In theory, both methods should display the same. But in practice, the appearance of composed and decomposed characters can vary depending upon what rendering engine and fonts are being used on the display end. The Library of Congress uses the decomposed sequence when creating a Unicode character.
 +
 +
In the example above, the <nowiki>’</nowiki>''' ñ''' <nowiki>’</nowiki> character contains two elements and must be either entirely composed or completely decomposed. There is no in-between state. In Korean, a character may contain several elements with multiple possibilities for combining the characters into composed subsets. We sometimes run into problems identfying the right level of composition/decomposition in the translation from MARC-8 to UTF-8 for authority matching and back to MARC-8 again for delivery to your system.
 +
 +
Our programmers have completed&nbsp;an enhancement that will allow characters, as they convert, to stop at the correct level of translation for MARC-8 compatibility.&nbsp;You should no longer see blank fields in your authority records that have&nbsp;Korean&nbsp;representations.&nbsp;However, you may run into other diacritic representations with similar problems. Please contact your project manager to bring this to our attention&nbsp;when this occurs.
  
 
===De-duplication===
 
===De-duplication===

Latest revision as of 10:07, 15 April 2009

Terminology… D

Death Date Fix

See Heading Tracker

Deblinding Cross References

LC Authority records are constructed so that they are naturally "self deblinding." This is in reference to the LC see reference in that the see reference (4XX) really points to the authorized heading. In addition to the See cross-reference, most local systems will also generate a See also display based on fields 550 and 150 in an authority record. If an associated authority record for the see also record does not exist then a blind record will exist. Standard authority control creation practice will not allow for a see also (5XX) reference unless the Authorized Version is also created. In automated authority control process, the authorized version of the see also reference does not automatically get delivered. The only way to deblind a see also cross-reference is to delete from the authority record or change the 5XX to a 4XX. The MARS 2.0 staff does not recommend removal or change of the 5XX tag. (mpg)

Decomposed vs Composed Characters

What we’re talking about here is decomposed characters and the process of translation between UTF-8 and MARC-8 formats.

At the 2007 ALA Midwinter meeting, the Library of Congress announced the change to UTF-8 as the internal data exchange format for their database. The standard change came from LC’s migration to a Voyager-based server environment. As a Unicode format, UTF-8 holds an advantage over MARC-8 in allowing a broader range of languages and characters.

Backstage Library Works followed LC’s lead in developing MARS 2.0 by making UTF-8 our internal data exchange format. If your ILS and institutional policies allow, we recommend that you utilize the Unicode capabilities and switch your data over to UTF-8 for greater compatibility with the Library of Congress standard.

If it is not possible for your library to convert to a UTF-8 data exchange, MARS does have the capability to receive and deliver data in MARC-8. However, translating between UTF-8 and MARC-8 can be problematic because there are multiple ways to represent some characters, particulaly in a character’s level of composition or decomposition.

Characters with diacritical marks can generally be represented either as a single, composed character or as a decomposed sequence of a base letter plus one or more non-spacing marks. For example, a Spanish ’ ñ ’ can be a self-contained, composed character, separate from the English 26-letter alphabet, or it can be made up of the two decomposed elements — a standard ’ n ’ and a tilde, ’ ~ ’ — sharing the same display space.

In theory, both methods should display the same. But in practice, the appearance of composed and decomposed characters can vary depending upon what rendering engine and fonts are being used on the display end. The Library of Congress uses the decomposed sequence when creating a Unicode character.

In the example above, the ’ ñ ’ character contains two elements and must be either entirely composed or completely decomposed. There is no in-between state. In Korean, a character may contain several elements with multiple possibilities for combining the characters into composed subsets. We sometimes run into problems identfying the right level of composition/decomposition in the translation from MARC-8 to UTF-8 for authority matching and back to MARC-8 again for delivery to your system.

Our programmers have completed an enhancement that will allow characters, as they convert, to stop at the correct level of translation for MARC-8 compatibility. You should no longer see blank fields in your authority records that have Korean representations. However, you may run into other diacritic representations with similar problems. Please contact your project manager to bring this to our attention when this occurs.

De-duplication

Also called authority record de-duplication. If any of the MARS 2.0 update special field conversions adds a field identical to a pre-existing field, the identical fields will be merged to one to ensure that headings in your bibliographic records will be unique. MARS 2.0 deduping compares heading text character by character. A 650 field with a second indicator of 0 and a 650 field with a second indicator of 2 are not considered duplicates. (mpg)

Deleted Authority Records

If one of the authority records (based on its control number) no longer exists in the national-level file, a delete has occurred (i.e., the national library has removed the record from the master copy of their authority file). The deleted authority records will have the Record Status (Leader byte 05) set to d. (mpg)

Descriptive Summaries

Information about a book gathered from the book dust jacket to further enrich your bibliographic record. It is part of the TOC service offered through Backstage Library Works. Typically Descriptive Summaries are added to the 520 field. (mpg)

Diacritics

A diacritic also called a diacritic or diacritical mark, point, or sign, is a small sign added to a letter to alter pronunciation or to distinguish between similar words. A diacritical mark can appear above or below a letter or in some other position. Its main usage is to change the phonetic value of the letter to which it is added; but it may also be used to modify the pronunciation of a whole word or syllable. A letter which has been modified by a diacritic may be treated either as a new, distinct letter or as a letter-diacritic combination. (wp)

Differentiated

Differentiated or differentiation is reached by adding qualifiers to an original authority heading to make the authority heading unique. The problem is most commonly found with name headings. Often a birth and or death date will be added to make the heading unique. (ac)

Direct Geographic Subdivisions

A direct geographic subdivision puts the narrowest term in the first order of the subdivision. An example would be a heading about Jefferson County Kansas would be constructed with Jefferson County in first position, $a or $z of the heading. (ac)

Direct-to-indirect geographic conversion

MARS 2.0 Authority Cleanup uses a table to convert direct geographic subdivisions to the indirect form. Changes are made by the direct-to-indirect subfield conversion program only when the invalid form is the entire text of the subfield $z and there is only one subfield $z in the heading: (mpg)

Direct Subdivision Changes to In Field / Subfield
$zParis $zFrance$zParis LC 6XX fields
$zJefferson Co., Kan. $zKansas$zJefferson County LC 6XX fields
$zJefferson County, Kan. $zKansas$zJefferson County LC 6XX fields

Disambiguation

The process of resolving conflicts that occur when a single term can be associated with more than one topic. (wp)

Downloading

Downloading describes the transfer of electronic data between two computers or similar systems. To download is to receive data from a remote or central system, such as a webserver, FTP server, mail server, or other similar systems. A download is any file that is offered for downloading or that has been downloaded. (wp)

Duplicate headings

Refers to the same heading being listed more than once. Duplicate headings can be found on a single bibliographic record after the automated process. The process will automatically de-dup them. (mpg)