Dedupe 5.1: Keep Which Record
When comparing 2 records, the program will decide which record to keep based upon the options chosen in this section.
Keeps the first record in a file (or group of files). Any subsequent matching records will be merged into the original record. When a record is read into the deduping process, if it does not match any previously read records, it becomes the base record into which all future matches will be merged.
This option will retain the record with the latest processing date according to the 005. If one of the records doesn't have an 005 it compares the 008's date (positions 0 through 5). If one of the records doesn't have an 008 it keeps the record with an 005. If neither record has an 005 it keeps the record with an 008. If neither record has an 005 or an 008 it keeps the base record.
This option is the opposite of the Base option. If you have a database of 1000 records being sent through the deduplication process and record 1 and record 200 are duplicates, then record 200 will be retained regardless of which record might be better.
Compares the original record with the potential match. The best match is determined to be the one with the highest count of a given field in the record, which may not necessarily represent the largest record.
This can be used if you want to retain records with the highest count of 650 fields, or 5XX fields.
Compares the original record with the potential match. The best match is determined to be the one that has the most data, which may not necessarily represent the one with the most fields.
Record A has a lot of data in a 5XX field, but has no 6XX fields Record B has limited data in the 5XX field along with 6XX fields
Record A will be the retained if it has more data in total number of characters/bytes.
Within the Largest option, a field(s) can be chosen to be omitted from the largest equation. For the example above, if 5XX were chosen to be ignored, then it would retain Record B, barring another field. This allows for the 9XX and holdings fields to be ignored.
- Keep Largest Record and ignore the 9XX fields in determining largest record
1.0 - 2.0 - 3.0 - 4.0 - 5.0 - 6.0