Notes on the prototype and testing environment
1. A diagram of the flow of transformations
2. An XML schema for MARC rules
The purpose of the XML scheme for MARC is to provide a formalism for the representation of MARC rules and human
oriented information currently held in MARC manuals. The scheme will allow not only HTML, PDF, Windows HELP
version of the MARC manual, all generated automatically, but also the production of stylesheets for record
transformation with validation or decoding purposes.
See file FORMAT.html
3. Generation of valitation stylesheets
The RecordValidator.xsl stylesheet is automatically generated by the RecordValidatorGenerator.xslt
stylesheet. This stylesheet uses the information contained in the sample subset of the UNIMARC manual files to build
validation rules to create the validation stylesheets.
The RecordValidator.xsl has the following features:
- It's automaticaly generated from an XML version of the UNIMARC manual.
- Handles positional fields.
- Checks for mandatory fields.
-
Creates variables to hold the content of the current field and the value of the leader. These variables are used for
tests to decide which rules to apply.
- Handles ranges of legal values in positional fields tests and indicator values.
- Uses vocabularies of acceptable values in positional field testing.
- Adds templates for positional fields validation:
- validatePSubfield
- checkPSubfieldValue
- checkRangeCode
- generatePSubfieldError
4. Generation of decoding stylesheets
The HTMLFormater.xsl stylesheet is automatically generated by the HTMLFormaterGenerator.xslt .
This stylesheet uses the information contained in the sample subset of the MARC manual files to create the decoding
stylesheets.
5. HTML formating of rules
The FORMATtoHTML.xsl stylesheet transforms the XML version of the sample subset of the MARC manual in an
HTML document for referencial purposes.
6. How to use the sample set of stylesheets
Requires Java VM 1.3 or later.
At the DOS prompt or Linux/Unix terminal:
- Decompress the archive into a directory. This will be called the MARC_DIR.
- Change dir to MARC_DIR.
- Three sub-directories have been created:
- bin : java libraries and two short scripts for testing: transform.sh (Unix) and transform.bat (Windows)
- doc: documentation, specially these notes and representation of the scheme in file FORMAT.html
- src
- src/schemas: Schema for the MARC manual
- src/stylesheets
- FORMATtoHTML.xsl: Generates an html version of the UNIMARC manual
- HTMLFormaterGenerator.xslt: Generates a MARC to HTML stylesheet
- RecordValidatorGenerator.xslt: Generates a stylesheet for record validation
-
src/xml: Sample records and the
FORMATDescription.xml file that describes the localization
of the sample UNIMARC manual files
- src/sml/manual: Sample subset of the UNIMARC manual
- Examples (run from MARC_DIR):
The transform.sh and transform.bat are simple scripts that call the command line
interpreter of the Saxon java XLST processor. They take three arguments: XML document, XLS stylesheet, output
file name. They will apply the stylesheet to the XML document and save the result in the output file.
- Generate a HTML version of the MARC manual:
./bin/transform.sh src/manual/Unimarc0.xml src/stylesheets/FORMATtoHTML.xsl output/Unimarc0.html
- Generate a Stylesheet for decoding a record in HTML
-
./bin/transform.sh src/xml/FORMATDescription.xml src/stylesheets/HTMLFormaterGenerator.xslt output/HTMLFormater.xsl
- Now
HTMLFormater.xsl is able to render a record in HTML:
./bin/transform.sh src/xml/Record.xml output/HTMLFormater.xsl Record.html
- Generate a validation stylesheet from the MARC manual:
-
./bin/transform.sh src/xml/FORMATDescription.xml src/stylesheets/RecordValidatorGenerator.xslt
output/RecordValidator.xsl
- Now
RecordValidator.xsl is capable of validating specific records
./bin/transform.sh src/xml/Record.xml output/RecordValidator.xsl output/RecordErrors.xml
7. Additional examples
This set of additional examples is intended to demonstrate the RecordValidator.xsl capabilities
regarding error detection.In order to test these examples several errors are going to be inserted in the
Record.xml file. Use the command:
./bin/transform.sh src/xml/Record.xml output/RecordValidator.xsl output/RecordErrors.xml
-
Test mandatory control fields and unknown fields.
Replace the 001 control field with:
<controlfield tag="999">450981</controlfield>
Errors:
<error type="MandatoryControlfield" tag="001"/>
<warning type="UnknownControlfieldTag">
<controlfield xmlns="http://www.loc.gov/MARC21/slim" tag="999">450981</controlfield>
</warning>
-
Test subfield codes.
Replace the 200 data field with:
<datafield tag="200" ind1="1" ind2=" ">
<subfield code="a">UNIMARC</subfield>
<subfield code="y">Manual de operações</subfield>
<subfield code="f">Fernanda Maria Guedes de Campos, José Carlos Sottomayor</subfield>
</datafield>
Errors:
<error type="InvalidSubfieldCode" tagID="d0e49">
<code>y</code>
</error>
-
Leader contents
Replace the leader content with:
<leader>00614kam 22002051 450 </leader>
Errors:
<error type="Leader" domain="RECORD-STATUS" start="6" length="1">
<invalid>k</invalid>
<content>00614kam 22002051 450 </content>
<valid-options>
<OPTION xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
name="Corrected Record" value="c"/>
<OPTION xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
name="Deleted Record" value="d"/>
<OPTION xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
name="New Record" value="n"/>
<OPTION xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
name="Previously Issued Higher Level Record" value="o"/>
<OPTION xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
name="Previously Issued as an Incomplete, Pre-publication Record" value="p"/>
</valid-options>
</error>
|