XML and XSLT¶

About¶

This style guide covers the use of both XML and XSLT.

XML¶

XML is the primary serialization format we use in DLTN for metadata exchange. This is driven by the fact that our workflows are primarily based around OAI-PMH.

XSLT¶

XSLT (Extensible Stylesheet Language Transformations) is the primary language we use for transforming XML documents to DLTN MODS. This is mainly driven by the fact that we use Repox as our aggregation platform.

Note: Repox’s XSL processor is built on Saxon 8.7. Therefore, all XSLT needs to be tested versus 8.7 rather than the current version.

For help getting started see:

Reading and Writing DLTN XSL Transforms

MODS XML Order of Elements¶

Rule¶

When serializing a new MODS record, elements should follow the order defined below. This order is loosely based on how corresponding elements within MARC are positioned within a record, minus the emphasis on authorship (100 field).

identifier

[@type=”local”]
[@type=”issn/isbn”]
[@type=”extension”]
[@type=”filename”]
[@type=”pid”]

titleInfo

title
[@supplied=”yes”]:title
[@type=”alternative”]:title

abstract

tableOfContents

name

namePart
role:roleTerm

originInfo

dateCreated
dateCreated[@type=”edtf”]
dateIssued
dateIssued[@type=”edtf”]
dateOther
publisher
place:placeTerm

physicalDescription

form
extent
internetMediaType
digitalOrigin

note

subject

topic
name
geographic

genre

language:languageTerm

typeOfResource

classification

relatedItem[@displayLabel=”Project”][@type=”host”]:titleInfo:title

relatedItem[@displayLabel=”Collection”][type=”host”]:titleInfo:title

location:physicalLocation

recordInfo:recordContentSource

accessCondition

Implicit vs. Explicit Processing¶

Rule¶

For all new XSL transforms, we create XSL transforms that are implicit and based around an identity transform. A simple way to think about an identity transform is as a generic XSL transform that copies the input XML document to the output. Once we copy everything with our identity transform, we create any needed templates to create the final XML that we want.

Justification¶

When we first started our service hub, we wrote explicit transforms. They were long, difficult to read, and fragile. Whenever we needed to modify a template, refactoring was expensive.

We find implicit transforms to be less verbose, easier to read, and much easier to modify and extend.

Example¶

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:lyncode="http://www.lyncode.com/xoai"
xmlns="http://www.loc.gov/mods/v3" exclude-result-prefixes="#all"
xpath-default-namespace="http://www.lyncode.com/xoai" version="2.0">

    <!-- copy all incoming metadata with an identity transform -->
    <xsl:template match="@* | node()">
        <xsl:copy>
            <xsl:apply-templates select="@* | node()"/>
        </xsl:copy>
    </xsl:template>

    <!-- Match where the incoming records start -->
    <xsl:template match="lyncode:metadata">
        <!-- Start building our MODS -->
        <mods xmlns="http://www.loc.gov/mods/v3" version="3.5" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-5.xsd">

            <!-- Serialize our title -->
            <xsl:apply-templates select="element[@name = 'dc']/element[@name = 'title']/element/field"/>

        </mods>
    </xsl:template>

    <!-- Our title template -->
    <xsl:template match="element[@name = 'dc']/element[@name = 'title']/element/field">
        <titleInfo>
            <title>
                <xsl:apply-templates/>
            </title>
        </titleInfo>
    </xsl:template>

</xsl:stylesheet>

Namespacing¶

Rule¶

Stylesheets can have default namespaces for both XML and XPATH.

xsl:param values should be namespaced to avoid collision with xpath-default-namespace.

Justification¶

XSL is verbose. Verbosity makes things hard to read. Therefore, it’s okay to use default namespaces for your xml and xpaths.

This practice often causes collisions. Therefore, namespace things like xsl:param values so that things just work without a lot of deep thinking.

Example¶

<?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:lyncode="http://www.lyncode.com/xoai"
    xmlns:dltn = "https://github.com/digitallibraryoftennessee"
    xmlns="http://www.loc.gov/mods/v3"
    exclude-result-prefixes="#all"
    xpath-default-namespace="http://www.lyncode.com/xoai" version="2.0">

    <!-- output settings -->
    <xsl:output encoding="UTF-8" method="xml" omit-xml-declaration="yes" indent="yes"/>
    <xsl:strip-space elements="*"/>

    <!-- includes and imports -->

    <!--
    Collection/Set = Crossroads Friends and Family
    -->

    <!-- Types -->
    <xsl:param name="pType">
        <dltn:type string="moving image">Video</dltn:type>
        <dltn:type string="text">Text</dltn:type>
        <dltn:type string="sound recording">Sound</dltn:type>
    </xsl:param>

</xsl:stylesheet>