This series of XSLT posts are designed as a learning diary, to aid my Continuing Professional Development which is a requirement for fellowship status in the ISTC. I will be posting about my learning and in this post, I cover something that caused significant confusion at first – XML namespaces and how XSLT writers can use them.
To learn XSLT, I am using the Ucademy course https://www.udemy.com/practical-transformation-using-xslt-and-xpath, by Ken Holman. It was recommended to me by members of the DITA yahoo group.
I’m now through the introduction and have completed Module 1. So far I’ve learned:
- As an XSLT writer, you need to work out the ‘tree’ of the XML elements.
- The ‘tree’ is not the same as the schema. The schema sets the rules for what is permitted; the tree is the actual structure of the XML (and may be invalid).
- The ‘tree’ is built of elements and nodes. Default values for the nodes may be declared in the XML (at the top, before the ‘tree’ elements, similar to defining parameters in other languages).
- When the ‘tree’ is transformed using the stylesheet, the transformed ‘tree’ is built depth-first, breadth next (i.e. top-to-bottom on the left, then back to top, top-to-bottom on next left, and repeat until tree is complete).
- Namespaces can be used to add identifiers to elements.
I’d never say that I’m coasting through this, as at various times, I find myself getting caught out by similar-sounding terminology or concepts being mentioned that aren’t explained until I get to later modules. But it’s not as hard as I feared. The only thing I have really got stuck on, and apparently, it is something that lots of XSLT writers find tricky, is Namespaces.
What follows is a summary of Namespaces that I know I will need to refer back to at various times. If you find it useful, great. But I’ll be honest, this post is written with my future self in mind…I know I’m going to forget this!
What is a Namespace?
Namespaces are used with prefixes to act as a tag for elements in XML. They allow us to give identifiers to XML elements, so that we can differentiate between XML elements that have the same label name. For example, you could have an XML project that uses an element called <set>. Let’s say you then have to import some other XML files that also use <set>, but in a different context. How can we differentiate between the two? We can’t, unless we use namespaces and prefixes.
The key points with namespaces are:
- Namespaces are used to differentiate between two elements that use the same label name.
- Namespaces are only used when searching for content…they let you find only those elements that have a specific namespace label. It is just a way of adding a custom tag to elements that use the same term, so that you can differentiate between them.
- There is a convention to use a link to the XML creator’s domain as the namespace, as this should not be used by other organisations. Sometimes, an actual url is used that points to documentation about the XML (but there is no actual link, it is just so that people can see it and use it in a browser).
What is the Confusion with Namespaces?
The confusion with namespaces comes from the convention used in the namespaces – they frequently contain links, but they are not working links, they are simply plain text. The ‘link’ text is just an identifier, and they allow XSLT writers to search for elements that have that identifier, which is vital if different types of element have the same name.
Why are links used as the plain text identifiers? There are two reasons:
- It is expected that if you are transforming XML, you have access to a domain. That domain is unique, and so won’t clash with the domains of other XML vocabularies.
- The link text used is often the address for a documentation page. XSLT writers can copy the link into a browser and access the documentation page for more information.
So that’s all there is to it. Think of namespaces as an extra ‘badge’ that you can add to elements to differentiate them from other elements with the same name. There is nothing else going on, even though, to the naked eye, it looks like the XML processor will follow the ‘links’ and do something. It doesn’t. To the processor, the namespaces are just text and are not followed in any way.