October 6th, 2008

VB XML Cookbook, Recipe 6: Writing an XSLT Transform in VB (Doug Rothaus)

Most XSLT programmers are familiar with this XSLT transform to copy an XML file.

<?xml version=1.0 encoding=utf-8?>

<xsl:stylesheet version=1.0 xmlns:xsl=http://www.w3.org/1999/XSL/Transform>

    <xsl:output method=xml indent=yes/>

 

    <xsl:template match=@* | node()>

        <xsl:copy>

            <xsl:apply-templates select=@* | node()/>

        </xsl:copy>

    </xsl:template>

</xsl:stylesheet>

This XSLT is commonly used for identity transforms as it allows you to copy an entire XML document and “touch” each XML node and attribute. If you add a matching template, then you can transform just that attribute or node that has a match in place. Unmatched nodes and attributes are simply copied.

We can also do this in Visual Basic with XML Literals (including LINQ to XML and XML Axis Properties). Our VB code will allow us to “touch” each node or element by recursively navigating through an XML document based on the following pseudo-code:

Starting with the root element, perform the following whenever you encounter a node

If the node is an element

If the element has attributes, transform or copy each attribute

If the element has child nodes, transform or copy each node

If the node is text, transform or copy the text

If the node is CData, transform or copy the CData

If the node is a comment, transform or copy the comment

If the node is a processing instruction, transform or copy the processing instruction

 

For this cookbook entry, we’ll create an abstract (MustInherit) base class that performs this pseudo-coded recursive navigation of an XML document. We can then create a class that inherits from that base class to perform specific transforms. First, we’ll create the abstract class and the “starting point,” a function called Transform that takes the XML document (XDocument) to be transformed as input and returns the transformed document.

 

Public MustInherit Class VBXmlTransform

 

  Public Overridable Function Transform(ByVal xmlDoc As XDocument) As XDocument

    Return <?xml version=1.0 encoding=utf-8?>

           <%= ProcessElement(xmlDoc.Root) %>

  End Function

 

End Class

 

Next, we add the logic that is called for each XML node (XNode) encountered. This includes elements, text, CData, and so on. Our code needs to determine the type of XML node and call the related function to transform or copy the node type and return the result, which is either a copied or transformed node. This method is called ProcessNode and is shown here.

 

  Public Overridable Function ProcessNode(ByVal xmlNode As XNode) As XNode

    ‘ This method ignores DTD (XDocumentType) content.

 

    Dim nodeType = xmlNode.GetType()

 

    ‘ Because XCData inherits from XText, check for the XCData type before checking

    ‘ for XText.

    If nodeType Is GetType(XCData) Then Return ProcessCData(xmlNode)

    If nodeType Is GetType(XText) Then Return ProcessText(xmlNode)

    If nodeType Is GetType(XElement) Then Return ProcessElement(xmlNode)

    If nodeType Is GetType(XComment) Then Return ProcessComment(xmlNode)

    If nodeType Is GetType(XProcessingInstruction) Then Return _

      ProcessProcessingInstruction(xmlNode)

 

    Return xmlNode

  End Function

 

Next, we can add the strongly-typed functions that process each of the node types as well as attributes. The function to process an element is unique, so we’ll leave that out for now and cover that next. The functions to process the other node types and attributes are rather simple. Because the default behavior of the base class is to simply copy a document, each function just returns the input value. The reason that we have created this code is to provide strongly-typed functions that we can override in our inheriting class with specific behavior. Here are the strongly-typed functions (without the ProcessElement function).

 

  Public Overridable Function ProcessAttribute(ByVal xmlAttribute As XAttribute) As XAttribute

    Return xmlAttribute

  End Function

 

  Public Overridable Function ProcessCData(ByVal xmlCData As XCData) As XCData

    Return xmlCData

  End Function

 

  Public Overridable Function ProcessText(ByVal xmlText As XText) As XText

    Return xmlText

  End Function

 

  Protected Overridable Function ProcessComment(ByVal xmlComment As XComment) As XComment

    Return xmlComment

  End Function

 

  Public Overridable Function ProcessProcessingInstruction( _

    ByVal pi As XProcessingInstruction) As XProcessingInstruction

 

    Return pi

  End Function

 

Now let’s look at the ProcessElement function. Processing elements is unique because elements can have both attributes as well as child nodes. Those attributes and child nodes need to be transformed or copied as well, so we must provide code that calls the ProcessAttribute function for each attribute, and calls the ProcessNode function for each child node. We’ll encapsulate this code in a function called CopyElement. The ProcessElement function will look like the other strongly-typed functions, except that it will return a call to the CopyElement function instead of just the input element. The CopyElement function uses XML Literals, embedded expressions, and LINQ to XML to create the copy of the XML element as shown here.

 

  Public Overridable Function ProcessElement(ByVal xmlElement As XElement) As XElement

    Return CopyElement(xmlElement)

  End Function

Author

0 comments