THESE PAGES MAKE USE OF JAVASCRIPT.
SOME FUNCTIONALITY WILL BE LOST IF JAVASCRIPT IS SWITCHED OFF.
[Pg 1 of 1]
DOM
Tree
DOM, the so-called Document Object Model, is another name in this context for how XML can be read into memory as a tree/hierarchy and manipulated by a script or program. When an HTML page is loaded, the same is done. The browser is programmed to create an internal 'tree'/hierarchical representation, and certain properties and methods are available to read or modify parts of this internal page hierarchy. Those made available would be called the - application programming interface. But generally, what's available for use is what most term the DOM, itself. So, it's the DOM.
The 'branches' and 'leaves' of the 'tree' are called - nodes. The same is true if XHTML is used. And the same is true if XML is read in directly, say just as some raw data. A tree is still a tree. So once the internal representation is created, then, Visual BASIC, or javascript, or whatever is to be used can be used to build or modify the XML tree, node by node. Entire new bits can be loaded externally using an 'AJAX' command invented by Microsoft (long before even they knew what to do with it). One can create a blank tree, even, and build it up node by node. If you take a Microsoft Office document, Visual BASIC for Applications (VBA) can then be used to save the modified XML. So the XML document can be cut and spliced, that is, after being read in and made into this 'tree', according to a particular DOM. Parts of the tree can be programatically located and changed with built-in routines. And that's the DOM, why the DOM, and what you can do with it. You can completely change the page around using script, as seen on this very website.
The 'branches' and 'leaves' of the 'tree' are called - nodes. The same is true if XHTML is used. And the same is true if XML is read in directly, say just as some raw data. A tree is still a tree. So once the internal representation is created, then, Visual BASIC, or javascript, or whatever is to be used can be used to build or modify the XML tree, node by node. Entire new bits can be loaded externally using an 'AJAX' command invented by Microsoft (long before even they knew what to do with it). One can create a blank tree, even, and build it up node by node. If you take a Microsoft Office document, Visual BASIC for Applications (VBA) can then be used to save the modified XML. So the XML document can be cut and spliced, that is, after being read in and made into this 'tree', according to a particular DOM. Parts of the tree can be programatically located and changed with built-in routines. And that's the DOM, why the DOM, and what you can do with it. You can completely change the page around using script, as seen on this very website.
Differences
It was said there is - a - DOM. The W3C has attempted to standardize a DOM, to make - The DOM. There can be only one, as it were. The web pages, say, would be organized in one way, properties would be this and that, methods would be this and that. But it's not that simple. Those who make the browsers have their own ideas, particularly Microsoft, although most are now converging onto that suggested by W3C. But cross-browser compatibility remains as much of an issue as it was in the 1990s. It's the same old story as from the Netscape and IE browser 'wars' days, with some implementing some functions, another implementing another, with variations even in those functions which they share in common. It becomes very browser specific. Microsoft might have its own DOM. The W3C might propose it's ideal DOM #1. And those putting together the Firefox or Opera browsers might pick up some of the W3C 'standard', or not. And of course there have always been differences with the 'webkit' Apple Safari browser and Microsoft's Apple port of I-Explorer, from all the rest. In other words, with four or five major browsers, and with Microsoft's Explorer, at least, still being used by people in two or three versions, a DOM for one is not quite the same as for another. So again, it can add MUCH complication when writing a javascript to accommodate all of these.
MSXML
MSXML
Conversion to a tree, manipulation, saving back out, doesn't just happen. With XML, specifically, of course this requires some sort of program or set of libraries to perform the translation back and forth and to make changes when converted to this internal DOM. They can be called 'parsers', 'core', whatever. But one frequently used 'parser' is from Microsoft, called msxml.dll, in versions 2.5, 3, 4, etc, even a version for .NET. Again, if using Microsoft Office, for example, then references are set in that Office program to this mxsml library, generally. And then a host of methods and properties become available for manipulating a tree, or nodeset, stored in memory and either taken from the original XML document and/or built node by node. Msxml installs with newer operating systems, like XP, but also even just with any installation of Internet Explorer 5 on up, I believe. Versions, and documentation, are available at msdn.com, for separate download or online viewing.
Add Node
Node
This following might be an example, using Visual BASIC for Applications (VBA), in something like Microsoft Access (the database and database 'front-end'). A DOM-document is created, but the IXML-type node is used. Elements are created to stick onto the document. But the important part is attaching them.
Dim xmldoc As New DOMDocument
Dim nodData As IXMLDOMNode, nodRoot As IXMLDOMNode, nodFile As IXMLDOMNode
Set nodRoot = xmldoc.createElement("root")
xmldoc.appendChild nodRoot
sAppendNode xmldoc, nodData, nodRoot, "CatalogName", strTitle
Set nodFile = xmldoc.createElement("file")
nodRoot.appendChild nodFile
sAppendNode xmldoc, nodData, nodFile, "sourcepath", strCataTplate
Reference
This starts off with a couple of type declarations: xmldoc As New DOMDocument, and nodData As IXMLDOMNode. The msxml, itself, was set up so that a reference is not explicitly needed. Using javascript to manipulate a web page, this wouldn't be the case. But in Microsoft Office and development programs, the reference to libraries like msxml.dll is established, beforehand. Then within a program module for the database or the Word document, or whatever, an xml document object (a DOMDocument) is simply called out as Newwhich then creates an instance ready to use. The IXMLDOMNode is the unlikely name for a node in an xml document - it's not "node", or "domnode", but this long, IXMLDOMNode.
_Append
And sAppendNode is a custom routine:
Public Sub sAppendNode(xmldoc As DOMDocument, _
nodNew As IXMLDOMNode, nodSup As IXMLDOMNode, _
nodName As String, nodText As String)
Set nodNew = xmldoc.createElement(nodName)
nodNew.Text = nodText
nodSup.appendChild nodNew
End Sub
Append_
The subroutine, sAppendNode, uses a shortcut DOM command, createElement, to simply take the name of nodName and create an unattached node called nodNew. So if nodName = "file", then you now have an element, called file, which exists by itself, but which is ready to be attached to any tree. Once attached, it is referred to as the node, nodNew. Some text for this file tag/element may also be passed as, nodText. The Text property of the element is used to store that. If the file is, say, "e:/windows/system/x.dll", then the xml for this would be - <file>e:/windows/system/x.dll<file>, just like that. And finally, the appendChild method is used to attach this nodNew as a child of the node, nodSup. Now it's part of the tree, and of the XML document.
Root
There can only be one root node in an xml document - or more properly, one root element - even if it's not called, root. It is, here, though. The first append is right to the new xmldoc, itself. So you have an xml document consisting of a single root node. All further appends will be to the root, or something under the root. The first call to the sub creates an element, under root, called here, CatalogName, with some text for that name in the variable, strTitle. Similarly an empty element, file, is appended to root. But under file, another element called, sourcepath, with presumeably some URL or local drive path in the string, strCataTplate.
Save
You can generate the xml document, itself with the simple command - xmldoc.Save filename.xml.
You don't even have to call it .xml, but it would be best.
More
It's that simple, to construct an xml document using DOM.
A few more additions to the structure, and you could see something like:
- <root>
<CatalogName>Main</CatalogName>
- <file>
<sourcepath>E:\templates\xslt\catalogs\default.xsl</sourcepath>
<targetpath>E:\catalogs\</targetpath>
<filename>Catalog.htm</filename>
</file>
</root>
Attribute
Add
Adding an attribute is slightly different. Again using VBA and as with the sub, above, you need some reference to your DOMDocument, the name of the attribute and its value. Importantly, the way DOM handles it - the element is superior to its attributes. The attribute is found at the next level into the heirarchy - that is, say for element IMG, in Xpath the SRC attribute would be found as - IMG/@SRC.
Public Sub sAddAttr(strAttrName As String, strAttrVal As String, nodT, xmldoc As DOMDocument)
Dim nodAttr
Set nodAttr = xmldoc.createAttribute(strAttrName)
nodAttr.Text = strAttrVal
nodT.Attributes.setNamedItem nodAttr
End Sub
More
Obviously, there's a great deal more to the DOM than just how to create and attach elements and attributes. But that's an important part of using the DOM, and may be all you need, at first.
LINKS
| (Robin) Cover Pages |
Cover's DOM links |
| Inet.com DOM |
Multi-page tutorial intro to DOM |
| Microsoft Help |
Microsoft's original compiled help file for msxml 4.0, with DOM reference (click: download de SDK) |
| Microsoft msxml 4 SP2 |
Service pack replacement for msxml 4.0 and SP1 |
| TOP XML Microsoft DOM |
Online copy of Microsoft XML DOM reference |
| W3 Schools |
Multi-page tutorial on DOM |
| W3C DOM |
WWW Consortium updated specs and explanation of DOM |
| Apache Xerces |
Xerces XML parser for Apache servers |
| TOP XML parsers |
Downloads of various Microsoft msxml libraries/parsers, even some older versions no longer available from Microsoft |