XML (Extensible Markup Language) is the latest and greatest development in Web/Internet technology. It is being promoted as the lingua franca of that domain. It starts from a very simple idea, but soon expands into a very broad set of concepts, expectations, and technical resources. Thus, XML should be thought of as a perspective and a range of computing resources that can support several different client/server architectures and, potentially, a host of new applications.
Put simply, XML is a language for creating hierarchically-structured tags. These tags normally appear in an XML document. There are currently two basic types of documents: conventional texts, with XML tags interspersed in them, and data objects, such as a record from a database, represented in XML. Both types, however, are referred to as XML documents.
With regard to the first, tags can be embedded within the text of a conventional document, such as an HTML document, to mark and, thus, make easily recognizable key information. For example, one could mark the author and title, key concepts as they appear in the text, definitions, etc. Such documents are often intended to be read by a human being.
With regard to the second, there is typically no text outside the context of an XML tag. Tags identify types of information, fields, or other structured information. These documents are usually intended to be read by a program and not a human being.
The discussion here will focus on the second type of XML document.
While the potential uses for XML-structured information is unbounded, any given use is likely to begin with one of two basic processes. Either the document is going to be parsed, and the user's program or an application works with the resulting parse tree, or it is going to be transformed into another document using an XSL stylesheet and an XSL Transformation utility. The transformation process is discussed in several subsequent lessons: see Java XSL and Java XML/XSL MVC.
There are numerous on-line articles and tutorials dealing with XML. One of the most visionary discussions is Tim Berner-Lee's Scientific American paper, The Semantic Web, co-authored with James Hendler and Ora Lassila. (Additional information is available on W3C's Semantic Web project at their Web site.) For tutorials, a query for "XML tutorials" is likely to turn up more than you will need. A couple that seem good are W3School's and Sun's (discusses XML in the context of Java). There are also numerous books on the subject. Two that I have found most helpful, personally, are the O'Reilly XML in a Nutshell, by Rusty Harold and Scott Means, and, particularly for XSL, XSLT: Working with XML and HTML by Khun Yee Fung (Addison-Wesley, 2001), but there are many others out there that I haven't looked at (or not closely).
For a list of XML software, a good place to start is the XML page at W3C. More suggestions on technical details are provided, below.
XML Document
The reader is referred to one of the tutorials for details of XML syntax. A simple example XML document is provided here.
XML Namespaces
Namespaces provide a mechanism to insure that tag elements are unique. A brief introduction to the concept, with an example, is provided.
XML Parsing
Basic processing of a DOM-like parse tree using JDOM..
XML Traversal
Traversing and processing existing parse tree.
XML Construction of Parse Tree
Building parse tree and generating XML String from it.
XML Schema
Introduction to XML schemas, based on a simple example.