RDB PRIME!
Engineering
Home
Research Paper(s)
Resume
Technology Items
Site Map
Site Search
 
 It is 21:48 PST on Thursday 04/25/2024

UNIVERSITY OF MARYLAND
UNIVERSITY COLLEGE

myUMUCWebTycho


Web Engineer Bob Betterton
RDB PRIME INC.

Thoughts and Ideas About XML

August 20, 2003

Robert D. Betterton




XML is Not HTML

If you've had some experience writing HTML documents, you should pay close attention to XML's rules for elements. Shortcuts you can get away with in HTML, like forgeting a closing tag, are not allowed in XML. Some important changes you should take note of include:

  • Element names are case-sesnsitive in XML. HTML allows you to wirte tags in whatever case you want.
  • In XML, container elements always require both a start and an end tag. In HTML, on the other hand, you can drop the end tag in some cases.
  • Empty XML elements require a slash before the right bracket (i.e., <examle/>), whereas HTML uses a lone start tag with no fiinal slash.
  • XML elements treat whitespace as part of the content, preserving it unless they are explicitly told not to. But in HTML, most elements throw away extra spaces and lin breaks when formitting content in the browser.

Unlike many HTML elements, XML elements are based strictly on function, and not on format. You should not assume any kind of formatting or presentational style based on markup alone. Instead, XML leaves presentation for stylesheets, which are separate documents that map the elements to styles.

Unlearning Bad Habits

Whereas HTML browsers often ignore simple errors in documents, XML applications are not nearly as forgiving. For the HTML reader, there are a few bad habits from which we should first dissuade you:

Attribute values must be in quotation marks -- You can't specify an attribute value such as <picture src=/images/blueball.gif>, an error that HTML browsers often overlooked. An attribute value must be inside single or double quotation marks, or the XML parser will flag it as an error. Here is the correct way to specify such a tag:

<picture src="/images/blueball.gif">

A non-empty element must have an opening and closing tage -- Each element that specifies an opening tag must have a closing tag that matches it. If it does not, and it is not an empty element, the XML parser generates an error. In other words, yu cannot do the following:

<Paragraph>
This is a paragraph
<Paragraph>
This is another paragraph.

Instead, you must have an opening and closing tag for each paragraph element:

<Paragraph>This is a paragraph.</Paragraph>
<Paragraph>This is another paragraph.</Paragraph>

Tags must be nested correcly -- It is illegal to do the following:

<Italic><Bold>This is incorrect</Italic></Bold>

The closing tag for the Bold element should be inside the closing tag for the Italic element, to match the nearest opening tag and preserve the correct element nesting. It is essential for the application parsing your XML to process the hierarchy of the elements:

<Italic><Bold>This is correct</Bold></Italic>

These syntactic rules are the source of many common errors in XML, especially given that some of this behavior can be ignored by HTML browser parsers. An XML document that adheres to these rules (and a few others) is said to be well-formed.


Back | Home | Top | Feedback | Site Search


E-Mail Me

This site is brought to you by
Bob Betterton; 2001 - 2011.

This page was last updated on 08/27/2003
Copyright, RDB Prime Engineering



This Page has been accessed "2293" times.