RDB PRIME!
Engineering
Home
Research Paper(s)
Resume
Technology Items
Site Map
Site Search
 
 It is 22:18 PST on Thursday 03/28/2024

UNIVERSITY OF MARYLAND
UNIVERSITY COLLEGE

myUMUCWebTycho


Professor Jon W. Mckeeby
Internet Technologies
Course MSIT660 Section 9041

XML, the business-to-business internet application builder. A way of creating web application gateways.

November 07, 2001

Robert D. Betterton








* ABSTRACT

The technology of XML™ (Extensible Markup Language) when used to design internet applications can accommodate and aid in the communication of many heterogeneous systems. This can be done at a low cost, and a reduced learning curve. When compared with present day electronic data interchange (EDI) systems such as ANSI X12, ISO 9735, and UN/EDIFACT the difference can be substantial. XML will lead the way in helping many small business's and large corporations in their "data interchange" over the internet to extranet connections. XML technology will reduce the cost to bring business customer/partner databases together. XML offers and allows improved business-to-business inter-application communication. The XML metastructure will allow the data structures within databases to communicate with each other without worrying about the underlying database structures them-selfs. It also provides a bridge between structured and unstructured data. This paper will focus on XML and its capabilities for making this data interchange possible over the internet. What does it take to make this data interchange with XML possible, why all the fuss, why all the bother, why should we care as IT professionals?

XML is undergoing explosive growth as the industry standard for data exchange. It's the cornerstone of many internet technologies including Microsoft's .NET initiative. XML has broad industry support with tools like editors, browsers and parsers. XML is one of the most important developments in document syntax in the history of computing. In the last few years it has been adopted in fields as divers as law, aeronautics, finance, insurance, robotics, multimedia, hospitality, travel, art, construction, telecommunications, software design, agriculture, physics, journalism, theology, retail, and medieval literature.

XML is a way to build Corporate Portals, also called Enterprise Portals (EPs) or Enterprise Information Portals (EIPs). These portals are based on Data Warehousing technologies, using Metadata and XML to integrate both structured and unstructured data throughout the corporation. To facilitate this concept of business-to-business, business-to-customer, customer-to-customer, and application-to-application inter-application communication, XML and metadata structures are used to build a communication structure or gateway between these dissimilar systems, databases, and applications. The Extensible Markup Language (XML) is a new and exciting internet technology that has been developed to address these problems as will be demonstrated in this white paper.

Thus, XML will become the defacto inter-application, system-to-system, protocol method for communication over the Internet, intranet and extranet systems. XML will become the backbone for e-commerce and B2B enterprise solutions.




* INTRODUCTION

XML has created a quite revolution on the Internet. It has everyone's attention. It is the first truly portable data format that was designed for Internet and multi-language support. It's the best technology for solving the existing application integration problems many believe are inhibiting e-commerce. The number of applications for XML are limitless. Here are just a few of the areas that XML has gained momentum.

  • Business-to-business E-Commerce
  • XML vs. EDI
  • Catalogs and Multi-vendor catalogs
  • Data Warehousing and Archiving
  • Data Migration
  • Content Management
  • Corporate Portals
  • Application Integration
  • Meta Searching

As stated by the W3C [World Wide Web Consortium] XML is:

The shorthand for Extensible Markup Language, and is an acronym of Extensible Markup Language [XML]. The Extensible Markup Language (XML) is the universal format for structured documents and data on the Web.[1]

XML was conceived as a means of regaining the power and flexibility of SGML without most of its complexity. Although a restricted form of SGML [Standard Generalized Markup Language], XML nonetheless preserves most of SGML's power and richness, and yet still retains all of SGML's commonly used features.

While retaining these beneficial features, XML removes many of the more complex features of SGML that make the authoring and design of suitable software both difficult and costly. Also, XML is a growing standard with many technologies and initiatives, see table one in section two for XML Technologies and Initiatives.

* A LITTLE HISTORY:

All this talk about XML, begs the question: What is it anyway? And what about the jumble of abbreviations that cloaks XML from the curious eyes of an HTML coder or content author? How do XSLT, DTD, XLink, XPointer, XML Schema, XML namespace, XSL, and all the rest of the XML alpha-bit soup fit into the XML equation? It's good for you to wonder if it is really what folks say it is, or is it just another "Excellent Marketing Language." Why does every company in the Fortune 1000 seem to be catching the XML fire, and why are XML conferences selling out? [2]

As stated above, XML is a subset of the Standard Generalized Markup Language (SGML), that started from GML in 1969, when IBM was putting together a system for publishing legal information. SGML is the international standard meta language for markup. SGML was standardized by an ANSI group in draft form in 1980. By 1986, SGML grew into an international standard (ISO 8879:1986). [2]

XML uses tags just like HTML. XML, however, is also extensible, so it doesn't have a fixed set of tags such as HTML, which makes HTML inflexible. For example <HR>, for horizontal rule, <BR> for line brake, and <TABLE> for the start of a table structure. With XML you can design your own markup tags. More flexible than a fixed-format markup language such as HTML, the big winner for XML is that it adds context and gives meaning to data. However, XML uses tags only to delimit pieces of data, the custom tags help represent data logically. The interpretation of the data is left to the application that reads it, via the DTD or XML Schema. [3]

Figure one below shows a sample XML document:shipping address information for a customer named John Smith, for order number A9999.

myCustomer_XML


Using such a document, we can identify key relationships about different data items with respect to the entire "customer" entity. This document is self-describing, because tags describe the information it contains.

XML is simple because its rules for creating a markup language to encapsulate data are straight forward. For example, XML documents contain tags, with data stored between them as plain text. The tags usually come in pairs and can be nest to multiple levels. Similarly because XML data is stored as ordinary text, you can use a standard text editor to create and edit XML documents. An XML plus is XML's support for the Unicode standard, a character-encoding system that supports all major languages. Unicode support lets XML accept virtually all the characters used in the world. With this feature, this gives a huge benefit in developing applications that span national and cultural boundaries.

XML documents essentially have a rooted tree/logical structure as shown below, in figure two.

myCustomer_Logical_Root


For many applications, this structure is powerful enough to represent complex data, and writing software programs that manipulate tree structure data is not difficult.

A Document Type Definition [DTD] defines an XML document's legal structure, or grammar, which specifies: what markup tags are available, or defined; where they may occur, and; how they all fit together. Figure three shows a sample DTD for the customer XML document in figure one.

myCustomer_DTD


By definition, all XML documents must be well formed. This means that they must obey XML syntax. All elements must match, that is all elements potentially containing data must have both the start and end tags.

On one level, XML is a protocol for containing and managing information. On another level, it is a family of technologies, [see section two - XML Technologies and Initiatives] that can do everything from formatting documents to filtering data. And on the highest level, it is a philosophy for information handling that seeks maximum usefulness and flexibility for data by refining it to its purest and most structured form. [4]


* THE PROBLEM - PRESENT INTERNET/EDI SITUATION

Many Internet information systems are not able to communicate with each other at reasonable costs -- cost here means: dollars, time, systems integration, and learning curves, i.e., training. That is, EDI is perceived by many to have a high barrier of entry, not unlike SGML. It has been said that XML is to SGML what ebXML will be to EDI: a lowering or removing of barriers so that these standards [XML, and its Initiatives] are available to just about everyone. [2]

Present EDI (Electronic Data Interchange) systems -- are expensive, built around legacy systems architecture, have long learning curves, and do not easily interface between heterogenous systems. Lots of expertise is needed to make these systems a reality, and many small to mid-sized companies can not afford this type of data interchange.

Customers, for example, who come to many Web Sites are demanding the ability to search catalogs, based on product attributes such as the size of a hard drive in a given PC. Of course, the manufactures have all that information and are perfectly capable of supplying it electronically, but details tended to get filtered out of EDI versions of product data sheets that are trickled down through the distributors. One reason for this is that EDI traffic traditionally flowed over proprietary networks that charged users based on the amount of data transmitted. EDI users learned all sorts of tricks to strip out as much information as possible.[5] This behavior ocurred because they were trying to reduce the charges. However, the down side is that the information becomes skewed, and/or incomplete, causing uninformed decision making.

XML-formatted data feeds tend to provide much richer content, both because of the extensibility of the data structure and because transmissions aren't as heavily tariffed. XML can also address business processes that EDI doesn't address at all: such as returns and warranty claims.

* THE SOLUTION - XML

How is XML the solution -- what does it have over legacy EDI systems, or any other approach? Because XML promises to improve the way companies exchange and present information over the Internet, it is becoming popular with developers of next-generation business-to-business e-commerce applications. XML can benefit e-commerce by enabling back-end systems (databases) to communicate business transaction information. For example, business partners can standardize on a specific XML syntax that describes a purchase order, or a parts ordering system supplied by a web database application, and automate the information's transfer across the Internet. XML is ideal for building these systems because it allows formatting data for easy-to-process, platform-neutral exchange between business partners.

* SUPPORTING FEATURES AND TECHNOLOGIES [6]

Designers and developers are continually adding supporting features and technologies to XML. However, others have stated that this has added confusion to the XML debate. But if one studies the supporting technologies, one sees the need for them. But this is what XML is all about, an extensible mark up language, one that allows you to expand where needed and if it makes sense.

Table One: XML Technologies and Initiatives [2][6]

Technologies/Initiatives Description
XHTML Extensible Hypertext Markup Language is the result of rewriting HTML (version 4.0) as an XML application. XHTML creates a middle ground between HTML and XML. It will open up web access to more devices and increase the capabilities of devices that already office such access, such as cell phones, personal digital assistants, pocket PC's, and other miniature devices, see http://www.w3.org/TR/xhtml1
XSL Extensible Stylesheet Language lets you apply rules for formatting, including presentation format for example, font size), to XML documents. XSL can transform XML documents into different formats such as HTML, PDF, or even audio. Once XSL converts an XML document into HTML, you can view that document using any browser. XSL cna also transform one XML document into another, see http://www.w3.org/Style/XSL
XML schema An XML schema defines the elements that can appear in an XML document along with attributes and their default values, if any. It also defines the document's structure: the parent and child elements, the number of child elements, the sequence in which the elements can appear, and whether an element can be empty or include text. In addition, it can enforce data typing. An XML schema provides a more powerful mechanism than DTDs for describing an XML document's structure, see http://www.oasis-open.org/cover/schemas.html
XML namespace An XML namespace is a collection of element types and attribute names identified by a universal resource indicator (URI). Any element type or attribute name in an XML namespace can be uniquely identified by a two-part name: The URI of its XML namespace and its local name. An XML name space distinguishes between duplicate element types and attributes so that you can mix two or more XML languages in one document without any conflict or ambiguity, see http://www.w3.org/TR/REC-xml-names
XPointer The XML Pointer language supports making specific references to an XML document's internal structure. XPointer provides a mechanism to refer to elements, character stings selections, and other parts of the document, see http://www.w3.org/TR/WD-xptr
XLink XML Linking Language specifies constructs that you can insert into XMl documents to describe links between objects. XLink describes the simple unidirectional hyperlinks of today's HTML, as well as more sophisticated bi-directional, multidirectional, and typed links, see http://www.w3.org/TR/xlink
RDF Resource Description Framework (RDF) provides a uniform way to add metadata to an XML document and is an aid to organizing, categorizing, and cataloging online information. It is an approved recommendation of the W3C, and it beginning to catch on. It will likely be an important building block for the Semantic Web, which is only in the early stages of development, see http://www.w3.org/RDF/
XML Query XML Query will deliver a means to query and extract reports from documents. The mission of the XML Query working group is to provide flexible query facilities to extract data from real and virtual documents on the Web, therefore finally providing the needed interaction between the web world and the database world. Ultimately, collections of XML files will be accessed like databases, see http://www.w3.org/XML/Query
XForms HTML forms , while essential to e-commerce, have become outdated. The XForms initiative will update Web form technology so that, among other improvements, data from forms will be delivered as XML. The current design of Web forms doesn't separate the purpose from the presentation of a form. XForms, in contrast, are comprised of separate sections that describe what the form does, and how the form looks. This allows for flexible presentation options, including classic XHTML forms, to be attached to an XML form definition, see http://www.w3.org/MarkUp/Forms
XML Base In HTML, the base element lets you specify the document's base URI so that it can automatically resolve relative URIs. In XML, this will be done through the xlm:base attribute. The value of xml:base must be a URI. This used to be identified as XBase, see http://www.w3.org/TR/xmlbase
XInclude XML Inclusions or XInclude is an inclusion mechanism for merging XML documents. For example, you could include a separate XML document in another XML document. Many programming languages provide an inclusion mechanism to facilitate modularity. Markup languages also often have need of such a mechanism. The XInclude proposal introduces a generic mechanism for merging XML documents (as represented by their information sets) for use by applications that need such a facility. The syntax leverages existing XML constructs - elements, attributes, and URI references, see http://www.w3.org/TR/xinclude
XML Signature This initiative will permit the creation of digital signatures - a method for secure, online identification -- using XML. It is not yet fully approved, but it soon will be. This will go a long way in helping with the limitations and drawbacks mentioned about XML security. XML Signature is an XML compliant syntax used for representing the signature of Web resources and portions of protocol messages (anything reference-able by a URI) and procedures for computing and verifying such signatures, see http://www.w3.org/Signature
XML Infoset The XML Information Set (Infoset) provides a level of abstraction for XML document using a set of information items that make it easier for XML-related standards and applications to interoperate. This specification defines an abstract data set called the XML Information Set (Infoset). Its purpose is to provide a consistent set of definitions for use in other specifications that need to refer to the information in a well-formed XML document, see http://www.w3.org/TR/xml-infoset
Canonical XML Canonical XML provides a strict, streamlined way to represent XML documents. XML documents, while physically different, may be logically identical as long as they conform to the Canonical XML specification. The XML 1.0 Recommendation [XML] specifies the syntax of a class of resources called XML documents. The Namespaces in XML Recommendation [Names] specifies additional syntax and semantics for XML documents. It is possible for XML documents which are equivalent for the purposes of many applications to differ in physical representation. For example, they may differ in their entity structure, attribute ordering, and character encoding. It is the goal of this specification to establish a method for determining whether two documents are identical, or whether an application has not changed a document, except for transformations permitted by XML 1.0 and Namespaces in XML, see http://www.w3.org/TR/xml-c14n


* WHAT MAKES XML THE SOLUTION?

What will it take to make XML the solution for web inter-application communication that will lead companies to want to do business with each other via B2B, C2B, B2C, and A2A using XML?

It's been stated that the real pay-off is the ability to exchange a greater variety of information that is made possible by the extensibility of XML (supplemented at times with file attachments), and the ability to add real-time collaboration to the mix. Finding an interchange format that can be used for transfer of data between databases of different vendors and different operating systems was always difficult. That interchange is one of the major applications of XML.

Thus XML is very important in two classes of applications. Most applications of XML will fall into one of the following categories: documents or data exchange and database connectivity.

Thus, companies are increasingly turning to XML to address the issue of a common data exchange and/or document standard. XML has been instrumental in jump-starting a number of e-commerce and Web-based applications. XML, which lets companies exchange data over existing Internet connections, can be learned quickly (as described above), and is setup for easy display and programming. Because XML is self-describing; the tags describe the data enclosed within them, it lets companies enhance the efficiency of exchanged data, in any format they want to use. Users can even apply a style sheet to an XML file and view it precisely as they wish. The beauty of this concept is that you never need to change the actual XML data whenever you want to create output for different devices. You only need to use different pieces of software that know how to provide the output needed for a particular output format or piece of hardware, i.e. DTD's or XML Schemas. See the diagram below:[7]

myXML_For_Diff_Outputs


Because XML provides flexible document-definition and processing capabilities, it lets you reformat data for multiple devices and platforms. Because XML separates display instruction from content definition, Web designers can alter their Web Site's look and feel by using the Extensible Stylesheet Language [XSL] documents to apply different style sheets to the same XML document.

This then allows you to use the same content for devices such as Personal digital assistants or PDAs and wireless devices that do not use HTML for display processing. See figure four below.


myXML_Formatting_Using_XSL


XML allows online information search and retrieval to be fast and efficient. This is because XML documents store meta-information, i.e. information about information. If we look at each tag, determination of the data items is apparent. For example in the XML document above, the customer's first name, last name, address are very apparent. Search engines can use this meta-information feature to efficiently search and retrieve documents. For example, we could process search queries such as [find all documents where customer's last name is Smith]. This has a decided advantage over HTML.

One of the hottest application areas for XML is massaging, i.e., the seamless and efficient transfer of data between applications. That is inter-application communication between say two heterogeneous databases. Because XML is text based, all platforms can easily understand it. Thus XML, is a perfect medium for exchanging information between an organization across conflicting platforms.

* HOWEVER, XML DOES HAVE LIMITATIONS AND DRAWBACKS [maybe]

Despite the hype surrounding XML, it is felt that it isn't a one-stop solution for all application development issues. It is a poor choice for building internal stand-alone systems. This is true if there is no collaboration with other business when there should of been. However this really depends on the business rules and business requirements for any application. An internal system my be needed that does not collaborate with the rest of the business network. XML also falls short when security and efficient low-level communication are critical.

On the security side of things, the XML technologies and initiatives address the problem of security with the advent of XML Signature and DSML [Directory Services Markup Language] which is an XML vocabulary for defining, reading, and writing LDAP content. DSML takes advantage of XML while also leaning on the strengths of LDAP, in the areas of security and scalability.[2]

It has been stated that XML is limited in terms of the data types it supports. It is a text-based format and does not have facilities for directly supporting binary data or other complex data types. However, the counter to this statement is the use of XML Schema in place of the DTD. Where data types and complex data are supported. The XML Schema is one of the XML Technologies and Initiatives listed above. XML schema datatypes are int, float, date, boolean, and uriReference.

However, another counter to XML limitations and drawbacks is possibly the use of Electronic Business XML -- ebXML, or instead of using DTD's, make use of the XML Schema. Open interoperability and dynamic computing are key mechanisms in solving business problems, which is well described in the SUPPORTING FEATURES AND TECHNOLOGIES above.

The ebXML architecture defines the Business Process Specification Schema or BP Schema. This XML Schema supports the specification of business documents and transactions and the required choreography of these transactions that comprise a complete business collaboration.[8] ebXML enables enterprises of any size, in any location to meet and conduct business through the exchange of XML-Based messages.

The ebXML architecture defines a Collaboration Protocol Profile [CPP]. A CPP is an XML document that allows a party to express both the business collaborations in which the parties can engage and the quality and levels of service they can support in delivering their service. A CPP therefore defines the comprehensive set of capabilities of a single business context. A formal CPP description can be published and then universally understood by interested parties.[8] There is a lot more to ebXML, but it is beyond the scope of this paper. At the time of this writing, ebXML is not yet a well-established standard, nothing is chiseled in marble, and things tend to change quickly.

* SUMMARY

XML is the future over present EDI system, and XML will facilitate the Webs B2B, C2B, B2C, and A2A information exchange revolution. XML is a vehicle that will help people communicate smarter. After all, communication is at the core of business, and humanity, for that matter. If business is largely made of communication, then B2B, likewise, is all about communication, often vital communication. Over time, XML will become part of the Web's standard infrastructure.

ebXML will not toss EDI aside, but it will build on its foundation. A concern for the ebXML group is to not leave behind those who have considerable investments in EDI. But the promotion of ebXML for future system integration and collaboration is very important. ebXML is considered the enabler of a global electronic market.

What will really pay off is the ability to exchange a greater variety of information made possible by the extensibility of XML (supplemented at times with file attachments), and the ability to add real-time collaboration to the mix.[5] With B2B e-commerce expected to reach $1.3 trillion by 2003 and $7.3 trillion by 2004, XML promises to be a key enabling technology.

References

[1] The World Wide Web Consortium (W3C) (2001/10/31).

W3C

[2] Fitzgerald M., (2001). Building B2B Applications with XML;

(pp. 3-15, p. 164) New York, NY: Wiley Computer Publishing

[3] Ippolite C. (Sept, 2001). FileMaker Pro Advisor; FileMaker 5.5 on the Web,

(pp. 20-29). Sol ana Beach, CA: Advisor Media, Inc.

[4] Ray T.E. (2001). Learning XML, (p. 2, pp. 143-189)

Sebastopol, CA: O'Reilly & Associates, Inc.

[5] Carr D.F. (July, 2001). Forging 21st-century Value Chains,

(pp. 26-32 ). Internet World: The Voice of E-Business and Technology; Danvers, MA: Penton Technology Media.

[6] Jaideep R., Anupama R. (May|June, 2000). XML: Data's Universal Language,

(pp. 32-36 ). IT Professional; Technology Solutions for the Enterprise; Piscataway, NJ: IEEE Computer Society.

[7] Kohl. K., (NOV, 2000). e-Business Advisor; Making Sense of the XML Standards,

(pp. 22-27 ). Solana Beach, CA: Advisor Media, Inc.

[8] Russell D., (June, 2001). XML: Data's Universal Language,

(pp. 18-20 ). Web Services Journal; Electronic Business XML (ebXML): Making Web Services Work for Business; Montvale, NJ: Sys-Con Media.


Back | Home | Top | Feedback | Site Search


E-Mail Me

This site is brought to you by
Bob Betterton; 2001 - 2011.

This page was last updated on 08/30/2003
Copyright, RDB Prime Engineering



This Page has been accessed "4006" times.