What is XML with Extensible Markup Language

XML - Extensible description language

Information in readable form

The exchange of information, even electronically, requires a mutual agreement on the interpretation of the information between sender and recipient. Otherwise, the information sent will not be understood by the recipient and will not produce the desired or no effect.

In the past, the technical limitations (storage space, transmission capacities) often led to forms of information exchange that were difficult to understand because numerous abbreviations and codes were used. With the increasing networking within and between companies (supported by concepts such as SOA) and the availability of broadband technology, the need for simpler description languages ​​for data and information exchange grew.

"XML" was developed as a viable solution and standardized internationally. The language definition XML means "Extended Markup Language". Like HTML, XML is also based on SGML concepts.

The basic idea of ​​XML lies in the structuring of information or in the clear separation of information and design / display instructions. XML documents are easy to read for humans due to their structure and the "tags" (structural elements) used.

Separation of content and design

In XML, users can define structural elements themselves, which then creates their own special "HTML dialects". For a long time to come, XML will not replace the HTML language, but rather supplement it.

A main problem with the HTML page description language is that an HTML document can only contain information about the content via so-called META tags or comments. There are no HTML tags (commands) that classify the content of a document or parts of it.

The search for information via search engines is made very difficult, an example illustrates this: The search for a painting by Vincent van Gogh that contains a vase with stones with a search engine is as follows:

  • Start the search engine (example: http://www.google.ch)
  • Enter the search term: Vincent van Gogh + vase + sunflower + oil painting

The search engine lists search results with around 42,000 entries (December 2011) from brochures, online shops, news groups, etc. With the AND link (+) it can be achieved that only results are displayed which contain all keywords.
The search engine can only search for keywords in HTML; with the help of XML, page content can now be structured and qualified:

<?xml version="1.0" ?>
<AUCTIONBLOCK>
  <ITEM>
    <TITLE>Sonnenblumen in Vase</TITLE>
    <ARTIST>Vincent van Gogh</ARTIST>
    <DIMENSIONS>20x30 cm</DIMENSIONS>
    <MATERIALS>Oel auf Leinwand</MATERIALS>
    <YEAR>1888</YEAR>
    <DESCRIPTION>Stilleben</DESCRIPTION>
    <PREVIEW-SMALL src="12sonnenblumen_klein.jpg" width="300" height="194" alt="Sonnenblumen in Vase"/>

  <BIDS>
    <BID>
      <PRICE>6000</PRICE>
      <TIME>3:02:22 PM</TIME>
      <BIDDER>Chris</BIDDER>
      <TIMESTAMP>1307</TIMESTAMP>
    </BID>

    <BID>
      <PRICE>5700</PRICE>
      <TIME>2:58:42 PM</TIME>
      <BIDDER>John</BIDDER>
      <TIMESTAMP>1315</TIMESTAMP>
    </BID>
  </BIDS>

<TIMESTAMP>1315</TIMESTAMP>
</ITEM>
</AUCTIONBLOCK>

This example shows an XML page from an auction house with 2 bidders (Chris and John). These XML definitions are free of formatting or representations; the content of the document is classified or divided into individual attributes. The design takes place with the help of style sheets in the XSL language, in the XML file only appropriate references to the style definition to be used are entered. Since XML attaches more importance to structures than HTML, the following differences arise:

  • all tags must be completed ( ... )
  • the sequence of opening tags must be adhered to when closing
  • all attributes for a day (src = "12sonnenblumen_klein.jpg" or width = "300") must be enclosed in quotation marks
  • XML tags are case-sensitive ( is not the same as )

XSL for design information

An XSL definition has a similar structure to a CSS definition for HTML. Design elements are managed centrally in an XSL file. XSL stands for "Extensible Stylesheet Language". In addition to the usual design definitions such as “font-family” or “font-size”, XSL also offers options for sorting tables (“order-by”). Style definitions can also be defined for several XML tags at the same time (see example above: AUCTIONBLOCK / ITEM / BIDS / BID).

<?xml version="1.0"?>

<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl">
<xsl:template match="/">

<TABLE STYLE="border:1px solid black">
   <TR STYLE="font-size:12pt; font-family:Verdana; font-weight:bold; text-decoration:underline">
     <TD>Price</TD>
     <TD STYLE="background-color:lightgrey">Time</TD>
     <TD>Bidder</TD>
   </TR>

   <xsl:for-each select="AUCTIONBLOCK/ITEM/BIDS/BID" order-by="BIDDER">
   <TR STYLE="font-family:Verdana; font-size:12pt; padding:0px 6px">
     <TD>$<xsl:value-of select="PRICE"/></TD>
     <TD STYLE="background-color:lightgrey"><xsl:value-of select="TIME"/></TD>
     <TD><xsl:value-of select="BIDDER"/></TD>
   </TR>
</xsl:for-each>
</TABLE>

</xsl:template>
</xsl:stylesheet>

An external XSL definition is integrated into an XML document using a reference command:

<?xml-stylesheet type="text/xsl" href="review.xsl" ?>

The combination of XML as a structure format and XSL as a design format requires the following components to display documents in the browser:



By converting via the XSL processor, the structure and design data of XML and XSL are merged into one HTML / CSS file and can be displayed on numerous end devices (browser on PC or smartphone, tablet computer, etc.). The data source (XML) always remains identical; the design is adapted to the end device by using different XSL definitions.

XML schema for type descriptions

With the consistent separation of content and design in XML, the challenge arises that content has to be specified in terms of data types. "XML Schema" (formerly DTD, the "Document Type Definition") is used for this purpose. XML schema is abbreviated to XSD (XML Schema Definition).

XML schema offers the option of describing the structure of XML documents and restricting the content of elements and attributes, e.g. B. on certain data types such as numbers, dates or texts. An XML schema is itself an XML document that allows more complex (also content-related) relationships to be described.

The following example defines a new type «monthInt», which provides a list of valid month numbers for XML documents that incorporate the XML schema:

<xs:simpleType name="monatInt">
  <xs:restriction base="xs:integer">
    <xs:minInclusive value="1"/>
    <xs:maxInclusive value="12"/>
  </xs:restriction>
</xs:simpleType>

<xs:simpleType name="monate">
  <xs:list itemType="monatInt"/>
</xs:simpleType>
 
A use of the new type («monthInt») in an XML document could look like this:

<monate>
   1 2 3 4 5 6 7 8 9 10 11 12
</monate>
 
The individual elements of a list (in the example the month numbers) are separated by spaces (also called «whitespaces»).