WEEK 6: XSD - Schema and XML

 

Reading assignments
BOOK PAGES
XML in easy steps 49-84

LINKS TO XML RELATED SITES

  1. XML.COM
  2. MSDN'S XML DEVELOPER SITE
  3. IBM's XML WEBSITE
  4. IBM'S ALPHAWORKS WEBSITE
  5. W3 Schools XML Tutorial
  6. Kickstart XML Tutorial

Tutorial files from W3C on Schema Documents:

Introduction to eXtensible Schema Documents (XSD):

Section 1: Introduction to XSD What is an XML Schema?

An XML Schema:

  1. Defines elements that can appear in a document
  2. Defines attributes that can appear in a document
  3. Defines which elements are child elements
  4. Defines the sequence in which the child elements can appear
  5. Defines the number of child elements
  6. Defines whether an element is empty or can include text
  7. Define default values for attributes define default values for attributes

    The purpose of a Schema is to define the legal building blocks of an XML document, just like a DTD.

BUT....

XML Schemas will be used in Web applications as a replacement for DTDs. Here are the reasons why:

  1. XML Schemas are easier to learn than DTD
  2. XML Schemas are extensible to future additions
  3. XML Schemas are richer and more useful than DTDs
  4. XML Schemas are written in XML
  5. XML Schemas support data types
  6. XML Schemas support namespaces

Sample files: We will be working with a number of sample files. You should open shippingOrder.xml, shippingOrder_xsd.xml, and shippingOrder.xsd . If you have problems viewing the shippingOrder.xsd document, try opening orderSchema.xml, which shows the schema in 'plain' .xml. These are well-formed and valid XML files, so IE 5.5 should be able to render them. An instance file which contains reference to these schema (without data) is shipOrder.xml. You can modify that file, which can link to shipOrder.xsd through shipOrder_xsd.xml.

We will be creating a schema document for our address book, address_book.xml shown in address_book.xsd and linked through address_book_xsd.xml. These can serve as a working reference for your eventual project files.

Section 2: Beginning a schema

  1. Indicating a schema's location - Open orderSchema.xml which shows the schema for shipOrder.xml. Since an XML schema is an XML document, we begin the schema with a processing instruction that is the prolog of the XML document. The root element is <xsd:schema>, which includes the namespace. Open shippingOrder_xsd.xml to see the reference to shippingOrder.xsd .
  2. Declaring a namespace - Open address_schema_1.xml to see the namespace declaration. We'll talk more about namespaces next week, but namespaces hold archives of "global" elements and declarations. The file shipOrder.xml contains a reference to the namespace and the schema file.
  3. Annotating a schema - It's a good idea to annotate a schema, especially if it refers to another schema, or is a "work in progress". Open address_schema_2.xml to see my annotation.

Section 3: Simple types

  1. Declare an element with a simple type - The riddle of schema is that simple types aren't so simple, and complex types aren't so complex. Simple types can contain only text. Additionally, XML schema has built in simple types for most kinds of text. These include string, decimal, Boolean, date, time, uri-reference, language, or "custom". Custom is the name of a custom simple type that you invent for your schema.
  2. xsd:schema - <xsd:schema is the root element of the .xsd document. In it you will declare the location of the namespace by which your schema document is validated.
  3. xsd:element - <xsd:element is where you will declare the "name" of the root element that the schema is applied to in the xml document, and the "type". Open address_schema_3.xml to see "name" and "type".
  4. xsd:complexType - This is where the fun begins. A complex type will typically contain elements, elements and text, attributes, and combinations of all the above. In address_schema_4.xml we start to build out the address_book element declarations.
  5. xsd:sequence - Sequence will contain the elements that nest, or reside within an outer element. Each complexType will have a name that will be used to declare all the elements that it includes. Open address_schema_5.xml we start to build out the sequences.
  6. complexTypes and complexContent - After we have declared the sequence of elements for the "record" element, we need to declare and create sequences for "name", "address" and "contact" elements. Each of these will in turn contain elements, and possibly attributes as well. Open address_schema_6.xml, address_schema_7.xml, and address_schema_8.xml to see each of these.
  7. Restriction bases - Restriction bases build on simple types but restrict the data structure, typically to a pattern. We have built a simple document called zipcode.xml that restricts the input to 5 or 9 digits (94020 or 94020-0007). The associated schema for this document is written in a file called zipcode_schema.xml . The instance document zipcode_xsd.xml is validated by the schema file zipcode.xsd . A similar set of files can be seen for phone numbers; phone.xml, phone_xsd.xml, and phone.xsd (or phone_schema.xml)
  8. Using number and date types - We can use number and date types as simple types that keep our data consistent with the needs of a validating application. For number types please see accounting.xml and accounting_schema.xsd . For date types we can specify time, date, month, year, and century. These generally follow the CCYY-MM-DD and hh:mm:ss:sss formats. This is more detail then you'll probably ever need, so please see personal.xml , personal_xsd.xml, and personal.xsd for examples that might fit your address book
  9. Deriving custom simple types - See picture.xml, picture_xsd.xml, and picture.xsd for derivation of custom simple types.
  10. Enumeration values - Enumeration values specify or limit the entry of data to match only predefined choices. A typical example might be specifying months in a year, or astrological signs. An example of an enumeration value and the associated schema can be found in astrological.xml , astrological_xsd.xml, astrological.xsd, and astrological_schema.xml, respectively.
  11. Specifying patterns - for patterns please refer to the zipcode.xml and zipcode_schema.xml documents.
  12. Specifying ranges - To specify a range of acceptable values pg. 86 address_schema_9.xml content
  13. Predefining content - To predefine an element's value pg. 91 address_schema_10.xml content

Section 4: Complex types

Overview (pg. 93)

  1. Only elements - An example of defining an element to contain only elements is shown in onlyElements.xml .
  2. Sequence - An example of declaring elements to appear in a given sequence is shown in sequence.xml .
  3. Choice - Am example of using choice if an element may appear or not is shown in choice.xml
  4. Any order - Elements can be declared to appear in any order, as shown in anyOrder.xml .
  5. Groups - You can define and name groups of elements as shown in groups.xml , groups_xsd.xml, and groups.xsd .
  6. Referencing defined elements - After naming the groups, they can be referenced as shown in referenced.xml . This file uses groups declared as name, address, and contact and referenced to the element record. Record is a complex type that uses the sequence of named groups referenced above.
  7. How many - You can control how many times an element appears using minOccurs and maxOccurs. These are shown in the example howMany.xml .
  8. Define elements only text - Defining elements to contain only text is shown in onlyText.xml .
  9. Define empty elements - Take a look at picture.xml, picture_xsd.xml, and picture.xsd for deriving schema for empty elements.
  10. Mixed content - Thanks to the mixed content attribute, we can create such things as paragraphs with text and elements for creating semantics and meaning, as in story.xml . Look at this document, and the related linking file story_xsd.xml and its schema story.xsd.
  11. Basing complex types on complex types - An example of complex types built on complex types is shown the example complexComplex.xml .
  12. Declare an element of complex type - An example of declaring an attribute of complex type is shown in declareComplex.xml .

Section 5: Attributes

  1. Declaring attributes - While an attribute is always of simple type, since it contains neither elements nor attributes), it always appears within an element of complex type. Use <xsd:attribute to start the element, then give the attribute a "name" and a "type". Type will usually (or always) be one of the built in simple types. You can also reference an attribute group, as shown below (WIP).
  2. Requiring an attribute - You can require an attribute by simply adding use="required" within <xsd:attribute . You can also add value = "must" where "must" is the only acceptable value allowed. See address_schema_12.xml and picture_schema_2.xml.
  3. Predefining an attributes content - You can predefine an attribute's content by using use="fixed" and value ="content" within the <xsd"attribute element, where "content" is the predefined value. Use can be "fixed" or "default" as seen in these two examples (WIP).
  4. Defining attribute groups - Attribute groups can be defined and later referenced to save time. Use <xsd:attributeGroup name="attribute_nameGroup" and name="whatever_Atts to name the group, as shown in (files are being modified at this time).
  5. Referencing attribute groups - After defining an attribute group above, you can reference it by using <xsd:attributeGroup ref="label" /> where "label" is the named group (above). Files are being modified at this time.

Section 6: Creating schema for nested, empty and mixed models

  1. Nested elements - Creating schema for nested files uses the complex type, sequence, and attribute declarations. These files are shown in nested_elements.xml, nested_elements_xsd.xml, and nested_elements.xsd
  2. Empty elements - Creating schema for the empty model is straightforward. Declare the outer elements as a block, using a complexType declaration, and declare attributes of that block if present. Then declare the inner block as a complexType, with each attribute declared. Look carefully at the three files empty_elements.xml, empty_elements_xsd.xml, and empty_elements.xsd.
  3. Mixed elements - Creating schema for mixed models is only slightly more complicated than empty. Look carefully at the three files mixed_elements.xml, mixed_elements_xsd.xml, and mixed_elements.xsd. Notice that the only difference is that you are using simpleContent and an extension base to allow you to declare the 'mixed content' of attributes and text in the inner elements.
  4. Mixed content - These files contain text and elements mixed together - addressing the issue of 'unstructured text'. Take a look at these files, including their DTD counterparts: story.xml, story_dtd.xml, story_xsd.xml, story.dtd, and story.xsd, or the entire zipped archive story.zip.

Section 7: The address book schema

Below are the nine files that represent the address book files in nested, empty, and mixed models. These are worth viewing and comparing.

  1. address_book_nested.xml
  2. address_book_nested.xsd
  3. address_book_nested_xsd.xml
  4. address_book_empty.xml
  5. address_book_empty.xsd
  6. address_book_empty_xsd.xml
  7. address_book_mixed.xml
  8. address_book_mixed.xsd
  9. address_book_mixed_xsd.xml
  10. address_schema.zip (all 9 above)

Homework: Create an .xsd file for your address book (or whatever theme you are using now). Use the sample files or final projects as a guide if you need help. Create schemas for both nested and empty models of whatever theme you are using for your projects. Please try to validate your schema (if you are using XML Spy). Looking at address_book_xsd.xml and address_book.xsd may help as examples to help you in the design of your schema. Email me both your schema file (filename.xsd) and linking document (filename_xsd.xml) by the end of week 7 or the start of week 8. You can download a great collection of sample XSD files with advanced declarations from this link. Or download the entire file with schema for empty, mixed, and nested models - Address_Book_Schema.zip .