WEEK 6: XSD - Schema and
XML
|
|
| Reading assignments
|
| BOOK |
PAGES |
| XML in easy steps |
49-84 |
| |
|
| |
|
LINKS TO XML RELATED SITES
- XML.COM
- MSDN'S
XML DEVELOPER SITE
- IBM's
XML WEBSITE
- IBM'S
ALPHAWORKS WEBSITE
- W3
Schools XML Tutorial
- Kickstart
XML Tutorial
Tutorial files from W3C on Schema
Documents:
Introduction to eXtensible Schema
Documents (XSD):
Section 1: Introduction to XSD
What is an XML Schema?
An XML Schema:
- Defines elements that can appear
in a document
- Defines attributes that can appear
in a document
- Defines which elements are child
elements
- Defines the sequence in which
the child elements can appear
- Defines the number of child elements
- Defines whether an element is
empty or can include text
- Define
default values for attributes define default values for attributes
The purpose of a Schema is to define the legal building blocks of an XML document,
just like a DTD.
BUT....
- XML Schemas are the Successors
of DTDs
- XML Schema was originally proposed
by Microsoft, but is now a W3C recommendation.
XML Schemas will be used in Web applications
as a replacement for DTDs. Here are the reasons why:
- XML Schemas are easier to learn
than DTD
- XML Schemas are extensible to
future additions
- XML Schemas are richer and more
useful than DTDs
- XML Schemas are written in XML
- XML Schemas support data types
- XML Schemas support namespaces
Sample files:
We will be working with a number of sample files. You should open shippingOrder.xml,
shippingOrder_xsd.xml, and shippingOrder.xsd
. If you have problems viewing the shippingOrder.xsd document, try opening orderSchema.xml,
which shows the schema in 'plain' .xml. These are well-formed and valid XML
files, so IE 5.5 should be able to render them. An instance file which contains
reference to these schema (without data) is shipOrder.xml.
You can modify that file, which can link to shipOrder.xsd
through shipOrder_xsd.xml.
We will be creating a schema document
for our address book, address_book.xml shown
in address_book.xsd and linked through address_book_xsd.xml.
These can serve as a working reference for your eventual project files.
Section 2: Beginning a schema
- Indicating a schema's location
- Open orderSchema.xml which shows the schema
for shipOrder.xml. Since an XML schema is an XML
document, we begin the schema with a processing instruction that is the prolog
of the XML document. The root element is <xsd:schema>, which includes
the namespace. Open shippingOrder_xsd.xml
to see the reference to shippingOrder.xsd
.
- Declaring a namespace - Open address_schema_1.xml
to see the namespace declaration. We'll talk more about namespaces next week,
but namespaces hold archives of "global" elements and declarations.
The file shipOrder.xml contains a reference to the namespace and the schema
file.
- Annotating a schema - It's a good
idea to annotate a schema, especially if it refers to another schema, or is
a "work in progress". Open address_schema_2.xml
to see my annotation.
Section 3: Simple types
- Declare an element with a simple
type - The riddle of schema is that simple types aren't so simple, and
complex types aren't so complex. Simple types can contain only text. Additionally,
XML schema has built in simple types for most kinds of text. These include
string, decimal, Boolean, date, time, uri-reference, language, or "custom".
Custom is the name of a custom simple type that you invent for your schema.
- xsd:schema - <xsd:schema
is the root element of the .xsd document. In it you will declare the location
of the namespace by which your schema document is validated.
- xsd:element - <xsd:element
is where you will declare the "name" of the root element that the
schema is applied to in the xml document, and the "type". Open address_schema_3.xml
to see "name" and "type".
- xsd:complexType - This
is where the fun begins. A complex type will typically contain elements, elements
and text, attributes, and combinations of all the above. In address_schema_4.xml
we start to build out the address_book element declarations.
- xsd:sequence - Sequence
will contain the elements that nest, or reside within an outer element. Each
complexType will have a name that will be used to declare all the elements
that it includes. Open address_schema_5.xml
we start to build out the sequences.
- complexTypes and complexContent
- After we have declared the sequence of elements for the "record"
element, we need to declare and create sequences for "name", "address"
and "contact" elements. Each of these will in turn contain elements,
and possibly attributes as well. Open address_schema_6.xml,
address_schema_7.xml, and address_schema_8.xml
to see each of these.
- Restriction bases - Restriction
bases build on simple types but restrict the data structure, typically to
a pattern. We have built a simple document called zipcode.xml
that restricts the input to 5 or 9 digits (94020 or 94020-0007). The associated
schema for this document is written in a file called zipcode_schema.xml
. The instance document zipcode_xsd.xml is validated
by the schema file zipcode.xsd . A similar set of
files can be seen for phone numbers; phone.xml, phone_xsd.xml,
and phone.xsd (or phone_schema.xml)
- Using number and date types
- We can use number and date types as simple types that keep our data consistent
with the needs of a validating application. For number types please see accounting.xml
and accounting_schema.xsd . For date types
we can specify time, date, month, year, and century. These generally follow
the CCYY-MM-DD and hh:mm:ss:sss formats. This is more detail then you'll probably
ever need, so please see personal.xml , personal_xsd.xml,
and personal.xsd for examples that might fit your
address book
- Deriving custom simple types
- See picture.xml, picture_xsd.xml,
and picture.xsd for derivation of custom simple
types.
- Enumeration values - Enumeration
values specify or limit the entry of data to match only predefined choices.
A typical example might be specifying months in a year, or astrological signs.
An example of an enumeration value and the associated schema can be found
in astrological.xml , astrological_xsd.xml,
astrological.xsd, and astrological_schema.xml,
respectively.
- Specifying patterns - for
patterns please refer to the zipcode.xml and zipcode_schema.xml
documents.
- Specifying ranges - To
specify a range of acceptable values pg. 86 address_schema_9.xml content
- Predefining content - To
predefine an element's value pg. 91 address_schema_10.xml content
Section 4: Complex types
Overview (pg. 93)
- Only elements - An example
of defining an element to contain only elements is shown in onlyElements.xml
.
- Sequence - An example of
declaring elements to appear in a given sequence is shown in sequence.xml
.
- Choice - Am example of
using choice if an element may appear or not is shown in choice.xml
- Any order - Elements can
be declared to appear in any order, as shown in anyOrder.xml
.
- Groups - You can define
and name groups of elements as shown in groups.xml
, groups_xsd.xml, and groups.xsd
.
- Referencing defined elements
- After naming the groups, they can be referenced as shown in referenced.xml
. This file uses groups declared as name, address, and contact and referenced
to the element record. Record is a complex type that uses the sequence of
named groups referenced above.
- How many - You can control
how many times an element appears using minOccurs and maxOccurs. These are
shown in the example howMany.xml .
- Define elements only text
- Defining elements to contain only text is shown in onlyText.xml .
- Define empty elements -
Take a look at picture.xml, picture_xsd.xml,
and picture.xsd for deriving schema for empty elements.
- Mixed content - Thanks
to the mixed content attribute, we can create such things as paragraphs with
text and elements for creating semantics and meaning, as in story.xml
. Look at this document, and the related linking file story_xsd.xml
and its schema story.xsd.
- Basing complex types on complex
types - An example of complex types built on complex types is shown the
example complexComplex.xml .
- Declare an element of complex
type - An example of declaring an attribute of complex type is shown in
declareComplex.xml .
Section 5: Attributes
- Declaring attributes -
While an attribute is always of simple type, since it contains neither elements
nor attributes), it always appears within an element of complex type. Use
<xsd:attribute to start the element, then give the attribute a "name"
and a "type". Type will usually (or always) be one of the built
in simple types. You can also reference an attribute group, as shown below
(WIP).
- Requiring an attribute
- You can require an attribute by simply adding use="required" within
<xsd:attribute . You can also add value = "must" where "must"
is the only acceptable value allowed. See address_schema_12.xml and picture_schema_2.xml.
- Predefining an attributes content
- You can predefine an attribute's content by using use="fixed"
and value ="content" within the <xsd"attribute element,
where "content" is the predefined value. Use can be "fixed"
or "default" as seen in these two examples (WIP).
- Defining attribute groups
- Attribute groups can be defined and later referenced to save time. Use <xsd:attributeGroup
name="attribute_nameGroup" and name="whatever_Atts to name
the group, as shown in (files are being modified at this time).
- Referencing attribute groups
- After defining an attribute group above, you can reference it by using <xsd:attributeGroup
ref="label" /> where "label" is the named group (above).
Files are being modified at this time.
Section 6: Creating schema for
nested, empty and mixed models
- Nested elements - Creating
schema for nested files uses the complex type, sequence, and attribute declarations.
These files are shown in nested_elements.xml,
nested_elements_xsd.xml, and nested_elements.xsd
- Empty elements - Creating
schema for the empty model is straightforward. Declare the outer elements
as a block, using a complexType declaration, and declare attributes of that
block if present. Then declare the inner block as a complexType, with each
attribute declared. Look carefully at the three files empty_elements.xml,
empty_elements_xsd.xml, and empty_elements.xsd.
- Mixed elements - Creating
schema for mixed models is only slightly more complicated than empty. Look
carefully at the three files mixed_elements.xml,
mixed_elements_xsd.xml, and mixed_elements.xsd.
Notice that the only difference is that you are using simpleContent and an
extension base to allow you to declare the 'mixed content' of attributes and
text in the inner elements.
- Mixed content - These files
contain text and elements mixed together - addressing the issue of 'unstructured
text'. Take a look at these files, including their DTD counterparts: story.xml,
story_dtd.xml, story_xsd.xml,
story.dtd, and story.xsd,
or the entire zipped archive story.zip.
Section 7: The address book schema
Below are the nine files that represent
the address book files in nested, empty, and mixed models. These are worth viewing
and comparing.
- address_book_nested.xml
- address_book_nested.xsd
- address_book_nested_xsd.xml
- address_book_empty.xml
- address_book_empty.xsd
- address_book_empty_xsd.xml
- address_book_mixed.xml
- address_book_mixed.xsd
- address_book_mixed_xsd.xml
- address_schema.zip
(all 9 above)
Homework: Create
an .xsd file for your address book (or whatever theme you are using now). Use
the sample files or final projects as a guide if you need help. Create schemas
for both nested and empty models of whatever theme you are using for your projects.
Please try to validate your schema (if you are using XML Spy). Looking at address_book_xsd.xml
and address_book.xsd may help as examples to
help you in the design of your schema. Email me both your schema file (filename.xsd)
and linking document (filename_xsd.xml) by the end of week 7 or the start of
week 8. You can download a great collection of sample XSD files with advanced
declarations from this link. Or download
the entire file with schema for empty, mixed, and nested models - Address_Book_Schema.zip
.