XML for Bioinformatics

In this section we will look at three XML markup languages for bioinformatics. These include BSML, MAGE-ML, and SBML.

If you have not studied XML before, it is a lot like HTML, with three big exceptions:

  1. The tags are for representing data, not presenting data (as with HTML)
  2. The markup languages are defined by DTDs, and the data must be encapsulated as specified in those schema
  3. Where no DTD exists, you can make up your own tags, and then describe 'your XML' in a DTD or schema that you share

Here are some basic background files to look at:

Here are some bioinformatics files in BSML, MAGE-ML, and SBML formats. They will all open correctly in MSIE 5.5 or later

Assignments for this section are as follows:

  1. Fill out simple.xml with your data (use Notepad to open, change, and save the data). Be sure to save as 'all files' and as simple.xml
  2. Fill out the address_book.xml, but add an additional record, and consider adding additional XML elements
  3. Make up your own short XML application, perhaps a recipe book, inventory model, or whatever suits you. Keep it simple.
  4. Look at some of the RSS feeds and especially those written in atom, as well as XML. Can your read what is going on?
  5. As you explore bioinformatics websites, take time to view records, like GenBank, in an XML format. Read the file for content.

This lesson is copyrighted using an Educational Common License, and may be used freely without restriction for academic purposes.

Robert D. Cormia

rdcormia@earthlink.net