NCBI 'Quick Tour' - Step-by-Step

This section is intended to give you a very quick tour of NCBI, including basic navigation, introduction to the major databases, using BLAST, downloading sequences, using the protein structure tool, and using the special sections such as the retrovirus resources and influenza tools. Most researchers will spend a third of their time at NCBI, and you should plan on allocating a similar amount of time learning to use these tools.

Step-by-step:

  1. Familiarize yourself with the menu bar of NCBI - http://www.ncbi.nlm.nih.gov/,explore as much of the home page as possible.
  2. Start by exploring PubMed using the Entrez link http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed In the form entry, search for terms including HIV, HIV immunity, HIV immunity and CCR5 delta 32 deletion. How does the number of hits returned change in each of the queries? You will spend a lot of time searching PubMed, especially if you have a university account where you can access journals.
  3. From the home page of NCBI, http://www.ncbi.nlm.nih.gov/, use the pull down menu to search different databases. First start with search 'All Databases' for COMT. What is the page that is returned? Does it help you see all the databases that are organized around an entry? How many databases have one or more entries for COMT? Make a mental note of each one of these databases. In time, try to visit and explore these. Take careful notes, as even if you bookmark sections, a notebook will serve as a guide to help your future research.
  4. Search Protein and then Nucleotide for COMT. What are the primary accession numbers of COMT in each of these databases? For COMT nucleotide, search with Z26491. Try this link http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=403303 From this entry, scan down the GenBank record to the CAA81263.1 link http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val=CAA81263.1 Read both records carefully, and take note as to the types of information in these records.
  5. Go to the OMIM database at http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM and search for COMT. What are some of the entries that are returned? What does COMT do in the body?
  6. Go to NCBI Structure at http://www.ncbi.nlm.nih.gov/Structure/ search for COMT. The structure entry that you want is 1VID. Follow the 1VID entry to the MMDB structure entry at http://www.ncbi.nlm.nih.gov/Structure/mmdb/mmdbsrv.cgi?form=6&db=t&Dopt=s&uid=4499 You may need to install Cn3D4.1 and RasMol to view the structures.
  7. Search the SNP database for COMT. How many entries are listed? Explore (carefully) anyone of these links. The SNP database is very complicated so don't get discouraged. You'll also navigate back to this database from the Perlegen Genotype and Haplotype browsers.
  8. Go GenBank http://www.ncbi.nlm.nih.gov/Genbank/index.html Read about the submission requirements. We will cover the NCBI data model later in the quarter, but follow the main entries for COMT Nucleotide and Protein into GenBank.
  9. Go to Entrez http://www.ncbi.nlm.nih.gov/Entrez/index.html This is a transient page that will load another. This is the view into all of NCBI, showing you the various databases that store, and share, genomic data at NCBI. It's also a good checklist for your travels.
  10. Explore NCBI Map Viewer http://www.ncbi.nlm.nih.gov/mapview/ Can you find the gene location for COMT? Make sure to select the human organism. In time you'll be navigating the UCSC Genome browser, and Perlegen's Haplotype and Genotype browsers.
  11. Explore Taxonomy browser http://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html
  12. NCBI has a great book section http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Books including Molecular Biology of the Cell.

Wild card: In about one page or so, write down your impressions of NCBI before and after you do this exercise. Of all the databases you will encounter, this is the biggest, and the most overwhelming. Think about how NCBI will expand with the growing amount of gene expression data, and the SNPs from haplotyping experiments (HapMap.org). As you use Cn3D, think about how it can be used to really model variation.


This lesson is copyrighted using an Educational Common License, and may be used freely without restriction for academic purposes.

Robert D. Cormia

rdcormia@earthlink.net