Lesson 4 - Points on the Map

Tools and Techniques

In this lesson you will learn how to look at database models and external database links. We'll also try performing some searches using the Query by Example. In the second part of the lesson we'll look at the Query Builder.

This lesson will take a close look at objects that are assigned a single point position on a genetic map (as opposed to an interval). Some examples of this type of information are genes and molecular markers. We'll also look at objects that are closely associated with these "point" objects, such as probes, alleles, polymorphism, traits, and so on. By looking at examples from several databases, you should get an idea of the different nomenclature and structures that are being used.
 
 

Database models

Every database located here is different in both content and organization. The underlying structure of the database, for example, what fields are used to describe a locus, is defined by its model. While the models are written in a special syntax unique to the ACEDB software, they are fairly easy to understand, even if you're not a database curator. The models can be very useful if you're having trouble locating certain information, or if you are consistently getting no hits on your queries. The models let you see what the fields are called, what type of data they contain (text, numeric), and which data items are hypertext links to other data items.

Example 4.1- The database below has examples that illustrate how to look at database models.

AlfaGenes
If you have a AlfaGenes Database comment or question contact either Daniel Z. Skinner or Paul C. St. Amand, Curators.
 
 

Molecular markers

On most of the genetic maps in these databases, the majority of the points consist of anonymous molecular markers and not genes of known function. Molecular markers can be generated by a variety of different techniques, each with their advantages and disadvantages, and different sorts of information will accompany the different marker types. These markers may pinpoint genes, (for example cDNA, RFLP or EST markers), or proteins, (isozyme markers), or they may mark genomic DNA (for example genomic RFLP, RAPD or microsatellite markers). They may be gel-based, or PCR-based; they may detect a single locus or multiple loci; they may or may not require sequence information; finally, they may vary in their degree of reliability, difficulty and expense, and the nature of the polymorhpism they detect. This tutorial is not intended to provide a complete background on marker technology, and refinements to these techniques continue to occur. The point is that the information surrounding these markers will vary quite a bit depending on the marker type. A user needs to be somewhat familiar with a particular technology to assess whether a database is providing adequate and useful information.

The following section will introduce a new searching technique, which we'll then use to look at examples of six different marker types from several different databases.

Query by Example

Query by example (QBE for short) is a fill-in-the-blank approach to searching. You can access QBE whether you entered a database in Browse mode or Query mode. QBE is appropriate when you can provide specific details about what you want to find. QBE searches are case-insensitive and are limited to a single class of information, in other words, you will be returned only one type of record, never a mixture. You simply fill in the fields with the restrictions you want applied to your retrieval. Each item entered is AND'd together, meaning any hits must satisfy ALL your search criteria. You may use a wildcard (the asterisk *) at any point in a search criterion. You do not need to put quotes around textual values, unless the data you are searching for itself contains quotes, which it probably does not.

Example 4.2 - The following example continues the explanation of QBE and lets you try it out for yourself.

SoyBase
If you have a SoyBase Database comment or question contact David Grant, Curator.
 
 

RAPD markers

RAPD stands for Random Amplified Polymorhpic DNA. This is a marker technique that requires no cloning or sequencing, is PCR-based and may detect several loci simultaneously. Short (about 10 base pair) random primer sequences are used to amplify DNA, and generally results in a presence/absence polymorphism. This technique is easy, inexpensive and fast, and because a single RAPD primer may detect more than one loci, it is useful for phylogenetic studies. The short primers, however, may be easily affected by annealing conditions and results may not always be consistently reproducible. This marker technology is not appropriate for use in comparative mapping.

Example 4.3- The database below has examples that illustrate where to find and how to interpret information about RAPD markers.

GrainGenes
If you have a GrainGenes Database comment or question contact Gerry Lazo, Curator.
 
 

AFLP markers

AFLP stands for Amplified Fragment Length Polymorphism. This technique also requires no cloning or sequencing and is PCR-based. It operates on much the same principle as a RAPD, but the primer consists of a longer (~ 15 bp) fixed portion and a short (2-4 bp) random portion. The long fixed portion gives the primer stability and the short random portion means it will amplify many loci - one may get over 100 loci amplified with a single AFLP primer. Polymorphism is detected as the presence/absence of a band. Because it detects so many loci, it is very useful for fingerprinting. It does, however, require additional laboratory steps and can be a difficult technique to master. This technology is not appropriate for comparative mapping work.

Example 4.4- The database below has examples that illustrate where to find and how to interpret information about AFLP markers.

SolGenes
The SolGenes Database Project is currently inactive.
 

RFLP markers

RFLP stands for Restriction Fragment Length Polymorphism. This technique requires that you create a library of DNA fragments cloned into some vector. These probe libraries may be based on genomic or cDNA. It does not require sequencing, and is gel-based, not PCR-based. The DNA of the organism you are interested in is digested with restriction enzymes and probed with the library of cloned DNA. Matches to the probe DNA are visualized using radioactivity. The technique is slow and requires a significant investment of time. This method detects a single locus, and the fact that it will consistently detect the same locus makes this technology well-suited for mapping, comparative mapping and QTL studies. Polymorphism is generally detected as a difference in the molecular weight (and thus the migration through a gel) of the fragments of host DNA.

Example 4.5- The database below has examples that illustrate where to find and how to interpret information about RFLP markers.

RiceGenes
If you have a RiceGenes Database comment or question contact Angela Baldo, Curator.
 
 

STS markers

STS stands for Sequence Tagged Site. This is a PCR-based approach that detects a single, unique, sequence-defined point in the genome. This does not require cloning, but does require sequence information. Primers of 18-20 base pairs are designed to amplify some short, unique fragment of DNA whose sequence is known (for example, this might be the sequence of a RAPD PCR product, or an RFLP clone). Polymorphism is generally detected as a size difference in the amplified product; if there is no size difference, restriction enzymes may be used to cut the products and identify polymorphism. Since the primers are longer than RAPD primers and based on a specific sequence, this method reliably detects the same locus and is good for mapping studies. The design and creation of good primers may involve a significant investment.

Example 4.6- The database below has examples that illustrate where to find and how to interpret information about STS markers.

MilletGenes
If you have a MilletGenes Database comment or question contact Matthew Couchman , Curator.
 
 

EST markers

rewrite

EST stands for Expressed Sequence Tag. As part of many sequencing projects, partial (end) sequences of cDNA clones are generated. These partial sequences can then be used to design 18-20 base pair primers which provide a unique sequence tag for the gene. Again, this is a PCR-based approach which does not require cloning, but does require sequence information. It detects a unique, expressed region of the genome, and is good for mapping. Polymorphism is generally detected as a size difference of the amplified product. The design and creation of good primers may involve a significant investment.

Example 4.7- The database below has examples that illustrate where to find and how to interpret information about EST markers.

MaizeDB
If you have a MaizeDB Database comment or question contact Mary Polacco, Curator.
 
 

Microsatellite markers

Microsatellite markers detect hypervariable regions of the genome. A microsatellite (also called simple sequence repeat (SSR), short tandem repeat (STR) or variable number tandem repeat (VNTR)) is a short (2-5 base) motif that is repeated multiple times and is flanked by unique DNA. The microsatellite motif is used as a probe against genomic or cDNA libraries to identify clones containing the motif. These clones are then end sequenced, and primers are designed to amplify the unique DNA flanking the microsatellite motif. This method is repeatable, identifies a single locus, and targets hypervariable regions of the genome. Polymorphism is generally detected as a length difference in the amplified product. This length difference may be very small, for example 2 base pairs. It does require a significant investment of time and resources, and is appropriate for mapping, QTLs studies and fingerprinting.

Example 4.8- The database below has examples that illustrate where to find and how to interpret information about microsatellite markers.

GrainGenes
If you have a GrainGenes Database comment or question contact Gerry Lazo, Curator.

Go back to the tutorial Home Page

Go on to the next part of this lesson

Return to the Plant Genome Data and Information Center Home Page