Lesson
4 - Points on the Map
Tools and Techniques
In this lesson you will learn how to look at database models and external
database links. We'll also try performing some searches using the Query
by Example. In the second part of the lesson we'll look at the Query Builder.
This lesson will take a close look at objects that are assigned a single
point position on a genetic map (as opposed to an interval). Some examples
of this type of information are genes and molecular markers. We'll also
look at objects that are closely associated with these "point" objects,
such as probes, alleles, polymorphism, traits, and so on. By looking at
examples from several databases, you should get an idea of the different
nomenclature and structures that are being used.
Database models
Every database located here is different in both
content and organization. The underlying structure of the database, for
example, what fields are used to describe a locus, is defined by its model.
While the models are written in a special syntax unique to the ACEDB software,
they are fairly easy to understand, even if you're not a database curator.
The models can be very useful if you're having trouble locating certain
information, or if you are consistently getting no hits on your queries.
The models let you see what the fields are called, what type of data they
contain (text, numeric), and which data items are hypertext links to other
data items.
Example 4.1- The database below has examples that
illustrate how to look at database models.
AlfaGenes
If you have a AlfaGenes Database comment or question contact either
Daniel Z. Skinner or Paul
C. St. Amand, Curators.
Molecular markers
On most of the genetic maps in these databases, the majority of the points
consist of anonymous molecular markers and not genes of known function.
Molecular markers can be generated by a variety of different techniques,
each with their advantages and disadvantages, and different sorts of information
will accompany the different marker types. These markers may pinpoint genes,
(for example cDNA, RFLP or EST markers), or proteins, (isozyme markers),
or they may mark genomic DNA (for example genomic RFLP, RAPD or microsatellite
markers). They may be gel-based, or PCR-based; they may detect a single
locus or multiple loci; they may or may not require sequence information;
finally, they may vary in their degree of reliability, difficulty and expense,
and the nature of the polymorhpism they detect. This tutorial is not intended
to provide a complete background on marker technology, and refinements
to these techniques continue to occur. The point is that the information
surrounding these markers will vary quite a bit depending on the marker
type. A user needs to be somewhat familiar with a particular technology
to assess whether a database is providing adequate and useful information.
The following section will introduce a new searching technique, which
we'll then use to look at examples of six different marker types from several
different databases.
Query by Example
Query by example (QBE for short) is a fill-in-the-blank
approach to searching. You can access QBE whether you entered a database
in Browse mode or Query mode. QBE is appropriate when you can provide
specific details about what you want to find. QBE searches are case-insensitive
and are limited to a single class of information, in other words, you will
be returned only one type of record, never a mixture. You simply fill in
the fields with the restrictions you want applied to your retrieval. Each
item entered is AND'd together, meaning any hits must satisfy ALL your
search criteria. You may use a wildcard (the asterisk *) at any point in
a search criterion. You do not need to put quotes around textual values,
unless the data you are searching for itself contains quotes, which it
probably does not.
Example 4.2 - The following example continues
the explanation of QBE and lets you try it out for yourself.
SoyBase
If you have a SoyBase Database comment or question contact David
Grant, Curator.
RAPD markers
RAPD stands for Random Amplified Polymorhpic DNA.
This is a marker technique that requires no cloning or sequencing, is PCR-based
and may detect several loci simultaneously. Short (about 10 base pair)
random primer sequences are used to amplify DNA, and generally results
in a presence/absence polymorphism. This technique is easy, inexpensive
and fast, and because a single RAPD primer may detect more than one loci,
it is useful for phylogenetic studies. The short primers, however, may
be easily affected by annealing conditions and results may not always be
consistently reproducible. This marker technology is not appropriate for
use in comparative mapping.
Example 4.3- The database below has examples that
illustrate where to find and how to interpret information about RAPD markers.
GrainGenes
If you have a GrainGenes Database comment or question contact Gerry
Lazo, Curator.
AFLP markers
AFLP stands for Amplified Fragment Length Polymorphism. This technique
also requires no cloning or sequencing and is PCR-based. It operates on
much the same principle as a RAPD, but the primer consists of a longer
(~ 15 bp) fixed portion and a short (2-4 bp) random portion. The long fixed
portion gives the primer stability and the short random portion means it
will amplify many loci - one may get over 100 loci amplified with a single
AFLP primer. Polymorphism is detected as the presence/absence of a band.
Because it detects so many loci, it is very useful for fingerprinting.
It does, however, require additional laboratory steps and can be a difficult
technique to master. This technology is not appropriate for comparative
mapping work.
Example 4.4- The database below has examples that
illustrate where to find and how to interpret information about AFLP markers.
SolGenes
The SolGenes Database Project is currently inactive.
RFLP markers
RFLP stands for Restriction Fragment Length Polymorphism.
This technique requires that you create a library of DNA fragments cloned
into some vector. These probe libraries may be based on genomic or cDNA.
It does not require sequencing, and is gel-based, not PCR-based. The DNA
of the organism you are interested in is digested with restriction enzymes
and probed with the library of cloned DNA. Matches to the probe DNA are
visualized using radioactivity. The technique is slow and requires a significant
investment of time. This method detects a single locus, and the fact that
it will consistently detect the same locus makes this technology well-suited
for mapping, comparative mapping and QTL studies. Polymorphism is generally
detected as a difference in the molecular weight (and thus the migration
through a gel) of the fragments of host DNA.
Example 4.5- The database below has examples that
illustrate where to find and how to interpret information about RFLP markers.
RiceGenes
If you have a RiceGenes Database comment or question contact Angela
Baldo, Curator.
STS markers
STS stands for Sequence Tagged Site. This is a PCR-based
approach that detects a single, unique, sequence-defined point in
the genome. This does not require cloning, but does require sequence information.
Primers of 18-20 base pairs are designed to amplify some short, unique
fragment of DNA whose sequence is known (for example, this might be the
sequence of a RAPD PCR product, or an RFLP clone). Polymorphism is generally
detected as a size difference in the amplified product; if there is no
size difference, restriction enzymes may be used to cut the products and
identify polymorphism. Since the primers are longer than RAPD primers and
based on a specific sequence, this method reliably detects the same locus
and is good for mapping studies. The design and creation of good primers
may involve a significant investment.
Example 4.6- The database below has examples that
illustrate where to find and how to interpret information about STS markers.
MilletGenes
If you have a MilletGenes Database comment or question contact Matthew
Couchman , Curator.
EST markers
rewrite
EST stands for Expressed Sequence Tag. As part of many sequencing projects,
partial (end) sequences of cDNA clones are generated. These partial sequences
can then be used to design 18-20 base pair primers which provide a unique
sequence tag for the gene. Again, this is a PCR-based approach which does
not require cloning, but does require sequence information. It detects
a unique, expressed region of the genome, and is good for mapping. Polymorphism
is generally detected as a size difference of the amplified product. The
design and creation of good primers may involve a significant investment.
Example 4.7- The database below has examples that
illustrate where to find and how to interpret information about EST markers.
MaizeDB
If you have a MaizeDB Database comment or question contact Mary
Polacco, Curator.
Microsatellite markers
Microsatellite markers detect hypervariable regions
of the genome. A microsatellite (also called simple sequence repeat (SSR),
short tandem repeat (STR) or variable number tandem repeat (VNTR)) is a
short (2-5 base) motif that is repeated multiple times and is flanked by
unique DNA. The microsatellite motif is used as a probe against genomic
or cDNA libraries to identify clones containing the motif. These clones
are then end sequenced, and primers are designed to amplify the unique
DNA flanking the microsatellite motif. This method is repeatable, identifies
a single locus, and targets hypervariable regions of the genome. Polymorphism
is generally detected as a length difference in the amplified product.
This length difference may be very small, for example 2 base pairs. It
does require a significant investment of time and resources, and is appropriate
for mapping, QTLs studies and fingerprinting.
Example 4.8- The database below has examples that
illustrate where to find and how to interpret information about microsatellite
markers.
GrainGenes
If you have a GrainGenes Database comment or
question contact Gerry Lazo,
Curator.
Go back to the tutorial Home
Page
Return to the Plant Genome Data and Information
Center Home Page