Lesson 6 - Intervals on the Map - Rearrangements and QTLs

Tools and Techniques

In this lesson we will continue looking at interval data that may appear on or associated with a genetic map. In the previous lesson we looked at cytological regions (which may actually be their own type of map) and introgressed regions. In this section we'll look at rearranged regions (such as inversions and translocations) and Quantitative Trait Loci (QTLs). We'll cover yet another new query method, fuzzy searching, as well.

 

Fuzzy searching

A fuzzy search is very similar to a WAIS search, but it allows for some uncertainty ("fuzziness"). It uses the UNIX agrep utility, which makes it possible to identify objects that do not exactly match your criteria. It is useful when you are unsure of the spelling or formatting of the information (is it adh-1 or adh1, for example), or when you want to get all variations of a word. Just like a WAIS search, a fuzzy search is performed against an entire database or set of databases, not just a single class, so it is also appropriate when you're not sure which class might contain the information. If your query is successful you will see a list of objects that you can click in the usual way to obtain additional information. If you have included more than one database in your search, the list may contain objects from different databases. Remember, there is no standard structure shared by all the databases, so results will not be uniform across databases. Finally, unlike the WAIS search, a fuzzy search will return all records meeting your criteria, not just the top 40. There is no score associated with each hit, instead they are returned in alphabetical order.

A big difference between a WAIS and a fuzzy search is the wildcard character: in a WAIS search the * indicates a wildcard but in a fuzzy search you must use the #. (The asterisk has another meaning to fuzzy searching and you will probably be disappointed if you try it.) Also, in a fuzzy search the wildcard character can be used anywhere in the search string, not just as a suffix. Another difference is that the Boolean operators AND and OR are indicated differently. In a fuzzy search you must use the semi-colon (;) to indicate an AND (e.g. pesticide;fertilizer finds records that contain both these words), and use the comma (,) to indicate an OR (e.g. pesticide, fertilizer finds records that contain one or both words).

There are several options you can set that affect the fuzzy search, and these will appear at the bottom of the fuzzy search form window. By default, the search is case-insensitive, but you may force it to be case-sensitive. The search also assumes that the terms you enter are separate words, surrounded by spaces or tabs. If you want to find anything that has your search term either as a stand alone term or embedded within another word, you can unselect the "Treat string as a word" option. For example, if your search term was the word correct, and you had unselected the "Treat string as a word", you would get records containing correct, correctly, corrected, incorrect, and so on.

Finally, fuzzy searching supports some degree of mismatching. A mismatch may be a substitution of one character for another, or an insertion or deletion of a character. For example, massechusets matches massachusetts with two errors (one substitution and one insertion). Even with mismatching allowed, it is possible to insist that certain parts of the pattern match exactly. Any pattern inside angle brackets <> must match the text exactly, even if the match is with errors. For example, <mathemat>ics matches mathematical with two errors (replacing the last s with an a and adding the l), but mathe<matics> does not match mathematical no matter how many errors we allow, because the "matics" ending is required.

The ability to use wildcards (remember it's the #) anywhere in the search string and the fact that all the records meeting the criteria are returned are the main advantages of the fuzzy search over a WAIS search. Fuzzy searching may take longer however, particularly if you are searching many databases and are allowing mismatches.

Example 6.1 - The following example continues the explanation of fuzzy searching and lets you try it out for yourself.

BeanGenes
If you have a BeanGenes Database comment or question contact Phil McClean, Curator.

 

Rearranged regions

Chromosomes may mutate, causing changes in their structure. These possible changes include the deletion or duplication of a segment of DNA, an inversion, in which a segment breaks off, rotates 180 degrees and reattaches itself, and a translocation, in which segments from two different chromosomes are swapped (so chromosome 1 has a bit of chromosome 2 and chromosome 2 has a bit of chromosome 1, for example). These sorts of chromosomal rearrangements can have significant effects on the viability, fertility and phenotype of the organism, thus it is useful to know where they might be located, and which germplasm lines carry them.

Transposable (movable) elements may cause rearrangements. The end result of a DNA transposition is the insertion of a DNA sequence between two base pairs in a recipient DNA molecule, generating a short duplication (3 to 12 bp) of the target site sequence at the ends of the transposed element. Transposition events generate insertion mutations which disrupt the integrity of the target DNA. Excision of the element may be perfect or imperfect (meaning when it moves again it may leave behind a "footprint" of where it has been). Transposable elements can even carry transcription initiation and/or termination signals, so they may affect gene expression.

Example 6.2 - The databases below have examples that illustrate where to find and how to interpret information about chromosome rearrangements. These sorts of chromosomal changes may be documented with text, graphically, or both. The first example below includes text and graphics, the second is entirely textual. Investigate as many as you like!

GrainGenes
If you have a GrainGenes Database comment or question contact Gerry Lazo, Curator.

MaizeDB
If you have a MaizeDB Database comment or question contact Mary Polacco, Curator.

 

Quantitative Trait Loci (QTLs)

Most traits are not controlled by a single gene, but rather by a set of genes acting in concert. Traits that are controlled by multiple genes, such as yield, tend to have a more subtle, constant variation in the result phenotype, leading to a continuous distribution of phenotypic values. These are also known as "quantitative" traits. Traits controlled by one or a very few genes, such as certain disease-resistances, tend to have more drastic, discrete changes in expression, and these are termed "qualitative" traits.

Because the change in phenotype is more drastic for single gene traits, many of these genes have been mapped to relatively precise locations by observing the segregation of the trait in a mapping population. This approach cannot be used for quantitative traits, however, because it is too difficult to distinguish changes in phenotype, so special QTL-mapping studies must be done, and a different set of mapping software used. The result of such a mapping study is a set of intervals, not points, scattered over the genome, in which a gene or genes affecting the trait is thought to lie. Depending on how the study was done, these intervals may be centered over a single marker, or lie inbetween two markers. The result of the study is a list of markers whose particular alleles tend to be associated with certain changes in the traits being evaluated.

Information on complex traits such as yield may be represented in many different ways in the various databases. You may see QTLs displayed graphically as intervals on the genetic map, or they may be described in text records. Sometimes complex traits are documented by field trial results, but no QTL mapping study has been done, so no map locations are associated with yield components. The examples in this section will try to look at all these possible data formats. In addition, they will touch back on all the different query methods you've learned so far - Query by Example, Query Builder, WAIS searches and Fuzzy searches.

Example 6.3 - The databases below have examples that illustrate where to find and how to interpret information about quantitative traits. Investigate as many as you like!

RiceGenes
If you have a RiceGenes Database comment or question contact Marci Blinstrub, Curator.

SoyBase
If you have a SoyBase Database comment or question contact David Grant, Curator.

CottonDB
If you have a CottonDB Database comment or question contact Savita Kini, Curator.

GrainGenes
If you have a GrainGenes Database comment or question contact Gerry Lazo, Curator.

Go back to the tutorial Home Page

Go on to the next lesson

Return to the Plant Genome Data and Information Center Home Page