from Dave Matthews and Ken Kephart
 last update 95.06.14
 
 
                 How to Create Germplasm Records for GrainGenes
 
 The Germplasm class includes cultivars, breeding lines, genetic stocks, wild
 accessions, etc. of wheat, barley, oats, rice, maize -- any plant that's in
 the database.  Pathogens will be put in a different class such as 'Isolate'."
 
 
 FIELDS AVAILABLE
 -----------------------------------------------------------------------------
 Field label              Sample Entry               Description of field
 -----------------------------------------------------------------------------
 
 Germplasm                "Advance CIav3845"         Primary name for the
                                                     accession. See NAMING
                                                     GERMPLASM ACCESSIONS
                                                     listed below.
 
 Species                  "Avena sativa"             Name of a Species object.
                                                     See discussion on SPECIES
                                                     AND SUBSPECIES listed
                                                     below.
 
 Subspecies               "Triticum turgidum         See discussion on
                          ssp. durum"                SPECIES AND SUBSPECIES
                                                     listed below.
 
 Donor_species            "Aegilops searsii"         For addition and
                                                     substitution lines,
                                                     defines the species of
                                                     the progenitor used as
                                                     chromosome donor.
 
 Type                     Cultivar                   Defines the stock type of
                                                     the germplasm record.
                                                     Acceptable values include
                                                     Cultivar, Substitution,
                                                     Amphiploid, Aneuploid,
                                                     Deletion, Alien_addition,
                                                     Mutation, Marker,
                                                     Alloplasmic_line,
                                                     Germplasm,
                                                     Elite_germplasm,
                                                     Synthetic, Isogenic.
 
 Other_name               "Rust Proof"               Lists additional common
                                                     names, experimental
                                                     numbers or other synonyms
                                                     the germplasm is known by
                                                     other than the primary
                                                     name.
 
 Collection_and_ID        WGRC TA3076                Field requires two values.
                                                     Name of a Collection
                                                     object (e.g. WGRC)
                                                     followed by the
                                                     identifier for the
                                                     germplasm (e.g. TA3076) in
                                                     that particular
                                                     collection.
 
 Cross_number             CD81653                    Accession identifier used
                                                     by CIMMYT.
 
 Chromosome_configuration 20''+1'1A                  See GENETIC STOCKS, below.
 
 Abbreviation             CS                         Official abbreviation of
                                                     the germplasm name.
 
 (Pairing_configuration)                             This old field is now
                                                     replaced by Chromosome_
                                                     configuration.
 
 Chromosome_number        42tt                       Usually just an integer.
 
 Female_parent            "Era CItr13986"            Name of a Germplasm object
                                                     used as the cytoplasm
                                                     source in developing a
                                                     germplasm.  Seldom used in
                                                     preference to the Pedigree
                                                     field.
 
 Male_parent              "Justin CItr13462"         Name of a Germplasm object
                                                     used as the pollen source
                                                     in developing a germplasm.
                                                     Seldom used in preference
                                                     to the Pedigree field.
 
 Pedigree                 "Era / Justin"             Identifies the parents and
                                                     crossing sequences used to
                                                     produce the cultivar with a
                                                     modified version of the
                                                     Purdy system.  See
                                                     PEDIGREE discussion below.
 
 Selection_history        L0559-0L-2AP-0AP           Identifies the plants
 						    selected at each generation
 						    of selfing since the
 						    original cross.
 
 Market_class             "USDA/FGIS Hard            Text description of the
                          Red Spring Wheat"          market class based on USDA
                                                     Federal Grain Inspection
                                                     Service standards.
 
 Trait_score              "Dwarf bunt, GRIN" 88      Field requires two values.
                                                     Trait and study in which
                                                     the germplasm was
                                                     evaluated, followed by
                                                     numeric score.
 
 Trait_description        "Awn-color, GRIN" 2=Blue   Alternative to Trait_score,
                                                     to be used when the
                                                     value is not strictly
                                                     numeric.
 
 (Characteristic)         "Awn-color: Blue"          This field is no longer
                                                     used, but exists in older
                                                     germplasm records.  Usage
                                                     replaced by Trait_score
                                                     and Trait_description
                                                     fields.
 
 (Pathology)              "Powdery Mildew"           Name of a Pathology object
                                                     for a pathology to which
 USE OF THIS FIELD IS CURRENTLY DEPRECATED.          this germplasm is 
                                                     resistant.
 
 Rearrangement            T13ak                      Name of a Rearrangement
                                                     object, for a translocation
                                                     or deletion present in this
                                                     stock.
 
 Derived_from             Shasta CItr17651           Name of a Germplasm
                                                     object, the progenitor of
                                                     a mutation, monosomic,
                                                     etc., or nuclear
                                                     background of an addition
                                                     or substitution.
 
 Chromosome_donor         Imperial (rye)             Name of the Germplasm
                                                     providing the new
                                                     chromosome for an addition
                                                     or substitution.
 
 Developed_by             "Arkansas AES"             Text citing person or
                                                     organization responsible
                                                     for development of the
                                                     germplasm.
 
 Development_site         USA-Wyoming                Country and optional
                                                     state/province.
 
 Collection_site          "Akzaburt; near mountain   Text describing where a 
                          Dash-Agl"                  wild accession was 
                                                     collected.
 
 Date_of_release          <1971                      Year, or estimate thereof.
 
 Registration_no          CV-755                     Crop Science cultivar (CV-)
                                                     or germplasm (GP-)
                                                     registration number.
 
 Remark                   "Likes it hot"             Free form text providing
                                                     any additional information.
                                                     
 Reference                CRS-31-491                 Name of a Reference object.
                                                     
 Image                    "Rye chromosomes"          Name of an Image object.
 
 Data_source              CIMMYT 93.09.21            Field requires two values.
                                                     Name of a Colleague
                                                     object, followed by the
                                                     date on which the
                                                     Colleague submitted the
                                                     data to GrainGenes, in
                                                     yy.mm(.dd) format.
 
 Polymorphism             "BCD385 EcoRI"             See POLYMORPHISM below.
 
 Coefficient_of_parentage "Shasta CItr17651" 0.6     Field requires two values.
                                                     Name of a Germplasm
                                                     object, followed by a
                                                     numeric COP score from
                                                     0.0 to 1.0.
 
 ------------------------------------------------------------------------------
 
 Notes: 
    The words "Name of an object" used above mean "GrainGenes ID of a database
 record of the indicated class".  For example, whereas "Avena sativa L." is the
 name of a species in a loose sense, it is not the name of a Species object in
 the GrainGenes database and would be incorrect here, because the GrainGenes ID
 for this species is "Avena sativa".
    All text values containing embedded blanks must be bracketed by quotation
 marks (").
 
 
 
 ------------------------------------------------------------------------------
 EXAMPLES                                     SYNTAX NOTES
 ------------------------------------------------------------------------------
 
 Germplasm : "Chinese Spring CItr14108"       Must include the " : ".
 Species "Triticum aestivum"
 Subspecies "Triticum aestivum ssp. aestivum"
 Abbreviation CS                              If the value of a field does not
 Development_site China                       contain any blank spaces, the
 Date_of_release 1932                         quotes (") are optional.
 Data_source CIMMYT 93.09
 Data_source "Kephart, Kenneth D." 94.02.09   If a field has multiple values,
                                              list them one below the other.
 
 Germplasm : "Chinese Spring-Thatcher Tetrasomic Substitution 6D TH (6D CS)"
 Species "Triticum aestivum"
 Type "Substitution"
 Type "Aneuploid"
 Abbreviation "CS-TH TS6D"
 Chromosome_number 44
 Chromosome_configuration "20''+1''''6D TH(6D CS)"
 Derived_from "Chinese Spring CItr 14108"
 Chromosome_donor "Thatcher CItr10003"
 Development_site "USDA-ARS, Columbia, MO"
 Collection_and_ID "USDA-ARS, Columbia, MO"
 Data_source "Raupp, W. John" 94.11.15
 
 ------------------------------------------------------------------------------
 
 
 NAMING GERMPLASM ACCESSIONS
 ---------------------------
 
 Each Germplasm in the database is known by a single primary name, which is one
 of its common names if it has any, followed by its accession number in some
 recognized germplasm collection.  If information must be added to GrainGenes
 about a Germplasm for which only a common name is known, the identifier should
 be the common name followed by the name of the crop.
 
                Examples: "Astro CIav9160", "Lamar (oat)"
 
 What is the "common" name?  Sometimes more than one exists.  GRIN, for
 example, has six categories of names: 1) Varietal, 2) Local, 3)
 Institution, 4) Common, 5) Donor and 6) Other.  In exporting GRIN data to
 GrainGenes, we have used the lowest-numbered category for which a name is
 given.  If additional names are given, they are placed in the Other_name
 field.
 
 There are several trivial aspects of Germplasm names -- punctuation, blanks,
 etc. -- that tend to be used inconsistently.  Here are some of the conventions
 that are followed in GrainGenes to prevent such naming inconsistencies.  When
 in doubt, send email to kephart.
 
 "CI" Numbers:  Use the complete "Cereal Investigations" prefix for the correct
 crop species and do not embed a blank between prefix and accession number.
 All wheat accessions identified by a "CI" number use the prefix "CItr" (e.g.
 "CItr1353", not "CI 1353" or "CItr 1353").  Other CI prefixes for small grains
 include "CIho" (barley), "CIav" (oat) and "CIse" (rye)
 
 "PI" Numbers: Do not embed a blank between the "PI" prefix and accession
 number (e.g. "PI38347", not "PI 38347").
 
 CIMMYT Cross ID - Selection ID numbers: "CID-SID: 541-6"
 
 Abbreviations of US states: Do follow with a blank.  "MO 11769", not
 "MO11769".
 
 "RL" and similar accession numbers: Do not embed a blank between prefix and
 accession number (e.g. "RL6002", not "RL 6002").
 
 
 SPECIES AND SUBSPECIES
 ----------------------
 
 All primary Germplasm records should have a value in the Species field.  An
 optional Subspecies field is also available; it should contain the full
 species and subspecies name.  Examples:
 
                Germplasm : "Advance CIav3845"
                Species "Avena sativa"
 
                Germplasm : TA1
                Species "Triticum timopheevii"
                Subspecies "Triticum timopheevii ssp. araraticum"
 
 Note also that the authority is not included as part of either the Species
 or the Subspecies value.  This information is in the GrainGenes record for
 the Species itself.
 
 
 GENETIC STOCKS
 --------------
 
 The conventions for naming wheat genetic stocks, for formulating their
 official abbreviations, and for certain fields like Chromosome_configuration
 are described in WJ Raupp, B Friebe and BS Gill, "Suggested guidelines for
 the nomenclature and abbreviation of the genetic stocks of wheat, Triticum
 aestivum L. em Thell., and its relatives", http://wheat.pw.usda.gov/
 ggpages/GeneticStockNaming.html.
 
 
 PEDIGREES
 ---------
 
 The pedigree identifies the parents and crossing sequences used to produce the
 cultivar.  The method used to illustrate pedigrees is a slightly modified
 version of the system proposed by Purdy et al. in 1969 (see Crop Sci. 8:405-
 406).  Use of abbreviations has been minimized.  Crosses are symbolized by
 combinations of slash marks ("/") with female and male parents listed to left
 and right side, respectively.  Numbers indicate the order in which crosses
 were made:
 
                           /  = primary cross
 
                          /2/ = secondary cross
 
                          /3/ = tertiary cross
 
                          /X/ = Xth level cross, etc.
 
 Higher numbers indicate more recent crosses in the sequence.  The most recent
 or final cross used to create a cultivar is indicated by the highest number
 within the pedigree.  For example, the pedigree of "Scout" hard red winter
 wheat is:
 
      -------------------------------------------------------------------
      Example 1:  Comparison of Purdy pedigree nomenclature to a tree
                  diagram of the pedigree of Scout hard red winter wheat.
 
            Scout = Nebred /2/ Hope / Turkey Red /3/ Cheyenne / Ponca
 
                                       OR
 
                         Hope        Turkey Red
                           |______ / _____|
                                   |
                                   |
                Nebred            "?"
                   |_____ /2/ _____|
                           |              Cheyenne        Ponca
                           |                  |______ / ____|
                           |                          |
                           |                          |
                          "?"                        "?"
                           |___________ /3/ __________|
                                         |
                                         |
                                       Scout
      -------------------------------------------------------------------
 
 In narrative terms, an unidentified progeny of a primary cross between "Hope"
 hard red spring wheat and "Turkey Red" hard red winter wheat was selected and
 crossed to "Nebred" hard red winter wheat.  One of the progeny selected from
 the "Nebred/2/Hope/Turkey Red" sequence of crosses was crossed to another
 unidentified progeny derived by crossing "Cheyenne" and "Ponca" hard red
 winter wheats.  The cultivar Scout was selected from progeny resulting from
 the final or "/3/" cross.  Specific generations and selection techniques
 involved are not indicated, but may be obtained from the referenced
 literature.
 
 Single slash marks are also used where the parents are known, but the exact
 sequence of a series of crosses is unknown.  Backcrossing sequences are
 indicated by use of an asterisk ("*") preceded or followed by a number to
 indicate the total number of crosses made with the recurrent parent (see
 Examples 2 and 3).  Left and right parentheses are used to bracket both the
 pedigree and designation of breeding lines contained within a cultivar's
 pedigree (see Example 4).  Commas are used to separate breeding line pedigrees
 from designations within the parentheses.
 
 
      -------------------------------------------------------------------
      Example 2:  Pedigree with three backcrosses of female recurrent parent
                  for TAM 107 hard red winter wheat.
 
                            TAM 107 = TAM 105*4 / Amigo
 
                                         OR
 
                                                  TAM 105      Amigo
                                                     |____ / ____|
                                                           |
                                         TAM 105           |
         1st backcross>                     |____ *2 / ____|
                                                    |
                                  TAM 105           |
         2nd backcross>              |____ *3 / ____|
                                             |
                         TAM 105             |
         3rd backcross>     |_____ *4 / _____|
                                    |
                                    |
                                 TAM 107
 
 
      Example 3:  Pedigree with three backcrosses of male recurrent parent
                  for Blueboy II soft red winter wheat.
 
                      Blueboy II = Agent / Tascosa /2/ 4*Blueboy
 
                                          OR
 
               Agent      Tascosa
                 |____ / ____|
                       |
                       |          Blueboy
                       |____ /2/ ____|
                              |
                              |            Blueboy
          1st backcross>      |____ /2/ 2* ____|
                                       |
                                       |            Blueboy
          2nd backcross>               |____ /2/ 3* ____|
                                                |
                                                |            Blueboy
          3rd backcross>                        |____ /2/ 4* ____|
                                                         |
                                                         |
                                                     Blueboy II
 
 
      Example 4:  Use of parentheses to delineate breeding line used in the
                  pedigree of Pitic 62 hard red spring wheat.
 
            Pitic 62 = Yaktana 54 /2/ (Sel. 26-1c, Norin 10 / Brevor)
                                       ^
                                       Indicates Sel. 26-1c as the male
                                       parent of the highest order cross for
                                       Pitic 62, with its own pedigree of
                                       "Norin 10 / Brevor".
                                      OR
 
                                   Norin 10          Brevor
                                      |________ / ______|
                                                |
                                                |
                   Yaktana 54              Sel. 26-1c
                       |__________ /2/ _________|
                                    |
                                    |
                                 Pitic 62
 
      -------------------------------------------------------------------
 
 Narratives providing more detailed information are used where necessary for
 clarification.  Pedigrees of cultivars screened from another cultivar are
 listed as "pure line selections".  Pedigrees of cultivars phenotypically
 selected from mixtures or out-crosses in commercial fields are listed as
 "farmer selections" with the original source material identified wherever
 possible.
 
 
 POLYMORPHISMS
 -------------
 
 The Polymorphism field can accept just the Name of a Polymorphism record (e.g.
 "BCD385 EcoRI").  Alternatively this name can be followed by additional
 information about what fragments of this polymorphism are present or absent in
 this Germplasm.  This is done by adding the word "Present" or "Absent",
 followed by a list of fragments/molecules.  Example:
 
                Germplasm : "Advance CIav3845"
                Polymorphism "BCD385 EcoRI"  Present  14.9   13.4
                Polymorphism "BCD385 EcoRI"  Absent   17.2   12.4   4.6
                Polymorphism "BCD719 EcoRI"  Present   6.6    4.4
                Polymorphism "BCD719 EcoRI"  Absent   14.5
 
 The fragments may be designated by their sizes in kb, as here, but any text
 string is allowed.  Thus isozyme or protein polymorphisms could also be
 described in this field.