Sampling Strategies


.  



Germplasm collections may consist of thousands of accessions. For association analysis, usually only a part of a germplasm collection can be used.
Preferably, the accessions selected to represent the germplasm collection should contain all diversity present in the entire germplasm collection. The set of selected accessions is often referred to as a core collection.

The simplest way would be to take a random sample from a germplasm collection. A more sophisticated way would be to take random samples from pre-determined groups of accessions within a germplasm collection. This strategy is called stratified random sampling.
Two problems arise if we want to carry out stratified random sampling. First, we have to define the groups of accessions. Preferably, accessions in the same group should be more alike than accessions in different groups. Secondly, we have to determine the sizes of the samples we want to take from the groups.

In order to define groups of accessions we need information about the resemblance of accessions. Measures of similarity between accessions may be based on different types of information, e.g. passport data, phenotypic data and genotypic data. Marker data are becoming increasingly popular for calculating similarities between accessions.

Some methods are not based on random sampling, but on choosing a set of accessions with maximum diversity. They use some form of optimization.


A GCP funded introductory text on germplasm sampling can be found on  these pages.

Software for sampling, developed or adapted by the GCP, can be found  here and  here at Cropforge.

Core Hunter, developed by the GCP, is a program that can be used for sampling genetic resources from a reference data set, in order to establish core subsets. Users can specify the sampling intensity, which genetic measures will be used for selection criteria, or both.


.  

back
back to the GCP bioinformatics portal page

  

GCP Bioinformatics
and Biometrics