[adegenet-forum] sub-sampling individuals from a genind object

Thibaut Jombart jombart at biomserv.univ-lyon1.fr
Tue Nov 25 16:06:42 CET 2008

Stephen Petersen wrote:

Dear Stephen,
> Hello,
> I am having trouble as I try to sub sample from a population. I would like
> to sub sample after I have created the genind object so that I can separate
> populations then randomly pull samples from a specific population.
I am not sure I understand what you want to do, so sorry if I miss the 
point. If your purpose is to separate populations from you genind, you 
can use seppop:
 > data(sim2pop) # sim2pop is a genind with 130 genotypes in two 
populations (with 100 and 30 genotypes respectively)
 > foo <- seppop(sim2pop)
 > foo
*$`Pop A`
   ### Genind object ###
- genotypes of individuals -

S4 class:  genind
@call: .local(x = x, i = i, j = j, drop = drop)

@tab*:  100* x 241 matrix of genotypes

@ind.names: vector of  100 individual names
@loc.names: vector of  20 locus names
@loc.nall: number of alleles per locus
@loc.fac: locus factor for the  241 columns of @tab
@all.names: list of  20 components yielding allele names for each locus
@ploidy:  2

Optionnal contents:
@pop:  factor giving the population of each individual
@pop.names:  factor giving the population of each individual

@other: a list containing: xy

*$`Pop B`*

   ### Genind object ###
- genotypes of individuals -

S4 class:  genind
@call: .local(x = x, i = i, j = j, drop = drop)

@tab:  *30* x 241 matrix of genotypes

@ind.names: vector of  30 individual names
@loc.names: vector of  20 locus names
@loc.nall: number of alleles per locus
@loc.fac: locus factor for the  241 columns of @tab
@all.names: list of  20 components yielding allele names for each locus
@ploidy:  2

Optionnal contents:
@pop:  factor giving the population of each individual
@pop.names:  factor giving the population of each individual

@other: a list containing: xy

here, foo is a list with two components, which are genind objects 
containing the genotypes from population A and B.
Now, if you want to sample randomly, say 10 genotypes from each 
population, you can use:
 > mySamp <- lapply(foo, function(x) x[sample(1:nrow(x$tab), 10)])
 > mySamp
$`Pop A`

   ### Genind object ###
- genotypes of individuals -

S4 class:  genind
@call: .local(x = x, i = i, j = j, drop = drop)

@tab:  10 x 241 matrix of genotypes

@ind.names: vector of  10 individual names
@loc.names: vector of  20 locus names
@loc.nall: number of alleles per locus
@loc.fac: locus factor for the  241 columns of @tab
@all.names: list of  20 components yielding allele names for each locus
@ploidy:  2

Optionnal contents:
@pop:  factor giving the population of each individual
@pop.names:  factor giving the population of each individual

@other: a list containing: xy

$`Pop B`

   ### Genind object ###
- genotypes of individuals -

S4 class:  genind
@call: .local(x = x, i = i, j = j, drop = drop)

@tab:  10 x 241 matrix of genotypes

@ind.names: vector of  10 individual names
@loc.names: vector of  20 locus names
@loc.nall: number of alleles per locus
@loc.fac: locus factor for the  241 columns of @tab
@all.names: list of  20 components yielding allele names for each locus
@ploidy:  2

Optionnal contents:
@pop:  factor giving the population of each individual
@pop.names:  factor giving the population of each individual

@other: a list containing: xy

mySamp is a list with 2 genind objects, each containing 10 genotypes 
taken at random from populations A and B.
You can put these samples back into a single genind object by using pool:
 >  x <- repool(mySamp)
 > x

   ### Genind object ###
- genotypes of individuals -

S4 class:  genind
@call: repool(mySamp)

@tab:  20 x 175 matrix of genotypes

@ind.names: vector of  20 individual names
@loc.names: vector of  20 locus names
@loc.nall: number of alleles per locus
@loc.fac: locus factor for the  175 columns of @tab
@all.names: list of  20 components yielding allele names for each locus
@ploidy:  2

Optionnal contents:
@pop:  factor giving the population of each individual
@pop.names:  factor giving the population of each individual

@other: - empty -

 > x$pop
 [1] P1 P1 P1 P1 P1 P1 P1 P1 P1 P1 P2 P2 P2 P2 P2 P2 P2 P2 P2 P2
Levels: P1 P2

x contains 20 genotypes randomly drawn from populations A and B.

> I'm not sure if am on the correct track but I am able to sample the names
> (indlist <- sample(data$ind.names, 5, replace = FALSE, prob = NULL) from my
> population but I can't seem to sample the alleles (data$tab ?) or the entire
> record for a set of individuals.
The simplest way to subset a genind object is using the [] operator:
x[ ind, all] where 'ind' is the index of individuals kept, and 'all' is 
the index of alleles kept. Like in matrices, when nothing is given, all 
is kept. For instance:
contains only the 5 first genotypes in x, with all alleles. You can also 
use the argument loc="..." to specify loci to be retainde (... will be a 
vector naming the retained loci).For instance :
x[1:5, loc="L03"]

contains the 5 first genotypes of x, but only for locus 3.

> Any help will be appreciated
> Thanks
> Stephen
Please have a look at the data manipulation section from the adegenet 
tutorial (starting p.10):
which covers these topics.

Best regards,

> Stephen D. Petersen, Ph.D.
> Post Doctoral Researcher
> Fisheries & Oceans Canada, ArcticNet
> Winnipeg, Manitoba
> _______________________________________________
> adegenet-forum mailing list
> adegenet-forum at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum

Dr Thibaut JOMBART
CNRS UMR 5558 - Laboratoire de Biométrie et Biologie Evolutive
Universite Lyon 1
43 bd du 11 novembre 1918
69622 Villeurbanne Cedex
Tél. :
Fax :
jombart at biomserv.univ-lyon1.fr
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20081125/e09d10e5/attachment-0001.htm 

More information about the adegenet-forum mailing list