[adegenet-forum] help with scaleGEN

Jombart, Thibaut t.jombart at imperial.ac.uk
Thu Sep 19 13:41:06 CEST 2013


I haven't seen many, but one can think of a few cases, yes. 

In multialllelic markers such as microsatellites, one may want to give the same 'weight' to each marker, and thus use a scaling so that the total variance (ie summed over alleles) would be the same for all markers. But this is already a bit different from standardizing alleles, at least in practice (on a theoretical level, the procedure is nearly identical, we divide vectors/matrices by their norm). 

Same idea could apply to SNPs of different genes. 

Cheers
Thibaut

________________________________________
From: Danica Fabrigar [danica_714 at hotmail.com]
Sent: 19 September 2013 09:57
To: Jombart, Thibaut; adegenet-forum at lists.r-forge.r-project.org
Subject: RE: [adegenet-forum] help with scaleGEN

Hi Thibaut,

Thank you for the clarification. I got confused myself there.

What you've said made a lot of sense, are there cases in genetics in which scaling would be a good idea?


Regards,
Danica



 ________________________________________
> From: t.jombart at imperial.ac.uk
> To: danica_714 at hotmail.com; adegenet-forum at lists.r-forge.r-project.org
> Subject: RE: [adegenet-forum] help with scaleGEN
> Date: Wed, 18 Sep 2013 14:53:53 +0000
>
> Hello,
>
> I think some clarification should help here.
>
> "scaling" means transforming a variable to that its variance is 1. It is usually used to remove the effects of variances inherently different across a bunch of variables (typically because of different units). In genetics, most of the time, I think scaling is a bad idea: all variable have the same unit, and differences in variances are probably meaningful.
>
> missing="mean" refers to the procedure for replacing missing data. They are set to the origin, which is the mean of the corresponding allele frequencies (typically the 'non-informative' point in PCA).
>
> Best
> Thibaut
>
>
> ________________________________________
> From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Danica Fabrigar [danica_714 at hotmail.com]
> Sent: 18 September 2013 11:03
> To: adegenet-forum at lists.r-forge.r-project.org
> Subject: [adegenet-forum] help with scaleGEN
>
> Hi adegenet users,
>
> I am having some trouble interpreting how scaleGEN is supposed to be used when plotting a PCA.
>
> I get very different results when running the following two commands (note: "scale=FALSE" is omitted in the second object):
>
> A)
> obj <- scaleGen(mosquitoind, scale=FALSE, missing="mean")
> pca.obj <- dudi.pca(obj,cent=FALSE,scale=FALSE,scannf=FALSE,nf=3)
>
> B)
> obj 2<- scaleGen(mosquitoind, missing="mean")
> pca.obj2 <- dudi.pca(obj2,cent=FALSE,scale=FALSE,scannf=FALSE,nf=3)
>
>
> I guess my question is, what is the appropriate way of using scaleGEN if I want to scale my missing data to the mean allele frequency?
>
>
> Thanks in advance,
> Danica


More information about the adegenet-forum mailing list