[Genabel-commits] r2003 - tutorials/GenABEL_general

noreply at r-forge.r-project.org noreply at r-forge.r-project.org
Wed Jul 8 22:05:23 CEST 2015


Author: lckarssen
Date: 2015-07-08 22:05:22 +0200 (Wed, 08 Jul 2015)
New Revision: 2003

Modified:
   tutorials/GenABEL_general/strat0.Rnw
Log:
Summary: Remove unwanted whitespace from the stratification chapter of the GenABEL tutorial.


Modified: tutorials/GenABEL_general/strat0.Rnw
===================================================================
--- tutorials/GenABEL_general/strat0.Rnw	2015-07-08 19:51:55 UTC (rev 2002)
+++ tutorials/GenABEL_general/strat0.Rnw	2015-07-08 20:05:22 UTC (rev 2003)
@@ -17,7 +17,7 @@
 
 %
 % would be good to have confounder figure here
-% 
+%
 
 There are two major types of confounders leading to induced correlation in
 genetic association studies. One type is ``good'' confounding of association by
@@ -35,7 +35,7 @@
 Due to genetic drift, these two populations will have very different frequencies
 at many loci throughout the genome. At the same time, these two populations are
 different phenotypically (prevalence of different disease, mean value of
-quantitative traita) due to accumulated genetic and cultural differences. 
+quantitative traita) due to accumulated genetic and cultural differences.
 Therefore any of these traits will show
 association with multiple genomic loci. While some of these associations may
 be genuine genetic associations in a sense that either the polymorphisms
@@ -47,13 +47,13 @@
 genetic asociation study in which two very distinct populations are so blintly
 mixed and analysed not taking this mixture into account. However, a more subtle
 scenario where several slighly genetically different populations are mixed in
-the same study is frequently the case and a matter of concern in GWA studies. 
+the same study is frequently the case and a matter of concern in GWA studies.
 
-In this chapter, we will define what is genetic structure, 
-and how it can be quantified (section \ref{sec:genstruct}); 
-what are the effects of genetic structure on the standard 
-association tests (section \ref{sec:effects_on_tests}) and 
-specific association tests which take possible genetic 
+In this chapter, we will define what is genetic structure,
+and how it can be quantified (section \ref{sec:genstruct});
+what are the effects of genetic structure on the standard
+association tests (section \ref{sec:effects_on_tests}) and
+specific association tests which take possible genetic
 structure into account (section \ref{sec:tests_in_structured_populations}).
 
 {\bf The text of this chapter is in large part based on a chapter of a
@@ -66,212 +66,212 @@
 \section{Genetic structure of populations}
 \label{sec:genstruct}
 
-A major unit of genetic structure is a 
-genetic population. Different definitions 
-of genetic population are available,  
-for example 
-\href{http://en.wikipedia.org/wiki/Population}{Wikipedia 
-defines population (biol.)} 
-as ''the collection of inter-breeding organisms of a 
-particular species''. The genetics of populations is 
+A major unit of genetic structure is a
+genetic population. Different definitions
+of genetic population are available,
+for example
+\href{http://en.wikipedia.org/wiki/Population}{Wikipedia
+defines population (biol.)}
+as ''the collection of inter-breeding organisms of a
+particular species''. The genetics of populations is
 \href{http://en.wikipedia.org/wiki/Population_genetics}{
-''the study of the allele frequency distribution and change 
-under the influence of \ldots evolutionary processes''}. 
+''the study of the allele frequency distribution and change
+under the influence of \ldots evolutionary processes''}.
 \index{population genetics}
-In the framework of population genetics, the main 
-characteristics of interest of a group of 
+In the framework of population genetics, the main
+characteristics of interest of a group of
 individuals are their genotypes, frequencies of alleles
-in this group, and the dynamics of these distributions 
-in time. 
-While the units of interest of population genetics 
-are alleles, the units of evolutionary processes 
-are acting upon are organisms. 
+in this group, and the dynamics of these distributions
+in time.
+While the units of interest of population genetics
+are alleles, the units of evolutionary processes
+are acting upon are organisms.
 Therefore a definition of a genetic population should
-be based on the chance that different alleles, present 
+be based on the chance that different alleles, present
 in the individuals in question can mix together;
-if such chance is zero, 
-we may consider such groups as different populations, 
-each described by its own genotypic and allelic 
+if such chance is zero,
+we may consider such groups as different populations,
+each described by its own genotypic and allelic
 frequencies and their dynamic.
-Based on these considerations, a genetic 
-population may be defined a 
-in the following way: 
+Based on these considerations, a genetic
+population may be defined a
+in the following way:
 
 \emph{
-Two individuals, $I_1$ and $I_2$, belong to the same 
-population if (a) the probability that they would 
-have an offspring in common is greater then zero and 
-(b) this probability is much higher than the probability 
-of $I_1$ and $I_2$ having an offspring in common with 
-some individual $I_3$, which is said to belong to other 
+Two individuals, $I_1$ and $I_2$, belong to the same
+population if (a) the probability that they would
+have an offspring in common is greater then zero and
+(b) this probability is much higher than the probability
+of $I_1$ and $I_2$ having an offspring in common with
+some individual $I_3$, which is said to belong to other
 genetic population.}
 \index{genetic population!prospective definition}
 
 Here, to have an offspring in common
-does not imply having a direct offspring, but rather a 
-common descendant in a number of generations. 
+does not imply having a direct offspring, but rather a
+common descendant in a number of generations.
 
-However, in gene discovery in general and GWA studies 
-in particular we are usually not interested 
-in future dinamics of alleles and genotypes distributions. 
-What is the matter of concern in genetic association 
-studies is potential common 
-ancestry -- that is that individuals 
-may share common ancestors and thus share in common 
-the alleles, which are exact copies of the same ancestral 
-allele. Such alleles are called ''identical-by-descent'', 
+However, in gene discovery in general and GWA studies
+in particular we are usually not interested
+in future dinamics of alleles and genotypes distributions.
+What is the matter of concern in genetic association
+studies is potential common
+ancestry -- that is that individuals
+may share common ancestors and thus share in common
+the alleles, which are exact copies of the same ancestral
+allele. Such alleles are called ''identical-by-descent'',
 or IBD for short.\index{identity by descent}\index{IBD}
-If the chance of IBD is high, this reflects high degree 
-of genetic relationship. 
-As a rule, relatives 
-share many features, both environmental and genetic, 
-which may lead to confounding. 
+If the chance of IBD is high, this reflects high degree
+of genetic relationship.
+As a rule, relatives
+share many features, both environmental and genetic,
+which may lead to confounding.
 
-Genetic relationship between a pair of individuals 
-is quantified using the ''coefficient of kinship'', 
-which measures that chance that gametes, sampled 
+Genetic relationship between a pair of individuals
+is quantified using the ''coefficient of kinship'',
+which measures that chance that gametes, sampled
 at random from these individuals, are IBD.\index{coefficient!of
-kinship}\index{kinship!coefficient}\label{def:kinship} 
+kinship}\index{kinship!coefficient}\label{def:kinship}
 
-Thus for the purposes of gene-discovery 
-we can define genetic population 
-use retrospective terms and based on the 
-concept of IBD: 
+Thus for the purposes of gene-discovery
+we can define genetic population
+use retrospective terms and based on the
+concept of IBD:
 
 \emph{
-Two individuals, $I_1$ and $I_2$, belong to the same 
-genetic population if (a) their genetic relationship, measured 
-with the coefficient of kinship, 
+Two individuals, $I_1$ and $I_2$, belong to the same
+genetic population if (a) their genetic relationship, measured
+with the coefficient of kinship,
 is greater then zero and (b) their kinship is much higher
 than kinship between them and some individual $I_3$, which is
 said to belong to other genetic population.}
 \index{genetic population!retrospective definition}
 \label{def:population}
 
-One can see that this definition is quantitative and 
-rather flexible (if not to say arbitrary): what we call 
-a ''population'' depends on the choice of the threshold 
-for the ''much-higher'' probability. Actually, what 
-you define as ''the same'' genetic population depends 
-in large part on the scope aims of your study. 
-In human genetics literature you may find references to 
-a particular genetically isolated population, population of some 
-country (e.g. ''German population'', ''population of United Kingdom''), 
-European, Caucasoid or even general human population. Defining a 
-population is about deciding on some probability threshold. 
+One can see that this definition is quantitative and
+rather flexible (if not to say arbitrary): what we call
+a ''population'' depends on the choice of the threshold
+for the ''much-higher'' probability. Actually, what
+you define as ''the same'' genetic population depends
+in large part on the scope aims of your study.
+In human genetics literature you may find references to
+a particular genetically isolated population, population of some
+country (e.g. ''German population'', ''population of United Kingdom''),
+European, Caucasoid or even general human population. Defining a
+population is about deciding on some probability threshold.
 
-In genetic association studies, it is frequently assumed that 
-study participants are ''unrelated'' and ''come from the same 
-genetic population''.  Here, ''unrelated'' means, that while 
-study participants come from the same population (so, there is 
-non-zero kinship between them!), this kinship is so low that it 
-has very little effect on the statistical testing procedures 
-used to study association between genes and phenotypes. 
+In genetic association studies, it is frequently assumed that
+study participants are ''unrelated'' and ''come from the same
+genetic population''.  Here, ''unrelated'' means, that while
+study participants come from the same population (so, there is
+non-zero kinship between them!), this kinship is so low that it
+has very little effect on the statistical testing procedures
+used to study association between genes and phenotypes.
 
-In the following sections we will consider the effects of population 
-structure on the istribution of genotypes in a study population. 
-We will start with assumption of zero kinship between study 
+In the following sections we will consider the effects of population
+structure on the istribution of genotypes in a study population.
+We will start with assumption of zero kinship between study
 participants, which would allow us to formulate Hary-Weinberg principle
-(section \ref{subsec:HWE}). 
-In effect, there is no such thing as zero kinship between 
-any two organisms, however, when kinship is very low, the effects 
-of kinship on genotypic distribution are minimal, as we will see in 
-section \ref{subsec:inbreeding}. The effects of substructure -- 
-that is when study sample consist of several genetic populations -- 
+(section \ref{subsec:HWE}).
+In effect, there is no such thing as zero kinship between
+any two organisms, however, when kinship is very low, the effects
+of kinship on genotypic distribution are minimal, as we will see in
+section \ref{subsec:inbreeding}. The effects of substructure --
+that is when study sample consist of several genetic populations --
 onto genotypic distribution will be considered in section \ref{subsec:wahlund}.
-Finally, we will generalize the obtained results for the 
-case of arbitrary structures and will see what are the effects 
-of kinship onto joint distribution of genotypes and phenotypes 
-in section \ref{subsec:phenocorr}. 
+Finally, we will generalize the obtained results for the
+case of arbitrary structures and will see what are the effects
+of kinship onto joint distribution of genotypes and phenotypes
+in section \ref{subsec:phenocorr}.
 
 \subsection{Hardy-Weinberg equilibrium}
 \label{subsec:HWE}
-To describe genetic structure of populations 
+To describe genetic structure of populations
 we will use rather simplistic model
-approximating genetic processes in natural populations. Firstly, we will 
-assume that the population under consideration has infinitely 
-large size, which implies that we can work in terms of probabilities, 
-and no random process take place. 
-Secondly, we accept non-overlapping 
+approximating genetic processes in natural populations. Firstly, we will
+assume that the population under consideration has infinitely
+large size, which implies that we can work in terms of probabilities,
+and no random process take place.
+Secondly, we accept non-overlapping
 
 $$\textrm{generation}  \Rightarrow \textrm{gametic pool} \Rightarrow \textrm{generation}$$
 \index{generation -- gametic pool -- generation model}
 \label{ggpg_model}
 
-\noindent model. This model assumes that a set of individuals 
-contributes gametes to genetic pool, and dies out. The gametes 
-are sampled randomly from this pool in pairs to form individuals 
-of the second generation. The selection acts on individuals, while 
-mutation occurs when the gametic pool is formed. The key point 
-of this model is the abstract of gametic pool: if you use that, 
-you do not need to consider all pair-wise mating between male and 
-female individuals; you rather consider some abstract infinitely 
-large pool, where gametes are contributed to with the frequency 
-proportional to that in previous generation. Interestingly, this 
-rather artificial construct has a great potential to describe 
-the phenomena we indeed observe in nature. 
+\noindent model. This model assumes that a set of individuals
+contributes gametes to genetic pool, and dies out. The gametes
+are sampled randomly from this pool in pairs to form individuals
+of the second generation. The selection acts on individuals, while
+mutation occurs when the gametic pool is formed. The key point
+of this model is the abstract of gametic pool: if you use that,
+you do not need to consider all pair-wise mating between male and
+female individuals; you rather consider some abstract infinitely
+large pool, where gametes are contributed to with the frequency
+proportional to that in previous generation. Interestingly, this
+rather artificial construct has a great potential to describe
+the phenomena we indeed observe in nature.
 
-In this section, we will derive Hardy-Weinberg low (this analog 
-of the Mendel's low for populations). The question to be 
-answered is, if some alleles at some locus segregate  
-according to Mendel's lows and aggregate totally at random, what 
-would be genotypic distribution in a population? 
+In this section, we will derive Hardy-Weinberg low (this analog
+of the Mendel's low for populations). The question to be
+answered is, if some alleles at some locus segregate
+according to Mendel's lows and aggregate totally at random, what
+would be genotypic distribution in a population?
 
-Let us consider two alleles, wild type normal allele ($N$) and 
-a mutant ($D$), segregating at some locus in the population 
-and apply the ''generation $\Rightarrow$ gametic pool $\Rightarrow$ 
-generation'' model. 
-Let us denote the ferquency of the $D$ allele in the gametic 
+Let us consider two alleles, wild type normal allele ($N$) and
+a mutant ($D$), segregating at some locus in the population
+and apply the ''generation $\Rightarrow$ gametic pool $\Rightarrow$
+generation'' model.
+Let us denote the ferquency of the $D$ allele in the gametic
 pool as $q$, and the frequency of the other allele, $N$, as
-$p=1-q$.  
-Gametes containing alleles $N$ and $D$ are sampled at random to 
-form diploid individuals of the next generation. 
-The probability to sample a ''$N$'' gamete is $p$, and the 
-probability that the second sampled gamete is also ''$N$'' is 
-also $p$. According to the rule, which states that joint probability 
-of two independent events is a product of their probabilities, 
-the probability to sample ''$N$'' and ''$N$'' is 
-$p \cdot p = p^2$. In the same manner, the probability to 
-sample ''$D$'' and then ''$D$'' is $q \cdot q = q^2$. The 
-probability to sample first the mutant and then normal allele 
-is $q \cdot p$, the same is the probability to 
-sample ''$D$'' first and ''$N$'' second. In most situations, we 
-do not (and can not) distinguish heterozygous genotypes $DN$ 
-and $ND$ and refer to both of them as ''$ND$''. In this 
-notation, frequency of $ND$ will be 
-$q \cdot p + p \cdot q = 2 \cdot p \cdot q $. 
-Thus, we have computed the genotypic distribution for a population 
-formed from a gametic pool in which the frequency of $D$ allele 
-was $q$. 
+$p=1-q$.
+Gametes containing alleles $N$ and $D$ are sampled at random to
+form diploid individuals of the next generation.
+The probability to sample a ''$N$'' gamete is $p$, and the
+probability that the second sampled gamete is also ''$N$'' is
+also $p$. According to the rule, which states that joint probability
+of two independent events is a product of their probabilities,
+the probability to sample ''$N$'' and ''$N$'' is
+$p \cdot p = p^2$. In the same manner, the probability to
+sample ''$D$'' and then ''$D$'' is $q \cdot q = q^2$. The
+probability to sample first the mutant and then normal allele
+is $q \cdot p$, the same is the probability to
+sample ''$D$'' first and ''$N$'' second. In most situations, we
+do not (and can not) distinguish heterozygous genotypes $DN$
+and $ND$ and refer to both of them as ''$ND$''. In this
+notation, frequency of $ND$ will be
+$q \cdot p + p \cdot q = 2 \cdot p \cdot q $.
+Thus, we have computed the genotypic distribution for a population
+formed from a gametic pool in which the frequency of $D$ allele
+was $q$.
 
-To obtain the next generation, the next gametic pool is generated. 
-The frequency of $D$ in the nect gametic pool is 
-$q^2 + \frac{1}{2}\cdot 2 \cdot p \cdot q$. 
-Here, $q^2$ is the probability that a gamete-contributing 
-individual has genotype $DD$; $2\cdot p \cdot q$ is the probability that 
-a gamete-contributing individual is $ND$, and $\frac{1}{2}$ is 
+To obtain the next generation, the next gametic pool is generated.
+The frequency of $D$ in the nect gametic pool is
+$q^2 + \frac{1}{2}\cdot 2 \cdot p \cdot q$.
+Here, $q^2$ is the probability that a gamete-contributing
+individual has genotype $DD$; $2\cdot p \cdot q$ is the probability that
+a gamete-contributing individual is $ND$, and $\frac{1}{2}$ is
 the probability that $ND$ individual contributes $D$ allele
-(only half of the gametes contributed by individuals with $ND$ 
-genotype are $D$); see Figure \ref{fig:allelic_freq}. 
-Thus the freqeuncy of $D$ in the gametic pool is 
+(only half of the gametes contributed by individuals with $ND$
+genotype are $D$); see Figure \ref{fig:allelic_freq}.
+Thus the freqeuncy of $D$ in the gametic pool is
 $q^2 + \frac{1}{2}\cdot 2 \cdot p \cdot q = q \cdot (q + p) = q$
--- exactly the same as it was in previous gametic pool. 
+-- exactly the same as it was in previous gametic pool.
 
 \begin{figure}
 \center
 \includegraphics[width=0.80\textwidth]{allelic_freq}
 \caption{
-Genotypic and allelic frequency distribution in a 
+Genotypic and allelic frequency distribution in a
 population; $q=P(D)=P(DD)+\frac{1}{2}\cdot P(DN)$.
 }
 \label{fig:allelic_freq}
 \end{figure}
 
-Thus, if assumptions of random segregation 
-and aggregation hold, the expected frequency of $NN$, $ND$ 
-and $DD$ genotypes are stable over generations and 
-can be related to the allelic frequencies using the 
-follwoing relation   
+Thus, if assumptions of random segregation
+and aggregation hold, the expected frequency of $NN$, $ND$
+and $DD$ genotypes are stable over generations and
+can be related to the allelic frequencies using the
+follwoing relation
 
 \begin{equation}
 \label{eq:HWE2}
@@ -285,68 +285,68 @@
 \label{Hardy-Weinberg equilibrium}
 
 
-There are many reasons, in which random segregation and 
-aggregation, and, consequently, Hardy-Weinberg equilibrium, 
+There are many reasons, in which random segregation and
+aggregation, and, consequently, Hardy-Weinberg equilibrium,
 are violated. It is very important to
-realize that, especially if the study participants are believed 
-to come from the same genetic population, most of the times when 
-deviation from HWE is detected, this 
-deviation is due to technical reasons, i.e. genotyping 
-error. Therefore testing for HWE is a part of the 
-genotypic quality control procedure in most studies. 
-Only when the possibility of technical errors is 
-eliminated, other possible explanations may be 
+realize that, especially if the study participants are believed
+to come from the same genetic population, most of the times when
+deviation from HWE is detected, this
+deviation is due to technical reasons, i.e. genotyping
+error. Therefore testing for HWE is a part of the
+genotypic quality control procedure in most studies.
+Only when the possibility of technical errors is
+eliminated, other possible explanations may be
 considered.
-In a case when deviation from HWE can not be explained 
-by technical reasons, the most frequent explanation would 
-be that the sample tested is composed of representatives 
-of different genetic populations, or more subtle 
-genetic structure. However, unless study participants 
-represent a mixture of very distinct genetic 
-populations -- the chances of which coming unnoticed 
-are low -- the efffects of genetic structure on HWE 
-are difficult to detect, at least for any single marker, 
-as you will see in the next sections. 
+In a case when deviation from HWE can not be explained
+by technical reasons, the most frequent explanation would
+be that the sample tested is composed of representatives
+of different genetic populations, or more subtle
+genetic structure. However, unless study participants
+represent a mixture of very distinct genetic
+populations -- the chances of which coming unnoticed
+are low -- the efffects of genetic structure on HWE
+are difficult to detect, at least for any single marker,
+as you will see in the next sections.
 \index{deviation from Hardy-Weinberg equilibrium}
 \index{Hardy-Weinberg equilibrium!deviation from}
 
 \subsection{Inbreeding}
 \label{subsec:inbreeding}
 
-Inbreeding is preferential breeding between (close) relatives.\index{inbreeding} 
-An extreme example of inbreeding is a selfing, a breeding system, 
-observed in some plants. The inbreeding is not uncommon in animal 
-and human populations. Here, the main reason 
-for inbreeding are usually geographical (e.g. mice live in 
-very small interbred colonies -- dems -- which are usually 
-established by few mice and are quite separated 
+Inbreeding is preferential breeding between (close) relatives.\index{inbreeding}
+An extreme example of inbreeding is a selfing, a breeding system,
+observed in some plants. The inbreeding is not uncommon in animal
+and human populations. Here, the main reason
+for inbreeding are usually geographical (e.g. mice live in
+very small interbred colonies -- dems -- which are usually
+established by few mice and are quite separated
 from other dems) or cultural (e.g. noble families
-of Europe). 
+of Europe).
 
-Clearly, such preferential breeding between relatives 
-violates the assumption of random aggregation, underling 
-Hardy-Weinberg principle. Relatives are likely to share the 
-same alleles, inherited from common ancestors. Therefore 
-their progeny has an increased chance of being 
-\emph{autozygous}\index{autozygosity} -- that is to 
-inherit a copy of exactly the same ancestral allele 
-from both parents. An autozygous genotype is always 
-homozygous, therefore inbreeding should increase the 
-frequency of homozygous, and decrease the frequency of 
+Clearly, such preferential breeding between relatives
+violates the assumption of random aggregation, underling
+Hardy-Weinberg principle. Relatives are likely to share the
+same alleles, inherited from common ancestors. Therefore
+their progeny has an increased chance of being
+\emph{autozygous}\index{autozygosity} -- that is to
+inherit a copy of exactly the same ancestral allele
+from both parents. An autozygous genotype is always
+homozygous, therefore inbreeding should increase the
+frequency of homozygous, and decrease the frequency of
 heterozygous, genotypes.
 
-Inbreeding is quantified by the \emph{coefficient of 
+Inbreeding is quantified by the \emph{coefficient of
 inbreeding},\index{coefficient!of inbreeding}\index{inbreeding!coefficient of}
-which is defined as the probability of autozygosity. 
-This coefficient may characterize an individual, or 
-a population in general, in which case this is expectation 
-that a random individual from the population is 
-autozygous at a random locus. The coefficient of 
-inbreeding is closely related to the coefficient of 
+which is defined as the probability of autozygosity.
+This coefficient may characterize an individual, or
+a population in general, in which case this is expectation
+that a random individual from the population is
+autozygous at a random locus. The coefficient of
+inbreeding is closely related to the coefficient of
 kinship, defined earlier for a pair of individuals as
-the probability that two alleles sampled 
+the probability that two alleles sampled
 at random from these individuals, are IBD. It is easy to see
-that the coefficient of inbreeding for a person is 
+that the coefficient of inbreeding for a person is
 the same as the kinship between its parents.
 \index{coefficien!of inbreeding, relation to kinship}
 \index{coefficien!of kinship, relation to inbreeding}
@@ -354,58 +354,58 @@
 \begin{figure}
 \center
 \includegraphics[width=1.00\textwidth]{inbred_family}
-\caption{Inbred family structure (A) and probability of 
-individual ''G'' being autozygous for the ''Red'' ancestral 
+\caption{Inbred family structure (A) and probability of
+individual ''G'' being autozygous for the ''Red'' ancestral
 allele
 }
 \label{fig:inbred_family}
 \end{figure}
 
 Let us compute the inbreeding coefficient for the person {\bf J}
-depicted at figure \ref{fig:inbred_family}. {\bf J} is a child 
-of {\bf G} and {\bf H}, who are cousins. {\bf J} could be autozygous 
-at for example ''red'' allele of founder grand-grand-parent {\bf A}, 
-which could have been transmitted through the meioses 
+depicted at figure \ref{fig:inbred_family}. {\bf J} is a child
+of {\bf G} and {\bf H}, who are cousins. {\bf J} could be autozygous
+at for example ''red'' allele of founder grand-grand-parent {\bf A},
+which could have been transmitted through the meioses
 {\bf A $\Rightarrow$ D}, {\bf D $\Rightarrow$ G}, and
-{\bf G $\Rightarrow$ J}, and also through the path 
+{\bf G $\Rightarrow$ J}, and also through the path
 {\bf A $\Rightarrow$ E}, {\bf E $\Rightarrow$ H}, and
 {\bf H $\Rightarrow$ J} (Figure \ref{fig:inbred_family} {\bf B}).
-What is the chance for {\bf J} to be autozygous for the 
-''red'' allele? The probability that this particular founder 
-allele is transmitted to {\bf D} is $1/2$, the same is the probability 
-that the allele is transmitted from {\bf D} to {\bf G}, and 
-the probability that the allele is transmitted from 
-{\bf G} to {\bf J}. Thus the probability that the ''red'' allele 
+What is the chance for {\bf J} to be autozygous for the
+''red'' allele? The probability that this particular founder
+allele is transmitted to {\bf D} is $1/2$, the same is the probability
+that the allele is transmitted from {\bf D} to {\bf G}, and
+the probability that the allele is transmitted from
+{\bf G} to {\bf J}. Thus the probability that the ''red'' allele
 is transmitted from {\bf A} to {\bf J} is $1/2 \cdot 1/2 \cdot 1/2 = 1/2^3 = 1/8$.
-The same is the chance that that allele is transmitted from 
-{\bf A} to {\bf E} to {\bf H} to {\bf J}, therefore the probability 
-that {\bf J} would be autozygous for the red allele is 
-$1/2^3 \cdot 1/2^3 = 1/2^6 = 1/64$. However, we are interested in 
-autozygosity for any founder allele; and there are four such 
-alleles (''red'', ''green'', ''yellow'' and ''blue'', figure 
-\ref{fig:inbred_family} {\bf B}). For any of these the probability 
-of autozygosity is the same, thus the total probability of 
-autozygosity for {\bf J} is $4\cdot 1/64 = 1/2^4 = 1/16$.  
+The same is the chance that that allele is transmitted from
+{\bf A} to {\bf E} to {\bf H} to {\bf J}, therefore the probability
+that {\bf J} would be autozygous for the red allele is
+$1/2^3 \cdot 1/2^3 = 1/2^6 = 1/64$. However, we are interested in
+autozygosity for any founder allele; and there are four such
+alleles (''red'', ''green'', ''yellow'' and ''blue'', figure
+\ref{fig:inbred_family} {\bf B}). For any of these the probability
+of autozygosity is the same, thus the total probability of
+autozygosity for {\bf J} is $4\cdot 1/64 = 1/2^4 = 1/16$.
 
-Now we shall estimate the expected genotypic probability 
-distribution for a person characterized with some 
-arbitrary coefficient of inbreeding, $F$ -- or for a population 
-in which average inbreeding is $F$. Consider a locus with two 
-alleles, $A$ and $B$, with frequency of $B$ denoted as $q$, and 
-frequency of $A$ as $p=1-q$. If the person is autozygous 
-for some founder allele, the founder allele may be either 
-$A$, leading to autozygous genotype $AA$, or the founder 
-allele may be $B$, leading to genotype $BB$. The chance that 
-the founder allele is $A$ is $p$, and the chance that the 
-founder allele is $B$ is $q$. If the person 
-is not autozygous, then the expected genotypic frequencies 
-follow HWE. Thus, the probability of genotype 
-$AA$ is $(1-F)\cdot p^2 + F\cdot p$, where the first term corresponds 
-to probability that the person is $AA$ given it is not inbred ($p^2$), 
-multiplied by the probability that it is not inbred ($1-F$), and 
-the second term corresponds to probability that a person is 
-$AA$ given it is inbred ($p$), multiplied by the probability that the 
-person is inbred ($F$). This computations can be easily done for all 
+Now we shall estimate the expected genotypic probability
+distribution for a person characterized with some
+arbitrary coefficient of inbreeding, $F$ -- or for a population
+in which average inbreeding is $F$. Consider a locus with two
+alleles, $A$ and $B$, with frequency of $B$ denoted as $q$, and
+frequency of $A$ as $p=1-q$. If the person is autozygous
+for some founder allele, the founder allele may be either
+$A$, leading to autozygous genotype $AA$, or the founder
+allele may be $B$, leading to genotype $BB$. The chance that
+the founder allele is $A$ is $p$, and the chance that the
+founder allele is $B$ is $q$. If the person
+is not autozygous, then the expected genotypic frequencies
+follow HWE. Thus, the probability of genotype
+$AA$ is $(1-F)\cdot p^2 + F\cdot p$, where the first term corresponds
+to probability that the person is $AA$ given it is not inbred ($p^2$),
+multiplied by the probability that it is not inbred ($1-F$), and
+the second term corresponds to probability that a person is
+$AA$ given it is inbred ($p$), multiplied by the probability that the
+person is inbred ($F$). This computations can be easily done for all
 genotypic classes leading to the expression for HWE under inbreeding.
 
 \begin{equation}
@@ -418,28 +418,28 @@
 \end{equation}
 \index{Hardy-Weinberg equilibrium!under inbreeding}
 
-How much is inbreeding expected to modify genotypic distribution 
-in human populations? The levels of inbreeding observed in 
-human genetically isolated populations typically 
-vary between $0.001$ (low inbreeding) to $0.05$ (relatively high), 
-see \cite{rudan2003,pardo2005}. The genotypic distribution 
-for $q=0.5$ and different values of the inbreeding coefficient is 
+How much is inbreeding expected to modify genotypic distribution
+in human populations? The levels of inbreeding observed in
+human genetically isolated populations typically
+vary between $0.001$ (low inbreeding) to $0.05$ (relatively high),
+see \cite{rudan2003,pardo2005}. The genotypic distribution
+for $q=0.5$ and different values of the inbreeding coefficient is
 shown at the figure \ref{fig:HWE_under_inbreeding}.
 
 \begin{figure}
 \center
 \includegraphics[width=1.00\textwidth]{HWE_under_inbreeding}
 \caption{
-Genotypic probability distribution for a locus with 50\% frequency of 
-the $B$ allele; black bar, no inbreeding; red, $F=0.001$; green, $F=0.01$; 
+Genotypic probability distribution for a locus with 50\% frequency of
+the $B$ allele; black bar, no inbreeding; red, $F=0.001$; green, $F=0.01$;
 blue, $F=0.05$
 }
 \label{fig:HWE_under_inbreeding}
 \end{figure}
 
 What is the power to detect deviation from HWE due to inbreeding?
-For that, we need to estimate the expectation of 
-the $\chi^2$ statistics (the non-centrality parameter, NCP) used 
+For that, we need to estimate the expectation of
+the $\chi^2$ statistics (the non-centrality parameter, NCP) used
[TRUNCATED]

To get the complete diff run:
    svnlook diff /svnroot/genabel -r 2003


More information about the Genabel-commits mailing list