<html><head>

<meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type">

</head><body style="font-family: Arial; font-size: 14pt;" wsmode="reply"

 bgcolor="#FFFFFF" text="#000000"><div style="font-size: 

14pt;font-family: Arial;">Dear Thibaut<br><br>Thanks for the prompt 

reply! <br>Unfortunately I do not see how that improves on the example 

given. <br>When one uses allelic data, there are simple (automatic) ways

 to build a genind object that includes the factor pop or even a xy 

coordinates factor. That is because the read.file functions available 

include that possibility (read.genepop, retains the pop info, 

read.genalex, retains pop, and xy info). And there is no need of further

 manipulations. So I was looking for something similar, perhaps not a 

read.file function, because read.fasta does not include that, but a set 

of scritps that will do it. <br>I saw another previous suggestion of 

yours, <span>but it implies still an extra file:</span><br><small>popFac

 <- read.csv("oneColumnFileWithMyGroupsInIt.csv")<br>popFac <- 

factor(unlist(popFac))<br>pop(obj) <- popFac</small><br><br>and in 

any case I could not understand how to use it, as I get an error:<br><br><small>data.dnabin

 <- fasta2DNAbin("Engraulis_P3_mtDNA.fas")<br>popFac <- 

read.csv("Engraulis_P3_mtDNA_pops.csv")<br>popFac <- 

factor(unlist(popFac))<br>pop(data.dnabin) <- popFac</small><br><br>Error

 in (function (classes, fdef, mtable)  : <br>  unable to find an 

inherited method for function ‘pop<-’ for signature ‘"DNAbin"’<br><br>It

 would be neat to have a way of reading from the fasta/phylip files the 

first two letters, and use them as factors. I am not familiarized with R

 enough to be able to do it. I just use the packages, and most of the 

times I have a hard time to get things working, because the departure 

examples include R.data, which are not very useful for the beginners.<br><br>In

 any case I appreciate your efforts towards programming for the 

community!<br><br><br>Best<br>Rita<br><br><br><br><br><blockquote 

style="border: 0px none;" 

cite="mid:2CB2DA8E426F3541AB1907F98ABA657075F13A67@icexch-m2.ic.ac.uk" 

type="cite"><div style="margin:30px 25px 10px 25px;" class="__pbConvHr"><div

 style="display:table;width:100%;border-top:1px solid 

#EDEEF0;padding-top:5px">       <div 

style="display:table-cell;vertical-align:middle;padding-right:6px;"><img

 photoaddress="t.jombart@imperial.ac.uk" photoname="Jombart, Thibaut" 

src="cid:part1.05070704.06000907@gmail.com" 

name="compose-unknown-contact.jpg" height="25px" width="25px"></div>   <div

style="display:table-cell;white-space:nowrap;vertical-align:middle;width:100%">

        <a moz-do-not-send="true" href="mailto:t.jombart@imperial.ac.uk" 

style="color:#737F92 

!important;padding-right:6px;font-weight:bold;text-decoration:none 

!important;">Jombart, Thibaut</a></div>   <div 

style="display:table-cell;white-space:nowrap;vertical-align:middle;">   

  <font color="#9FA2A5"><span style="padding-left:6px">December 16, 2013

 5:33 AM</span></font></div></div></div><div 

style="color:#888888;margin-left:24px;margin-right:24px;" 

__pbrmquotes="true" class="__pbConvBody"><pre wrap="">Hello, 

yes, there are simpler ways. sub/gsub and regular expressions are immensely useful to extract information contained in the labels of sequences.

For instance:

##

</pre><blockquote type="cite"><pre wrap="">lab <- c("AD01012","AD666","FR1212","AD0101","FR9873")

lab

</pre></blockquote><pre wrap=""><!---->[1] "AD01012" "AD666"   "FR1212"  "AD0101"  "FR9873" 

</pre><blockquote type="cite"><pre wrap="">pop <- gsub("[[:digit:]]","",lab)

pop

</pre></blockquote><pre wrap=""><!---->[1] "AD" "AD" "FR" "AD" "FR"

##

For some useful examples, see ?sub and ?regexp

Cheers

Thibaut

________________________________________

From: <a class="moz-txt-link-abbreviated" href="mailto:adegenet-forum-bounces@lists.r-forge.r-project.org">adegenet-forum-bounces@lists.r-forge.r-project.org</a> [<a class="moz-txt-link-abbreviated" href="mailto:adegenet-forum-bounces@lists.r-forge.r-project.org">adegenet-forum-bounces@lists.r-forge.r-project.org</a>] on behalf of Rita Castilho [<a class="moz-txt-link-abbreviated" href="mailto:rita.castil@gmail.com">rita.castil@gmail.com</a>]

Sent: 16 December 2013 05:02

To: <a class="moz-txt-link-abbreviated" href="mailto:adegenet-forum@lists.r-forge.r-project.org">adegenet-forum@lists.r-forge.r-project.org</a>

Subject: [adegenet-forum] DNAbin and pop

Hi!

I am new to R and I have a lot of trouble in going from a phylip or fasta file to a genind object or fasta2DNAbin containing pop information.

My files are always phylip or fasta files, and sequences have a reference composed of an di-alpha followed by 4 numeric digits (e.g. CD1495). The first two letters determine the population to which the sequence belongs to.

Is there a quick way to do it instead of doing this, as the grouping factor can be easily deduced from the current individual labels, saving the task of read that info R separately?

#reading data

dna <- fasta2DNAbin('data.fas')

# setting pops

data.pop <- as.factor(rep(c('AD', 'CD', 'FR', 'GE', 'RE', 'OT', 'YU', 'AU'), c(17, 11, 12, 12, 25, 14, 13, 20)))

Many thanks

Rita

</pre></div><div style="margin:30px 25px 10px 25px;" class="__pbConvHr"><div

 style="display:table;width:100%;border-top:1px solid 

#EDEEF0;padding-top:5px">       <div 

style="display:table-cell;vertical-align:middle;padding-right:6px;"><img

 photoaddress="rita.castil@gmail.com" photoname="Rita Castilho" 

src="cid:part1.05070704.06000907@gmail.com" 

name="compose-unknown-contact.jpg" height="25px" width="25px"></div>   <div

style="display:table-cell;white-space:nowrap;vertical-align:middle;width:100%">

        <a moz-do-not-send="true" href="mailto:rita.castil@gmail.com" 

style="color:#737F92 

!important;padding-right:6px;font-weight:bold;text-decoration:none 

!important;">Rita Castilho</a></div>   <div 

style="display:table-cell;white-space:nowrap;vertical-align:middle;">   

  <font color="#9FA2A5"><span style="padding-left:6px">December 16, 2013

 5:02 AM</span></font></div></div></div><div 

style="color:#888888;margin-left:24px;margin-right:24px;" 

__pbrmquotes="true" class="__pbConvBody">

<meta content="text/html; charset=ISO-8859-1" http-equiv="content-type">

<div style="font-size: 14pt;font-family: Arial;"><span><div>Hi!<br>I am 

new to R and I have a lot of trouble in going from a phylip or fasta 

file to a genind object or fasta2DNAbin containing pop information.<br>My

 files are always phylip or fasta files, and sequences have a reference 

composed of an di-alpha followed by 4 numeric digits (e.g. CD1495). The 

first two letters determine the population to which the sequence belongs

 to.<br><br>Is there a quick way to do it instead of doing this, as the 

grouping factor can be easily deduced from the current individual 

labels, saving the task of read that info R separately?<br><br>#reading 

data<br>dna <- fasta2DNAbin('data.fas')<br># setting pops<br>data.pop

 <- as.factor(rep(c('AD', 'CD', 'FR', 'GE', 'RE', 'OT', 'YU', 'AU'), 

c(17, <span style="display: inline; font-size: inherit; padding: 0pt;" 

class="__postbox-detected-content __postbox-detected-date" 

__postbox-detected-content="__postbox-detected-date">11, <span 

style="display: inline; font-size: inherit; padding: 0pt;" 

class="__postbox-detected-content __postbox-detected-date" 

__postbox-detected-content="__postbox-detected-date">12, <span 

style="display: inline; font-size: inherit; padding: 0pt;" 

class="__postbox-detected-content __postbox-detected-date" 

__postbox-detected-content="__postbox-detected-date">12, 25, 14,</span></span></span><span

 style="display: inline; font-size: inherit; padding: 0pt;" 

class="__postbox-detected-content __postbox-detected-date" 

__postbox-detected-content="__postbox-detected-date"><span 

style="display: inline; font-size: inh<br />erit; padding: 0pt;" 

class="__postbox-detected-content __postbox-detected-date" 

__postbox-detected-content="__postbox-detected-date"> 13,</span></span><span

 style="display: inline; font-size: inherit; padding: 0pt;" 

class="__postbox-detected-content __postbox-detected-date" 

__postbox-detected-content="__postbox-detected-date"> 20)))</span><br><br>Many

 thanks<br>Rita</div> </span></div>

</div></blockquote></div></body></html>