<div>Dear Dr. L<span style="font-family:arial,sans-serif,&#39;Arial Unicode MS&#39;;font-size:13px;border-collapse:collapse;white-space:pre-wrap">obry and <span style="font-family:arial,helvetica,clean,sans-serif;white-space:normal;line-height:15px">Colleagues,</span></span></div>


<div><font face="arial, helvetica, clean, sans-serif"><span style="border-collapse:collapse;line-height:15px"><br>The code is running fine and the codon count table is generated.</span></font></div><div><font class="Apple-style-span" face="arial, helvetica, clean, sans-serif"><span class="Apple-style-span" style="border-collapse: collapse; line-height: 15px;"><br>

</span></font></div><div><font face="arial, helvetica, clean, sans-serif"><span style="border-collapse:collapse;line-height:15px">But the problem seems to be that it only produces codon table for the first virus on the list, NC_000898 and not for all the available viruses. Another concern is that for the virus NC_000898 have 104 CDS but it seems to be only 6 CDS is selected here, is it so? Additionall<font class="Apple-style-span" face="&#39;Times New Roman&#39;">y <span class="Apple-style-span" style="border-collapse: separate; line-height: normal; font-family: arial; "> I do not want to include<span class="Apple-style-span" style="font-family: &#39;Times New Roman&#39;; border-collapse: collapse; line-height: 15px; "> &quot;<span class="Apple-style-span" style="border-collapse: separate; line-height: normal; font-family: arial; ">Met&quot;, &quot;Trp&quot; and &quot;Stop&quot; Codons for the correspondence analysis.</span></span></span></font></span></font></div>

<div><font face="arial, helvetica, clean, sans-serif"><span style="border-collapse:collapse;line-height:15px">

<br></span></font></div>A minor addition, I want to select only those CDS with length &gt;150 bp or &gt;50 Codons.<br><br><div>I really appreciate your help. Thank you very much<div><br></div><div>Sincerely yours</div><div>

Sourav Roy Choudhury</div><div><br></div><div><span style="font-family:arial,helvetica,clean,sans-serif;font-size:13px;border-collapse:collapse;color:rgb(136, 136, 136);line-height:15px">SOURAV ROY CHOUDHURY.<br style="line-height:1.2em;outline-style:none">


Dept of Neuroscience<br style="line-height:1.2em;outline-style:none">University of Calcutta <br style="line-height:1.2em;outline-style:none">

West Bengal, INDIA</span></div><br><br><br><br><br><br><div class="gmail_quote">On Thu, Nov 5, 2009 at 1:48 AM, Jean lobry <span dir="ltr">&lt;<a href="mailto:lobry@biomserv.univ-lyon1.fr" target="_blank">lobry@biomserv.univ-lyon1.fr</a>&gt;</span> wrote:<br>


<blockquote class="gmail_quote" style="border-left:1px solid rgb(204, 204, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex"><div><blockquote class="gmail_quote" style="border-left:1px solid rgb(204, 204, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">


 The code is running fine and I can reproduce the output as<br>

 same as yours. It seems that the data is accessible.<br>

</blockquote>

<br></div>

Perfect.<div><br>

<br>

<blockquote class="gmail_quote" style="border-left:1px solid rgb(204, 204, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">

 Yes, these are absolutely those genomes that I intend to<br>

 work with. I want to consider only all the CDS (or Open<br>

 Reading Frame) for the statistical observation.<br>

</blockquote>

<br></div>

OK, let&#39;s make one more baby step: can you compute the table<br>

of codon counts with the following code?<br>

<br>

######<br>

library(seqinr)<br>

choosebank(&quot;refseqViruses&quot;)<br>

#<br>

# Now we want all to extract all complete CDS from<br>

# these genomes:<br>

#<br>

query(&quot;herpcds&quot;, &quot;TID=548681 AND T=CDS AND NO K=PARTIAL&quot;)<br>

n &lt;- herpcds$nelem<br>

n # there are 4887 CDS<br>

mnemo.cds &lt;- getName(herpcds)<br>

head(mnemo.cds) # just to see what they look like<br>

#<br>

# We pre-allocate a table for codon counts:<br>

#<br>

tabcod &lt;- matrix(0, nrow = n, ncol = 64)<br>

rownames(tabcod) &lt;- mnemo.cds # cds names<br>

colnames(tabcod) &lt;- words()   # codon names<br>

head(tabcod) # this empty for now<br>

#<br>

# Now we loop over all CDS to compute codon usage (this<br>

# may take a while)<br>

#<br>

pb &lt;- txtProgressBar(min = 1, max = n, initial = 1, style = 3)<br>

for(i in 1:n){<br>

  setTxtProgressBar(pb, i) # to show progression<br>

  tabcod[i, ] &lt;- uco(getSequence(herpcds$req[[i]]))<br>

}<br>

close(pb)<br>

head(tabcod) # shouldn&#39;t be empty now<br>

#<br>

# We save this table on disk for further usage<br>

#<br>

save(tabcod, file = &quot;tabcod.RData&quot;)<br>

######<div><div></div><div><br>

<br>

Best,<br>

-- <br>

Jean R. Lobry            (<a href="mailto:lobry@biomserv.univ-lyon1.fr" target="_blank">lobry@biomserv.univ-lyon1.fr</a>)<br>

Laboratoire BBE-CNRS-UMR-5558, Univ. C. Bernard - LYON I,<br>

43 Bd 11/11/1918, F-69622 VILLEURBANNE CEDEX, FRANCE<br>

allo  : +33 472 43 27 56     fax    : +33 472 43 13 88<br>

<a href="http://pbil.univ-lyon1.fr/members/lobry/" target="_blank">http://pbil.univ-lyon1.fr/members/lobry/</a><br>

<br>

<br>

</div></div></blockquote></div><br>

</div>