<div>Dear Dr. L<span style="font-family:arial,sans-serif,'Arial Unicode MS';font-size:13px;border-collapse:collapse;white-space:pre-wrap">obry and <span style="font-family:arial,helvetica,clean,sans-serif;white-space:normal;line-height:15px">Colleagues,</span></span></div>
<div><font face="arial, helvetica, clean, sans-serif"><span style="border-collapse:collapse;line-height:15px"><br>The code is running fine and the codon count table is generated.</span></font></div><div><font class="Apple-style-span" face="arial, helvetica, clean, sans-serif"><span class="Apple-style-span" style="border-collapse: collapse; line-height: 15px;"><br>
</span></font></div><div><font face="arial, helvetica, clean, sans-serif"><span style="border-collapse:collapse;line-height:15px">But the problem seems to be that it only produces codon table for the first virus on the list, NC_000898 and not for all the available viruses. Another concern is that for the virus NC_000898 have 104 CDS but it seems to be only 6 CDS is selected here, is it so? Additionall<font class="Apple-style-span" face="'Times New Roman'">y <span class="Apple-style-span" style="border-collapse: separate; line-height: normal; font-family: arial; "> I do not want to include<span class="Apple-style-span" style="font-family: 'Times New Roman'; border-collapse: collapse; line-height: 15px; "> "<span class="Apple-style-span" style="border-collapse: separate; line-height: normal; font-family: arial; ">Met", "Trp" and "Stop" Codons for the correspondence analysis.</span></span></span></font></span></font></div>
<div><font face="arial, helvetica, clean, sans-serif"><span style="border-collapse:collapse;line-height:15px">
<br></span></font></div>A minor addition, I want to select only those CDS with length >150 bp or >50 Codons.<br><br><div>I really appreciate your help. Thank you very much<div><br></div><div>Sincerely yours</div><div>
Sourav Roy Choudhury</div><div><br></div><div><span style="font-family:arial,helvetica,clean,sans-serif;font-size:13px;border-collapse:collapse;color:rgb(136, 136, 136);line-height:15px">SOURAV ROY CHOUDHURY.<br style="line-height:1.2em;outline-style:none">
Dept of Neuroscience<br style="line-height:1.2em;outline-style:none">University of Calcutta <br style="line-height:1.2em;outline-style:none">
West Bengal, INDIA</span></div><br><br><br><br><br><br><div class="gmail_quote">On Thu, Nov 5, 2009 at 1:48 AM, Jean lobry <span dir="ltr"><<a href="mailto:lobry@biomserv.univ-lyon1.fr" target="_blank">lobry@biomserv.univ-lyon1.fr</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="border-left:1px solid rgb(204, 204, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex"><div><blockquote class="gmail_quote" style="border-left:1px solid rgb(204, 204, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">
The code is running fine and I can reproduce the output as<br>
same as yours. It seems that the data is accessible.<br>
</blockquote>
<br></div>
Perfect.<div><br>
<br>
<blockquote class="gmail_quote" style="border-left:1px solid rgb(204, 204, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">
Yes, these are absolutely those genomes that I intend to<br>
work with. I want to consider only all the CDS (or Open<br>
Reading Frame) for the statistical observation.<br>
</blockquote>
<br></div>
OK, let's make one more baby step: can you compute the table<br>
of codon counts with the following code?<br>
<br>
######<br>
library(seqinr)<br>
choosebank("refseqViruses")<br>
#<br>
# Now we want all to extract all complete CDS from<br>
# these genomes:<br>
#<br>
query("herpcds", "TID=548681 AND T=CDS AND NO K=PARTIAL")<br>
n <- herpcds$nelem<br>
n # there are 4887 CDS<br>
mnemo.cds <- getName(herpcds)<br>
head(mnemo.cds) # just to see what they look like<br>
#<br>
# We pre-allocate a table for codon counts:<br>
#<br>
tabcod <- matrix(0, nrow = n, ncol = 64)<br>
rownames(tabcod) <- mnemo.cds # cds names<br>
colnames(tabcod) <- words() # codon names<br>
head(tabcod) # this empty for now<br>
#<br>
# Now we loop over all CDS to compute codon usage (this<br>
# may take a while)<br>
#<br>
pb <- txtProgressBar(min = 1, max = n, initial = 1, style = 3)<br>
for(i in 1:n){<br>
setTxtProgressBar(pb, i) # to show progression<br>
tabcod[i, ] <- uco(getSequence(herpcds$req[[i]]))<br>
}<br>
close(pb)<br>
head(tabcod) # shouldn't be empty now<br>
#<br>
# We save this table on disk for further usage<br>
#<br>
save(tabcod, file = "tabcod.RData")<br>
######<div><div></div><div><br>
<br>
Best,<br>
-- <br>
Jean R. Lobry (<a href="mailto:lobry@biomserv.univ-lyon1.fr" target="_blank">lobry@biomserv.univ-lyon1.fr</a>)<br>
Laboratoire BBE-CNRS-UMR-5558, Univ. C. Bernard - LYON I,<br>
43 Bd 11/11/1918, F-69622 VILLEURBANNE CEDEX, FRANCE<br>
allo : +33 472 43 27 56 fax : +33 472 43 13 88<br>
<a href="http://pbil.univ-lyon1.fr/members/lobry/" target="_blank">http://pbil.univ-lyon1.fr/members/lobry/</a><br>
<br>
<br>
</div></div></blockquote></div><br>
</div>