[genoPlotR-help] genoPlot R

Lionel Guy guy.lionel at gmail.com
Mon Aug 18 22:09:29 CEST 2014


Hi Shatavia,

You are trying to compare contigs (or chromosomes) using blastp (i.e. protein against protein) results. That won't work, because blastp results have coordinates in protein space (i.e. where 1 is the first amino acid of the protein and goes to the end of the protein, irrespective of its position on the contig) and contigs (or chromosomes) are numbered from 1 (beginning of contig/chromorome) to the end of the contig/chromosome. 

You have two solutions:
- use blastn (if your contigs/chromosomes are very similar) or tblastx (if they are not). Note that with tblastx, you may run into a problem that is that (last time I checked) the number of hits per sequence in tblastx comparisons is limited to 400. Thus, if you compare large contigs, you will get only the top hits which are not necessarily those you're after. It may be that the newer versions of tblastx allow to change this. You may also use the -query option of tblastx to limit the blast to a specific region of your query (and possibly -subject_loc if you blast one sequence against another.
- recode protein coordinates into nucleotide positions along the contig. This is not trivial but can be done. I have an experimental Perl script that I'm willing to share, but this doesn't take .ptt files as input. Instead it uses an ad hoc format (four columns: id start end strand). 

I'd start with trying with blastn, then try tblastx. Finally, try recoding the positions if none of the above works.

Kind regards,

Lionel

On 18 Aug 2014, at 20:56 , Morrison, Shatavia Sharday (CDC/OID/NCIRD) (CTR) <xxh5 at cdc.gov> wrote:

> Hi Lionel,
>  
> Thank you for your reply.  Here are the items that you asked for. All the files are attached. R Code is in the email.
>  
> Best,
> Shatavia
>  
> Here is the R Code:
>  
> org_16BC <- read_dna_seg_from_ptt("NC_015470_6BC.ptt_pttExtract.ptt")
> org_2VS225 <- read_dna_seg_from_ptt("NC_018621_VS225.ptt.txt_pttExtract.ptt")
> # Need pairwise comparison blast results one per comparison set
> blastResults <- read_comparison_from_blast("genoOUT.txt")
> genomeList<- list(org_16BC,org_2VS225)
> comparisonList <- list(blastResults)
> xlims <- list(c(77800,79308,176734,179346,209685,210215),c(77763,79271,176672,179293,209626,210156))
>  
>  
> #compSegs <- list(c(77800,210215),c(176734,179346),c(209685,210215),c(77763,79271),c(176672,179293),c(209626,210156),c(77792,79294),c(176718,179297),c(209630,210160))
> #compSegs1 <- list(c(77800,79308),c(77763,210156),c(77792,210160))
>  
> plot_gene_map(dna_segs = genomeList, comparisons= comparisonList,xlims=xlims,main="Test GenoPlotR", gene_type="arrows",dna_seg_scale= FALSE, scale=TRUE)
>  
>  
> Best,
> Shatavia
> From: Lionel Guy [mailto:guy.lionel at gmail.com] 
> Sent: Monday, August 18, 2014 2:53 PM
> To: Morrison, Shatavia Sharday (CDC/OID/NCIRD) (CTR)
> Cc: Help for genoPlotR
> Subject: Re: genoPlot R
>  
> Hi Shatavia,
>  
> Thanks for using genoPlotR. To be able to answer your question, I would need you to send me the whole code that you are using, including the part where you load the genes and the comparison, and the actual files containing the comparison and the genes. 
>  
> Best regards, 
>  
> Lionel
> 
> On 18 août 2014, at 19:36, "Morrison, Shatavia Sharday (CDC/OID/NCIRD) (CTR)" <xxh5 at cdc.gov> wrote:
> 
> Hello Lionel,
>  
> I’m running into some issues with the genoPlotR package.  Currently I’m trying to generate a plot similar to the one in the Getting Started with genoplotR manual, quick start page 2.  I’m not interested in generating a phylogenetic tree with the graph image, I just want to show the genes and their corresponding blast results synteny.
>  
> As a test I’m using three genes from two different genomes.  The code listed below generates the attached image. I would like to see the comparisons of the blast results, which are missing.  I have removed the self-hits from the blast results as I thought this is why genoPlotR was not creating the comparisons since there were multiple hits in the file, now that I have removed them I’m still having the same issue.
>  
> What am I missing from my plot gene map function? 
>  
> plot_gene_map(dna_segs = genomeList, comparisons= comparisonList,xlims=xlims,main="Test GenoPlotR", gene_type="arrows",dna_seg_scale= FALSE, scale=TRUE)
>  
> Best,
> Shatavia
> <TestGenoR.tiff>
> <NC_015470_6BC.ptt_pttExtract.ptt><NC_018621_VS225.ptt.txt_pttExtract.ptt><genoOUT.txt>

--
Lionel Guy 
Fålhagsleden 13, SE-75324 Uppsala | email: guy.lionel at gmail.com | mobile: +46 (0)73 9760618 | phone: +46 (0)18 410 7398



More information about the genoPlotR-help mailing list