From lionel.guy at icm.uu.se Sun Feb 1 22:38:43 2015 From: lionel.guy at icm.uu.se (Lionel Guy) Date: Sun, 1 Feb 2015 21:38:43 +0000 Subject: [genoPlotR-help] genoPlotR question: visualizing an HPV reference genome In-Reply-To: <83C11896-84EC-4DD3-9D0F-A18015A1551C@telus.net> References: <83C11896-84EC-4DD3-9D0F-A18015A1551C@telus.net> Message-ID: <8025D1E7-B5AB-4AAD-9552-F6F22CB101BA@icm.uu.se> Hi Gordon, Thanks for your interest in genoPlotR. The dna_segs argument of plot_gene_map should be a list of dna_seg objects. To plot one single dna_seg, use a list composed of a single dna_seg object: plot_gene_map(list(hpv.df)) That should do the trick. HTH, Lionel > On 1 Feb 2015, at 21:17 , Gordon Robertson wrote: > > Hi Lionel, > > I'm exploring (assessing) the genoPlotR package as a simple and extensible way of drawing information on viral integration into human genes in cancer. There are a number of components to this. I'm trying to take a first step. > > I'm also looking at the Gviz package. > > As a first step I've downloaded an HPV16 reference genome Genbank-format file from PaVE (attached). The file loads easily into genoPlotR. Now I'd like to plot the genes in the genome. > > library(genoPlotR) > > # HPV16 reference genome > hpv.df <- read_dna_seg_from_file("/Users/grobertson/resources/genomic_data/viral/PaVE/HPV16.ref.PaVE.GenBank.format.txt", tagsToParse=c("CDS","misc_feature","gene"), fileType = "detect", meta_lines = 2, gene_type = "blocks", header = TRUE, extra_fields = c("db_xref", "transl_table")) > > str(hpv.df) > # Classes ?dna_seg? and 'data.frame': 25 obs. of 19 variables: > #$ name : chr "E1" "E2" "E4" "E5_ALPHA" ... > #$ start : num 865 2756 3333 3850 83 ... > #... > > plot_gene_map(hpv.df) > Error in plot_gene_map(hpv.df) : > Argument dna_segs must be a list of dna_seg objects > > Would you be willing to help me resolve this? I apologize for emailing so early in assessing your package, but I have very limited time for resolving this visualization issue. I'm using R 3.1.2 on OS X, and working in the latest RStudio. > > Thank you. > > Gordon > -- > Gordon Robertson > agrobertson at telus.net > > > > -- Lionel Guy Department for Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden postal address: Box 582, SE-751 23 Uppsala; visiting address: BMC D7:3, Husargatan 3, SE-752 37 Uppsala phone: +46 73 976 0618 lionel.guy at imbim.uu.se From lionel.guy at icm.uu.se Mon Feb 2 08:48:00 2015 From: lionel.guy at icm.uu.se (Lionel Guy) Date: Mon, 2 Feb 2015 07:48:00 +0000 Subject: [genoPlotR-help] genoPlotR question: visualizing an HPV reference genome In-Reply-To: <6453D422-D50C-4C4A-A585-1885EAFABAF8@bcgsc.ca> References: <83C11896-84EC-4DD3-9D0F-A18015A1551C@telus.net> <8025D1E7-B5AB-4AAD-9552-F6F22CB101BA@icm.uu.se> <6453D422-D50C-4C4A-A585-1885EAFABAF8@bcgsc.ca> Message-ID: <8AD0DA23-A09B-4D45-BB67-C55C934E9A25@icm.uu.se> Hi Gordon, I haven't worked extensively with eukaryotic sequences, so in some ways the gbk parser might be behaving strangely with spliced sequences, for example. That being said, there are more blocks, they are just all the same color. Try hpv.df$col <- rainbow(nrow(hdv.df)) hpv.df$gene_type <- "arrows" annot <- annotation(x1=middle(hpv.df), text=hpv.df$name, rot=0) plot_gene_map(list(hpv.df), annotations=annot) that should help you see better. To plot different genes on different lines, you'll either have to separate them on different dna_segs, or write your own plotting functions. See the vignette for genoPlotR or the help page for function gene_types Kind regards, Lionel > On 2 Feb 2015, at 0:04 , Gordon Robertson wrote: > > Thanks Lionel > > That worked. It gave me a two-block graphic (attached). What I'd like is something like the CDS (gene) view (attached). How would I get that? > > Thanks, > > Gordon > -- > Gordon Robertson > Canada's Michael Smith Genome Sciences Centre > BC Cancer Agency > Vancouver BC Canada > www.bcgsc.ca > T: 604-707-5900 x675416 > Skype: a.gordon.robertson > > > > On 2015-02-01, at 1:38 PM, Lionel Guy wrote: > >> Hi Gordon, >> >> Thanks for your interest in genoPlotR. >> >> The dna_segs argument of plot_gene_map should be a list of dna_seg objects. To plot one single dna_seg, use a list composed of a single dna_seg object: >> >> plot_gene_map(list(hpv.df)) >> >> That should do the trick. >> HTH, >> >> Lionel >> >> >>> On 1 Feb 2015, at 21:17 , Gordon Robertson wrote: >>> >>> Hi Lionel, >>> >>> I'm exploring (assessing) the genoPlotR package as a simple and extensible way of drawing information on viral integration into human genes in cancer. There are a number of components to this. I'm trying to take a first step. >>> >>> I'm also looking at the Gviz package. >>> >>> As a first step I've downloaded an HPV16 reference genome Genbank-format file from PaVE (attached). The file loads easily into genoPlotR. Now I'd like to plot the genes in the genome. >>> >>> library(genoPlotR) >>> >>> # HPV16 reference genome >>> hpv.df <- read_dna_seg_from_file("/Users/grobertson/resources/genomic_data/viral/PaVE/HPV16.ref.PaVE.GenBank.format.txt", tagsToParse=c("CDS","misc_feature","gene"), fileType = "detect", meta_lines = 2, gene_type = "blocks", header = TRUE, extra_fields = c("db_xref", "transl_table")) >>> >>> str(hpv.df) >>> # Classes ?dna_seg? and 'data.frame': 25 obs. of 19 variables: >>> #$ name : chr "E1" "E2" "E4" "E5_ALPHA" ... >>> #$ start : num 865 2756 3333 3850 83 ... >>> #... >>> >>> plot_gene_map(hpv.df) >>> Error in plot_gene_map(hpv.df) : >>> Argument dna_segs must be a list of dna_seg objects >>> >>> Would you be willing to help me resolve this? I apologize for emailing so early in assessing your package, but I have very limited time for resolving this visualization issue. I'm using R 3.1.2 on OS X, and working in the latest RStudio. >>> >>> Thank you. >>> >>> Gordon >>> -- >>> Gordon Robertson >>> agrobertson at telus.net >>> >>> >>> >>> >> >> -- >> Lionel Guy >> Department for Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden >> postal address: Box 582, SE-751 23 Uppsala; visiting address: BMC D7:3, Husargatan 3, SE-752 37 Uppsala >> phone: +46 73 976 0618 >> lionel.guy at imbim.uu.se >> > > -- Lionel Guy Department for Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden postal address: Box 582, SE-751 23 Uppsala; visiting address: BMC D7:3, Husargatan 3, SE-752 37 Uppsala phone: +46 73 976 0618 lionel.guy at imbim.uu.se From grobertson at bcgsc.ca Mon Feb 2 08:51:46 2015 From: grobertson at bcgsc.ca (Gordon Robertson) Date: Sun, 1 Feb 2015 23:51:46 -0800 Subject: [genoPlotR-help] genoPlotR question: visualizing an HPV reference genome In-Reply-To: <8AD0DA23-A09B-4D45-BB67-C55C934E9A25@icm.uu.se> References: <83C11896-84EC-4DD3-9D0F-A18015A1551C@telus.net> <8025D1E7-B5AB-4AAD-9552-F6F22CB101BA@icm.uu.se> <6453D422-D50C-4C4A-A585-1885EAFABAF8@bcgsc.ca> <8AD0DA23-A09B-4D45-BB67-C55C934E9A25@icm.uu.se> Message-ID: <6715CF33-E841-4B5D-9191-8E054FA74769@bcgsc.ca> Thanks Lionel, It's close to midnight Sunday in Vancouver. I'll look at this on Monday. Gordon On 2015-02-01, at 11:48 PM, Lionel Guy wrote: Hi Gordon, I haven't worked extensively with eukaryotic sequences, so in some ways the gbk parser might be behaving strangely with spliced sequences, for example. That being said, there are more blocks, they are just all the same color. Try hpv.df$col <- rainbow(nrow(hdv.df)) hpv.df$gene_type <- "arrows" annot <- annotation(x1=middle(hpv.df), text=hpv.df$name, rot=0) plot_gene_map(list(hpv.df), annotations=annot) that should help you see better. To plot different genes on different lines, you'll either have to separate them on different dna_segs, or write your own plotting functions. See the vignette for genoPlotR or the help page for function gene_types Kind regards, Lionel On 2 Feb 2015, at 0:04 , Gordon Robertson > wrote: Thanks Lionel That worked. It gave me a two-block graphic (attached). What I'd like is something like the CDS (gene) view (attached). How would I get that? Thanks, Gordon -- Gordon Robertson Canada's Michael Smith Genome Sciences Centre BC Cancer Agency Vancouver BC Canada www.bcgsc.ca T: 604-707-5900 x675416 Skype: a.gordon.robertson On 2015-02-01, at 1:38 PM, Lionel Guy wrote: Hi Gordon, Thanks for your interest in genoPlotR. The dna_segs argument of plot_gene_map should be a list of dna_seg objects. To plot one single dna_seg, use a list composed of a single dna_seg object: plot_gene_map(list(hpv.df)) That should do the trick. HTH, Lionel On 1 Feb 2015, at 21:17 , Gordon Robertson > wrote: Hi Lionel, I'm exploring (assessing) the genoPlotR package as a simple and extensible way of drawing information on viral integration into human genes in cancer. There are a number of components to this. I'm trying to take a first step. I'm also looking at the Gviz package. As a first step I've downloaded an HPV16 reference genome Genbank-format file from PaVE (attached). The file loads easily into genoPlotR. Now I'd like to plot the genes in the genome. library(genoPlotR) # HPV16 reference genome hpv.df <- read_dna_seg_from_file("/Users/grobertson/resources/genomic_data/viral/PaVE/HPV16.ref.PaVE.GenBank.format.txt", tagsToParse=c("CDS","misc_feature","gene"), fileType = "detect", meta_lines = 2, gene_type = "blocks", header = TRUE, extra_fields = c("db_xref", "transl_table")) str(hpv.df) # Classes ?dna_seg? and 'data.frame': 25 obs. of 19 variables: #$ name : chr "E1" "E2" "E4" "E5_ALPHA" ... #$ start : num 865 2756 3333 3850 83 ... #... plot_gene_map(hpv.df) Error in plot_gene_map(hpv.df) : Argument dna_segs must be a list of dna_seg objects Would you be willing to help me resolve this? I apologize for emailing so early in assessing your package, but I have very limited time for resolving this visualization issue. I'm using R 3.1.2 on OS X, and working in the latest RStudio. Thank you. Gordon -- Gordon Robertson agrobertson at telus.net -- Lionel Guy Department for Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden postal address: Box 582, SE-751 23 Uppsala; visiting address: BMC D7:3, Husargatan 3, SE-752 37 Uppsala phone: +46 73 976 0618 lionel.guy at imbim.uu.se -- Lionel Guy Department for Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden postal address: Box 582, SE-751 23 Uppsala; visiting address: BMC D7:3, Husargatan 3, SE-752 37 Uppsala phone: +46 73 976 0618 lionel.guy at imbim.uu.se -------------- next part -------------- An HTML attachment was scrubbed... URL: From grobertson at bcgsc.ca Mon Feb 2 17:58:49 2015 From: grobertson at bcgsc.ca (Gordon Robertson) Date: Mon, 2 Feb 2015 08:58:49 -0800 Subject: [genoPlotR-help] genoPlotR question: visualizing an HPV reference genome In-Reply-To: <8AD0DA23-A09B-4D45-BB67-C55C934E9A25@icm.uu.se> References: <83C11896-84EC-4DD3-9D0F-A18015A1551C@telus.net> <8025D1E7-B5AB-4AAD-9552-F6F22CB101BA@icm.uu.se> <6453D422-D50C-4C4A-A585-1885EAFABAF8@bcgsc.ca> <8AD0DA23-A09B-4D45-BB67-C55C934E9A25@icm.uu.se> Message-ID: Thank you. Your code generated a graphic with a single line of dark blue arrows (for genes), with gene symbols above the arrows. I've looked briefly at the vignette and the help page for gene_types, but I do not have time right now to work with this information. Gordon On 2015-02-01, at 11:48 PM, Lionel Guy wrote: Hi Gordon, I haven't worked extensively with eukaryotic sequences, so in some ways the gbk parser might be behaving strangely with spliced sequences, for example. That being said, there are more blocks, they are just all the same color. Try hpv.df$col <- rainbow(nrow(hdv.df)) hpv.df$gene_type <- "arrows" annot <- annotation(x1=middle(hpv.df), text=hpv.df$name, rot=0) plot_gene_map(list(hpv.df), annotations=annot) that should help you see better. To plot different genes on different lines, you'll either have to separate them on different dna_segs, or write your own plotting functions. See the vignette for genoPlotR or the help page for function gene_types Kind regards, Lionel On 2 Feb 2015, at 0:04 , Gordon Robertson > wrote: Thanks Lionel That worked. It gave me a two-block graphic (attached). What I'd like is something like the CDS (gene) view (attached). How would I get that? Thanks, Gordon -- Gordon Robertson Canada's Michael Smith Genome Sciences Centre BC Cancer Agency Vancouver BC Canada www.bcgsc.ca T: 604-707-5900 x675416 Skype: a.gordon.robertson On 2015-02-01, at 1:38 PM, Lionel Guy wrote: Hi Gordon, Thanks for your interest in genoPlotR. The dna_segs argument of plot_gene_map should be a list of dna_seg objects. To plot one single dna_seg, use a list composed of a single dna_seg object: plot_gene_map(list(hpv.df)) That should do the trick. HTH, Lionel On 1 Feb 2015, at 21:17 , Gordon Robertson > wrote: Hi Lionel, I'm exploring (assessing) the genoPlotR package as a simple and extensible way of drawing information on viral integration into human genes in cancer. There are a number of components to this. I'm trying to take a first step. I'm also looking at the Gviz package. As a first step I've downloaded an HPV16 reference genome Genbank-format file from PaVE (attached). The file loads easily into genoPlotR. Now I'd like to plot the genes in the genome. library(genoPlotR) # HPV16 reference genome hpv.df <- read_dna_seg_from_file("/Users/grobertson/resources/genomic_data/viral/PaVE/HPV16.ref.PaVE.GenBank.format.txt", tagsToParse=c("CDS","misc_feature","gene"), fileType = "detect", meta_lines = 2, gene_type = "blocks", header = TRUE, extra_fields = c("db_xref", "transl_table")) str(hpv.df) # Classes ?dna_seg? and 'data.frame': 25 obs. of 19 variables: #$ name : chr "E1" "E2" "E4" "E5_ALPHA" ... #$ start : num 865 2756 3333 3850 83 ... #... plot_gene_map(hpv.df) Error in plot_gene_map(hpv.df) : Argument dna_segs must be a list of dna_seg objects Would you be willing to help me resolve this? I apologize for emailing so early in assessing your package, but I have very limited time for resolving this visualization issue. I'm using R 3.1.2 on OS X, and working in the latest RStudio. Thank you. Gordon -- Gordon Robertson agrobertson at telus.net -- Lionel Guy Department for Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden postal address: Box 582, SE-751 23 Uppsala; visiting address: BMC D7:3, Husargatan 3, SE-752 37 Uppsala phone: +46 73 976 0618 lionel.guy at imbim.uu.se -- Lionel Guy Department for Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden postal address: Box 582, SE-751 23 Uppsala; visiting address: BMC D7:3, Husargatan 3, SE-752 37 Uppsala phone: +46 73 976 0618 lionel.guy at imbim.uu.se -------------- next part -------------- An HTML attachment was scrubbed... URL: From lionel.guy at imbim.uu.se Mon Feb 2 20:41:56 2015 From: lionel.guy at imbim.uu.se (Lionel Guy) Date: Mon, 2 Feb 2015 19:41:56 +0000 Subject: [genoPlotR-help] genoPlotR question: visualizing an HPV reference genome In-Reply-To: References: <83C11896-84EC-4DD3-9D0F-A18015A1551C@telus.net> <8025D1E7-B5AB-4AAD-9552-F6F22CB101BA@icm.uu.se> <6453D422-D50C-4C4A-A585-1885EAFABAF8@bcgsc.ca> <8AD0DA23-A09B-4D45-BB67-C55C934E9A25@icm.uu.se> Message-ID: <4B52FDF2-1F3F-48D5-9CE2-E197B710CC48@icm.uu.se> Hi Gordon, My complete code is: library(genoPlotR) hpv.df <- read_dna_seg_from_file("/Users/lguy/Desktop/tmp/HPV16.ref.PaVE.gbk", tagsToParse=c("CDS"), fileType = "detect", meta_lines = 2, gene_type = "blocks", header = TRUE, extra_fields = c("db_xref", "transl_table")) hpv.df$col <- rainbow(nrow(hpv.df)) hpv.df$gene_type <- "arrows" annot <- annotation(x1=middle(hpv.df), text=hpv.df$name, rot=0) plot_gene_map(list(hpv.df), annotations=annot) And it yields the following plot: [cid:0A18CDC1-3845-4FE9-BD3B-60AC27EEDDF5 at lan] HTH, Lionel On 2 Feb 2015, at 17:58 , Gordon Robertson > wrote: Thank you. Your code generated a graphic with a single line of dark blue arrows (for genes), with gene symbols above the arrows. I've looked briefly at the vignette and the help page for gene_types, but I do not have time right now to work with this information. Gordon On 2015-02-01, at 11:48 PM, Lionel Guy wrote: Hi Gordon, I haven't worked extensively with eukaryotic sequences, so in some ways the gbk parser might be behaving strangely with spliced sequences, for example. That being said, there are more blocks, they are just all the same color. Try hpv.df$col <- rainbow(nrow(hdv.df)) hpv.df$gene_type <- "arrows" annot <- annotation(x1=middle(hpv.df), text=hpv.df$name, rot=0) plot_gene_map(list(hpv.df), annotations=annot) that should help you see better. To plot different genes on different lines, you'll either have to separate them on different dna_segs, or write your own plotting functions. See the vignette for genoPlotR or the help page for function gene_types Kind regards, Lionel On 2 Feb 2015, at 0:04 , Gordon Robertson > wrote: Thanks Lionel That worked. It gave me a two-block graphic (attached). What I'd like is something like the CDS (gene) view (attached). How would I get that? Thanks, Gordon -- Gordon Robertson Canada's Michael Smith Genome Sciences Centre BC Cancer Agency Vancouver BC Canada www.bcgsc.ca T: 604-707-5900 x675416 Skype: a.gordon.robertson On 2015-02-01, at 1:38 PM, Lionel Guy wrote: Hi Gordon, Thanks for your interest in genoPlotR. The dna_segs argument of plot_gene_map should be a list of dna_seg objects. To plot one single dna_seg, use a list composed of a single dna_seg object: plot_gene_map(list(hpv.df)) That should do the trick. HTH, Lionel On 1 Feb 2015, at 21:17 , Gordon Robertson > wrote: Hi Lionel, I'm exploring (assessing) the genoPlotR package as a simple and extensible way of drawing information on viral integration into human genes in cancer. There are a number of components to this. I'm trying to take a first step. I'm also looking at the Gviz package. As a first step I've downloaded an HPV16 reference genome Genbank-format file from PaVE (attached). The file loads easily into genoPlotR. Now I'd like to plot the genes in the genome. library(genoPlotR) # HPV16 reference genome hpv.df <- read_dna_seg_from_file("/Users/grobertson/resources/genomic_data/viral/PaVE/HPV16.ref.PaVE.GenBank.format.txt", tagsToParse=c("CDS","misc_feature","gene"), fileType = "detect", meta_lines = 2, gene_type = "blocks", header = TRUE, extra_fields = c("db_xref", "transl_table")) str(hpv.df) # Classes ?dna_seg? and 'data.frame': 25 obs. of 19 variables: #$ name : chr "E1" "E2" "E4" "E5_ALPHA" ... #$ start : num 865 2756 3333 3850 83 ... #... plot_gene_map(hpv.df) Error in plot_gene_map(hpv.df) : Argument dna_segs must be a list of dna_seg objects Would you be willing to help me resolve this? I apologize for emailing so early in assessing your package, but I have very limited time for resolving this visualization issue. I'm using R 3.1.2 on OS X, and working in the latest RStudio. Thank you. Gordon -- Gordon Robertson agrobertson at telus.net -- Lionel Guy Department for Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden postal address: Box 582, SE-751 23 Uppsala; visiting address: BMC D7:3, Husargatan 3, SE-752 37 Uppsala phone: +46 73 976 0618 lionel.guy at imbim.uu.se -- Lionel Guy Department for Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden postal address: Box 582, SE-751 23 Uppsala; visiting address: BMC D7:3, Husargatan 3, SE-752 37 Uppsala phone: +46 73 976 0618 lionel.guy at imbim.uu.se -- Lionel Guy Department for Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden postal address: Box 582, SE-751 23 Uppsala; visiting address: BMC D7:3, Husargatan 3, SE-752 37 Uppsala phone: +46 73 976 0618 lionel.guy at imbim.uu.se -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-1.png Type: image/png Size: 18565 bytes Desc: PastedGraphic-1.png URL: From grobertson at bcgsc.ca Mon Feb 2 21:25:50 2015 From: grobertson at bcgsc.ca (Gordon Robertson) Date: Mon, 2 Feb 2015 12:25:50 -0800 Subject: [genoPlotR-help] genoPlotR question: visualizing an HPV reference genome In-Reply-To: <4B52FDF2-1F3F-48D5-9CE2-E197B710CC48@icm.uu.se> References: <83C11896-84EC-4DD3-9D0F-A18015A1551C@telus.net> <8025D1E7-B5AB-4AAD-9552-F6F22CB101BA@icm.uu.se> <6453D422-D50C-4C4A-A585-1885EAFABAF8@bcgsc.ca> <8AD0DA23-A09B-4D45-BB67-C55C934E9A25@icm.uu.se> <4B52FDF2-1F3F-48D5-9CE2-E197B710CC48@icm.uu.se> Message-ID: Thank you. I get the same graphic on my Mac. The colours are good. The issue that we're looking at is: drawing breakpoints between a virus and a human gene, where we've determined the breakpoints from RNAseq viral-human transcripts. So it's a visualization use case that's close to what genoPlotR is built to do. I'm very focused on other aspects of the analysis, so do not have time to invest in good visualization in the short term. Could I ask: if I have a list of viral breakpoints, e.g. coordinate 822, how would I draw a vertical line at this location, just below the current graphic? I've attached a fake set of lines, with the line length (height) indicating that some breakpoints have more evidence than others. Gordon On 2015-02-02, at 11:41 AM, Lionel Guy wrote: Hi Gordon, My complete code is: library(genoPlotR) hpv.df <- read_dna_seg_from_file("/Users/lguy/Desktop/tmp/HPV16.ref.PaVE.gbk", tagsToParse=c("CDS"), fileType = "detect", meta_lines = 2, gene_type = "blocks", header = TRUE, extra_fields = c("db_xref", "transl_table")) hpv.df$col <- rainbow(nrow(hpv.df)) hpv.df$gene_type <- "arrows" annot <- annotation(x1=middle(hpv.df), text=hpv.df$name, rot=0) plot_gene_map(list(hpv.df), annotations=annot) And it yields the following plot: HTH, Lionel On 2 Feb 2015, at 17:58 , Gordon Robertson > wrote: Thank you. Your code generated a graphic with a single line of dark blue arrows (for genes), with gene symbols above the arrows. I've looked briefly at the vignette and the help page for gene_types, but I do not have time right now to work with this information. Gordon On 2015-02-01, at 11:48 PM, Lionel Guy wrote: Hi Gordon, I haven't worked extensively with eukaryotic sequences, so in some ways the gbk parser might be behaving strangely with spliced sequences, for example. That being said, there are more blocks, they are just all the same color. Try hpv.df$col <- rainbow(nrow(hdv.df)) hpv.df$gene_type <- "arrows" annot <- annotation(x1=middle(hpv.df), text=hpv.df$name, rot=0) plot_gene_map(list(hpv.df), annotations=annot) that should help you see better. To plot different genes on different lines, you'll either have to separate them on different dna_segs, or write your own plotting functions. See the vignette for genoPlotR or the help page for function gene_types Kind regards, Lionel On 2 Feb 2015, at 0:04 , Gordon Robertson > wrote: Thanks Lionel That worked. It gave me a two-block graphic (attached). What I'd like is something like the CDS (gene) view (attached). How would I get that? Thanks, Gordon -- Gordon Robertson Canada's Michael Smith Genome Sciences Centre BC Cancer Agency Vancouver BC Canada www.bcgsc.ca T: 604-707-5900 x675416 Skype: a.gordon.robertson On 2015-02-01, at 1:38 PM, Lionel Guy wrote: Hi Gordon, Thanks for your interest in genoPlotR. The dna_segs argument of plot_gene_map should be a list of dna_seg objects. To plot one single dna_seg, use a list composed of a single dna_seg object: plot_gene_map(list(hpv.df)) That should do the trick. HTH, Lionel On 1 Feb 2015, at 21:17 , Gordon Robertson > wrote: Hi Lionel, I'm exploring (assessing) the genoPlotR package as a simple and extensible way of drawing information on viral integration into human genes in cancer. There are a number of components to this. I'm trying to take a first step. I'm also looking at the Gviz package. As a first step I've downloaded an HPV16 reference genome Genbank-format file from PaVE (attached). The file loads easily into genoPlotR. Now I'd like to plot the genes in the genome. library(genoPlotR) # HPV16 reference genome hpv.df <- read_dna_seg_from_file("/Users/grobertson/resources/genomic_data/viral/PaVE/HPV16.ref.PaVE.GenBank.format.txt", tagsToParse=c("CDS","misc_feature","gene"), fileType = "detect", meta_lines = 2, gene_type = "blocks", header = TRUE, extra_fields = c("db_xref", "transl_table")) str(hpv.df) # Classes ?dna_seg? and 'data.frame': 25 obs. of 19 variables: #$ name : chr "E1" "E2" "E4" "E5_ALPHA" ... #$ start : num 865 2756 3333 3850 83 ... #... plot_gene_map(hpv.df) Error in plot_gene_map(hpv.df) : Argument dna_segs must be a list of dna_seg objects Would you be willing to help me resolve this? I apologize for emailing so early in assessing your package, but I have very limited time for resolving this visualization issue. I'm using R 3.1.2 on OS X, and working in the latest RStudio. Thank you. Gordon -- Gordon Robertson agrobertson at telus.net -- Lionel Guy Department for Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden postal address: Box 582, SE-751 23 Uppsala; visiting address: BMC D7:3, Husargatan 3, SE-752 37 Uppsala phone: +46 73 976 0618 lionel.guy at imbim.uu.se -- Lionel Guy Department for Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden postal address: Box 582, SE-751 23 Uppsala; visiting address: BMC D7:3, Husargatan 3, SE-752 37 Uppsala phone: +46 73 976 0618 lionel.guy at imbim.uu.se -- Lionel Guy Department for Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden postal address: Box 582, SE-751 23 Uppsala; visiting address: BMC D7:3, Husargatan 3, SE-752 37 Uppsala phone: +46 73 976 0618 lionel.guy at imbim.uu.se -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: screenshot_587.tif Type: image/tif Size: 11266 bytes Desc: screenshot_587.tif URL: From grobertson at bcgsc.ca Mon Feb 2 23:53:53 2015 From: grobertson at bcgsc.ca (Gordon Robertson) Date: Mon, 2 Feb 2015 14:53:53 -0800 Subject: [genoPlotR-help] genoPlotR question: visualizing an HPV reference genome In-Reply-To: <4B52FDF2-1F3F-48D5-9CE2-E197B710CC48@icm.uu.se> References: <83C11896-84EC-4DD3-9D0F-A18015A1551C@telus.net> <8025D1E7-B5AB-4AAD-9552-F6F22CB101BA@icm.uu.se> <6453D422-D50C-4C4A-A585-1885EAFABAF8@bcgsc.ca> <8AD0DA23-A09B-4D45-BB67-C55C934E9A25@icm.uu.se> <4B52FDF2-1F3F-48D5-9CE2-E197B710CC48@icm.uu.se> Message-ID: Lionel, I've discussed the HPV16 graphic from your package with a senior genomics person here. She recognized that it was useful to be able to read Genbank PaVE files directly. I've passed on the script that I was using, which uses your code to get coloured genes. Thanks for your help. We will look more at the package as time permits. Gordon On 2015-02-02, at 11:41 AM, Lionel Guy wrote: Hi Gordon, My complete code is: library(genoPlotR) hpv.df <- read_dna_seg_from_file("/Users/lguy/Desktop/tmp/HPV16.ref.PaVE.gbk", tagsToParse=c("CDS"), fileType = "detect", meta_lines = 2, gene_type = "blocks", header = TRUE, extra_fields = c("db_xref", "transl_table")) hpv.df$col <- rainbow(nrow(hpv.df)) hpv.df$gene_type <- "arrows" annot <- annotation(x1=middle(hpv.df), text=hpv.df$name, rot=0) plot_gene_map(list(hpv.df), annotations=annot) And it yields the following plot: HTH, Lionel On 2 Feb 2015, at 17:58 , Gordon Robertson > wrote: Thank you. Your code generated a graphic with a single line of dark blue arrows (for genes), with gene symbols above the arrows. I've looked briefly at the vignette and the help page for gene_types, but I do not have time right now to work with this information. Gordon On 2015-02-01, at 11:48 PM, Lionel Guy wrote: Hi Gordon, I haven't worked extensively with eukaryotic sequences, so in some ways the gbk parser might be behaving strangely with spliced sequences, for example. That being said, there are more blocks, they are just all the same color. Try hpv.df$col <- rainbow(nrow(hdv.df)) hpv.df$gene_type <- "arrows" annot <- annotation(x1=middle(hpv.df), text=hpv.df$name, rot=0) plot_gene_map(list(hpv.df), annotations=annot) that should help you see better. To plot different genes on different lines, you'll either have to separate them on different dna_segs, or write your own plotting functions. See the vignette for genoPlotR or the help page for function gene_types Kind regards, Lionel On 2 Feb 2015, at 0:04 , Gordon Robertson > wrote: Thanks Lionel That worked. It gave me a two-block graphic (attached). What I'd like is something like the CDS (gene) view (attached). How would I get that? Thanks, Gordon -- Gordon Robertson Canada's Michael Smith Genome Sciences Centre BC Cancer Agency Vancouver BC Canada www.bcgsc.ca T: 604-707-5900 x675416 Skype: a.gordon.robertson On 2015-02-01, at 1:38 PM, Lionel Guy wrote: Hi Gordon, Thanks for your interest in genoPlotR. The dna_segs argument of plot_gene_map should be a list of dna_seg objects. To plot one single dna_seg, use a list composed of a single dna_seg object: plot_gene_map(list(hpv.df)) That should do the trick. HTH, Lionel On 1 Feb 2015, at 21:17 , Gordon Robertson > wrote: Hi Lionel, I'm exploring (assessing) the genoPlotR package as a simple and extensible way of drawing information on viral integration into human genes in cancer. There are a number of components to this. I'm trying to take a first step. I'm also looking at the Gviz package. As a first step I've downloaded an HPV16 reference genome Genbank-format file from PaVE (attached). The file loads easily into genoPlotR. Now I'd like to plot the genes in the genome. library(genoPlotR) # HPV16 reference genome hpv.df <- read_dna_seg_from_file("/Users/grobertson/resources/genomic_data/viral/PaVE/HPV16.ref.PaVE.GenBank.format.txt", tagsToParse=c("CDS","misc_feature","gene"), fileType = "detect", meta_lines = 2, gene_type = "blocks", header = TRUE, extra_fields = c("db_xref", "transl_table")) str(hpv.df) # Classes ?dna_seg? and 'data.frame': 25 obs. of 19 variables: #$ name : chr "E1" "E2" "E4" "E5_ALPHA" ... #$ start : num 865 2756 3333 3850 83 ... #... plot_gene_map(hpv.df) Error in plot_gene_map(hpv.df) : Argument dna_segs must be a list of dna_seg objects Would you be willing to help me resolve this? I apologize for emailing so early in assessing your package, but I have very limited time for resolving this visualization issue. I'm using R 3.1.2 on OS X, and working in the latest RStudio. Thank you. Gordon -- Gordon Robertson agrobertson at telus.net -- Lionel Guy Department for Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden postal address: Box 582, SE-751 23 Uppsala; visiting address: BMC D7:3, Husargatan 3, SE-752 37 Uppsala phone: +46 73 976 0618 lionel.guy at imbim.uu.se -- Lionel Guy Department for Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden postal address: Box 582, SE-751 23 Uppsala; visiting address: BMC D7:3, Husargatan 3, SE-752 37 Uppsala phone: +46 73 976 0618 lionel.guy at imbim.uu.se -- Lionel Guy Department for Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden postal address: Box 582, SE-751 23 Uppsala; visiting address: BMC D7:3, Husargatan 3, SE-752 37 Uppsala phone: +46 73 976 0618 lionel.guy at imbim.uu.se -------------- next part -------------- An HTML attachment was scrubbed... URL: