[Rcpp-devel] add new components to list without specifying list size initially
Walrus Foolhill
walrus.foolhill at gmail.com
Fri Aug 12 23:26:12 CEST 2011
Thanks for your advice, I now understand how to manipulate one-level lists:
fn <- cxxfunction(signature(l_in="list"),
body='
using namespace Rcpp;
List l(l_in);
IntegerVector lf = l["foo"];
CharacterVector lb = l["bar"];
for(int i=0; i<lf.size(); ++i)
Rprintf("l[%s][%i] %i\\n", "foo", i, lf[i]);
for(int i=0; i<lb.size(); ++i)
Rprintf("l[%s][%i] %s\\n", "bar", i, std::string(lb[i]).c_str());
', plugin="Rcpp", verbose=TRUE)
z <- fn(list(foo=c(1,2,3,4),bar=c("bar1","bar2")))
But what about 2-level lists? Why the following code doesn't compile?
fn <- cxxfunction(signature(l_in="list"),
body='
using namespace Rcpp;
List l(l_in);
List lf(l["foo"]);
', plugin="Rcpp", verbose=TRUE)
z <- fn(list(foo=list(bar=1)))
And what the following message mean? "error: call of overloaded
‘Vector(Rcpp::internal::generic_name_proxy<19>)’ is ambiguous"
I had a look at "runit.Vector.R" on r-forge, but couldn't find any test
involving 2-level (or more) lists, although on SO in June 2010 (
http://stackoverflow.com/questions/3088650/how-do-i-create-a-list-of-vectors-in-rcpp/3088744#3088744),
you said that it should work.
I checked that I can create a 2-level list, but the code below doesn't
compile if I uncomment the last Rprintf line:
fn <- cxxfunction(signature(),
body='
using namespace Rcpp;
IntegerVector vi(2);
vi[0] = 2;
vi[1] = 8;
List ll = List::create(Named("bar")=vi);
Rprintf("ll.size %i\\n", ll.size());
List l = List::create(Named("foo")=ll);
Rprintf("l.size %i\\n", l.size());
//Rprintf("l.ll.size %i\\n", l["foo"].size());
return l;
', plugin="Rcpp", verbose=TRUE)
print(fn())
Thus once again I'm stuck, but if I know how to access 2-level lists, I
think I will be able to go back to my original problem, and stop sending
emails on this mailing list ;)
On Fri, Aug 12, 2011 at 8:09 AM, Dirk Eddelbuettel <edd at debian.org> wrote:
>
> On 12 August 2011 at 01:22, Walrus Foolhill wrote:
> | Ok, I started with smaller examples. I understand more or less how to
> | manipulate IntegerVectors, but not StringVectors (see below), and thus I
> can't
> | even start manipulating a simple list of StringVectors. Even so I looked
> at
> | mailing lists, StackOverflow, package pdf, source code on R-Forge...
> |
> | The following code tells me "warning: cannot pass objects of non-POD type
> | ‘struct Rcpp::internal::string_proxy<16>’ through ‘...’; call will abort
> at
> | runtime": why does it complain about printing the string in vec_s[i]?
>
> Again, simpler helps. That is the standard C / C++ error message of
>
> std:string foo = "bar";
> printf("String is %s \n", foo);
>
> where you need foo.c_str() to pass a char* to printf.
>
> | fn <- cxxfunction(signature(l_in="list"),
> | body='
> | using namespace Rcpp;
> | List l = List(l_in);
> | Rprintf("list size: %d\\n", l.size());
> |
> | IntegerVector vec_i= IntegerVector(2);
> | vec_i[0] = 1;
> | vec_i[1] = 2;
> | List l2 = List::create(_["vec"] = vec_i);
> | Rprintf("vec_i size: %d\\n", vec_i.size());
> | for(int i=0; i<vec_i.size(); ++i)
> | Rprintf("vec_i[%d]=%d\\n", i, vec_i[i]);
> |
> | StringVector vec_s = StringVector::create("toto");
> | vec_s[0] = "toto";
> | Rprintf("vec_s size: %d\\n", vec_s.size());
> | for(int i=0; i<vec_s.size(); ++i)
> | Rprintf("vec_s[%d]=%s\\n", i, vec_s[i]);
>
> Try vec_s[i].c_str() instead.
>
> Dirk
>
> | return l2;
> | ',
> | plugin="Rcpp", verbose=TRUE)
> | print(fn(list(a=c(1,2,3), b=c("a","b","c"))))
> |
> | Moreover, how can I access the component of a list given as input, as
> "l_in"
> | above? Should I use l.begin()? or l[1]? or l["a"]? none of them seems to
> | compile successfully.
> |
> | On Thu, Aug 11, 2011 at 8:54 PM, Dirk Eddelbuettel <edd at debian.org>
> wrote:
> |
> |
> | Howdy,
> |
> | On 11 August 2011 at 20:44, Walrus Foolhill wrote:
> | | Ok, thanks for your answer, but I wasn't clear enough. So here are
> more
> | details
> | | of what I want to do.
> | |
> | | I have one list named "probes":
> | | probes <- list(chr1=data.frame(name=c("p1","p2"),
> | | start=c(81,95),
> | | end=c(85,100),
> | | stringsAsFactors=FALSE))
> | |
> | | I also have one list named "genes":
> | | genes <- list(chr1=data.frame(name=c("g1","g2"), start=c(11,111),
> end=c
> | | (90,190)),
> | | chr2=data.frame(name="g3", start=11, end=90))
> | |
> | | I need to compare those two lists in order to obtain the following
> list
> | which
> | | contains, for each gene, the name of the probes included in it:
> | | links <- list(chr1=list(g1=c("p1")))
> | |
> | | Here is my R function (assuming that the probes are sorted based on
> their
> | start
> | | and end coordinates):
> | |
> | | fun.l <- function(genes, probes){
> | | links <- lapply(names(genes), function(chr.name){
> | | if(! chr.name %in% names(probes))
> | | return(NULL)
> | |
> | | res <- list()
> | |
> | | genes.c <- genes[[chr.name]]
> | | probes.c <- probes[[chr.name]]
> | |
> | | for(gene.name in genes.c$name){
> | | gene <- genes.c[genes.c$name == gene.name,]
> | | res[[gene.name]] <- vector()
> | | for(probe.name in probes.c$name){
> | | probe <- probes.c[probes.c$name == probe.name,]
> | | if(probe$start >= gene$start && probe$end <= gene$end)
> | | res[[gene.name]] <- append(res[[gene.name]], probe.name)
> | | else if(probe$start > gene$end)
> | | break
> | | }
> | | if(length(res[[gene.name]]) == 0)
> | | res[[gene.name]] <- NULL
> | | }
> | |
> | | if(length(res) == 0)
> | | res <- NA
> | | return(res)
> | | })
> | | names(links) <- names(genes)
> | | links <- Filter(function(links.c){!is.null(links.c)}, links)
> | | return(links)
> | | }
> | |
> | | And here is the beginning of my attempt using Rcpp:
> | |
> | | src <- '
> | | using namespace Rcpp;
> | |
> | | List genes = List(genes_in);
> | | int genes_nb_chr = genes.length();
> | | std::vector<std::string> genes_chr = genes.names();
> | |
> | | List probes = List(probes_in);
> | | int probes_nb_chr = probes.length();
> | |
> | | std::vector< std::vector<std::string> > links;
> | |
> | | // the main task is performed in this loop
> | | for(int chrnum=0; chrnum<genes_nb_chr; ++chrnum){
> | | DataFrame genes_c = DataFrame(genes[chrnum]);
> | | // ... add code to map probes on genes, that is fill "links" ...
> | | }
> | |
> | | return wrap(links);
> | | '
> | |
> | | funC <- cxxfunction(signature(genes_in="list",
> | | probes_in="list"),
> | | body=src, plugin="Rcpp")
> | |
> | | The problem starts quite early: when I compile this piece of code,
> I get
> | | "error: call of overloaded
> ‘DataFrame(Rcpp::internal::generic_proxy<19>)’
> | is
> | | ambiguous".
> |
> | Try a simpler mock-up. I don't have it in me to work through this
> now.
> | DataFrames are a little different from C++ -- start by trying to
> summarize
> | in
> | just a vector, or collection of vectors.
> |
> | | What should I do to go through the "probes" and "genes" lists given
> as
> | input?
> | | Maybe more generically, how can we go through a list of lists (of
> | lists...)
> | | with Rcpp?
> | |
> | | 2nd (small) question, I don't manage to use Rprintf when using
> inline,
> | for
> | | instance Rprintf("%d\n", i);, it complains about the quotes. What
> should
> | I do
> | | to print statement from within the for loop?
> |
> | The backslashes need escaping as in
> |
> | R> printing <- cxxfunction(, plugin="Rcpp", body='
> Rprintf("foo\\n"); ')
> | R> printing()
> | foo
> | NULL
> | R>
> |
> | | Thanks in advance. As my question is very long, I won't mind if you
> tell
> | me to
> | | find another way by myself. But maybe one of you can put me on the
> good
> | track.
> |
> | You are doing good but you have decent size problem. Try breaking
> into
> | smaller pieces and a handle on each problem in turn.
> |
> | Dirk
> |
> | |
> | | On Thu, Aug 11, 2011 at 7:00 AM, Dirk Eddelbuettel <edd at debian.org
> >
> | wrote:
> | |
> | |
> | | On 11 August 2011 at 03:06, Walrus Foolhill wrote:
> | | | Hello,
> | | | I need to create a list and then fill it sequentially by
> adding
> | | components in a
> | | | for loop. Here is an example that works:
> | | |
> | | | library(inline)
> | | | src <- '
> | | | Rcpp::List mylist(2);
> | | | for(int i=0; i<2; ++i)
> | | | mylist[i] = i;
> | | | mylist.names() = CharacterVector::create("a","b");
> | | | return mylist;
> | | | '
> | | | fun <- cxxfunction(body=src, plugin="Rcpp")
> | | | print(fun())
> | | |
> | | | But what I really want is to create an empty list and then
> fill it,
> | that
> | | is
> | | | without specifying its number of components before hand...
> This is
> | | because I
> | | | don't know in advance at which step of the for loop I will
> need to
> | create
> | | a new
> | | | component. Here is an example, that obviously doesn't work,
> but
> | that
> | | should
> | | | show what I am looking for:
> | | |
> | | | Rcpp::List mylist;
> | | | CharacterVector names = CharacterVector::create("a", "b");
> | |
> | | If you know how long names is, you know how long mylist going
> to be
> | ....
> | |
> | | | for(int i=0; i<2; ++i){
> | | | mylist.add(names[i], IntegerVector::create());
> | | | mylist[names[i]].push_back(i);
> | |
> | | I don't understand what that is trying to do.
> | |
> | | | }
> | | | return mylist;
> | | |
> | | | Do you know how I could achieve this? Thanks.
> | |
> | | Rcpp::List is an alias for Rcpp::GenericVector, and derives
> from
> | Vector.
> | | You
> | | can look at the public member functions -- there are things
> like
> | |
> | | push_back()
> | | push_front()
> | | insert()
> | |
> | | etc that behave like STL functions __but are inefficient as we
> | (almost
> | | always) need to copy the whole object__ so they are not
> recommended.
> | |
> | | When I had to deal with 'unknown quantities of data' returning
> I was
> | mostly
> | | able to either turn it into a 'fixed or known columns, unknow
> rows'
> | problem
> | | (easy, just grow row-wise) or I 'cached' in a C++ data
> structure
> | first
> | | before
> | | returning to R via Rcpp structures -- and then I knew the
> dimensions
> | for
> | | the
> | | to-be-created object too.
> | |
> | | Dirk
> | |
> | |
> | | --
> | | Two new Rcpp master classes for R and C++ integration scheduled
> for
> | | New York (Sep 24) and San Francisco (Oct 8), more details are
> at
> | | http://dirk.eddelbuettel.com/blog/2011/08/04#
> | | rcpp_classes_2011-09_and_2011-10
> | |
> | |
> |
> | --
> | Two new Rcpp master classes for R and C++ integration scheduled for
> | New York (Sep 24) and San Francisco (Oct 8), more details are at
> | http://dirk.eddelbuettel.com/blog/2011/08/04#
> | rcpp_classes_2011-09_and_2011-10
> | http://www.revolutionanalytics.com/products/training/public/
> | rcpp-master-class.php
> |
> |
>
> --
> Two new Rcpp master classes for R and C++ integration scheduled for
> New York (Sep 24) and San Francisco (Oct 8), more details are at
>
> http://dirk.eddelbuettel.com/blog/2011/08/04#rcpp_classes_2011-09_and_2011-10
>
> http://www.revolutionanalytics.com/products/training/public/rcpp-master-class.php
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20110812/684966d5/attachment-0001.htm>
More information about the Rcpp-devel
mailing list