From santosh.srinivas at gmail.com Tue Jan 6 05:44:14 2015 From: santosh.srinivas at gmail.com (Santosh Srinivas) Date: Tue, 6 Jan 2015 10:14:14 +0530 Subject: [datatable-help] Append missing characters to character variable to bring to a standard length Message-ID: Hello All, I am trying to create a character variable in my data table so that it is *atleast* 6 characters long. The last line seems to be going wrong. Please advise. Reproducible code below require("data.table") value <- sample(seq(60:1000),100) #Random data dt <- as.data.table(value) dt[,value:=as.character(value)] #Cast as character dt[, value_MISSINGDIGITS:=6-nchar(value)] #Check for # of characters missing dt[, value_MISSINGDIGITS:=value_MISSINGDIGITS*(value_MISSINGDIGITS>0)] #Handle negative values # This works till here! # The missing character count works correctly above # The below fails. I am trying to generate dummy 0s to fill the missing characters. The values do not get generated accurately dt[,value_MISSINGPART:=substr("000000",0,value_MISSINGDIGITS)] > sessionInfo() R version 3.1.2 (2014-10-31) Platform: x86_64-apple-darwin13.4.0 (64-bit) locale: [1] C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] fractalrock_1.1.0 quantmod_0.4-0 TTR_0.22-0 xts_0.9-7 zoo_1.7-11 Defaults_1.1-1 [7] futile.logger_1.3.7 futile.any_1.3.0 lambda.r_1.1.6 timeDate_3011.99 lubridate_1.3.3 data.table_1.9.4 loaded via a namespace (and not attached): [1] Rcpp_0.11.3 chron_2.3-45 digest_0.6.4 futile.options_1.0.0 grid_3.1.2 [6] lattice_0.20-29 memoise_0.2.1 plyr_1.8.1 reshape2_1.4 stringr_0.6.2 [11] tools_3.1.2 -------------- next part -------------- An HTML attachment was scrubbed... URL: From mel at mbacou.com Tue Jan 6 06:10:48 2015 From: mel at mbacou.com (Bacou, Melanie) Date: Tue, 06 Jan 2015 00:10:48 -0500 Subject: [datatable-help] Append missing characters to character variable to bring to a standard length In-Reply-To: References: Message-ID: <54AB6E58.6090302@mbacou.com> substr() needs argument fixed=TRUE or else it uses regex expression for string matching and replacements. I'd suggest you look into stringr::str_pad(x, 6) or formatC(x, width = 6, format = "d", flag = "0") for faster ways to achieve this. --Mel. On 1/5/2015 11:44 PM, Santosh Srinivas wrote: > Hello All, > > I am trying to create a character variable in my data table so that it > is *atleast* 6 characters long. > The last line seems to be going wrong. > > Please advise. Reproducible code below > > require("data.table") > value <- sample(seq(60:1000),100) #Random data > dt <- as.data.table(value) > dt[,value:=as.character(value)] #Cast as character > dt[, value_MISSINGDIGITS:=6-nchar(value)] #Check for # of characters > missing > dt[, value_MISSINGDIGITS:=value_MISSINGDIGITS*(value_MISSINGDIGITS>0)] > #Handle negative values > > # This works till here! > # The missing character count works correctly above > > # The below fails. I am trying to generate dummy 0s to fill the > missing characters. The values do not get generated accurately > dt[,value_MISSINGPART:=substr("000000",0,value_MISSINGDIGITS)] > > > > sessionInfo() > R version 3.1.2 (2014-10-31) > Platform: x86_64-apple-darwin13.4.0 (64-bit) > > locale: > [1] C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] fractalrock_1.1.0 quantmod_0.4-0 TTR_0.22-0 > xts_0.9-7 zoo_1.7-11 Defaults_1.1-1 > [7] futile.logger_1.3.7 futile.any_1.3.0 lambda.r_1.1.6 > timeDate_3011.99 lubridate_1.3.3 data.table_1.9.4 > > loaded via a namespace (and not attached): > [1] Rcpp_0.11.3 chron_2.3-45 digest_0.6.4 > futile.options_1.0.0 grid_3.1.2 > [6] lattice_0.20-29 memoise_0.2.1 plyr_1.8.1 > reshape2_1.4 stringr_0.6.2 > [11] tools_3.1.2 > > > _______________________________________________ > datatable-help mailing list > datatable-help at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help -- Melanie BACOU International Food Policy Research Institute Snr. Program Manager, HarvestChoice Work +1(202)862-5699 E-mail m.bacou at cgiar.org Visit www.harvestchoice.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From harishv_99 at yahoo.com Tue Jan 6 09:01:50 2015 From: harishv_99 at yahoo.com (Harish) Date: Tue, 6 Jan 2015 08:01:50 +0000 (UTC) Subject: [datatable-help] fread() without quoting? Message-ID: <1487850865.3079353.1420531310885.JavaMail.yahoo@jws10085.mail.ne1.yahoo.com> I noticed that fread() now started treating quotes(") as special. ?I have large data files to read where I do not want to use this quoting feature. ?Is there a way to disable quoting when I read a file? Regards,Harish -------------- next part -------------- An HTML attachment was scrubbed... URL: From santosh.srinivas at gmail.com Tue Jan 6 09:36:52 2015 From: santosh.srinivas at gmail.com (Santosh Srinivas) Date: Tue, 6 Jan 2015 14:06:52 +0530 Subject: [datatable-help] Append missing characters to character variable to bring to a standard length In-Reply-To: <54AB6E58.6090302@mbacou.com> References: <54AB6E58.6090302@mbacou.com> Message-ID: Thank you Mel. str_pad is what i was looking for. require("data.table") value <- sample(seq(60:1000),100) #Random data dt <- as.data.table(value) dt[,value:=stringr::str_pad(value, 6, pad="0", side="right")] On Tue, Jan 6, 2015 at 10:40 AM, Bacou, Melanie wrote: > substr() needs argument fixed=TRUE or else it uses regex expression for > string matching and replacements. > I'd suggest you look into stringr::str_pad(x, 6) or formatC(x, width = 6, > format = "d", flag = "0") for faster ways to achieve this. > --Mel. > > > > On 1/5/2015 11:44 PM, Santosh Srinivas wrote: > > Hello All, > > I am trying to create a character variable in my data table so that it > is *atleast* 6 characters long. > The last line seems to be going wrong. > > Please advise. Reproducible code below > > require("data.table") > value <- sample(seq(60:1000),100) #Random data > dt <- as.data.table(value) > dt[,value:=as.character(value)] #Cast as character > dt[, value_MISSINGDIGITS:=6-nchar(value)] #Check for # of characters > missing > dt[, value_MISSINGDIGITS:=value_MISSINGDIGITS*(value_MISSINGDIGITS>0)] > #Handle negative values > > # This works till here! > # The missing character count works correctly above > > # The below fails. I am trying to generate dummy 0s to fill the missing > characters. The values do not get generated accurately > dt[,value_MISSINGPART:=substr("000000",0,value_MISSINGDIGITS)] > > > > sessionInfo() > R version 3.1.2 (2014-10-31) > Platform: x86_64-apple-darwin13.4.0 (64-bit) > > locale: > [1] C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] fractalrock_1.1.0 quantmod_0.4-0 TTR_0.22-0 xts_0.9-7 > zoo_1.7-11 Defaults_1.1-1 > [7] futile.logger_1.3.7 futile.any_1.3.0 lambda.r_1.1.6 > timeDate_3011.99 lubridate_1.3.3 data.table_1.9.4 > > loaded via a namespace (and not attached): > [1] Rcpp_0.11.3 chron_2.3-45 digest_0.6.4 > futile.options_1.0.0 grid_3.1.2 > [6] lattice_0.20-29 memoise_0.2.1 plyr_1.8.1 > reshape2_1.4 stringr_0.6.2 > [11] tools_3.1.2 > > > _______________________________________________ > datatable-help mailing listdatatable-help at lists.r-forge.r-project.orghttps://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > > > -- > Melanie BACOU > International Food Policy Research Institute > Snr. Program Manager, HarvestChoice > Work +1(202)862-5699 > E-mail m.bacou at cgiar.org > Visit www.harvestchoice.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperickson at wisc.edu Wed Jan 14 23:06:36 2015 From: fperickson at wisc.edu (Frank Erickson) Date: Wed, 14 Jan 2015 17:06:36 -0500 Subject: [datatable-help] best way of eval-ing a list of quoted expressions Message-ID: Hi, I'm wondering what the most idiomatic or efficient approach is...? Here's my example: expr_nonlin = list( early = quote(tt/TT*(tt/TT < .2)), late = quote(tt/TT*(tt/TT > .8)) ) # eval on a single expr works data.table(tt=1,TT=100)[,early:=eval(expr_nonlin$early)][] # lapply eval does not work data.table(tt=1,TT=100)[,names(expr_nonlin):=lapply(expr_nonlin,eval)][] # (1) envir fixes it DT <- data.table(tt=1,TT=100) DT[,names(expr_nonlin):=lapply(expr_nonlin,eval,envir=DT)][] # (2) or a for loop DT <- data.table(tt=1,TT=100) for (i in names(expr_nonlin)) DT[,(i):=eval(expr_nonlin[[i]])] (1) and (2) both work. Is either preferable? (1) calls [.data.table fewer times, but messes around with environments, which always seem fragile. ------------------ One more quick question: In approach (1), is there a way to skip the names(zzz):= part? I see that this doesn't work: DT <- data.table(tt=1,TT=100) DT[,do.call(`:=`,lapply(expr_nonlin,eval,envir=DT))][] Thanks, Frank -------------- next part -------------- An HTML attachment was scrubbed... URL: From jmtruppia at gmail.com Thu Jan 15 17:07:40 2015 From: jmtruppia at gmail.com (Juan Manuel Truppia) Date: Thu, 15 Jan 2015 16:07:40 +0000 Subject: [datatable-help] datatable-help Digest, Vol 59, Issue 2 References: Message-ID: I don't know what you are trying to achieve, but I usually quote the list, instead of generating a list of quotes. I think that your issue is similar to something I've faced in the past, and I usually solve it like this dt <- data.table(a = runif(10)) ee <- quote(list(3 * a, a +2)) dt[, c("b", "c") := eval(ee)] I still don't know how to define the column names in the quoted expression, instead of in the `:=` call. Hope it helps On Thu Jan 15 2015 at 8:00:10 AM < datatable-help-request at lists.r-forge.r-project.org> wrote: > Send datatable-help mailing list submissions to > datatable-help at lists.r-forge.r-project.org > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.r-forge.r-project.org/cgi-bin/mailman/ > listinfo/datatable-help > > or, via email, send a message with subject or body 'help' to > datatable-help-request at lists.r-forge.r-project.org > > You can reach the person managing the list at > datatable-help-owner at lists.r-forge.r-project.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of datatable-help digest..." > > > Today's Topics: > > 1. best way of eval-ing a list of quoted expressions (Frank Erickson) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Wed, 14 Jan 2015 17:06:36 -0500 > From: Frank Erickson > To: "data.table help" > Subject: [datatable-help] best way of eval-ing a list of quoted > expressions > Message-ID: > FrfEssH9mqoLiA at mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > Hi, > > I'm wondering what the most idiomatic or efficient approach is...? Here's > my example: > > expr_nonlin = list( > early = quote(tt/TT*(tt/TT < .2)), > late = quote(tt/TT*(tt/TT > .8)) > ) > > # eval on a single expr works > data.table(tt=1,TT=100)[,early:=eval(expr_nonlin$early)][] > > # lapply eval does not work > data.table(tt=1,TT=100)[,names(expr_nonlin):=lapply(expr_nonlin,eval)][] > > # (1) envir fixes it > DT <- data.table(tt=1,TT=100) > DT[,names(expr_nonlin):=lapply(expr_nonlin,eval,envir=DT)][] > > # (2) or a for loop > DT <- data.table(tt=1,TT=100) > for (i in names(expr_nonlin)) DT[,(i):=eval(expr_nonlin[[i]])] > > (1) and (2) both work. Is either preferable? > > (1) calls [.data.table fewer times, but messes around with environments, > which always seem fragile. > > ------------------ > > One more quick question: In approach (1), is there a way to skip the > names(zzz):= part? I see that this doesn't work: > > DT <- data.table(tt=1,TT=100) > DT[,do.call(`:=`,lapply(expr_nonlin,eval,envir=DT))][] > > > Thanks, > > Frank > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: datatable-help/attachments/20150114/035b1682/attachment-0001.html> > > ------------------------------ > > _______________________________________________ > datatable-help mailing list > datatable-help at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/ > listinfo/datatable-help > > End of datatable-help Digest, Vol 59, Issue 2 > ********************************************* > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperickson at wisc.edu Fri Jan 16 15:19:37 2015 From: fperickson at wisc.edu (Frank Erickson) Date: Fri, 16 Jan 2015 09:19:37 -0500 Subject: [datatable-help] datatable-help Digest, Vol 59, Issue 2 In-Reply-To: References: Message-ID: Thanks! I keep forgetting that I can eval() a larger statement. I think I'll stick to an option that keeps the names close to the definitions for now, though. By the way, you replied to the r-forge forum "digest," outside of the original thread. I might've missed the reply if there weren't so few active conversations. On Thu, Jan 15, 2015 at 11:07 AM, Juan Manuel Truppia wrote: > I don't know what you are trying to achieve, but I usually quote the list, > instead of generating a list of quotes. I think that your issue is similar > to something I've faced in the past, and I usually solve it like this > > dt <- data.table(a = runif(10)) > ee <- quote(list(3 * a, a +2)) > dt[, c("b", "c") := eval(ee)] > > I still don't know how to define the column names in the quoted > expression, instead of in the `:=` call. > > Hope it helps > > > On Thu Jan 15 2015 at 8:00:10 AM < > datatable-help-request at lists.r-forge.r-project.org> wrote: > >> Send datatable-help mailing list submissions to >> datatable-help at lists.r-forge.r-project.org >> >> To subscribe or unsubscribe via the World Wide Web, visit >> https://lists.r-forge.r-project.org/cgi-bin/mailman/ >> listinfo/datatable-help >> >> or, via email, send a message with subject or body 'help' to >> datatable-help-request at lists.r-forge.r-project.org >> >> You can reach the person managing the list at >> datatable-help-owner at lists.r-forge.r-project.org >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of datatable-help digest..." >> >> >> Today's Topics: >> >> 1. best way of eval-ing a list of quoted expressions (Frank Erickson) >> >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Wed, 14 Jan 2015 17:06:36 -0500 >> From: Frank Erickson >> To: "data.table help" >> Subject: [datatable-help] best way of eval-ing a list of quoted >> expressions >> Message-ID: >> > FrfEssH9mqoLiA at mail.gmail.com> >> Content-Type: text/plain; charset="utf-8" >> >> Hi, >> >> I'm wondering what the most idiomatic or efficient approach is...? Here's >> my example: >> >> expr_nonlin = list( >> early = quote(tt/TT*(tt/TT < .2)), >> late = quote(tt/TT*(tt/TT > .8)) >> ) >> >> # eval on a single expr works >> data.table(tt=1,TT=100)[,early:=eval(expr_nonlin$early)][] >> >> # lapply eval does not work >> data.table(tt=1,TT=100)[,names(expr_nonlin):=lapply(expr_nonlin,eval)][] >> >> # (1) envir fixes it >> DT <- data.table(tt=1,TT=100) >> DT[,names(expr_nonlin):=lapply(expr_nonlin,eval,envir=DT)][] >> >> # (2) or a for loop >> DT <- data.table(tt=1,TT=100) >> for (i in names(expr_nonlin)) DT[,(i):=eval(expr_nonlin[[i]])] >> >> (1) and (2) both work. Is either preferable? >> >> (1) calls [.data.table fewer times, but messes around with environments, >> which always seem fragile. >> >> ------------------ >> >> One more quick question: In approach (1), is there a way to skip the >> names(zzz):= part? I see that this doesn't work: >> >> DT <- data.table(tt=1,TT=100) >> DT[,do.call(`:=`,lapply(expr_nonlin,eval,envir=DT))][] >> >> >> Thanks, >> >> Frank >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: > datatable-help/attachments/20150114/035b1682/attachment-0001.html> >> >> ------------------------------ >> >> _______________________________________________ >> datatable-help mailing list >> datatable-help at lists.r-forge.r-project.org >> https://lists.r-forge.r-project.org/cgi-bin/mailman/ >> listinfo/datatable-help >> >> End of datatable-help Digest, Vol 59, Issue 2 >> ********************************************* >> > > _______________________________________________ > datatable-help mailing list > datatable-help at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jmtruppia at gmail.com Fri Jan 16 15:56:28 2015 From: jmtruppia at gmail.com (Juan Manuel Truppia) Date: Fri, 16 Jan 2015 14:56:28 +0000 Subject: [datatable-help] best way of eval-ing a list of quoted expressions (Frank Erickson) References: Message-ID: With a little more work you could keep the definitions with the names. Here is how dt <- data.table(a = runif(10)) ee <- quote(list(3 * a, a +2)) dt[, names(ee)[-1] := eval(ee)] Sorry, hope this now sticks to the thread On Fri Jan 16 2015 at 11:19:57 AM Frank Erickson wrote: > Thanks! I keep forgetting that I can eval() a larger statement. I think > I'll stick to an option that keeps the names close to the definitions for > now, though. > > By the way, you replied to the r-forge forum "digest," outside of the > original thread. I might've missed the reply if there weren't so few active > conversations. > > On Thu, Jan 15, 2015 at 11:07 AM, Juan Manuel Truppia > wrote: > >> I don't know what you are trying to achieve, but I usually quote the >> list, instead of generating a list of quotes. I think that your issue is >> similar to something I've faced in the past, and I usually solve it like >> this >> >> dt <- data.table(a = runif(10)) >> ee <- quote(list(3 * a, a +2)) >> dt[, c("b", "c") := eval(ee)] >> >> I still don't know how to define the column names in the quoted >> expression, instead of in the `:=` call. >> >> Hope it helps >> >> >> On Thu Jan 15 2015 at 8:00:10 AM < >> datatable-help-request at lists.r-forge.r-project.org> wrote: >> >>> Send datatable-help mailing list submissions to >>> datatable-help at lists.r-forge.r-project.org >>> >>> To subscribe or unsubscribe via the World Wide Web, visit >>> https://lists.r-forge.r-project.org/cgi-bin/mailman/ >>> listinfo/datatable-help >>> >>> or, via email, send a message with subject or body 'help' to >>> datatable-help-request at lists.r-forge.r-project.org >>> >>> You can reach the person managing the list at >>> datatable-help-owner at lists.r-forge.r-project.org >>> >>> When replying, please edit your Subject line so it is more specific >>> than "Re: Contents of datatable-help digest..." >>> >>> >>> Today's Topics: >>> >>> 1. best way of eval-ing a list of quoted expressions (Frank Erickson) >>> >>> >>> ---------------------------------------------------------------------- >>> >>> Message: 1 >>> Date: Wed, 14 Jan 2015 17:06:36 -0500 >>> From: Frank Erickson >>> To: "data.table help" >>> Subject: [datatable-help] best way of eval-ing a list of quoted >>> expressions >>> Message-ID: >>> >> FrfEssH9mqoLiA at mail.gmail.com> >>> Content-Type: text/plain; charset="utf-8" >>> >>> Hi, >>> >>> I'm wondering what the most idiomatic or efficient approach is...? Here's >>> my example: >>> >>> expr_nonlin = list( >>> early = quote(tt/TT*(tt/TT < .2)), >>> late = quote(tt/TT*(tt/TT > .8)) >>> ) >>> >>> # eval on a single expr works >>> data.table(tt=1,TT=100)[,early:=eval(expr_nonlin$early)][] >>> >>> # lapply eval does not work >>> data.table(tt=1,TT=100)[,names(expr_nonlin):=lapply(expr_nonlin,eval)][] >>> >>> # (1) envir fixes it >>> DT <- data.table(tt=1,TT=100) >>> DT[,names(expr_nonlin):=lapply(expr_nonlin,eval,envir=DT)][] >>> >>> # (2) or a for loop >>> DT <- data.table(tt=1,TT=100) >>> for (i in names(expr_nonlin)) DT[,(i):=eval(expr_nonlin[[i]])] >>> >>> (1) and (2) both work. Is either preferable? >>> >>> (1) calls [.data.table fewer times, but messes around with environments, >>> which always seem fragile. >>> >>> ------------------ >>> >>> One more quick question: In approach (1), is there a way to skip the >>> names(zzz):= part? I see that this doesn't work: >>> >>> DT <- data.table(tt=1,TT=100) >>> DT[,do.call(`:=`,lapply(expr_nonlin,eval,envir=DT))][] >>> >>> >>> Thanks, >>> >>> Frank >>> -------------- next part -------------- >>> An HTML attachment was scrubbed... >>> URL: >> datatable-help/attachments/20150114/035b1682/attachment-0001.html> >>> >>> ------------------------------ >>> >>> _______________________________________________ >>> datatable-help mailing list >>> datatable-help at lists.r-forge.r-project.org >>> https://lists.r-forge.r-project.org/cgi-bin/mailman/ >>> listinfo/datatable-help >>> >>> End of datatable-help Digest, Vol 59, Issue 2 >>> ********************************************* >>> >> >> _______________________________________________ >> datatable-help mailing list >> datatable-help at lists.r-forge.r-project.org >> >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperickson at wisc.edu Fri Jan 16 17:34:49 2015 From: fperickson at wisc.edu (Frank Erickson) Date: Fri, 16 Jan 2015 11:34:49 -0500 Subject: [datatable-help] best way of eval-ing a list of quoted expressions (Frank Erickson) In-Reply-To: References: Message-ID: Ah, good idea. That's what I'll do next time. I hadn't realized that "names" could look inside a quoted/language object. Thanks! On Fri, Jan 16, 2015 at 9:56 AM, Juan Manuel Truppia wrote: > With a little more work you could keep the definitions with the names. > Here is how > > dt <- data.table(a = runif(10)) > ee <- quote(list(3 * a, a +2)) > dt[, names(ee)[-1] := eval(ee)] > > Sorry, hope this now sticks to the thread > > On Fri Jan 16 2015 at 11:19:57 AM Frank Erickson > wrote: > >> Thanks! I keep forgetting that I can eval() a larger statement. I think >> I'll stick to an option that keeps the names close to the definitions for >> now, though. >> >> By the way, you replied to the r-forge forum "digest," outside of the >> original thread. I might've missed the reply if there weren't so few active >> conversations. >> >> On Thu, Jan 15, 2015 at 11:07 AM, Juan Manuel Truppia < >> jmtruppia at gmail.com> wrote: >> >>> I don't know what you are trying to achieve, but I usually quote the >>> list, instead of generating a list of quotes. I think that your issue is >>> similar to something I've faced in the past, and I usually solve it like >>> this >>> >>> dt <- data.table(a = runif(10)) >>> ee <- quote(list(3 * a, a +2)) >>> dt[, c("b", "c") := eval(ee)] >>> >>> I still don't know how to define the column names in the quoted >>> expression, instead of in the `:=` call. >>> >>> Hope it helps >>> >>> >>> On Thu Jan 15 2015 at 8:00:10 AM < >>> datatable-help-request at lists.r-forge.r-project.org> wrote: >>> >>>> Send datatable-help mailing list submissions to >>>> datatable-help at lists.r-forge.r-project.org >>>> >>>> To subscribe or unsubscribe via the World Wide Web, visit >>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/ >>>> listinfo/datatable-help >>>> >>>> or, via email, send a message with subject or body 'help' to >>>> datatable-help-request at lists.r-forge.r-project.org >>>> >>>> You can reach the person managing the list at >>>> datatable-help-owner at lists.r-forge.r-project.org >>>> >>>> When replying, please edit your Subject line so it is more specific >>>> than "Re: Contents of datatable-help digest..." >>>> >>>> >>>> Today's Topics: >>>> >>>> 1. best way of eval-ing a list of quoted expressions (Frank Erickson) >>>> >>>> >>>> ---------------------------------------------------------------------- >>>> >>>> Message: 1 >>>> Date: Wed, 14 Jan 2015 17:06:36 -0500 >>>> From: Frank Erickson >>>> To: "data.table help" >>>> Subject: [datatable-help] best way of eval-ing a list of quoted >>>> expressions >>>> Message-ID: >>>> >>> FrfEssH9mqoLiA at mail.gmail.com> >>>> Content-Type: text/plain; charset="utf-8" >>>> >>>> Hi, >>>> >>>> I'm wondering what the most idiomatic or efficient approach is...? >>>> Here's >>>> my example: >>>> >>>> expr_nonlin = list( >>>> early = quote(tt/TT*(tt/TT < .2)), >>>> late = quote(tt/TT*(tt/TT > .8)) >>>> ) >>>> >>>> # eval on a single expr works >>>> data.table(tt=1,TT=100)[,early:=eval(expr_nonlin$early)][] >>>> >>>> # lapply eval does not work >>>> data.table(tt=1,TT=100)[,names(expr_nonlin):=lapply( >>>> expr_nonlin,eval)][] >>>> >>>> # (1) envir fixes it >>>> DT <- data.table(tt=1,TT=100) >>>> DT[,names(expr_nonlin):=lapply(expr_nonlin,eval,envir=DT)][] >>>> >>>> # (2) or a for loop >>>> DT <- data.table(tt=1,TT=100) >>>> for (i in names(expr_nonlin)) DT[,(i):=eval(expr_nonlin[[i]])] >>>> >>>> (1) and (2) both work. Is either preferable? >>>> >>>> (1) calls [.data.table fewer times, but messes around with environments, >>>> which always seem fragile. >>>> >>>> ------------------ >>>> >>>> One more quick question: In approach (1), is there a way to skip the >>>> names(zzz):= part? I see that this doesn't work: >>>> >>>> DT <- data.table(tt=1,TT=100) >>>> DT[,do.call(`:=`,lapply(expr_nonlin,eval,envir=DT))][] >>>> >>>> >>>> Thanks, >>>> >>>> Frank >>>> -------------- next part -------------- >>>> An HTML attachment was scrubbed... >>>> URL: >>> datatable-help/attachments/20150114/035b1682/attachment-0001.html> >>>> >>>> ------------------------------ >>>> >>>> _______________________________________________ >>>> datatable-help mailing list >>>> datatable-help at lists.r-forge.r-project.org >>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/ >>>> listinfo/datatable-help >>>> >>>> End of datatable-help Digest, Vol 59, Issue 2 >>>> ********************************************* >>>> >>> >>> _______________________________________________ >>> datatable-help mailing list >>> datatable-help at lists.r-forge.r-project.org >>> >>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >>> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jmtruppia at gmail.com Fri Jan 16 17:43:47 2015 From: jmtruppia at gmail.com (Juan Manuel Truppia) Date: Fri, 16 Jan 2015 16:43:47 +0000 Subject: [datatable-help] best way of eval-ing a list of quoted expressions (Frank Erickson) References: Message-ID: Forgot to change the second line. Here it goes dt <- data.table(a = runif(10)) ee <- quote(list(b = 3 * a, c =a +2)) dt[, names(ee)[-1] := eval(ee)] On Fri Jan 16 2015 at 1:35:09 PM Frank Erickson wrote: > Ah, good idea. That's what I'll do next time. > > I hadn't realized that "names" could look inside a quoted/language object. > > Thanks! > > On Fri, Jan 16, 2015 at 9:56 AM, Juan Manuel Truppia > wrote: > >> With a little more work you could keep the definitions with the names. >> Here is how >> >> dt <- data.table(a = runif(10)) >> ee <- quote(list(3 * a, a +2)) >> dt[, names(ee)[-1] := eval(ee)] >> >> Sorry, hope this now sticks to the thread >> >> On Fri Jan 16 2015 at 11:19:57 AM Frank Erickson >> wrote: >> >>> Thanks! I keep forgetting that I can eval() a larger statement. I think >>> I'll stick to an option that keeps the names close to the definitions for >>> now, though. >>> >>> By the way, you replied to the r-forge forum "digest," outside of the >>> original thread. I might've missed the reply if there weren't so few active >>> conversations. >>> >>> On Thu, Jan 15, 2015 at 11:07 AM, Juan Manuel Truppia < >>> jmtruppia at gmail.com> wrote: >>> >>>> I don't know what you are trying to achieve, but I usually quote the >>>> list, instead of generating a list of quotes. I think that your issue is >>>> similar to something I've faced in the past, and I usually solve it like >>>> this >>>> >>>> dt <- data.table(a = runif(10)) >>>> ee <- quote(list(3 * a, a +2)) >>>> dt[, c("b", "c") := eval(ee)] >>>> >>>> I still don't know how to define the column names in the quoted >>>> expression, instead of in the `:=` call. >>>> >>>> Hope it helps >>>> >>>> >>>> On Thu Jan 15 2015 at 8:00:10 AM < >>>> datatable-help-request at lists.r-forge.r-project.org> wrote: >>>> >>>>> Send datatable-help mailing list submissions to >>>>> datatable-help at lists.r-forge.r-project.org >>>>> >>>>> To subscribe or unsubscribe via the World Wide Web, visit >>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/ >>>>> listinfo/datatable-help >>>>> >>>>> or, via email, send a message with subject or body 'help' to >>>>> datatable-help-request at lists.r-forge.r-project.org >>>>> >>>>> You can reach the person managing the list at >>>>> datatable-help-owner at lists.r-forge.r-project.org >>>>> >>>>> When replying, please edit your Subject line so it is more specific >>>>> than "Re: Contents of datatable-help digest..." >>>>> >>>>> >>>>> Today's Topics: >>>>> >>>>> 1. best way of eval-ing a list of quoted expressions (Frank >>>>> Erickson) >>>>> >>>>> >>>>> ---------------------------------------------------------------------- >>>>> >>>>> Message: 1 >>>>> Date: Wed, 14 Jan 2015 17:06:36 -0500 >>>>> From: Frank Erickson >>>>> To: "data.table help" >>>>> Subject: [datatable-help] best way of eval-ing a list of quoted >>>>> expressions >>>>> Message-ID: >>>>> >>>> FrfEssH9mqoLiA at mail.gmail.com> >>>>> Content-Type: text/plain; charset="utf-8" >>>>> >>>>> Hi, >>>>> >>>>> I'm wondering what the most idiomatic or efficient approach is...? >>>>> Here's >>>>> my example: >>>>> >>>>> expr_nonlin = list( >>>>> early = quote(tt/TT*(tt/TT < .2)), >>>>> late = quote(tt/TT*(tt/TT > .8)) >>>>> ) >>>>> >>>>> # eval on a single expr works >>>>> data.table(tt=1,TT=100)[,early:=eval(expr_nonlin$early)][] >>>>> >>>>> # lapply eval does not work >>>>> data.table(tt=1,TT=100)[,names(expr_nonlin):=lapply( >>>>> expr_nonlin,eval)][] >>>>> >>>>> # (1) envir fixes it >>>>> DT <- data.table(tt=1,TT=100) >>>>> DT[,names(expr_nonlin):=lapply(expr_nonlin,eval,envir=DT)][] >>>>> >>>>> # (2) or a for loop >>>>> DT <- data.table(tt=1,TT=100) >>>>> for (i in names(expr_nonlin)) DT[,(i):=eval(expr_nonlin[[i]])] >>>>> >>>>> (1) and (2) both work. Is either preferable? >>>>> >>>>> (1) calls [.data.table fewer times, but messes around with >>>>> environments, >>>>> which always seem fragile. >>>>> >>>>> ------------------ >>>>> >>>>> One more quick question: In approach (1), is there a way to skip the >>>>> names(zzz):= part? I see that this doesn't work: >>>>> >>>>> DT <- data.table(tt=1,TT=100) >>>>> DT[,do.call(`:=`,lapply(expr_nonlin,eval,envir=DT))][] >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Frank >>>>> -------------- next part -------------- >>>>> An HTML attachment was scrubbed... >>>>> URL: >>>> datatable-help/attachments/20150114/035b1682/attachment-0001.html> >>>>> >>>>> ------------------------------ >>>>> >>>>> _______________________________________________ >>>>> datatable-help mailing list >>>>> datatable-help at lists.r-forge.r-project.org >>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/ >>>>> listinfo/datatable-help >>>>> >>>>> End of datatable-help Digest, Vol 59, Issue 2 >>>>> ********************************************* >>>>> >>>> >>>> _______________________________________________ >>>> datatable-help mailing list >>>> datatable-help at lists.r-forge.r-project.org >>>> >>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >>>> >>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jmtruppia at gmail.com Mon Jan 19 22:45:12 2015 From: jmtruppia at gmail.com (Juan Manuel Truppia) Date: Mon, 19 Jan 2015 21:45:12 +0000 Subject: [datatable-help] Using := in .onLoad Message-ID: Hi, I'm using data.table inside my own packages. I'm having some trouble using data.table functions on my .onLoad method. I'm actually depending (not importing) data.table, but even then, I can't get := to work in .onLoad. I'm using the `:=`(a = x, b = y) version, and getting Error in `:=`(a = x) : Check that is.data.table(DT) == TRUE. Otherwise, := and `:=`(...) are defined for use in j, once only and in particular ways. See help(":="). If I comment that, and then run it from the console after the package is loaded, it works. Any ideas? -------------- next part -------------- An HTML attachment was scrubbed... URL: From aragorn168b at gmail.com Mon Jan 19 23:17:59 2015 From: aragorn168b at gmail.com (Arunkumar Srinivasan) Date: Mon, 19 Jan 2015 23:17:59 +0100 Subject: [datatable-help] Using := in .onLoad In-Reply-To: References: Message-ID: Juan, `:=` is designed to be used only within the frame of data.table. On Mon, Jan 19, 2015 at 10:45 PM, Juan Manuel Truppia wrote: > Hi, I'm using data.table inside my own packages. I'm having some trouble > using data.table functions on my .onLoad method. > I'm actually depending (not importing) data.table, but even then, I can't > get := to work in .onLoad. > I'm using the `:=`(a = x, b = y) version, and getting > > Error in `:=`(a = x) : > Check that is.data.table(DT) == TRUE. Otherwise, := and `:=`(...) are > defined for use in j, once only and in particular ways. See help(":="). > > If I comment that, and then run it from the console after the package is > loaded, it works. > > Any ideas? > > _______________________________________________ > datatable-help mailing list > datatable-help at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jmtruppia at gmail.com Mon Jan 19 23:31:13 2015 From: jmtruppia at gmail.com (Juan Manuel Truppia) Date: Mon, 19 Jan 2015 22:31:13 +0000 Subject: [datatable-help] Using := in .onLoad References: Message-ID: Arun, what does "within the frame of data.table" mean? On Mon, Jan 19, 2015, 19:17 Arunkumar Srinivasan wrote: > Juan, > > `:=` is designed to be used only within the frame of data.table. > > On Mon, Jan 19, 2015 at 10:45 PM, Juan Manuel Truppia > wrote: > >> Hi, I'm using data.table inside my own packages. I'm having some trouble >> using data.table functions on my .onLoad method. >> I'm actually depending (not importing) data.table, but even then, I can't >> get := to work in .onLoad. >> I'm using the `:=`(a = x, b = y) version, and getting >> >> Error in `:=`(a = x) : >> Check that is.data.table(DT) == TRUE. Otherwise, := and `:=`(...) are >> defined for use in j, once only and in particular ways. See help(":="). >> >> If I comment that, and then run it from the console after the package is >> loaded, it works. >> >> Any ideas? >> >> _______________________________________________ >> datatable-help mailing list >> datatable-help at lists.r-forge.r-project.org >> >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aragorn168b at gmail.com Mon Jan 19 23:35:02 2015 From: aragorn168b at gmail.com (Arunkumar Srinivasan) Date: Mon, 19 Jan 2015 23:35:02 +0100 Subject: [datatable-help] Using := in .onLoad In-Reply-To: References: Message-ID: Within the square brackets in "DT[ .... ]" and even there, only in `j` (as the error message points out). On Mon, Jan 19, 2015 at 11:31 PM, Juan Manuel Truppia wrote: > Arun, what does "within the frame of data.table" mean? > > On Mon, Jan 19, 2015, 19:17 Arunkumar Srinivasan > wrote: > >> Juan, >> >> `:=` is designed to be used only within the frame of data.table. >> >> On Mon, Jan 19, 2015 at 10:45 PM, Juan Manuel Truppia < >> jmtruppia at gmail.com> wrote: >> >>> Hi, I'm using data.table inside my own packages. I'm having some trouble >>> using data.table functions on my .onLoad method. >>> I'm actually depending (not importing) data.table, but even then, I >>> can't get := to work in .onLoad. >>> I'm using the `:=`(a = x, b = y) version, and getting >>> >>> Error in `:=`(a = x) : >>> Check that is.data.table(DT) == TRUE. Otherwise, := and `:=`(...) are >>> defined for use in j, once only and in particular ways. See help(":="). >>> >>> If I comment that, and then run it from the console after the package is >>> loaded, it works. >>> >>> Any ideas? >>> >>> _______________________________________________ >>> datatable-help mailing list >>> datatable-help at lists.r-forge.r-project.org >>> >>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >>> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jmtruppia at gmail.com Tue Jan 20 12:33:14 2015 From: jmtruppia at gmail.com (Juan Manuel Truppia) Date: Tue, 20 Jan 2015 11:33:14 +0000 Subject: [datatable-help] Using := in .onLoad References: Message-ID: I already know that. I usually struggle when using data.table inside my own packages. I start by listing it in imports only (and not importing anything) but end up depending on it, as for the users to be able to manipulate the data.tables with [ and :=. I end up importing some or all of data.table also, because I don't know how to call := the :: notation. My specific problem was on using := during on.Load. I already know how to use it, and know that it can only be used inside [. Maybe it is imperative to @import data.table to be able to use :=? On Mon, Jan 19, 2015, 19:35 Arunkumar Srinivasan wrote: > Within the square brackets in "DT[ .... ]" and even there, only in `j` (as > the error message points out). > > On Mon, Jan 19, 2015 at 11:31 PM, Juan Manuel Truppia > wrote: > >> Arun, what does "within the frame of data.table" mean? >> >> On Mon, Jan 19, 2015, 19:17 Arunkumar Srinivasan >> wrote: >> >>> Juan, >>> >>> `:=` is designed to be used only within the frame of data.table. >>> >>> On Mon, Jan 19, 2015 at 10:45 PM, Juan Manuel Truppia < >>> jmtruppia at gmail.com> wrote: >>> >>>> Hi, I'm using data.table inside my own packages. I'm having some >>>> trouble using data.table functions on my .onLoad method. >>>> I'm actually depending (not importing) data.table, but even then, I >>>> can't get := to work in .onLoad. >>>> I'm using the `:=`(a = x, b = y) version, and getting >>>> >>>> Error in `:=`(a = x) : >>>> Check that is.data.table(DT) == TRUE. Otherwise, := and `:=`(...) are >>>> defined for use in j, once only and in particular ways. See help(":="). >>>> >>>> If I comment that, and then run it from the console after the package >>>> is loaded, it works. >>>> >>>> Any ideas? >>>> >>>> _______________________________________________ >>>> datatable-help mailing list >>>> datatable-help at lists.r-forge.r-project.org >>>> >>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >>>> >>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aragorn168b at gmail.com Tue Jan 20 14:01:00 2015 From: aragorn168b at gmail.com (Arunkumar Srinivasan) Date: Tue, 20 Jan 2015 14:01:00 +0100 Subject: [datatable-help] Using := in .onLoad In-Reply-To: References: Message-ID: Juan, > > I already know that. > Okay, great. > I usually struggle when using data.table inside my own packages. I start > by listing it in imports only (and not importing anything) > "imports only (and not importing anything)"? - what do you mean? > but end up depending on it, as for the users to be able to manipulate the > data.tables with [ and :=. > It depends on whether you want the data.table NAMESPACE to be attached or not. I think these posts may help: http://r.789695.n4.nabble.com/Re-R-CMD-check-checking-in-development-version-of-R-td4696125.html#none http://stackoverflow.com/questions/8637993/better-explanation-of-when-to-use-imports-depends > I end up importing some or all of data.table also, because I don't know > how to call := the :: notation. > I don't really follow this. Why do you want to use `::` along with `:=`. The last post we just discussed that it can't be done, as it is not designed to be used outside of `[...]` and in very specific ways. Check data.table:::`:=`. It is designed to error. This post might help: http://stackoverflow.com/q/7033106/559784 > My specific problem was on using := during on.Load. > It'd be useful to know what you are trying to do, along with your code. > I already know how to use it, and know that it can only be used inside [. > Maybe it is imperative to @import data.table to be able to use :=? > I don't follow exactly how you are trying to use := to answer this. Arun > > On Mon, Jan 19, 2015, 19:35 Arunkumar Srinivasan > wrote: > >> Within the square brackets in "DT[ .... ]" and even there, only in `j` >> (as the error message points out). >> >> On Mon, Jan 19, 2015 at 11:31 PM, Juan Manuel Truppia < >> jmtruppia at gmail.com> wrote: >> >>> Arun, what does "within the frame of data.table" mean? >>> >>> On Mon, Jan 19, 2015, 19:17 Arunkumar Srinivasan >>> wrote: >>> >>>> Juan, >>>> >>>> `:=` is designed to be used only within the frame of data.table. >>>> >>>> On Mon, Jan 19, 2015 at 10:45 PM, Juan Manuel Truppia < >>>> jmtruppia at gmail.com> wrote: >>>> >>>>> Hi, I'm using data.table inside my own packages. I'm having some >>>>> trouble using data.table functions on my .onLoad method. >>>>> I'm actually depending (not importing) data.table, but even then, I >>>>> can't get := to work in .onLoad. >>>>> I'm using the `:=`(a = x, b = y) version, and getting >>>>> >>>>> Error in `:=`(a = x) : >>>>> Check that is.data.table(DT) == TRUE. Otherwise, := and `:=`(...) are >>>>> defined for use in j, once only and in particular ways. See help(":="). >>>>> >>>>> If I comment that, and then run it from the console after the package >>>>> is loaded, it works. >>>>> >>>>> Any ideas? >>>>> >>>>> _______________________________________________ >>>>> datatable-help mailing list >>>>> datatable-help at lists.r-forge.r-project.org >>>>> >>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >>>>> >>>> >>>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jmtruppia at gmail.com Tue Jan 20 14:22:02 2015 From: jmtruppia at gmail.com (Juan Manuel Truppia) Date: Tue, 20 Jan 2015 13:22:02 +0000 Subject: [datatable-help] Using := in .onLoad References: Message-ID: Let's follow an example. I'm developing my own package. I add data.table to imports in my description file. This means I have to use the :: operator each time I want to call a data.table function. Can I use := in this setup? On Tue, Jan 20, 2015, 10:01 Arunkumar Srinivasan wrote: > Juan, > >> >> I already know that. >> > Okay, great. > >> I usually struggle when using data.table inside my own packages. I start >> by listing it in imports only (and not importing anything) >> > "imports only (and not importing anything)"? - what do you mean? > >> but end up depending on it, as for the users to be able to manipulate the >> data.tables with [ and :=. >> > It depends on whether you want the data.table NAMESPACE to be attached or > not. I think these posts may help: > > http://r.789695.n4.nabble.com/Re-R-CMD-check-checking-in-development-version-of-R-td4696125.html#none > > http://stackoverflow.com/questions/8637993/better-explanation-of-when-to-use-imports-depends > > >> I end up importing some or all of data.table also, because I don't know >> how to call := the :: notation. >> > I don't really follow this. Why do you want to use `::` along with `:=`. > The last post we just discussed that it can't be done, as it is not > designed to be used outside of `[...]` and in very specific ways. Check > data.table:::`:=`. It is designed to error. This post might help: > http://stackoverflow.com/q/7033106/559784 > >> My specific problem was on using := during on.Load. >> > It'd be useful to know what you are trying to do, along with your code. > >> I already know how to use it, and know that it can only be used inside [. >> Maybe it is imperative to @import data.table to be able to use :=? >> > I don't follow exactly how you are trying to use := to answer this. > > Arun > >> >> On Mon, Jan 19, 2015, 19:35 Arunkumar Srinivasan >> wrote: >> >>> Within the square brackets in "DT[ .... ]" and even there, only in `j` >>> (as the error message points out). >>> >>> On Mon, Jan 19, 2015 at 11:31 PM, Juan Manuel Truppia < >>> jmtruppia at gmail.com> wrote: >>> >>>> Arun, what does "within the frame of data.table" mean? >>>> >>>> On Mon, Jan 19, 2015, 19:17 Arunkumar Srinivasan >>>> wrote: >>>> >>>>> Juan, >>>>> >>>>> `:=` is designed to be used only within the frame of data.table. >>>>> >>>>> On Mon, Jan 19, 2015 at 10:45 PM, Juan Manuel Truppia < >>>>> jmtruppia at gmail.com> wrote: >>>>> >>>>>> Hi, I'm using data.table inside my own packages. I'm having some >>>>>> trouble using data.table functions on my .onLoad method. >>>>>> I'm actually depending (not importing) data.table, but even then, I >>>>>> can't get := to work in .onLoad. >>>>>> I'm using the `:=`(a = x, b = y) version, and getting >>>>>> >>>>>> Error in `:=`(a = x) : >>>>>> Check that is.data.table(DT) == TRUE. Otherwise, := and `:=`(...) are >>>>>> defined for use in j, once only and in particular ways. See help(":="). >>>>>> >>>>>> If I comment that, and then run it from the console after the package >>>>>> is loaded, it works. >>>>>> >>>>>> Any ideas? >>>>>> >>>>>> _______________________________________________ >>>>>> datatable-help mailing list >>>>>> datatable-help at lists.r-forge.r-project.org >>>>>> >>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >>>>>> >>>>> >>>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From aragorn168b at gmail.com Tue Jan 20 14:32:24 2015 From: aragorn168b at gmail.com (Arunkumar Srinivasan) Date: Tue, 20 Jan 2015 14:32:24 +0100 Subject: [datatable-help] Using := in .onLoad In-Reply-To: References: Message-ID: On Tue, Jan 20, 2015 at 2:22 PM, Juan Manuel Truppia wrote: > Let's follow an example. I'm developing my own package. I add data.table > to imports in my description file. This means I have to use the :: operator > each time I want to call a data.table function. Can I use := in this setup? > "I have to use / Can I use" - here you refer to the user using your package or yourself as the developer trying to use data.table functions and := from within your package? > > On Tue, Jan 20, 2015, 10:01 Arunkumar Srinivasan > wrote: > >> Juan, >> >>> >>> I already know that. >>> >> Okay, great. >> >>> I usually struggle when using data.table inside my own packages. I start >>> by listing it in imports only (and not importing anything) >>> >> "imports only (and not importing anything)"? - what do you mean? >> >>> but end up depending on it, as for the users to be able to manipulate >>> the data.tables with [ and :=. >>> >> It depends on whether you want the data.table NAMESPACE to be attached or >> not. I think these posts may help: >> >> http://r.789695.n4.nabble.com/Re-R-CMD-check-checking-in-development-version-of-R-td4696125.html#none >> >> http://stackoverflow.com/questions/8637993/better-explanation-of-when-to-use-imports-depends >> >> >>> I end up importing some or all of data.table also, because I don't know >>> how to call := the :: notation. >>> >> I don't really follow this. Why do you want to use `::` along with `:=`. >> The last post we just discussed that it can't be done, as it is not >> designed to be used outside of `[...]` and in very specific ways. Check >> data.table:::`:=`. It is designed to error. This post might help: >> http://stackoverflow.com/q/7033106/559784 >> >>> My specific problem was on using := during on.Load. >>> >> It'd be useful to know what you are trying to do, along with your code. >> >>> I already know how to use it, and know that it can only be used inside [. >>> Maybe it is imperative to @import data.table to be able to use :=? >>> >> I don't follow exactly how you are trying to use := to answer this. >> >> Arun >> >>> >>> On Mon, Jan 19, 2015, 19:35 Arunkumar Srinivasan >>> wrote: >>> >>>> Within the square brackets in "DT[ .... ]" and even there, only in `j` >>>> (as the error message points out). >>>> >>>> On Mon, Jan 19, 2015 at 11:31 PM, Juan Manuel Truppia < >>>> jmtruppia at gmail.com> wrote: >>>> >>>>> Arun, what does "within the frame of data.table" mean? >>>>> >>>>> On Mon, Jan 19, 2015, 19:17 Arunkumar Srinivasan < >>>>> aragorn168b at gmail.com> wrote: >>>>> >>>>>> Juan, >>>>>> >>>>>> `:=` is designed to be used only within the frame of data.table. >>>>>> >>>>>> On Mon, Jan 19, 2015 at 10:45 PM, Juan Manuel Truppia < >>>>>> jmtruppia at gmail.com> wrote: >>>>>> >>>>>>> Hi, I'm using data.table inside my own packages. I'm having some >>>>>>> trouble using data.table functions on my .onLoad method. >>>>>>> I'm actually depending (not importing) data.table, but even then, I >>>>>>> can't get := to work in .onLoad. >>>>>>> I'm using the `:=`(a = x, b = y) version, and getting >>>>>>> >>>>>>> Error in `:=`(a = x) : >>>>>>> Check that is.data.table(DT) == TRUE. Otherwise, := and `:=`(...) >>>>>>> are defined for use in j, once only and in particular ways. See help(":="). >>>>>>> >>>>>>> If I comment that, and then run it from the console after the >>>>>>> package is loaded, it works. >>>>>>> >>>>>>> Any ideas? >>>>>>> >>>>>>> _______________________________________________ >>>>>>> datatable-help mailing list >>>>>>> datatable-help at lists.r-forge.r-project.org >>>>>>> >>>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >>>>>>> >>>>>> >>>>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jmtruppia at gmail.com Tue Jan 20 14:36:01 2015 From: jmtruppia at gmail.com (Juan Manuel Truppia) Date: Tue, 20 Jan 2015 13:36:01 +0000 Subject: [datatable-help] Using := in .onLoad References: Message-ID: Myself as the developer trying to use data.table functions from within my package On Tue, Jan 20, 2015, 10:32 Arunkumar Srinivasan wrote: > On Tue, Jan 20, 2015 at 2:22 PM, Juan Manuel Truppia > wrote: > >> Let's follow an example. I'm developing my own package. I add data.table >> to imports in my description file. This means I have to use the :: operator >> each time I want to call a data.table function. Can I use := in this setup? >> > "I have to use / Can I use" - here you refer to the user using your > package or yourself as the developer trying to use data.table functions and > := from within your package? > > >> >> On Tue, Jan 20, 2015, 10:01 Arunkumar Srinivasan >> wrote: >> >>> Juan, >>> >>>> >>>> I already know that. >>>> >>> Okay, great. >>> >>>> I usually struggle when using data.table inside my own packages. I >>>> start by listing it in imports only (and not importing anything) >>>> >>> "imports only (and not importing anything)"? - what do you mean? >>> >>>> but end up depending on it, as for the users to be able to manipulate >>>> the data.tables with [ and :=. >>>> >>> It depends on whether you want the data.table NAMESPACE to be attached >>> or not. I think these posts may help: >>> >>> http://r.789695.n4.nabble.com/Re-R-CMD-check-checking-in-development-version-of-R-td4696125.html#none >>> >>> http://stackoverflow.com/questions/8637993/better-explanation-of-when-to-use-imports-depends >>> >>> >>>> I end up importing some or all of data.table also, because I don't know >>>> how to call := the :: notation. >>>> >>> I don't really follow this. Why do you want to use `::` along with `:=`. >>> The last post we just discussed that it can't be done, as it is not >>> designed to be used outside of `[...]` and in very specific ways. Check >>> data.table:::`:=`. It is designed to error. This post might help: >>> http://stackoverflow.com/q/7033106/559784 >>> >>>> My specific problem was on using := during on.Load. >>>> >>> It'd be useful to know what you are trying to do, along with your code. >>> >>>> I already know how to use it, and know that it can only be used inside >>>> [. >>>> Maybe it is imperative to @import data.table to be able to use :=? >>>> >>> I don't follow exactly how you are trying to use := to answer this. >>> >>> Arun >>> >>>> >>>> On Mon, Jan 19, 2015, 19:35 Arunkumar Srinivasan >>>> wrote: >>>> >>>>> Within the square brackets in "DT[ .... ]" and even there, only in `j` >>>>> (as the error message points out). >>>>> >>>>> On Mon, Jan 19, 2015 at 11:31 PM, Juan Manuel Truppia < >>>>> jmtruppia at gmail.com> wrote: >>>>> >>>>>> Arun, what does "within the frame of data.table" mean? >>>>>> >>>>>> On Mon, Jan 19, 2015, 19:17 Arunkumar Srinivasan < >>>>>> aragorn168b at gmail.com> wrote: >>>>>> >>>>>>> Juan, >>>>>>> >>>>>>> `:=` is designed to be used only within the frame of data.table. >>>>>>> >>>>>>> On Mon, Jan 19, 2015 at 10:45 PM, Juan Manuel Truppia < >>>>>>> jmtruppia at gmail.com> wrote: >>>>>>> >>>>>>>> Hi, I'm using data.table inside my own packages. I'm having some >>>>>>>> trouble using data.table functions on my .onLoad method. >>>>>>>> I'm actually depending (not importing) data.table, but even then, I >>>>>>>> can't get := to work in .onLoad. >>>>>>>> I'm using the `:=`(a = x, b = y) version, and getting >>>>>>>> >>>>>>>> Error in `:=`(a = x) : >>>>>>>> Check that is.data.table(DT) == TRUE. Otherwise, := and `:=`(...) >>>>>>>> are defined for use in j, once only and in particular ways. See help(":="). >>>>>>>> >>>>>>>> If I comment that, and then run it from the console after the >>>>>>>> package is loaded, it works. >>>>>>>> >>>>>>>> Any ideas? >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> datatable-help mailing list >>>>>>>> datatable-help at lists.r-forge.r-project.org >>>>>>>> >>>>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >>>>>>>> >>>>>>> >>>>>>> >>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From aragorn168b at gmail.com Tue Jan 20 14:41:19 2015 From: aragorn168b at gmail.com (Arunkumar Srinivasan) Date: Tue, 20 Jan 2015 14:41:19 +0100 Subject: [datatable-help] Using := in .onLoad In-Reply-To: References: Message-ID: There should be no need to use :: to call data.table functions then. You can use all the exported functions as such. And `:=` within DT[...] would / should work just fine.. Now that I understand your setup, why do you think you need to use :: for every data.table function? What happens when you don't? And why did you get error with :=? On Tue, Jan 20, 2015 at 2:36 PM, Juan Manuel Truppia wrote: > Myself as the developer trying to use data.table functions from within my > package > > On Tue, Jan 20, 2015, 10:32 Arunkumar Srinivasan > wrote: > >> On Tue, Jan 20, 2015 at 2:22 PM, Juan Manuel Truppia > > wrote: >> >>> Let's follow an example. I'm developing my own package. I add data.table >>> to imports in my description file. This means I have to use the :: operator >>> each time I want to call a data.table function. Can I use := in this setup? >>> >> "I have to use / Can I use" - here you refer to the user using your >> package or yourself as the developer trying to use data.table functions and >> := from within your package? >> >> >>> >>> On Tue, Jan 20, 2015, 10:01 Arunkumar Srinivasan >>> wrote: >>> >>>> Juan, >>>> >>>>> >>>>> I already know that. >>>>> >>>> Okay, great. >>>> >>>>> I usually struggle when using data.table inside my own packages. I >>>>> start by listing it in imports only (and not importing anything) >>>>> >>>> "imports only (and not importing anything)"? - what do you mean? >>>> >>>>> but end up depending on it, as for the users to be able to manipulate >>>>> the data.tables with [ and :=. >>>>> >>>> It depends on whether you want the data.table NAMESPACE to be attached >>>> or not. I think these posts may help: >>>> >>>> http://r.789695.n4.nabble.com/Re-R-CMD-check-checking-in-development-version-of-R-td4696125.html#none >>>> >>>> http://stackoverflow.com/questions/8637993/better-explanation-of-when-to-use-imports-depends >>>> >>>> >>>>> I end up importing some or all of data.table also, because I don't >>>>> know how to call := the :: notation. >>>>> >>>> I don't really follow this. Why do you want to use `::` along with >>>> `:=`. The last post we just discussed that it can't be done, as it is not >>>> designed to be used outside of `[...]` and in very specific ways. Check >>>> data.table:::`:=`. It is designed to error. This post might help: >>>> http://stackoverflow.com/q/7033106/559784 >>>> >>>>> My specific problem was on using := during on.Load. >>>>> >>>> It'd be useful to know what you are trying to do, along with your code. >>>> >>>>> I already know how to use it, and know that it can only be used inside >>>>> [. >>>>> Maybe it is imperative to @import data.table to be able to use :=? >>>>> >>>> I don't follow exactly how you are trying to use := to answer this. >>>> >>>> Arun >>>> >>>>> >>>>> On Mon, Jan 19, 2015, 19:35 Arunkumar Srinivasan < >>>>> aragorn168b at gmail.com> wrote: >>>>> >>>>>> Within the square brackets in "DT[ .... ]" and even there, only in >>>>>> `j` (as the error message points out). >>>>>> >>>>>> On Mon, Jan 19, 2015 at 11:31 PM, Juan Manuel Truppia < >>>>>> jmtruppia at gmail.com> wrote: >>>>>> >>>>>>> Arun, what does "within the frame of data.table" mean? >>>>>>> >>>>>>> On Mon, Jan 19, 2015, 19:17 Arunkumar Srinivasan < >>>>>>> aragorn168b at gmail.com> wrote: >>>>>>> >>>>>>>> Juan, >>>>>>>> >>>>>>>> `:=` is designed to be used only within the frame of data.table. >>>>>>>> >>>>>>>> On Mon, Jan 19, 2015 at 10:45 PM, Juan Manuel Truppia < >>>>>>>> jmtruppia at gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi, I'm using data.table inside my own packages. I'm having some >>>>>>>>> trouble using data.table functions on my .onLoad method. >>>>>>>>> I'm actually depending (not importing) data.table, but even then, >>>>>>>>> I can't get := to work in .onLoad. >>>>>>>>> I'm using the `:=`(a = x, b = y) version, and getting >>>>>>>>> >>>>>>>>> Error in `:=`(a = x) : >>>>>>>>> Check that is.data.table(DT) == TRUE. Otherwise, := and `:=`(...) >>>>>>>>> are defined for use in j, once only and in particular ways. See help(":="). >>>>>>>>> >>>>>>>>> If I comment that, and then run it from the console after the >>>>>>>>> package is loaded, it works. >>>>>>>>> >>>>>>>>> Any ideas? >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> datatable-help mailing list >>>>>>>>> datatable-help at lists.r-forge.r-project.org >>>>>>>>> >>>>>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jmtruppia at gmail.com Tue Jan 20 14:48:44 2015 From: jmtruppia at gmail.com (Juan Manuel Truppia) Date: Tue, 20 Jan 2015 13:48:44 +0000 Subject: [datatable-help] Using := in .onLoad References: Message-ID: Yes, if I just list data.table in Imports in my DESCRIPTION file, then I have to use :: every time I call functions that are outside the base package. The only workaround is using import data.table in the NAMESPACE. On Tue, Jan 20, 2015, 10:41 Arunkumar Srinivasan wrote: > There should be no need to use :: to call data.table functions then. You > can use all the exported functions as such. And `:=` within DT[...] would / > should work just fine.. > > Now that I understand your setup, why do you think you need to use :: for > every data.table function? What happens when you don't? And why did you get > error with :=? > > On Tue, Jan 20, 2015 at 2:36 PM, Juan Manuel Truppia > wrote: > >> Myself as the developer trying to use data.table functions from within my >> package >> >> On Tue, Jan 20, 2015, 10:32 Arunkumar Srinivasan >> wrote: >> >>> On Tue, Jan 20, 2015 at 2:22 PM, Juan Manuel Truppia < >>> jmtruppia at gmail.com> wrote: >>> >>>> Let's follow an example. I'm developing my own package. I add >>>> data.table to imports in my description file. This means I have to use the >>>> :: operator each time I want to call a data.table function. Can I use := >>>> in this setup? >>>> >>> "I have to use / Can I use" - here you refer to the user using your >>> package or yourself as the developer trying to use data.table functions and >>> := from within your package? >>> >>> >>>> >>>> On Tue, Jan 20, 2015, 10:01 Arunkumar Srinivasan >>>> wrote: >>>> >>>>> Juan, >>>>> >>>>>> >>>>>> I already know that. >>>>>> >>>>> Okay, great. >>>>> >>>>>> I usually struggle when using data.table inside my own packages. I >>>>>> start by listing it in imports only (and not importing anything) >>>>>> >>>>> "imports only (and not importing anything)"? - what do you mean? >>>>> >>>>>> but end up depending on it, as for the users to be able to manipulate >>>>>> the data.tables with [ and :=. >>>>>> >>>>> It depends on whether you want the data.table NAMESPACE to be attached >>>>> or not. I think these posts may help: >>>>> >>>>> http://r.789695.n4.nabble.com/Re-R-CMD-check-checking-in-development-version-of-R-td4696125.html#none >>>>> >>>>> http://stackoverflow.com/questions/8637993/better-explanation-of-when-to-use-imports-depends >>>>> >>>>> >>>>>> I end up importing some or all of data.table also, because I don't >>>>>> know how to call := the :: notation. >>>>>> >>>>> I don't really follow this. Why do you want to use `::` along with >>>>> `:=`. The last post we just discussed that it can't be done, as it is not >>>>> designed to be used outside of `[...]` and in very specific ways. Check >>>>> data.table:::`:=`. It is designed to error. This post might help: >>>>> http://stackoverflow.com/q/7033106/559784 >>>>> >>>>>> My specific problem was on using := during on.Load. >>>>>> >>>>> It'd be useful to know what you are trying to do, along with your >>>>> code. >>>>> >>>>>> I already know how to use it, and know that it can only be used >>>>>> inside [. >>>>>> Maybe it is imperative to @import data.table to be able to use :=? >>>>>> >>>>> I don't follow exactly how you are trying to use := to answer this. >>>>> >>>>> Arun >>>>> >>>>>> >>>>>> On Mon, Jan 19, 2015, 19:35 Arunkumar Srinivasan < >>>>>> aragorn168b at gmail.com> wrote: >>>>>> >>>>>>> Within the square brackets in "DT[ .... ]" and even there, only in >>>>>>> `j` (as the error message points out). >>>>>>> >>>>>>> On Mon, Jan 19, 2015 at 11:31 PM, Juan Manuel Truppia < >>>>>>> jmtruppia at gmail.com> wrote: >>>>>>> >>>>>>>> Arun, what does "within the frame of data.table" mean? >>>>>>>> >>>>>>>> On Mon, Jan 19, 2015, 19:17 Arunkumar Srinivasan < >>>>>>>> aragorn168b at gmail.com> wrote: >>>>>>>> >>>>>>>>> Juan, >>>>>>>>> >>>>>>>>> `:=` is designed to be used only within the frame of data.table. >>>>>>>>> >>>>>>>>> On Mon, Jan 19, 2015 at 10:45 PM, Juan Manuel Truppia < >>>>>>>>> jmtruppia at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Hi, I'm using data.table inside my own packages. I'm having some >>>>>>>>>> trouble using data.table functions on my .onLoad method. >>>>>>>>>> I'm actually depending (not importing) data.table, but even then, >>>>>>>>>> I can't get := to work in .onLoad. >>>>>>>>>> I'm using the `:=`(a = x, b = y) version, and getting >>>>>>>>>> >>>>>>>>>> Error in `:=`(a = x) : >>>>>>>>>> Check that is.data.table(DT) == TRUE. Otherwise, := and `:=`(...) >>>>>>>>>> are defined for use in j, once only and in particular ways. See help(":="). >>>>>>>>>> >>>>>>>>>> If I comment that, and then run it from the console after the >>>>>>>>>> package is loaded, it works. >>>>>>>>>> >>>>>>>>>> Any ideas? >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> datatable-help mailing list >>>>>>>>>> datatable-help at lists.r-forge.r-project.org >>>>>>>>>> >>>>>>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aragorn168b at gmail.com Tue Jan 20 14:51:29 2015 From: aragorn168b at gmail.com (Arunkumar Srinivasan) Date: Tue, 20 Jan 2015 14:51:29 +0100 Subject: [datatable-help] Using := in .onLoad In-Reply-To: References: Message-ID: Yes you've to add import(data.table) to your NAMESPACE file. What's difficult about that? How do you import other packages? And this is also FAQ 6.9 - http://cran.r-project.org/web/packages/data.table/vignettes/datatable-faq.pdf On Tue, Jan 20, 2015 at 2:48 PM, Juan Manuel Truppia wrote: > Yes, if I just list data.table in Imports in my DESCRIPTION file, then I > have to use :: every time I call functions that are outside the base > package. The only workaround is using import data.table in the NAMESPACE. > > On Tue, Jan 20, 2015, 10:41 Arunkumar Srinivasan > wrote: > >> There should be no need to use :: to call data.table functions then. You >> can use all the exported functions as such. And `:=` within DT[...] would / >> should work just fine.. >> >> Now that I understand your setup, why do you think you need to use :: for >> every data.table function? What happens when you don't? And why did you get >> error with :=? >> >> On Tue, Jan 20, 2015 at 2:36 PM, Juan Manuel Truppia > > wrote: >> >>> Myself as the developer trying to use data.table functions from within >>> my package >>> >>> On Tue, Jan 20, 2015, 10:32 Arunkumar Srinivasan >>> wrote: >>> >>>> On Tue, Jan 20, 2015 at 2:22 PM, Juan Manuel Truppia < >>>> jmtruppia at gmail.com> wrote: >>>> >>>>> Let's follow an example. I'm developing my own package. I add >>>>> data.table to imports in my description file. This means I have to use the >>>>> :: operator each time I want to call a data.table function. Can I use := >>>>> in this setup? >>>>> >>>> "I have to use / Can I use" - here you refer to the user using your >>>> package or yourself as the developer trying to use data.table functions and >>>> := from within your package? >>>> >>>> >>>>> >>>>> On Tue, Jan 20, 2015, 10:01 Arunkumar Srinivasan < >>>>> aragorn168b at gmail.com> wrote: >>>>> >>>>>> Juan, >>>>>> >>>>>>> >>>>>>> I already know that. >>>>>>> >>>>>> Okay, great. >>>>>> >>>>>>> I usually struggle when using data.table inside my own packages. I >>>>>>> start by listing it in imports only (and not importing anything) >>>>>>> >>>>>> "imports only (and not importing anything)"? - what do you mean? >>>>>> >>>>>>> but end up depending on it, as for the users to be able to >>>>>>> manipulate the data.tables with [ and :=. >>>>>>> >>>>>> It depends on whether you want the data.table NAMESPACE to be >>>>>> attached or not. I think these posts may help: >>>>>> >>>>>> http://r.789695.n4.nabble.com/Re-R-CMD-check-checking-in-development-version-of-R-td4696125.html#none >>>>>> >>>>>> http://stackoverflow.com/questions/8637993/better-explanation-of-when-to-use-imports-depends >>>>>> >>>>>> >>>>>>> I end up importing some or all of data.table also, because I don't >>>>>>> know how to call := the :: notation. >>>>>>> >>>>>> I don't really follow this. Why do you want to use `::` along with >>>>>> `:=`. The last post we just discussed that it can't be done, as it is not >>>>>> designed to be used outside of `[...]` and in very specific ways. Check >>>>>> data.table:::`:=`. It is designed to error. This post might help: >>>>>> http://stackoverflow.com/q/7033106/559784 >>>>>> >>>>>>> My specific problem was on using := during on.Load. >>>>>>> >>>>>> It'd be useful to know what you are trying to do, along with your >>>>>> code. >>>>>> >>>>>>> I already know how to use it, and know that it can only be used >>>>>>> inside [. >>>>>>> Maybe it is imperative to @import data.table to be able to use :=? >>>>>>> >>>>>> I don't follow exactly how you are trying to use := to answer this. >>>>>> >>>>>> Arun >>>>>> >>>>>>> >>>>>>> On Mon, Jan 19, 2015, 19:35 Arunkumar Srinivasan < >>>>>>> aragorn168b at gmail.com> wrote: >>>>>>> >>>>>>>> Within the square brackets in "DT[ .... ]" and even there, only in >>>>>>>> `j` (as the error message points out). >>>>>>>> >>>>>>>> On Mon, Jan 19, 2015 at 11:31 PM, Juan Manuel Truppia < >>>>>>>> jmtruppia at gmail.com> wrote: >>>>>>>> >>>>>>>>> Arun, what does "within the frame of data.table" mean? >>>>>>>>> >>>>>>>>> On Mon, Jan 19, 2015, 19:17 Arunkumar Srinivasan < >>>>>>>>> aragorn168b at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Juan, >>>>>>>>>> >>>>>>>>>> `:=` is designed to be used only within the frame of data.table. >>>>>>>>>> >>>>>>>>>> On Mon, Jan 19, 2015 at 10:45 PM, Juan Manuel Truppia < >>>>>>>>>> jmtruppia at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hi, I'm using data.table inside my own packages. I'm having some >>>>>>>>>>> trouble using data.table functions on my .onLoad method. >>>>>>>>>>> I'm actually depending (not importing) data.table, but even >>>>>>>>>>> then, I can't get := to work in .onLoad. >>>>>>>>>>> I'm using the `:=`(a = x, b = y) version, and getting >>>>>>>>>>> >>>>>>>>>>> Error in `:=`(a = x) : >>>>>>>>>>> Check that is.data.table(DT) == TRUE. Otherwise, := and >>>>>>>>>>> `:=`(...) are defined for use in j, once only and in particular ways. See >>>>>>>>>>> help(":="). >>>>>>>>>>> >>>>>>>>>>> If I comment that, and then run it from the console after the >>>>>>>>>>> package is loaded, it works. >>>>>>>>>>> >>>>>>>>>>> Any ideas? >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> datatable-help mailing list >>>>>>>>>>> datatable-help at lists.r-forge.r-project.org >>>>>>>>>>> >>>>>>>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mickcooney at gmail.com Tue Jan 20 14:52:55 2015 From: mickcooney at gmail.com (Mick Cooney) Date: Tue, 20 Jan 2015 13:52:55 +0000 Subject: [datatable-help] Using := in .onLoad In-Reply-To: References: Message-ID: Hi Juan, That isn't really a workaround, it is the way you are supposed to import things these days. The functionality for NAMESPACE files etc changed a while ago with one of the version releases of R (I think it was something like 2.13, but don't quote me). If you import it via a namespace it should work fine. BTW, if you are developing a package, I highly recommend using Hadley Wickham's devtools package. Combine that with roxygen2 for documentation, and a huge amount of the bookkeeping and maintenance of packages is taken care of for you. -- Mick Cooney mickcooney at gmail.com From jmtruppia at gmail.com Tue Jan 20 15:01:04 2015 From: jmtruppia at gmail.com (Juan Manuel Truppia) Date: Tue, 20 Jan 2015 14:01:04 +0000 Subject: [datatable-help] Using := in .onLoad References: Message-ID: Yes, I use hadley tools regularly, and it's hadley himself who recommends not to import the whole namespace to avoid polluting (check his R packages book). On Tue, Jan 20, 2015, 10:53 Mick Cooney wrote: > Hi Juan, > > That isn't really a workaround, it is the way you are supposed to > import things these days. The functionality for NAMESPACE files etc > changed a while ago with one of the version releases of R (I think it > was something like 2.13, but don't quote me). If you import it via a > namespace it should work fine. > > BTW, if you are developing a package, I highly recommend using Hadley > Wickham's devtools package. Combine that with roxygen2 for > documentation, and a huge amount of the bookkeeping and maintenance of > packages is taken care of for you. > > > -- > Mick Cooney > mickcooney at gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aragorn168b at gmail.com Tue Jan 20 15:13:16 2015 From: aragorn168b at gmail.com (Arunkumar Srinivasan) Date: Tue, 20 Jan 2015 15:13:16 +0100 Subject: [datatable-help] Using := in .onLoad In-Reply-To: References: Message-ID: In that case, you can use `importFrom()` as mentioned here: http://cran.r-project.org/doc/manuals/R-exts.html#Specifying-imports-and-exports On Tue, Jan 20, 2015 at 3:01 PM, Juan Manuel Truppia wrote: > Yes, I use hadley tools regularly, and it's hadley himself who recommends > not to import the whole namespace to avoid polluting (check his R packages > book). > > On Tue, Jan 20, 2015, 10:53 Mick Cooney wrote: > >> Hi Juan, >> >> That isn't really a workaround, it is the way you are supposed to >> import things these days. The functionality for NAMESPACE files etc >> changed a while ago with one of the version releases of R (I think it >> was something like 2.13, but don't quote me). If you import it via a >> namespace it should work fine. >> >> BTW, if you are developing a package, I highly recommend using Hadley >> Wickham's devtools package. Combine that with roxygen2 for >> documentation, and a huge amount of the bookkeeping and maintenance of >> packages is taken care of for you. >> >> >> -- >> Mick Cooney >> mickcooney at gmail.com >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jmtruppia at gmail.com Tue Jan 20 15:23:00 2015 From: jmtruppia at gmail.com (Juan Manuel Truppia) Date: Tue, 20 Jan 2015 14:23:00 +0000 Subject: [datatable-help] Using := in .onLoad References: Message-ID: Yes, I have that option, but it is enough to just import := ? Per the FAQ you mentioned, I should be importing the whole data.table package. Should I also import [ ? On Tue Jan 20 2015 at 11:13:16 AM Arunkumar Srinivasan < aragorn168b at gmail.com> wrote: > In that case, you can use `importFrom()` as mentioned here: > http://cran.r-project.org/doc/manuals/R-exts.html#Specifying-imports-and-exports > > On Tue, Jan 20, 2015 at 3:01 PM, Juan Manuel Truppia > wrote: > >> Yes, I use hadley tools regularly, and it's hadley himself who recommends >> not to import the whole namespace to avoid polluting (check his R packages >> book). >> >> On Tue, Jan 20, 2015, 10:53 Mick Cooney wrote: >> >>> Hi Juan, >>> >>> That isn't really a workaround, it is the way you are supposed to >>> import things these days. The functionality for NAMESPACE files etc >>> changed a while ago with one of the version releases of R (I think it >>> was something like 2.13, but don't quote me). If you import it via a >>> namespace it should work fine. >>> >>> BTW, if you are developing a package, I highly recommend using Hadley >>> Wickham's devtools package. Combine that with roxygen2 for >>> documentation, and a huge amount of the bookkeeping and maintenance of >>> packages is taken care of for you. >>> >>> >>> -- >>> Mick Cooney >>> mickcooney at gmail.com >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mickcooney at gmail.com Tue Jan 20 15:23:47 2015 From: mickcooney at gmail.com (Mick Cooney) Date: Tue, 20 Jan 2015 14:23:47 +0000 Subject: [datatable-help] Using := in .onLoad In-Reply-To: References: Message-ID: Yep, though for something like data.table (and I've had similar issues with xts - though that could be my poor coding) where a lot of the syntax itself gets changed and not just the availability of function, you are probably better off just biting the bullet and importing it. That said, I use data.table for everything, so importing it isn't that big a deal for me. I'm not nearly as good a coder as Hadley Wickham though, so I might be storing problems up for the future. :) -- Mick Cooney mickcooney at gmail.com From btupper at bigelow.org Thu Jan 22 18:11:34 2015 From: btupper at bigelow.org (Ben Tupper) Date: Thu, 22 Jan 2015 12:11:34 -0500 Subject: [datatable-help] number of rows selected in .SD subset Message-ID: Hello, I have been learning to use data.table and studying the vignette located here... https://rawgit.com/wiki/Rdatatable/data.table/vignettes/datatable-intro-vignette.html Section 2f. shows how to subset a data.table to select an arbitrary number of rows in each .SD. That's really handy. 2. Aggregations f. Subset .SD for each group: ans <- flights[, head(.SD, 2), by=month] In a similar way, I can get the last row of the .SD using either tail, nrow or dim (I don't think it matters much, but dim seems to be a faster*). ans <- flights[,.SD[dim(.SD)[1]], by=month] I got to wondering if the number of rows in .SD might be exposed in each grouping iteration. Is there an equivalent to .N for the subset data.table, .SD? Something like .SDN or the like? Thanks for data.table! Ben * After reading this discussion http://r.789695.n4.nabble.com/What-is-the-fastest-way-to-determine-that-data-table-is-empty-td4638348.html#a4638451 I tried out a couple of methods for getting the last element of a grouping using nrow(), tail() and dim(). # using tail > microbenchmark( last1 <- flights[, tail(.SD, 1), by=month] ) Unit: milliseconds expr min lq mean median uq max neval last1 <- flights[, tail(.SD, 1), by = month] 16.65898 16.89704 18.26415 17.37007 19.20147 40.12966 100 # using dim > microbenchmark( last2 <- flights[,.SD[dim(.SD)[1]], by=month] ) Unit: milliseconds expr min lq mean median uq max neval last2 <- flights[, .SD[dim(.SD)[1]], by = month] 15.51243 15.87788 17.40978 16.19426 17.83308 59.22429 100 # using nrow > microbenchmark( last3 <- flights[,.SD[nrow(.SD)], by=month] ) Unit: milliseconds expr min lq mean median uq max neval last3 <- flights[, .SD[nrow(.SD)], by = month] 15.63919 15.92073 17.28836 16.52588 18.33867 24.92624 100 > identical(last1, last2) [1] TRUE > identical(last1, last3) [1] TRUE Ben Tupper Bigelow Laboratory for Ocean Sciences 60 Bigelow Drive, P.O. Box 380 East Boothbay, Maine 04544 http://www.bigelow.org From aragorn168b at gmail.com Thu Jan 22 20:11:32 2015 From: aragorn168b at gmail.com (Arunkumar Srinivasan) Date: Thu, 22 Jan 2015 20:11:32 +0100 Subject: [datatable-help] number of rows selected in .SD subset In-Reply-To: References: Message-ID: Ben, Great to hear that you're going thro' the vignette.. To get the last row, you can similarly do: DT[, tail(.SD, 1L), by=month] # ~ as you say DT[, .SD[.N], by=month] # ~ since .N contains the number of observations in this group DT[, .SD[(.N-1L):.N], by=month] # ~ last two rows per group However, `.SD[...]` per group is slightly slower (especially on many groups) as it has to go through `[.data.table` (which is a S3 generic, and takes time for dispatching the right method.. which can get noticeable on large groups), and not all cases are optimised. You can also use `.I` (which is deliberately not mentioned in the vignette to keep things smooth and straightforward). Using it you could do: idx = DT[, .I[1L], by=month][, V1] DT[idx] `.I` contains the row number in `x` (it doesn't reset per group..). So we can get the row indices for each group for the first element, and then simply subset. We hope to improve this subset in the future (to take care of this optimisation internally). Similarly: idx = DT[, .I[.N], by=month][, V1] DT[idx] will get the last element for each group. Otherwise, how do you find the vignette so far? HTH, Arun On Thu, Jan 22, 2015 at 6:11 PM, Ben Tupper wrote: > Hello, > > I have been learning to use data.table and studying the vignette located > here... > > > https://rawgit.com/wiki/Rdatatable/data.table/vignettes/datatable-intro-vignette.html > > Section 2f. shows how to subset a data.table to select an arbitrary number > of rows in each .SD. That's really handy. > > 2. Aggregations > f. Subset .SD for each group: ans <- flights[, head(.SD, 2), by=month] > > In a similar way, I can get the last row of the .SD using either tail, > nrow or dim (I don't think it matters much, but dim seems to be a faster*). > > ans <- flights[,.SD[dim(.SD)[1]], by=month] > > I got to wondering if the number of rows in .SD might be exposed in each > grouping iteration. Is there an equivalent to .N for the subset > data.table, .SD? Something like .SDN or the like? > > Thanks for data.table! > > Ben > > * After reading this discussion > http://r.789695.n4.nabble.com/What-is-the-fastest-way-to-determine-that-data-table-is-empty-td4638348.html#a4638451 > I tried out a couple of methods for getting the last element of a grouping > using nrow(), tail() and dim(). > > # using tail > > microbenchmark( last1 <- flights[, tail(.SD, 1), by=month] ) > Unit: milliseconds > expr min lq mean > median uq max neval > last1 <- flights[, tail(.SD, 1), by = month] 16.65898 16.89704 18.26415 > 17.37007 19.20147 40.12966 100 > > # using dim > > microbenchmark( last2 <- flights[,.SD[dim(.SD)[1]], by=month] ) > Unit: milliseconds > expr min lq > mean median uq max neval > last2 <- flights[, .SD[dim(.SD)[1]], by = month] 15.51243 15.87788 > 17.40978 16.19426 17.83308 59.22429 100 > > # using nrow > > microbenchmark( last3 <- flights[,.SD[nrow(.SD)], by=month] ) > Unit: milliseconds > expr min lq > mean median uq max neval > last3 <- flights[, .SD[nrow(.SD)], by = month] 15.63919 15.92073 17.28836 > 16.52588 18.33867 24.92624 100 > > > identical(last1, last2) > [1] TRUE > > identical(last1, last3) > [1] TRUE > > Ben Tupper > Bigelow Laboratory for Ocean Sciences > 60 Bigelow Drive, P.O. Box 380 > East Boothbay, Maine 04544 > http://www.bigelow.org > > > > > > > > > _______________________________________________ > datatable-help mailing list > datatable-help at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > -------------- next part -------------- An HTML attachment was scrubbed... URL: From btupper at bigelow.org Thu Jan 22 20:35:09 2015 From: btupper at bigelow.org (Ben Tupper) Date: Thu, 22 Jan 2015 14:35:09 -0500 Subject: [datatable-help] number of rows selected in .SD subset In-Reply-To: References: Message-ID: <8595409E-C040-45C5-B2D0-2350385D0FE7@bigelow.org> Hi Arun, The vignette is very helpful; the ?data.table help page is so rich and dense that I end up wandering quite a bit. The vignette does a nice job laying it out logically. I'm sure it has been a huge effort. > DT[, .SD[.N], by=month] # ~ since .N contains the number of observations in this group Doh! Now I see it is stated clearly right under my nose: ".N is a special in-built variable that holds the number of observations in the current group." I'm not sure why I thought .N was for the original data.table instead of the grouping. I have switched my script to use the above and it is lightning fast now. I'm going to start wearing a seatbelt and helmet... Thank you! Ben P.S. How did you get download.file to read an https URL for the example? Here's what I get... flights <- fread("https://raw.githubusercontent.com/wiki/arunsrinivasan/flights/NYCflights14/flights14.csv") Error in download.file(input, tt, mode = "wb") : unsupported URL scheme So, I downloaded it manually using a browser and used my local copy instead. On Jan 22, 2015, at 2:11 PM, Arunkumar Srinivasan wrote: > Ben, > > Great to hear that you're going thro' the vignette.. > > To get the last row, you can similarly do: > > DT[, tail(.SD, 1L), by=month] # ~ as you say > DT[, .SD[.N], by=month] # ~ since .N contains the number of observations in this group > DT[, .SD[(.N-1L):.N], by=month] # ~ last two rows per group > > However, `.SD[...]` per group is slightly slower (especially on many groups) as it has to go through `[.data.table` (which is a S3 generic, and takes time for dispatching the right method.. which can get noticeable on large groups), and not all cases are optimised. > > You can also use `.I` (which is deliberately not mentioned in the vignette to keep things smooth and straightforward). Using it you could do: > > idx = DT[, .I[1L], by=month][, V1] > DT[idx] > > `.I` contains the row number in `x` (it doesn't reset per group..). So we can get the row indices for each group for the first element, and then simply subset. We hope to improve this subset in the future (to take care of this optimisation internally). > > Similarly: > > idx = DT[, .I[.N], by=month][, V1] > DT[idx] > > will get the last element for each group. > > Otherwise, how do you find the vignette so far? > > HTH, > Arun > > > > On Thu, Jan 22, 2015 at 6:11 PM, Ben Tupper wrote: > Hello, > > I have been learning to use data.table and studying the vignette located here... > > https://rawgit.com/wiki/Rdatatable/data.table/vignettes/datatable-intro-vignette.html > > Section 2f. shows how to subset a data.table to select an arbitrary number of rows in each .SD. That's really handy. > > 2. Aggregations > f. Subset .SD for each group: ans <- flights[, head(.SD, 2), by=month] > > In a similar way, I can get the last row of the .SD using either tail, nrow or dim (I don't think it matters much, but dim seems to be a faster*). > > ans <- flights[,.SD[dim(.SD)[1]], by=month] > > I got to wondering if the number of rows in .SD might be exposed in each grouping iteration. Is there an equivalent to .N for the subset data.table, .SD? Something like .SDN or the like? > > Thanks for data.table! > > Ben > > * After reading this discussion http://r.789695.n4.nabble.com/What-is-the-fastest-way-to-determine-that-data-table-is-empty-td4638348.html#a4638451 I tried out a couple of methods for getting the last element of a grouping using nrow(), tail() and dim(). > > # using tail > > microbenchmark( last1 <- flights[, tail(.SD, 1), by=month] ) > Unit: milliseconds > expr min lq mean median uq max neval > last1 <- flights[, tail(.SD, 1), by = month] 16.65898 16.89704 18.26415 17.37007 19.20147 40.12966 100 > > # using dim > > microbenchmark( last2 <- flights[,.SD[dim(.SD)[1]], by=month] ) > Unit: milliseconds > expr min lq mean median uq max neval > last2 <- flights[, .SD[dim(.SD)[1]], by = month] 15.51243 15.87788 17.40978 16.19426 17.83308 59.22429 100 > > # using nrow > > microbenchmark( last3 <- flights[,.SD[nrow(.SD)], by=month] ) > Unit: milliseconds > expr min lq mean median uq max neval > last3 <- flights[, .SD[nrow(.SD)], by = month] 15.63919 15.92073 17.28836 16.52588 18.33867 24.92624 100 > > > identical(last1, last2) > [1] TRUE > > identical(last1, last3) > [1] TRUE > > Ben Tupper > Bigelow Laboratory for Ocean Sciences > 60 Bigelow Drive, P.O. Box 380 > East Boothbay, Maine 04544 > http://www.bigelow.org > > > > > > > > > _______________________________________________ > datatable-help mailing list > datatable-help at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > Ben Tupper Bigelow Laboratory for Ocean Sciences 60 Bigelow Drive, P.O. Box 380 East Boothbay, Maine 04544 http://www.bigelow.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jmtruppia at gmail.com Thu Jan 22 20:53:36 2015 From: jmtruppia at gmail.com (Juan Manuel Truppia) Date: Thu, 22 Jan 2015 19:53:36 +0000 Subject: [datatable-help] number of rows selected in .SD subset References: Message-ID: I hadn't seen the vignette. It's really great Arun, nice work! I see that currently in the master branch there is only one other vignette. Do you have more in your plans or in other branches? I'll gladly review them and submit pull requests. Thanks again! -------------- next part -------------- An HTML attachment was scrubbed... URL: From aragorn168b at gmail.com Thu Jan 22 21:02:59 2015 From: aragorn168b at gmail.com (Arunkumar Srinivasan) Date: Thu, 22 Jan 2015 21:02:59 +0100 Subject: [datatable-help] number of rows selected in .SD subset In-Reply-To: <8595409E-C040-45C5-B2D0-2350385D0FE7@bigelow.org> References: <8595409E-C040-45C5-B2D0-2350385D0FE7@bigelow.org> Message-ID: Ben, `fread()` https support has been done in 1.9.5 (current devel). The vignettes are supposed to roll out for 1.9.8, but trying as much as we can in 1.9.6 already. On Thu, Jan 22, 2015 at 8:35 PM, Ben Tupper wrote: > Hi Arun, > > The vignette is very helpful; the ?data.table help page is so rich and > dense that I end up wandering quite a bit. The vignette does a nice job > laying it out logically. I'm sure it has been a huge effort. > > DT[, .SD[.N], by=month] # ~ since .N contains the number of observations > in this group > > > Doh! Now I see it is stated clearly right under my nose: ".N is a special > in-built variable that holds the number of observations in the current > group." I'm not sure why I thought .N was for the original data.table > instead of the grouping. > > I have switched my script to use the above and it is lightning fast now. > I'm going to start wearing a seatbelt and helmet... > > Thank you! > Ben > > P.S. How did you get download.file to read an https URL for the example? > Here's what I get... > > flights <- fread("https://raw.githubusercontent.com/wiki/arunsrinivasan/flights/NYCflights14/flights14.csv") > > Error in download.file(input, tt, mode = "wb") : unsupported URL scheme > > So, I downloaded it manually using a browser and used my local copy > instead. > > > > > On Jan 22, 2015, at 2:11 PM, Arunkumar Srinivasan > wrote: > > Ben, > > Great to hear that you're going thro' the vignette.. > > To get the last row, you can similarly do: > > DT[, tail(.SD, 1L), by=month] # ~ as you say > DT[, .SD[.N], by=month] # ~ since .N contains the number of observations > in this group > DT[, .SD[(.N-1L):.N], by=month] # ~ last two rows per group > > However, `.SD[...]` per group is slightly slower (especially on many > groups) as it has to go through `[.data.table` (which is a S3 generic, and > takes time for dispatching the right method.. which can get noticeable on > large groups), and not all cases are optimised. > > You can also use `.I` (which is deliberately not mentioned in the vignette > to keep things smooth and straightforward). Using it you could do: > > idx = DT[, .I[1L], by=month][, V1] > DT[idx] > > `.I` contains the row number in `x` (it doesn't reset per group..). So we > can get the row indices for each group for the first element, and then > simply subset. We hope to improve this subset in the future (to take care > of this optimisation internally). > > Similarly: > > idx = DT[, .I[.N], by=month][, V1] > DT[idx] > > will get the last element for each group. > > Otherwise, how do you find the vignette so far? > > HTH, > Arun > > > > On Thu, Jan 22, 2015 at 6:11 PM, Ben Tupper wrote: > >> Hello, >> >> I have been learning to use data.table and studying the vignette located >> here... >> >> >> https://rawgit.com/wiki/Rdatatable/data.table/vignettes/datatable-intro-vignette.html >> >> Section 2f. shows how to subset a data.table to select an arbitrary >> number of rows in each .SD. That's really handy. >> >> 2. Aggregations >> f. Subset .SD for each group: ans <- flights[, head(.SD, 2), >> by=month] >> >> In a similar way, I can get the last row of the .SD using either tail, >> nrow or dim (I don't think it matters much, but dim seems to be a faster*). >> >> ans <- flights[,.SD[dim(.SD)[1]], by=month] >> >> I got to wondering if the number of rows in .SD might be exposed in each >> grouping iteration. Is there an equivalent to .N for the subset >> data.table, .SD? Something like .SDN or the like? >> >> Thanks for data.table! >> >> Ben >> >> * After reading this discussion >> http://r.789695.n4.nabble.com/What-is-the-fastest-way-to-determine-that-data-table-is-empty-td4638348.html#a4638451 >> I tried out a couple of methods for getting the last element of a grouping >> using nrow(), tail() and dim(). >> >> # using tail >> > microbenchmark( last1 <- flights[, tail(.SD, 1), by=month] ) >> Unit: milliseconds >> expr min lq mean >> median uq max neval >> last1 <- flights[, tail(.SD, 1), by = month] 16.65898 16.89704 18.26415 >> 17.37007 19.20147 40.12966 100 >> >> # using dim >> > microbenchmark( last2 <- flights[,.SD[dim(.SD)[1]], by=month] ) >> Unit: milliseconds >> expr min lq >> mean median uq max neval >> last2 <- flights[, .SD[dim(.SD)[1]], by = month] 15.51243 15.87788 >> 17.40978 16.19426 17.83308 59.22429 100 >> >> # using nrow >> > microbenchmark( last3 <- flights[,.SD[nrow(.SD)], by=month] ) >> Unit: milliseconds >> expr min lq >> mean median uq max neval >> last3 <- flights[, .SD[nrow(.SD)], by = month] 15.63919 15.92073 >> 17.28836 16.52588 18.33867 24.92624 100 >> >> > identical(last1, last2) >> [1] TRUE >> > identical(last1, last3) >> [1] TRUE >> >> Ben Tupper >> Bigelow Laboratory for Ocean Sciences >> 60 Bigelow Drive, P.O. Box 380 >> East Boothbay, Maine 04544 >> http://www.bigelow.org >> >> >> >> >> >> >> >> >> _______________________________________________ >> datatable-help mailing list >> datatable-help at lists.r-forge.r-project.org >> >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >> > > > Ben Tupper > Bigelow Laboratory for Ocean Sciences > 60 Bigelow Drive, P.O. Box 380 > East Boothbay, Maine 04544 > http://www.bigelow.org > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aragorn168b at gmail.com Thu Jan 22 21:04:38 2015 From: aragorn168b at gmail.com (Arunkumar Srinivasan) Date: Thu, 22 Jan 2015 21:04:38 +0100 Subject: [datatable-help] number of rows selected in .SD subset In-Reply-To: References: Message-ID: Juan, Great! Thanks much. You can see all the vignettes planned and the ones finished here: https://github.com/Rdatatable/data.table/issues/944 I'd be glad to hear your feedbacks (probably on that page?) and of course you're welcome to submit PRs. Arun On Thu, Jan 22, 2015 at 8:53 PM, Juan Manuel Truppia wrote: > I hadn't seen the vignette. It's really great Arun, nice work! I see that > currently in the master branch there is only one other vignette. Do you > have more in your plans or in other branches? I'll gladly review them and > submit pull requests. > > Thanks again! > > _______________________________________________ > datatable-help mailing list > datatable-help at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > -------------- next part -------------- An HTML attachment was scrubbed... URL: From statquant at outlook.com Sun Jan 25 12:33:47 2015 From: statquant at outlook.com (statquant3) Date: Sun, 25 Jan 2015 03:33:47 -0800 (PST) Subject: [datatable-help] whole data.table copied "warning" Message-ID: <1422185627942-4702267.post@n4.nabble.com> I am experiencing a problem with data.table and can't find out what the problem is. I am getting the warning "this data table had to be copied over" First i cannot reproduce the bug on small example so I realize this post is mostly a bottle thrown at the see... Here is what I do: 1. I have several tables on a distant kdb server that I retrieve through a proprietary package. Those tables are copied and some data.frames are created. (I think it as nothing to do with it but still you never know) Because I have several tables on that server I created a function wrapper the "fetching" of a table, a data.frame is created and I setDT them the make them a data.table. 2. I call this function within a sapply command and end up with a named-list of data.table 3. I modify some of those tables in a R function, once again using lapply so I end up with a modified list of data.tables 4. I use attach() to be able to work on each data.table by name **5** Latter I when I try to add a column by ":=" I am getting a warning saying that the whole table had to be copied. Am I doing something obviously wrong here ? pseudo code: fetchingData = function(tableName, connectionToServer){ DF = connectAndFetchTable(connectionToServer, tableName) DT = setDT(DF) return(DT) } modifyDataTable = function(DT){ if('thisColName' %in% colnames(DT)){ DT[,thisColName:=someTransformation(thisColName)] } ... } I the main code : myDataTableList = sapply(c('tableA','tableB','tableC','tableD'), FUN=fetchingData, connectionToServer=myCon) myDataTableList = lapply(myDataTableList, FUN=modifyDataTable) attach(myDataTableList) tableB[,newColumn:=1L] *** getting the warning here*** Not I am using R 3.0.2 Finally let me say that I am using data.table for a long time right now, so it I know about := and set function (that usually are forgotten and trigger this warning like when doing DT$new = stuff) If any of you gets it I'd be gratefull (sorry I can't reproduce it correctly) -- View this message in context: http://r.789695.n4.nabble.com/whole-data-table-copied-warning-tp4702267.html Sent from the datatable-help mailing list archive at Nabble.com. From aragorn168b at gmail.com Sun Jan 25 19:06:07 2015 From: aragorn168b at gmail.com (Arunkumar Srinivasan) Date: Sun, 25 Jan 2015 19:06:07 +0100 Subject: [datatable-help] whole data.table copied "warning" In-Reply-To: <1422185627942-4702267.post@n4.nabble.com> References: <1422185627942-4702267.post@n4.nabble.com> Message-ID: The over allocation of pointers gets (silently) lost when we assign using `<-` in R. That's why we insist on using `:=`. And this is why we use an external pointer to detect if the over allocation has been lost. We can't directly use `:=`, as the over-allocation is gone. So we've to shallow copy, over-allocate once again. The message should say that the data was "shallow" copied. So it shouldn't affect performance (unless you do this quite repeatedly). But best to avoid the message altogether by using it the idiomatic way. I'm not sure if the R version matters here or not. HTH Arun. On Sun, Jan 25, 2015 at 12:33 PM, statquant3 wrote: > I am experiencing a problem with data.table and can't find out what the > problem is. > I am getting the warning "this data table had to be copied over" > First i cannot reproduce the bug on small example so I realize this post is > mostly a bottle thrown at the see... > > Here is what I do: > > 1. > I have several tables on a distant kdb server that I retrieve through a > proprietary package. > Those tables are copied and some data.frames are created. (I think it as > nothing to do with it but still you never know) > Because I have several tables on that server I created a function wrapper > the "fetching" of a table, a data.frame is created and > I setDT them the make them a data.table. > 2. > I call this function within a sapply command and end up with a named-list > of > data.table > 3. > I modify some of those tables in a R function, once again using lapply so I > end up with a modified list of data.tables > 4. > I use attach() to be able to work on each data.table by name > > **5** > Latter I when I try to add a column by ":=" I am getting a warning saying > that the whole table had to be copied. > Am I doing something obviously wrong here ? > > > pseudo code: > > fetchingData = function(tableName, connectionToServer){ > DF = connectAndFetchTable(connectionToServer, tableName) > DT = setDT(DF) > return(DT) > } > > modifyDataTable = function(DT){ > if('thisColName' %in% colnames(DT)){ > DT[,thisColName:=someTransformation(thisColName)] > } > ... > } > > I the main code : > > myDataTableList = sapply(c('tableA','tableB','tableC','tableD'), > FUN=fetchingData, connectionToServer=myCon) > myDataTableList = lapply(myDataTableList, FUN=modifyDataTable) > attach(myDataTableList) > > tableB[,newColumn:=1L] > *** getting the warning here*** > > Not I am using R 3.0.2 > > Finally let me say that I am using data.table for a long time right now, so > it I know about := and set function (that usually are forgotten and trigger > this warning like when doing DT$new = stuff) > > If any of you gets it I'd be gratefull (sorry I can't reproduce it > correctly) > > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/whole-data-table-copied-warning-tp4702267.html > Sent from the datatable-help mailing list archive at Nabble.com. > _______________________________________________ > datatable-help mailing list > datatable-help at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fjbuch at gmail.com Tue Jan 27 15:15:44 2015 From: fjbuch at gmail.com (Farrel Buchinsky) Date: Tue, 27 Jan 2015 09:15:44 -0500 Subject: [datatable-help] unique() not working. Is it me or the code? Message-ID: Look what happens when I execute the example. Any idea why I am getting the error. > DT <- data.table(A = rep(1:3, each=4), B = rep(1:4, each=3), C = rep(1:2, 6), key = "A,B") > duplicated(DT) [1] FALSE TRUE TRUE FALSE FALSE TRUE FALSE TRUE FALSE FALSE TRUE TRUE > unique(DT) Error in .Call("Cwhichwrapper", x, bool) : "Cwhichwrapper" not resolved from current namespace (data.table) Here is my session info Running from within RStudio Version 0.99.179 R version 3.1.2 (2014-10-31) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] knitr_1.9 zoo_1.7-11 RGoogleDocs_0.7-0 reshape2_1.4.1 [5] ggplot2_1.0.0 stringi_0.4-1 stringr_0.6.2 car_2.0-22 [9] plyr_1.8.1 data.table_1.9.5 lubridate_1.3.3 RODBC_1.3-10 loaded via a namespace (and not attached): [1] bitops_1.0-6 chron_2.3-45 colorspace_1.2-4 digest_0.6.8 [5] evaluate_0.5.5 formatR_1.0 grid_3.1.2 gtable_0.1.2 [9] lattice_0.20-29 MASS_7.3-35 memoise_0.2.1 munsell_0.4.2 [13] nnet_7.3-8 proto_0.3-10 Rcpp_0.11.4 RCurl_1.95-4.5 [17] scales_0.2.4 tools_3.1.2 XML_3.98-1.1 Thanks a lot. Farrel Farrel J. Buchinsky MBChB, BSc(HONS)MED, FACS (412) 567-7870 [mobile, work, home] Director, Pediatric Otolaryngology: Allegheny General Hospital Director, Respiratory Papillomatosis Program: Allegheny-Singer Research Institute 320 E. North Ave. Pittsburgh, PA 15212 -------------- next part -------------- An HTML attachment was scrubbed... URL: From statquant at outlook.com Tue Jan 27 15:53:47 2015 From: statquant at outlook.com (statquant3) Date: Tue, 27 Jan 2015 06:53:47 -0800 (PST) Subject: [datatable-help] whole data.table copied "warning" In-Reply-To: References: <1422185627942-4702267.post@n4.nabble.com> Message-ID: <1422370427251-4702370.post@n4.nabble.com> Arun, this is what I get... I still think something is wrong no ? Warning message: In `[.data.table`(DT, , `:=`(KEY, paste0(date, symbol, sep = "_"))) : Invalid .internal.selfref detected and fixed by taking a copy of the whole table so that := can add this new column by reference. At an earlier point, this data.table has been copied by R (or been created manually using structure() or similar). Avoid key<-, names<- and attr<- which in R currently (and oddly) may copy the whole data.table. Use set* syntax instead to avoid copying: ?set, ?setnames and ?setattr. Also, in R<=v3.0.2, list(DT1,DT2) copied the entire DT1 and DT2 (R's list() used to copy named objects); please upgrade to R>v3.0.2 if that is biting. If this message doesn't help, please report to datatable-help so the root cause can be fixed. -- View this message in context: http://r.789695.n4.nabble.com/whole-data-table-copied-warning-tp4702267p4702370.html Sent from the datatable-help mailing list archive at Nabble.com. From statquant at outlook.com Tue Jan 27 15:55:43 2015 From: statquant at outlook.com (statquant3) Date: Tue, 27 Jan 2015 06:55:43 -0800 (PST) Subject: [datatable-help] unique() not working. Is it me or the code? In-Reply-To: References: Message-ID: <1422370543762-4702371.post@n4.nabble.com> Can't reproduce on R 3.0.1, data.table_1.9.3 R version 3.0.1 (2013-05-16) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] C attached base packages: [1] grid stats graphics grDevices datasets utils methods base other attached packages: [1] xts_0.9-7 gridExtra_0.9.1 zoo_1.7-11 KdbR_1.0 bit64_0.9-3 bit_1.1-11 data.table_1.9.3 reshape2_1.2.2 stringr_0.6.2 plyr_1.8.1 Rcpp_0.11.1 optparse_1.0.2 getopt_1.20.0 msversion_1.0.1 loaded via a namespace (and not attached): [1] lattice_0.20-15 -- View this message in context: http://r.789695.n4.nabble.com/unique-not-working-Is-it-me-or-the-code-tp4702367p4702371.html Sent from the datatable-help mailing list archive at Nabble.com. From fjbuch at gmail.com Tue Jan 27 16:26:04 2015 From: fjbuch at gmail.com (Farrel Buchinsky) Date: Tue, 27 Jan 2015 10:26:04 -0500 Subject: [datatable-help] unique() not working. Is it me or the code? In-Reply-To: References: Message-ID: Thank you Jeff Zemla That fixed it. I reinstalled data.table and it all worked. Wonder how that happened. I just installed R 3.1.2 and am almost certain that I reinstalled data.table yesterday after upgrading R. Farrel Buchinsky Google Voice Tel: (412) 567-7870 On Tue, Jan 27, 2015 at 10:06 AM, Jeff Zemla wrote: > R 3.1.0 > data table 1.9.5 > > can't replicate. did you try reinstalling data table? > > On Tue, Jan 27, 2015 at 9:15 AM, Farrel Buchinsky > wrote: > >> Look what happens when I execute the example. Any idea why I am getting >> the error. >> >> > DT <- data.table(A = rep(1:3, each=4), B = rep(1:4, each=3), C = >> rep(1:2, 6), key = "A,B") >> > duplicated(DT) >> [1] FALSE TRUE TRUE FALSE FALSE TRUE FALSE TRUE FALSE FALSE TRUE >> TRUE >> > unique(DT) >> Error in .Call("Cwhichwrapper", x, bool) : >> "Cwhichwrapper" not resolved from current namespace (data.table) >> >> Here is my session info >> >> Running from within RStudio Version 0.99.179 >> >> R version 3.1.2 (2014-10-31) >> Platform: x86_64-w64-mingw32/x64 (64-bit) >> >> locale: >> [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United >> States.1252 >> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C >> >> [5] LC_TIME=English_United States.1252 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] knitr_1.9 zoo_1.7-11 RGoogleDocs_0.7-0 reshape2_1.4.1 >> >> [5] ggplot2_1.0.0 stringi_0.4-1 stringr_0.6.2 car_2.0-22 >> >> [9] plyr_1.8.1 data.table_1.9.5 lubridate_1.3.3 RODBC_1.3-10 >> >> >> loaded via a namespace (and not attached): >> [1] bitops_1.0-6 chron_2.3-45 colorspace_1.2-4 digest_0.6.8 >> [5] evaluate_0.5.5 formatR_1.0 grid_3.1.2 gtable_0.1.2 >> [9] lattice_0.20-29 MASS_7.3-35 memoise_0.2.1 munsell_0.4.2 >> [13] nnet_7.3-8 proto_0.3-10 Rcpp_0.11.4 RCurl_1.95-4.5 >> [17] scales_0.2.4 tools_3.1.2 XML_3.98-1.1 >> >> >> Thanks a lot. Farrel >> >> >> Farrel J. Buchinsky MBChB, BSc(HONS)MED, FACS >> (412) 567-7870 [mobile, work, home] >> Director, Pediatric Otolaryngology: Allegheny General Hospital >> Director, Respiratory Papillomatosis Program: Allegheny-Singer Research >> Institute >> 320 E. North Ave. >> Pittsburgh, PA 15212 >> >> _______________________________________________ >> datatable-help mailing list >> datatable-help at lists.r-forge.r-project.org >> >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aragorn168b at gmail.com Tue Jan 27 16:32:29 2015 From: aragorn168b at gmail.com (Arunkumar Srinivasan) Date: Tue, 27 Jan 2015 16:32:29 +0100 Subject: [datatable-help] unique() not working. Is it me or the code? In-Reply-To: References: Message-ID: Farrel, Glad it's resolved! On Tue, Jan 27, 2015 at 4:26 PM, Farrel Buchinsky wrote: > Thank you Jeff Zemla > That fixed it. I reinstalled data.table and it all worked. Wonder how that > happened. I just installed R 3.1.2 and am almost certain that I reinstalled > data.table yesterday after upgrading R. > > Farrel Buchinsky > Google Voice Tel: (412) 567-7870 > > On Tue, Jan 27, 2015 at 10:06 AM, Jeff Zemla wrote: > >> R 3.1.0 >> data table 1.9.5 >> >> can't replicate. did you try reinstalling data table? >> >> On Tue, Jan 27, 2015 at 9:15 AM, Farrel Buchinsky >> wrote: >> >>> Look what happens when I execute the example. Any idea why I am getting >>> the error. >>> >>> > DT <- data.table(A = rep(1:3, each=4), B = rep(1:4, each=3), C = >>> rep(1:2, 6), key = "A,B") >>> > duplicated(DT) >>> [1] FALSE TRUE TRUE FALSE FALSE TRUE FALSE TRUE FALSE FALSE TRUE >>> TRUE >>> > unique(DT) >>> Error in .Call("Cwhichwrapper", x, bool) : >>> "Cwhichwrapper" not resolved from current namespace (data.table) >>> >>> Here is my session info >>> >>> Running from within RStudio Version 0.99.179 >>> >>> R version 3.1.2 (2014-10-31) >>> Platform: x86_64-w64-mingw32/x64 (64-bit) >>> >>> locale: >>> [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United >>> States.1252 >>> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C >>> >>> [5] LC_TIME=English_United States.1252 >>> >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> >>> other attached packages: >>> [1] knitr_1.9 zoo_1.7-11 RGoogleDocs_0.7-0 >>> reshape2_1.4.1 >>> [5] ggplot2_1.0.0 stringi_0.4-1 stringr_0.6.2 car_2.0-22 >>> >>> [9] plyr_1.8.1 data.table_1.9.5 lubridate_1.3.3 RODBC_1.3-10 >>> >>> >>> loaded via a namespace (and not attached): >>> [1] bitops_1.0-6 chron_2.3-45 colorspace_1.2-4 digest_0.6.8 >>> [5] evaluate_0.5.5 formatR_1.0 grid_3.1.2 gtable_0.1.2 >>> [9] lattice_0.20-29 MASS_7.3-35 memoise_0.2.1 munsell_0.4.2 >>> [13] nnet_7.3-8 proto_0.3-10 Rcpp_0.11.4 RCurl_1.95-4.5 >>> [17] scales_0.2.4 tools_3.1.2 XML_3.98-1.1 >>> >>> >>> Thanks a lot. Farrel >>> >>> >>> Farrel J. Buchinsky MBChB, BSc(HONS)MED, FACS >>> (412) 567-7870 [mobile, work, home] >>> Director, Pediatric Otolaryngology: Allegheny General Hospital >>> Director, Respiratory Papillomatosis Program: Allegheny-Singer Research >>> Institute >>> 320 E. North Ave. >>> Pittsburgh, PA 15212 >>> >>> _______________________________________________ >>> datatable-help mailing list >>> datatable-help at lists.r-forge.r-project.org >>> >>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >>> >> >> > > _______________________________________________ > datatable-help mailing list > datatable-help at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aragorn168b at gmail.com Tue Jan 27 16:34:15 2015 From: aragorn168b at gmail.com (Arunkumar Srinivasan) Date: Tue, 27 Jan 2015 16:34:15 +0100 Subject: [datatable-help] whole data.table copied "warning" In-Reply-To: <1422370427251-4702370.post@n4.nabble.com> References: <1422185627942-4702267.post@n4.nabble.com> <1422370427251-4702370.post@n4.nabble.com> Message-ID: The part "by taking a copy" should be "by taking a shallow copy". Other than that, nothing is wrong. If you get this message, then your data.table has somehow lost it's over-allocation. I can't help much without seeing a minimal reproducible example :-(. On Tue, Jan 27, 2015 at 3:53 PM, statquant3 wrote: > Arun, this is what I get... > I still think something is wrong no ? > > Warning message: > In `[.data.table`(DT, , `:=`(KEY, paste0(date, symbol, sep = "_"))) : > Invalid .internal.selfref detected and fixed by taking a copy of the > whole > table so that := can add this new column by reference. At an earlier point, > this data.table has been copied by R (or been created manually using > structure() or similar). Avoid key<-, names<- and attr<- which in R > currently (and oddly) may copy the whole data.table. Use set* syntax > instead > to avoid copying: ?set, ?setnames and ?setattr. Also, in R<=v3.0.2, > list(DT1,DT2) copied the entire DT1 and DT2 (R's list() used to copy named > objects); please upgrade to R>v3.0.2 if that is biting. If this message > doesn't help, please report to datatable-help so the root cause can be > fixed. > > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/whole-data-table-copied-warning-tp4702267p4702370.html > Sent from the datatable-help mailing list archive at Nabble.com. > _______________________________________________ > datatable-help mailing list > datatable-help at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > -------------- next part -------------- An HTML attachment was scrubbed... URL: