From guizhida at gmail.com Thu Sep 3 19:58:20 2015 From: guizhida at gmail.com (George Gui) Date: Thu, 3 Sep 2015 12:58:20 -0500 Subject: [datatable-help] .SD segmentation fault. Message-ID: Hi, I had a memory bug when I tried to run the following script that wants to select a subset rows of data.table, the bug is somehow fixed by making a copy of the input data.table however: library(data.table) load('test_bug.RData') Test.Bug <- function(tmp_move, ID){ print(ID) #tmp_move <- copy(tmp_move) coverage <- tmp_move[, .(c.count= sum(dummy== 0)), by= group] groups_selected <- unique(coverage[c.count>120, group]) tmp_move2 <- tmp_move[group %in% groups_selected] return(tmp_move2) } move[, Test.Bug(.SD, ID), by= ID] *** caught segfault *** address 0x7fc3910d2824, cause 'memory not mapped' Traceback: 1: bmerge(i, x, leftcols, rightcols, io <- FALSE, xo, roll = 0, rollends = c(FALSE, FALSE), nomatch = 0L, verbose = verbose) 2: `[.data.table`(tmp_move, group %in% groups_selected) 3: tmp_move[group %in% groups_selected] 4: Test.Bug(.SD, ID) 5: `[.data.table`(move, , Test.Bug(.SD, ID), by = ID) 6: move[, Test.Bug(.SD, ID), by = ID] -- Zhida(George) Gui Mathematics and Economics Major Email:guizhida at gmail.com Cell:773-614-2597 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test_bug.R Type: application/octet-stream Size: 358 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test_bug.RData Type: application/octet-stream Size: 2113 bytes Desc: not available URL: From f_j_rod at hotmail.com Tue Sep 8 13:10:46 2015 From: f_j_rod at hotmail.com (Frank S.) Date: Tue, 8 Sep 2015 13:10:46 +0200 Subject: [datatable-help] Avoid warnings when obtaining fictitious rows In-Reply-To: References: , , , Message-ID: Hi Arun and sorry for my insistence. If I'm right, wraping the entire expression we can identify that there is one missign data in $visit for id=2: > DT[, print(list(start, visit=as.Date(c(paste0(year(start[1]):(year(end[1])-1),"-12-31"))),end)), by=id] [[1]] # Data for subject id=2[1] "2007-01-01" "2007-01-01" "2007-01-01" $visit[1] "2007-12-31" "2008-12-31? ########## <~~~ here?s the issue that results in warning. [[3]][1] "2009-05-01" "2009-05-01" "2009-05-01" But precisely, the output I desired for id=2 consists on keeping rows of only those "complete" observations for columns start, visit and end,.I only want to keep by rows the first and second columns, because the following 31 December would be 2009-12-31, which is older than the corresponding end=2009-05-01. That is: id start visit end 1: 2 2007-01-01 2007-12-31 2009-05-01 2: 2 2007-01-01 2008-12-31 2009-05-01 Trying to follow your suggestions, I've written: > DT[unique(DT), {N = as.Date(c(paste0(year(start[1]):(year(end[1])-1),"-12-31"))) .(start, visit = visit[N], end) }, by=id] But it returns me an error message. Can you help me to find a way out of this mess? -------------- next part -------------- An HTML attachment was scrubbed... URL: From aragorn168b at gmail.com Thu Sep 10 23:59:09 2015 From: aragorn168b at gmail.com (Arunkumar Srinivasan) Date: Thu, 10 Sep 2015 23:59:09 +0200 Subject: [datatable-help] .SD segmentation fault. In-Reply-To: References: Message-ID: There?s no way to reproduce this. And there?s no version info (sessionInfo()). There?s little we can do with it. I suggest you?ve a look at?https://github.com/Rdatatable/data.table/wiki/Support?and provide us with the necessary data/code for us to reproduce. --? Arun On 3 Sep 2015 at 19:58:41, George Gui (guizhida at gmail.com) wrote: Hi,? I had a memory bug when I tried to run the following script that wants to select a subset rows of data.table, the bug is somehow fixed by making a copy of the input data.table however: library(data.table) load('test_bug.RData') Test.Bug <- function(tmp_move, ID){ ? ? print(ID) ? #tmp_move <- copy(tmp_move) ? coverage <- tmp_move[, .(c.count= sum(dummy== 0)), by= group] ? groups_selected <- unique(coverage[c.count>120, group]) ? tmp_move2 <- tmp_move[group %in% groups_selected] ? return(tmp_move2) } move[, Test.Bug(.SD, ID), by= ID] ?*** caught segfault *** address 0x7fc3910d2824, cause 'memory not mapped' Traceback: ?1: bmerge(i, x, leftcols, rightcols, io <- FALSE, xo, roll = 0, ? ? rollends = c(FALSE, FALSE), nomatch = 0L, verbose = verbose) ?2: `[.data.table`(tmp_move, group %in% groups_selected) ?3: tmp_move[group %in% groups_selected] ?4: Test.Bug(.SD, ID) ?5: `[.data.table`(move, , Test.Bug(.SD, ID), by = ID) ?6: move[, Test.Bug(.SD, ID), by = ID] -- Zhida(George) Gui Mathematics and Economics Major Email:guizhida at gmail.com Cell:773-614-2597 _______________________________________________ datatable-help mailing list datatable-help at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help -------------- next part -------------- An HTML attachment was scrubbed... URL: From danielrenenowak at gmx.de Mon Sep 21 12:19:32 2015 From: danielrenenowak at gmx.de (daniel.nowak) Date: Mon, 21 Sep 2015 03:19:32 -0700 (PDT) Subject: [datatable-help] Progress Bar Message-ID: <1442830772833-4712534.post@n4.nabble.com> Is there anything like a Progress Bar included in the data.table package when I apply a function via the by command to subsets of a data.table? I have looked it up and just came up with some threads about the fread function, so it is about data import rather than data processing. Maybe something like the percentage of groups for which the calculation has already been executed? Many thanks for any comment or idea, Daniel -- View this message in context: http://r.789695.n4.nabble.com/Progress-Bar-tp4712534.html Sent from the datatable-help mailing list archive at Nabble.com. From aragorn168b at gmail.com Tue Sep 22 19:32:01 2015 From: aragorn168b at gmail.com (Arunkumar Srinivasan) Date: Tue, 22 Sep 2015 19:32:01 +0200 Subject: [datatable-help] Progress Bar In-Reply-To: <1442830772833-4712534.post@n4.nabble.com> References: <1442830772833-4712534.post@n4.nabble.com> Message-ID: No, it doesn?t. Please file a FR on the project page, if you wish. Please read:?https://github.com/Rdatatable/data.table/wiki/Support?before filing (even though most of it is not relevant for your FR). --? Arun On 21 September 2015 at 12:27:40, daniel.nowak (danielrenenowak at gmx.de) wrote: Is there anything like a Progress Bar included in the data.table package when I apply a function via the by command to subsets of a data.table? I have looked it up and just came up with some threads about the fread function, so it is about data import rather than data processing. Maybe something like the percentage of groups for which the calculation has already been executed? Many thanks for any comment or idea, Daniel -- View this message in context: http://r.789695.n4.nabble.com/Progress-Bar-tp4712534.html Sent from the datatable-help mailing list archive at Nabble.com. _______________________________________________ datatable-help mailing list datatable-help at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperickson at wisc.edu Thu Sep 24 17:01:49 2015 From: fperickson at wisc.edu (Frank Erickson) Date: Thu, 24 Sep 2015 11:01:49 -0400 Subject: [datatable-help] error in using new rowid function Message-ID: Hi, I ran the example from the doc, as seen below. --Frank > DT = data.table(x=c(20,10,10,30,30,20), y=c("a", "a", "a", "b", "b", "b"), z=1:6) > > rowid(DT$x) # 1,1,2,1,2,2 Error in rowidv(list(...), prefix = prefix) : Internal error: invalid ties.method for frankv(), should have been caught before. Please report to datatable-help > rowidv(DT, cols="x") # same as above Error in rowidv(DT, cols = "x") : Internal error: invalid ties.method for frankv(), should have been caught before. Please report to datatable-help > sessionInfo() R version 3.2.2 (2015-08-14) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] data.table_1.9.7 loaded via a namespace (and not attached): [1] chron_2.3-47 -------------- next part -------------- An HTML attachment was scrubbed... URL: