From michael.gahan at gmail.com Wed Dec 3 04:47:38 2014 From: michael.gahan at gmail.com (Mike.Gahan) Date: Tue, 2 Dec 2014 19:47:38 -0800 (PST) Subject: [datatable-help] Rolling Joins Replicated in Java MapReduce Message-ID: <1417578458560-4700329.post@n4.nabble.com> Hello all, I absolutely love the rolling join capabilities of data.table. It is extremely useful for the work I do. However, sometimes I work with data that is too large to fit into RAM (even when using a large server). I want to implement this rolling join code in a Java Map Reduce setting to be able to leverage some of the other resources available at the company I work for. Unfortunately I am not an experienced Java programmer. I figured that a project like this would provide an excellent incentive to learn this skill. My question is this: what data.table current code for rolling joins would be most useful to reference in starting this project? I am guessing the bmerge.c code has much of what I want. Any other code in the data.table package I should be aware of? Any other advice that might make this process go more smoothly? I know the function is based on a Modified Binary Search algorithm. Are there any libraries anyone is aware of that might help this along? I really appreciate all help. Mike -- View this message in context: http://r.789695.n4.nabble.com/Rolling-Joins-Replicated-in-Java-MapReduce-tp4700329.html Sent from the datatable-help mailing list archive at Nabble.com. From my.r.help at gmail.com Wed Dec 3 07:44:11 2014 From: my.r.help at gmail.com (Michael Smith) Date: Wed, 03 Dec 2014 14:44:11 +0800 Subject: [datatable-help] Rolling Joins Replicated in Java MapReduce In-Reply-To: <1417578458560-4700329.post@n4.nabble.com> References: <1417578458560-4700329.post@n4.nabble.com> Message-ID: <547EB13B.7010504@gmail.com> Maybe it is easier to build what you're looking for by contributing to plyrmr: https://github.com/RevolutionAnalytics/plyrmr It already implements "plyr for Hadoop" on top or the rmr2 package. Not sure whether merging is already implemented, but using rmr2 it should not be prohibitively difficult (hopefully). Best, M On 12/03/2014 11:47 AM, Mike.Gahan wrote: > Hello all, > > I absolutely love the rolling join capabilities of data.table. It is > extremely useful for the work I do. However, sometimes I work with data that > is too large to fit into RAM (even when using a large server). I want to > implement this rolling join code in a Java Map Reduce setting to be able to > leverage some of the other resources available at the company I work for. > Unfortunately I am not an experienced Java programmer. I figured that a > project like this would provide an excellent incentive to learn this skill. > > My question is this: what data.table current code for rolling joins would be > most useful to reference in starting this project? I am guessing the > bmerge.c code > has > much of what I want. Any other code in the data.table package I should be > aware of? Any other advice that might make this process go more smoothly? I > know the function is based on a Modified Binary Search algorithm. Are there > any libraries anyone is aware of that might help this along? > > I really appreciate all help. > Mike > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Rolling-Joins-Replicated-in-Java-MapReduce-tp4700329.html > Sent from the datatable-help mailing list archive at Nabble.com. > _______________________________________________ > datatable-help mailing list > datatable-help at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > From danielrlabar at gmail.com Wed Dec 3 22:41:35 2014 From: danielrlabar at gmail.com (Dan LaBar) Date: Wed, 3 Dec 2014 16:41:35 -0500 Subject: [datatable-help] Rolling Joins Replicated in Java MapReduce Message-ID: You may want to look into Spark SQL. There is currently discussion on adding support for range joins , which I think are similar to rolling joins in data.table. I started looking into rmr2, but Hive and Spark SQL look like better options for my use cases. On Wed, Dec 3, 2014 at 6:00 AM, < datatable-help-request at lists.r-forge.r-project.org> wrote: > Send datatable-help mailing list submissions to > datatable-help at lists.r-forge.r-project.org > > To subscribe or unsubscribe via the World Wide Web, visit > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > > or, via email, send a message with subject or body 'help' to > datatable-help-request at lists.r-forge.r-project.org > > You can reach the person managing the list at > datatable-help-owner at lists.r-forge.r-project.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of datatable-help digest..." > > Today's Topics: > > 1. Rolling Joins Replicated in Java MapReduce (Mike.Gahan) > 2. Re: Rolling Joins Replicated in Java MapReduce (Michael Smith) > > > ---------- Forwarded message ---------- > From: "Mike.Gahan" > To: datatable-help at lists.r-forge.r-project.org > Cc: > Date: Tue, 2 Dec 2014 19:47:38 -0800 (PST) > Subject: [datatable-help] Rolling Joins Replicated in Java MapReduce > Hello all, > > I absolutely love the rolling join capabilities of data.table. It is > extremely useful for the work I do. However, sometimes I work with data > that > is too large to fit into RAM (even when using a large server). I want to > implement this rolling join code in a Java Map Reduce setting to be able to > leverage some of the other resources available at the company I work for. > Unfortunately I am not an experienced Java programmer. I figured that a > project like this would provide an excellent incentive to learn this skill. > > My question is this: what data.table current code for rolling joins would > be > most useful to reference in starting this project? I am guessing the > bmerge.c code > has > much of what I want. Any other code in the data.table package I should be > aware of? Any other advice that might make this process go more smoothly? I > know the function is based on a Modified Binary Search algorithm. Are there > any libraries anyone is aware of that might help this along? > > I really appreciate all help. > Mike > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Rolling-Joins-Replicated-in-Java-MapReduce-tp4700329.html > Sent from the datatable-help mailing list archive at Nabble.com. > > > > ---------- Forwarded message ---------- > From: Michael Smith > To: "Mike.Gahan" , > datatable-help at lists.r-forge.r-project.org > Cc: > Date: Wed, 03 Dec 2014 14:44:11 +0800 > Subject: Re: [datatable-help] Rolling Joins Replicated in Java MapReduce > Maybe it is easier to build what you're looking for by contributing to > plyrmr: > > https://github.com/RevolutionAnalytics/plyrmr > > It already implements "plyr for Hadoop" on top or the rmr2 package. Not > sure whether merging is already implemented, but using rmr2 it should not > be prohibitively difficult (hopefully). > > Best, > M > > > On 12/03/2014 11:47 AM, Mike.Gahan wrote: > >> Hello all, >> >> I absolutely love the rolling join capabilities of data.table. It is >> extremely useful for the work I do. However, sometimes I work with data >> that >> is too large to fit into RAM (even when using a large server). I want to >> implement this rolling join code in a Java Map Reduce setting to be able >> to >> leverage some of the other resources available at the company I work for. >> Unfortunately I am not an experienced Java programmer. I figured that a >> project like this would provide an excellent incentive to learn this >> skill. >> >> My question is this: what data.table current code for rolling joins would >> be >> most useful to reference in starting this project? I am guessing the >> bmerge.c code >> has >> much of what I want. Any other code in the data.table package I should be >> aware of? Any other advice that might make this process go more smoothly? >> I >> know the function is based on a Modified Binary Search algorithm. Are >> there >> any libraries anyone is aware of that might help this along? >> >> I really appreciate all help. >> Mike >> >> >> >> -- >> View this message in context: http://r.789695.n4.nabble.com/ >> Rolling-Joins-Replicated-in-Java-MapReduce-tp4700329.html >> Sent from the datatable-help mailing list archive at Nabble.com. >> _______________________________________________ >> datatable-help mailing list >> datatable-help at lists.r-forge.r-project.org >> https://lists.r-forge.r-project.org/cgi-bin/mailman/ >> listinfo/datatable-help >> >> > > _______________________________________________ > datatable-help mailing list > datatable-help at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdowle at mdowle.plus.com Mon Dec 8 15:44:25 2014 From: mdowle at mdowle.plus.com (Matt Dowle) Date: Mon, 08 Dec 2014 14:44:25 +0000 Subject: [datatable-help] Video of talk at H2O World Message-ID: <5485B949.2080101@mdowle.plus.com> Hi, A video of my talk at H2O World in San Francisco recently : https://www.youtube.com/watch?v=MvH1eTdsekA 0:00 Examples from two insurance companies using data.table 12:00 What is data.table, benchmarks dplyr and pandas 16:55 Overlap joins 20:00 Rolling joins 22:30 data.table radix sorting is better than hashing (dplyr and pandas) 23:00 H2O (just parallel file reading and grouping as quick test) 30:00 Quick rerun of talk at Bay Area R User Group (sorting benchmark, automatic indexes flows through to dplyr, numeric rounding) 33:10 My status 36:45 Questions 49:26 End Comments/suggestions very welcome. Matt From mdowle at mdowle.plus.com Mon Dec 8 16:03:27 2014 From: mdowle at mdowle.plus.com (Matt Dowle) Date: Mon, 08 Dec 2014 15:03:27 +0000 Subject: [datatable-help] Video of talk at H2O World In-Reply-To: <5485B949.2080101@mdowle.plus.com> References: <5485B949.2080101@mdowle.plus.com> Message-ID: <5485BDBF.4040200@mdowle.plus.com> As a few have asked already, will upload slides later. It was a collection of different files and part was just an R script. I'll need to merge together ... On 08/12/14 14:44, Matt Dowle wrote: > Hi, > > A video of my talk at H2O World in San Francisco recently : > > https://www.youtube.com/watch?v=MvH1eTdsekA > > 0:00 Examples from two insurance companies using data.table > 12:00 What is data.table, benchmarks dplyr and pandas > 16:55 Overlap joins > 20:00 Rolling joins > 22:30 data.table radix sorting is better than hashing (dplyr and > pandas) > 23:00 H2O (just parallel file reading and grouping as quick test) > 30:00 Quick rerun of talk at Bay Area R User Group (sorting > benchmark, automatic indexes flows through to dplyr, numeric rounding) > 33:10 My status > 36:45 Questions > 49:26 End > > Comments/suggestions very welcome. > > Matt > > From jeales at gmail.com Mon Dec 8 16:25:47 2014 From: jeales at gmail.com (James Eales) Date: Mon, 8 Dec 2014 15:25:47 +0000 Subject: [datatable-help] Video of talk at H2O World In-Reply-To: <5485BDBF.4040200@mdowle.plus.com> References: <5485B949.2080101@mdowle.plus.com> <5485BDBF.4040200@mdowle.plus.com> Message-ID: Matt, Very impressive show of what data.table can do It would be helpful to have a wider set of these more 'advanced' data.table function calls in the FAQ I keep discovering more features, even after reading the FAQ, R-help and intro vignette multiple times (this is not a criticism of the docs, but praise for DT's flexibility) Learning by example, even if you don't understand it fully the first time, can be very powerful James On 8 December 2014 at 15:03, Matt Dowle wrote: > > As a few have asked already, will upload slides later. It was a > collection of different files and part was just an R script. I'll need to > merge together ... > > > On 08/12/14 14:44, Matt Dowle wrote: > >> Hi, >> >> A video of my talk at H2O World in San Francisco recently : >> >> https://www.youtube.com/watch?v=MvH1eTdsekA >> >> 0:00 Examples from two insurance companies using data.table >> 12:00 What is data.table, benchmarks dplyr and pandas >> 16:55 Overlap joins >> 20:00 Rolling joins >> 22:30 data.table radix sorting is better than hashing (dplyr and pandas) >> 23:00 H2O (just parallel file reading and grouping as quick test) >> 30:00 Quick rerun of talk at Bay Area R User Group (sorting benchmark, >> automatic indexes flows through to dplyr, numeric rounding) >> 33:10 My status >> 36:45 Questions >> 49:26 End >> >> Comments/suggestions very welcome. >> >> Matt >> >> >> > _______________________________________________ > datatable-help mailing list > datatable-help at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/ > listinfo/datatable-help > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdowle at mdowle.plus.com Mon Dec 8 23:49:10 2014 From: mdowle at mdowle.plus.com (Matt Dowle) Date: Mon, 08 Dec 2014 22:49:10 +0000 Subject: [datatable-help] Video of talk at H2O World In-Reply-To: References: <5485B949.2080101@mdowle.plus.com> <5485BDBF.4040200@mdowle.plus.com> Message-ID: <54862AE6.8040502@mdowle.plus.com> James, Thanks. Just to avoid crossed-wires, which features do you mean exactly? Thanks, Matt On 08/12/14 15:25, James Eales wrote: > Matt, > Very impressive show of what data.table can do > It would be helpful to have a wider set of these more 'advanced' > data.table function calls in the FAQ > I keep discovering more features, even after reading the FAQ, R-help > and intro vignette multiple times (this is not a criticism of the > docs, but praise for DT's flexibility) > Learning by example, even if you don't understand it fully the first > time, can be very powerful > James > > On 8 December 2014 at 15:03, Matt Dowle > wrote: > > > As a few have asked already, will upload slides later. It was a > collection of different files and part was just an R script. I'll > need to merge together ... > > > On 08/12/14 14:44, Matt Dowle wrote: > > Hi, > > A video of my talk at H2O World in San Francisco recently : > > https://www.youtube.com/watch?v=MvH1eTdsekA > > 0:00 Examples from two insurance companies using data.table > 12:00 What is data.table, benchmarks dplyr and pandas > 16:55 Overlap joins > 20:00 Rolling joins > 22:30 data.table radix sorting is better than hashing (dplyr > and pandas) > 23:00 H2O (just parallel file reading and grouping as quick > test) > 30:00 Quick rerun of talk at Bay Area R User Group (sorting > benchmark, automatic indexes flows through to dplyr, numeric > rounding) > 33:10 My status > 36:45 Questions > 49:26 End > > Comments/suggestions very welcome. > > Matt > > > > _______________________________________________ > datatable-help mailing list > datatable-help at lists.r-forge.r-project.org > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeales at gmail.com Tue Dec 9 18:58:48 2014 From: jeales at gmail.com (James Eales) Date: Tue, 9 Dec 2014 17:58:48 +0000 Subject: [datatable-help] Video of talk at H2O World In-Reply-To: <54862AE6.8040502@mdowle.plus.com> References: <5485B949.2080101@mdowle.plus.com> <5485BDBF.4040200@mdowle.plus.com> <54862AE6.8040502@mdowle.plus.com> Message-ID: fread() and in particular that you can paste content directly into the terminal i.e. fread("ctrl-v") fread() that it can read directly from a massive gzipped text file using a call to a system command, with no hassle i.e. fread("gunzip -c massive_file.txt") foverlaps() just that it exists and how quick it is for region overlaps (I do a lot of genomics) subset.data.table() allows negation of column selection i.e. subset(DT,select=-unwanted_column) data.table allows chaining of different selection statements DT[value<0.5][value>0.4][id %in% my_interesting_id_list] I discover more every time I use it, just thought some more complex examples (like the every-roof-in-the-uk machine learning example from your talk) would be helpful to illustrate the range of expressions you can supply to a data.table The docs are very good and hugely comprehensive, just sometimes its best to start with a complex example and then take it apart On 8 December 2014 at 22:49, Matt Dowle wrote: > > James, > > Thanks. Just to avoid crossed-wires, which features do you mean exactly? > > Thanks, Matt > > > On 08/12/14 15:25, James Eales wrote: > > Matt, > Very impressive show of what data.table can do > It would be helpful to have a wider set of these more 'advanced' > data.table function calls in the FAQ > I keep discovering more features, even after reading the FAQ, R-help and > intro vignette multiple times (this is not a criticism of the docs, but > praise for DT's flexibility) > Learning by example, even if you don't understand it fully the first > time, can be very powerful > James > > On 8 December 2014 at 15:03, Matt Dowle wrote: > >> >> As a few have asked already, will upload slides later. It was a >> collection of different files and part was just an R script. I'll need to >> merge together ... >> >> >> On 08/12/14 14:44, Matt Dowle wrote: >> >>> Hi, >>> >>> A video of my talk at H2O World in San Francisco recently : >>> >>> https://www.youtube.com/watch?v=MvH1eTdsekA >>> >>> 0:00 Examples from two insurance companies using data.table >>> 12:00 What is data.table, benchmarks dplyr and pandas >>> 16:55 Overlap joins >>> 20:00 Rolling joins >>> 22:30 data.table radix sorting is better than hashing (dplyr and >>> pandas) >>> 23:00 H2O (just parallel file reading and grouping as quick test) >>> 30:00 Quick rerun of talk at Bay Area R User Group (sorting benchmark, >>> automatic indexes flows through to dplyr, numeric rounding) >>> 33:10 My status >>> 36:45 Questions >>> 49:26 End >>> >>> Comments/suggestions very welcome. >>> >>> Matt >>> >>> >>> >> _______________________________________________ >> datatable-help mailing list >> datatable-help at lists.r-forge.r-project.org >> >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jhibschman at gmail.com Fri Dec 12 16:23:48 2014 From: jhibschman at gmail.com (Johann Hibschman) Date: Fri, 12 Dec 2014 10:23:48 -0500 Subject: [datatable-help] Ordering of filter expressions Message-ID: I just ran into an issue where d[X==round(X)] gives different results from d[round(X) == X]. Why would that happen? Here's the exact example: > z.dev[YIELD == round(YIELD),] Error in eval(expr, envir, enclos) : object 'YIELD' not found > z.dev[round(YIELD) == YIELD,] runId dealName cusip scenarioId shockId pathOrder PRICE100 YIELD MOD_DURN MOD_CONVEXITY DISC_MARGIN SPREAD_BP 1: 10556 HVML0501 41161PLE1 772 0 3 54.5094 6 4.739 0.431 400 418 2: 10556 HVML0501 41161PLE1 773 0 3 52.9452 6 4.706 0.427 400 424 Thanks, Johann P.S. data.table is 1.9.4, R is 3.1.2. From jhibschman at gmail.com Mon Dec 15 15:26:21 2014 From: jhibschman at gmail.com (Johann Hibschman) Date: Mon, 15 Dec 2014 09:26:21 -0500 Subject: [datatable-help] Ordering of filter expressions In-Reply-To: References: Message-ID: I finally had time to put together a minimal example: > d <- data.table(a=1:2, b=1:2) > d[round(a) == a] a b 1: 1 1 2: 2 2 > d[a == round(a)] Error in eval(expr, envir, enclos) : object 'a' not found Is this a bug, or am I missing something about the scoping rules? R 3.1.2, data.table 1.9.4, on Windows 7. Thanks, Johann On Fri, Dec 12, 2014 at 10:18 AM, Johann Hibschman wrote: > I just ran into an issue where d[X==round(X)] gives different results > from d[round(X) == X]. Why would that happen? > > Here's the exact example: > >> z.dev[YIELD == round(YIELD),] > Error in eval(expr, envir, enclos) : object 'YIELD' not found >> z.dev[round(YIELD) == YIELD,] > runId dealName cusip scenarioId shockId pathOrder PRICE100 > YIELD MOD_DURN MOD_CONVEXITY DISC_MARGIN SPREAD_BP > 1: 10556 HVML0501 41161PLE1 772 0 3 54.5094 > 6 4.739 0.431 400 418 > 2: 10556 HVML0501 41161PLE1 773 0 3 52.9452 > 6 4.706 0.427 400 424 > > Thanks, > Johann From aragorn168b at gmail.com Mon Dec 15 15:53:42 2014 From: aragorn168b at gmail.com (Arunkumar Srinivasan) Date: Mon, 15 Dec 2014 15:53:42 +0100 Subject: [datatable-help] Ordering of filter expressions In-Reply-To: References: Message-ID: I can't reproduce this in 1.9.5 (current devel): http://github.com/Rdatatable/data.table On Mon, Dec 15, 2014 at 3:26 PM, Johann Hibschman wrote: > > I finally had time to put together a minimal example: > > > d <- data.table(a=1:2, b=1:2) > > d[round(a) == a] > a b > 1: 1 1 > 2: 2 2 > > d[a == round(a)] > Error in eval(expr, envir, enclos) : object 'a' not found > > Is this a bug, or am I missing something about the scoping rules? > > R 3.1.2, data.table 1.9.4, on Windows 7. > > Thanks, > Johann > > On Fri, Dec 12, 2014 at 10:18 AM, Johann Hibschman > wrote: > > I just ran into an issue where d[X==round(X)] gives different results > > from d[round(X) == X]. Why would that happen? > > > > Here's the exact example: > > > >> z.dev[YIELD == round(YIELD),] > > Error in eval(expr, envir, enclos) : object 'YIELD' not found > >> z.dev[round(YIELD) == YIELD,] > > runId dealName cusip scenarioId shockId pathOrder PRICE100 > > YIELD MOD_DURN MOD_CONVEXITY DISC_MARGIN SPREAD_BP > > 1: 10556 HVML0501 41161PLE1 772 0 3 54.5094 > > 6 4.739 0.431 400 418 > > 2: 10556 HVML0501 41161PLE1 773 0 3 52.9452 > > 6 4.706 0.427 400 424 > > > > Thanks, > > Johann > _______________________________________________ > datatable-help mailing list > datatable-help at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mel at mbacou.com Mon Dec 15 21:30:10 2014 From: mel at mbacou.com (Bacou, Melanie) Date: Mon, 15 Dec 2014 15:30:10 -0500 Subject: [datatable-help] Ordering of filter expressions In-Reply-To: References: Message-ID: <548F44D2.3060106@mbacou.com> I can reproduce in 1.9.4: > library(data.table) data.table 1.9.4 For help type: ?data.table *** NB: by=.EACHI is now explicit. See README to restore previous behaviour. Warning message: package 'data.table' was built under R version 3.1.1 > d <- data.table(a=1:2, b=1:2) > d[round(a) == a] a b 1: 1 1 2: 2 2 > d[a == round(a)] Error in eval(expr, envir, enclos) : object 'a' not found > versionInfo() Error: could not find function "versionInfo" > sessionInfo() R version 3.1.0 (2014-04-10) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics utils datasets grDevices methods base other attached packages: [1] data.table_1.9.4 rj_1.1.3-1 loaded via a namespace (and not attached): [1] chron_2.3-45 plyr_1.8.1 Rcpp_0.11.3 reshape2_1.4 rj.gd_1.1.3-1 [6] stringr_0.6.2 tools_3.1.0 On 12/15/2014 9:53 AM, Arunkumar Srinivasan wrote: > I can't reproduce this in 1.9.5 (current devel): > http://github.com/Rdatatable/data.table > > On Mon, Dec 15, 2014 at 3:26 PM, Johann Hibschman > > wrote: > > I finally had time to put together a minimal example: > > > d <- data.table(a=1:2, b=1:2) > > d[round(a) == a] > a b > 1: 1 1 > 2: 2 2 > > d[a == round(a)] > Error in eval(expr, envir, enclos) : object 'a' not found > > Is this a bug, or am I missing something about the scoping rules? > > R 3.1.2, data.table 1.9.4, on Windows 7. > > Thanks, > Johann > > On Fri, Dec 12, 2014 at 10:18 AM, Johann Hibschman > > wrote: > > I just ran into an issue where d[X==round(X)] gives different > results > > from d[round(X) == X]. Why would that happen? > > > > Here's the exact example: > > > >> z.dev[YIELD == round(YIELD),] > > Error in eval(expr, envir, enclos) : object 'YIELD' not found > >> z.dev[round(YIELD) == YIELD,] > > runId dealName cusip scenarioId shockId pathOrder PRICE100 > > YIELD MOD_DURN MOD_CONVEXITY DISC_MARGIN SPREAD_BP > > 1: 10556 HVML0501 41161PLE1 772 0 3 54.5094 > > 6 4.739 0.431 400 418 > > 2: 10556 HVML0501 41161PLE1 773 0 3 52.9452 > > 6 4.706 0.427 400 424 > > > > Thanks, > > Johann > _______________________________________________ > datatable-help mailing list > datatable-help at lists.r-forge.r-project.org > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > > > > _______________________________________________ > datatable-help mailing list > datatable-help at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help -- Melanie BACOU International Food Policy Research Institute Snr. Program Manager, HarvestChoice Work +1(202)862-5699 E-mail m.bacou at cgiar.org Visit www.harvestchoice.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From aragorn168b at gmail.com Mon Dec 15 21:38:28 2014 From: aragorn168b at gmail.com (Arunkumar Srinivasan) Date: Mon, 15 Dec 2014 21:38:28 +0100 Subject: [datatable-help] Ordering of filter expressions In-Reply-To: <548F44D2.3060106@mbacou.com> References: <548F44D2.3060106@mbacou.com> Message-ID: 1.9.5 is the current devel version. Bugs from 1.9.4 are likely to be fixed there. On Mon, Dec 15, 2014 at 9:30 PM, Bacou, Melanie wrote: > > I can reproduce in 1.9.4: > > > library(data.table) > data.table 1.9.4 For help type: ?data.table > *** NB: by=.EACHI is now explicit. See README to restore previous > behaviour. > Warning message: > package 'data.table' was built under R version 3.1.1 > > d <- data.table(a=1:2, b=1:2) > > d[round(a) == a] > a b > 1: 1 1 > 2: 2 2 > > d[a == round(a)] > Error in eval(expr, envir, enclos) : object 'a' not found > > > versionInfo() > Error: could not find function "versionInfo" > > sessionInfo() > R version 3.1.0 (2014-04-10) > Platform: x86_64-w64-mingw32/x64 (64-bit) > > locale: > [1] LC_COLLATE=English_United States.1252 > [2] LC_CTYPE=English_United States.1252 > [3] LC_MONETARY=English_United States.1252 > [4] LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics utils datasets grDevices methods base > > other attached packages: > [1] data.table_1.9.4 rj_1.1.3-1 > > loaded via a namespace (and not attached): > [1] chron_2.3-45 plyr_1.8.1 Rcpp_0.11.3 reshape2_1.4 rj.gd_1.1.3-1 > [6] stringr_0.6.2 tools_3.1.0 > > > > On 12/15/2014 9:53 AM, Arunkumar Srinivasan wrote: > > I can't reproduce this in 1.9.5 (current devel): > http://github.com/Rdatatable/data.table > > On Mon, Dec 15, 2014 at 3:26 PM, Johann Hibschman > wrote: >> >> I finally had time to put together a minimal example: >> >> > d <- data.table(a=1:2, b=1:2) >> > d[round(a) == a] >> a b >> 1: 1 1 >> 2: 2 2 >> > d[a == round(a)] >> Error in eval(expr, envir, enclos) : object 'a' not found >> >> Is this a bug, or am I missing something about the scoping rules? >> >> R 3.1.2, data.table 1.9.4, on Windows 7. >> >> Thanks, >> Johann >> >> On Fri, Dec 12, 2014 at 10:18 AM, Johann Hibschman >> wrote: >> > I just ran into an issue where d[X==round(X)] gives different results >> > from d[round(X) == X]. Why would that happen? >> > >> > Here's the exact example: >> > >> >> z.dev[YIELD == round(YIELD),] >> > Error in eval(expr, envir, enclos) : object 'YIELD' not found >> >> z.dev[round(YIELD) == YIELD,] >> > runId dealName cusip scenarioId shockId pathOrder PRICE100 >> > YIELD MOD_DURN MOD_CONVEXITY DISC_MARGIN SPREAD_BP >> > 1: 10556 HVML0501 41161PLE1 772 0 3 54.5094 >> > 6 4.739 0.431 400 418 >> > 2: 10556 HVML0501 41161PLE1 773 0 3 52.9452 >> > 6 4.706 0.427 400 424 >> > >> > Thanks, >> > Johann >> _______________________________________________ >> datatable-help mailing list >> datatable-help at lists.r-forge.r-project.org >> >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >> > > > _______________________________________________ > datatable-help mailing listdatatable-help at lists.r-forge.r-project.orghttps://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > > > -- > Melanie BACOU > International Food Policy Research Institute > Snr. Program Manager, HarvestChoice > Work +1(202)862-5699 > E-mail m.bacou at cgiar.org > Visit www.harvestchoice.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffzemla at gmail.com Mon Dec 15 23:00:53 2014 From: jeffzemla at gmail.com (Jeff Zemla) Date: Mon, 15 Dec 2014 17:00:53 -0500 Subject: [datatable-help] Ordering of filter expressions In-Reply-To: References: <548F44D2.3060106@mbacou.com> Message-ID: This can be fixed in 1.9.4 by doing: options(datatable.auto.index=FALSE) though I recommend upgrading to 1.9.5 instead On Mon, Dec 15, 2014 at 3:38 PM, Arunkumar Srinivasan wrote: > > 1.9.5 is the current devel version. Bugs from 1.9.4 are likely to be fixed > there. > > On Mon, Dec 15, 2014 at 9:30 PM, Bacou, Melanie wrote: >> >> I can reproduce in 1.9.4: >> >> > library(data.table) >> data.table 1.9.4 For help type: ?data.table >> *** NB: by=.EACHI is now explicit. See README to restore previous >> behaviour. >> Warning message: >> package 'data.table' was built under R version 3.1.1 >> > d <- data.table(a=1:2, b=1:2) >> > d[round(a) == a] >> a b >> 1: 1 1 >> 2: 2 2 >> > d[a == round(a)] >> Error in eval(expr, envir, enclos) : object 'a' not found >> >> > versionInfo() >> Error: could not find function "versionInfo" >> > sessionInfo() >> R version 3.1.0 (2014-04-10) >> Platform: x86_64-w64-mingw32/x64 (64-bit) >> >> locale: >> [1] LC_COLLATE=English_United States.1252 >> [2] LC_CTYPE=English_United States.1252 >> [3] LC_MONETARY=English_United States.1252 >> [4] LC_NUMERIC=C >> [5] LC_TIME=English_United States.1252 >> >> attached base packages: >> [1] stats graphics utils datasets grDevices methods base >> >> other attached packages: >> [1] data.table_1.9.4 rj_1.1.3-1 >> >> loaded via a namespace (and not attached): >> [1] chron_2.3-45 plyr_1.8.1 Rcpp_0.11.3 reshape2_1.4 rj.gd_1.1.3-1 >> [6] stringr_0.6.2 tools_3.1.0 >> >> >> >> On 12/15/2014 9:53 AM, Arunkumar Srinivasan wrote: >> >> I can't reproduce this in 1.9.5 (current devel): >> http://github.com/Rdatatable/data.table >> >> On Mon, Dec 15, 2014 at 3:26 PM, Johann Hibschman >> wrote: >>> >>> I finally had time to put together a minimal example: >>> >>> > d <- data.table(a=1:2, b=1:2) >>> > d[round(a) == a] >>> a b >>> 1: 1 1 >>> 2: 2 2 >>> > d[a == round(a)] >>> Error in eval(expr, envir, enclos) : object 'a' not found >>> >>> Is this a bug, or am I missing something about the scoping rules? >>> >>> R 3.1.2, data.table 1.9.4, on Windows 7. >>> >>> Thanks, >>> Johann >>> >>> On Fri, Dec 12, 2014 at 10:18 AM, Johann Hibschman >>> wrote: >>> > I just ran into an issue where d[X==round(X)] gives different results >>> > from d[round(X) == X]. Why would that happen? >>> > >>> > Here's the exact example: >>> > >>> >> z.dev[YIELD == round(YIELD),] >>> > Error in eval(expr, envir, enclos) : object 'YIELD' not found >>> >> z.dev[round(YIELD) == YIELD,] >>> > runId dealName cusip scenarioId shockId pathOrder PRICE100 >>> > YIELD MOD_DURN MOD_CONVEXITY DISC_MARGIN SPREAD_BP >>> > 1: 10556 HVML0501 41161PLE1 772 0 3 54.5094 >>> > 6 4.739 0.431 400 418 >>> > 2: 10556 HVML0501 41161PLE1 773 0 3 52.9452 >>> > 6 4.706 0.427 400 424 >>> > >>> > Thanks, >>> > Johann >>> _______________________________________________ >>> datatable-help mailing list >>> datatable-help at lists.r-forge.r-project.org >>> >>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >>> >> >> >> _______________________________________________ >> datatable-help mailing listdatatable-help at lists.r-forge.r-project.orghttps://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >> >> >> -- >> Melanie BACOU >> International Food Policy Research Institute >> Snr. Program Manager, HarvestChoice >> Work +1(202)862-5699 >> E-mail m.bacou at cgiar.org >> Visit www.harvestchoice.org >> >> > _______________________________________________ > datatable-help mailing list > datatable-help at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jhibschman at gmail.com Mon Dec 15 23:18:39 2014 From: jhibschman at gmail.com (Johann Hibschman) Date: Mon, 15 Dec 2014 17:18:39 -0500 Subject: [datatable-help] Ordering of filter expressions In-Reply-To: References: <548F44D2.3060106@mbacou.com> Message-ID: Thanks. Since I'm running this on Windows, I figure I'll just flip my conditions (round(x)==x) for now and wait until 1.9.5 is released, and if something still doesn't work, I'll try that option. On Mon, Dec 15, 2014 at 5:00 PM, Jeff Zemla wrote: > This can be fixed in 1.9.4 by doing: > > options(datatable.auto.index=FALSE) > > though I recommend upgrading to 1.9.5 instead > > > On Mon, Dec 15, 2014 at 3:38 PM, Arunkumar Srinivasan > wrote: >> >> 1.9.5 is the current devel version. Bugs from 1.9.4 are likely to be fixed >> there. >> >> On Mon, Dec 15, 2014 at 9:30 PM, Bacou, Melanie wrote: >>> >>> I can reproduce in 1.9.4: >>> >>> > library(data.table) >>> data.table 1.9.4 For help type: ?data.table >>> *** NB: by=.EACHI is now explicit. See README to restore previous >>> behaviour. >>> Warning message: >>> package 'data.table' was built under R version 3.1.1 >>> > d <- data.table(a=1:2, b=1:2) >>> > d[round(a) == a] >>> a b >>> 1: 1 1 >>> 2: 2 2 >>> > d[a == round(a)] >>> Error in eval(expr, envir, enclos) : object 'a' not found >>> >>> > versionInfo() >>> Error: could not find function "versionInfo" >>> > sessionInfo() >>> R version 3.1.0 (2014-04-10) >>> Platform: x86_64-w64-mingw32/x64 (64-bit) >>> >>> locale: >>> [1] LC_COLLATE=English_United States.1252 >>> [2] LC_CTYPE=English_United States.1252 >>> [3] LC_MONETARY=English_United States.1252 >>> [4] LC_NUMERIC=C >>> [5] LC_TIME=English_United States.1252 >>> >>> attached base packages: >>> [1] stats graphics utils datasets grDevices methods base >>> >>> other attached packages: >>> [1] data.table_1.9.4 rj_1.1.3-1 >>> >>> loaded via a namespace (and not attached): >>> [1] chron_2.3-45 plyr_1.8.1 Rcpp_0.11.3 reshape2_1.4 rj.gd_1.1.3-1 >>> [6] stringr_0.6.2 tools_3.1.0 >>> >>> >>> >>> On 12/15/2014 9:53 AM, Arunkumar Srinivasan wrote: >>> >>> I can't reproduce this in 1.9.5 (current devel): >>> http://github.com/Rdatatable/data.table >>> >>> On Mon, Dec 15, 2014 at 3:26 PM, Johann Hibschman >>> wrote: >>>> >>>> I finally had time to put together a minimal example: >>>> >>>> > d <- data.table(a=1:2, b=1:2) >>>> > d[round(a) == a] >>>> a b >>>> 1: 1 1 >>>> 2: 2 2 >>>> > d[a == round(a)] >>>> Error in eval(expr, envir, enclos) : object 'a' not found >>>> >>>> Is this a bug, or am I missing something about the scoping rules? >>>> >>>> R 3.1.2, data.table 1.9.4, on Windows 7. >>>> >>>> Thanks, >>>> Johann >>>> >>>> On Fri, Dec 12, 2014 at 10:18 AM, Johann Hibschman >>>> wrote: >>>> > I just ran into an issue where d[X==round(X)] gives different results >>>> > from d[round(X) == X]. Why would that happen? >>>> > >>>> > Here's the exact example: >>>> > >>>> >> z.dev[YIELD == round(YIELD),] >>>> > Error in eval(expr, envir, enclos) : object 'YIELD' not found >>>> >> z.dev[round(YIELD) == YIELD,] >>>> > runId dealName cusip scenarioId shockId pathOrder PRICE100 >>>> > YIELD MOD_DURN MOD_CONVEXITY DISC_MARGIN SPREAD_BP >>>> > 1: 10556 HVML0501 41161PLE1 772 0 3 54.5094 >>>> > 6 4.739 0.431 400 418 >>>> > 2: 10556 HVML0501 41161PLE1 773 0 3 52.9452 >>>> > 6 4.706 0.427 400 424 >>>> > >>>> > Thanks, >>>> > Johann >>>> _______________________________________________ >>>> datatable-help mailing list >>>> datatable-help at lists.r-forge.r-project.org >>>> >>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >>> >>> >>> >>> _______________________________________________ >>> datatable-help mailing list >>> datatable-help at lists.r-forge.r-project.org >>> >>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >>> >>> >>> -- >>> Melanie BACOU >>> International Food Policy Research Institute >>> Snr. Program Manager, HarvestChoice >>> Work +1(202)862-5699 >>> E-mail m.bacou at cgiar.org >>> Visit www.harvestchoice.org >> >> >> _______________________________________________ >> datatable-help mailing list >> datatable-help at lists.r-forge.r-project.org >> >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > > > _______________________________________________ > datatable-help mailing list > datatable-help at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help From csjoqvis at abo.fi Wed Dec 17 22:52:29 2014 From: csjoqvis at abo.fi (csjoqvis) Date: Wed, 17 Dec 2014 13:52:29 -0800 (PST) Subject: [datatable-help] Import matrix? Message-ID: <1418853149811-4700867.post@n4.nabble.com> Hi, I have a txt.file with 32 columns and 32 rows (below) that I want R to treat as a distance matrix. 1 2 3 4... 1 NA 2 0.4 NA 3 0.3 0.7 NA 4 0.9 0.6 0.1 NA . . . How can I do this? Conny -- View this message in context: http://r.789695.n4.nabble.com/Import-matrix-tp4700867.html Sent from the datatable-help mailing list archive at Nabble.com.