From EJBernstein at wellington.com Wed Aug 16 22:44:48 2017 From: EJBernstein at wellington.com (Bernstein, Elliot J) Date: Wed, 16 Aug 2017 20:44:48 +0000 Subject: [datatable-help] Data Table Subset Question Message-ID: <9fb87885971d491fb2857f0f1b634cac@WDCAMSPRDMBX2.wellmanage.com> Is there a way to subset a data table by the result of a grouped aggregation without adding an interim column to the table? For example, if I want to select all rows for which the group mean value of x is less than 10, I can do the following: data <- data.table(x = 1:20, g = rep(c("a", "b"), each = 10)) data[, mean.x := mean(x), by = .(g)] data[mean.x < 10,] But I'm not really interested in "mean.x". Can I do the same thing without adding it to the table? Thanks. - Elliot -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperickson at wisc.edu Thu Aug 17 00:23:42 2017 From: fperickson at wisc.edu (Frank Erickson) Date: Wed, 16 Aug 2017 18:23:42 -0400 Subject: [datatable-help] Data Table Subset Question In-Reply-To: <9fb87885971d491fb2857f0f1b634cac@WDCAMSPRDMBX2.wellmanage.com> References: <9fb87885971d491fb2857f0f1b634cac@WDCAMSPRDMBX2.wellmanage.com> Message-ID: One idiom for testing group-level conditions is: data[, if (mean(x) < 10) .SD, by=g] This might be slower in the special case of taking a mean. See ?GForce. There's a request for an idiom like SQL HAVING over here: https://github.com/Rdatatable/data.table/issues/788 --Frank On Wed, Aug 16, 2017 at 4:44 PM, Bernstein, Elliot J < EJBernstein at wellington.com> wrote: > Is there a way to subset a data table by the result of a grouped > aggregation without adding an interim column to the table? For example, if > I want to select all rows for which the group mean value of x is less than > 10, I can do the following: > > > > data <- data.table(x = 1:20, g = rep(c("a", "b"), each = 10)) > > data[, mean.x := mean(x), by = .(g)] > > data[mean.x < 10,] > > > > But I?m not really interested in ?mean.x?. Can I do the same thing without > adding it to the table? > > > > Thanks. > > > > - Elliot > > _______________________________________________ > datatable-help mailing list > datatable-help at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/ > listinfo/datatable-help > -------------- next part -------------- An HTML attachment was scrubbed... URL: From EJBernstein at wellington.com Thu Aug 17 16:05:21 2017 From: EJBernstein at wellington.com (Bernstein, Elliot J) Date: Thu, 17 Aug 2017 14:05:21 +0000 Subject: [datatable-help] Data Table Subset Question In-Reply-To: References: <9fb87885971d491fb2857f0f1b634cac@WDCAMSPRDMBX2.wellmanage.com> Message-ID: <72521e5e2fee46689cde9ec1bf42ed9b@WDCAMSPRDMBX2.wellmanage.com> Thanks! - Elliot From: by.hook.or at gmail.com [mailto:by.hook.or at gmail.com] On Behalf Of Frank Erickson Sent: Wednesday, August 16, 2017 6:24 PM To: Bernstein, Elliot J Cc: datatable-help at lists.r-forge.r-project.org Subject: Re: [datatable-help] Data Table Subset Question One idiom for testing group-level conditions is: data[, if (mean(x) < 10) .SD, by=g] This might be slower in the special case of taking a mean. See ?GForce. There's a request for an idiom like SQL HAVING over here: https://github.com/Rdatatable/data.table/issues/788 --Frank On Wed, Aug 16, 2017 at 4:44 PM, Bernstein, Elliot J > wrote: Is there a way to subset a data table by the result of a grouped aggregation without adding an interim column to the table? For example, if I want to select all rows for which the group mean value of x is less than 10, I can do the following: data <- data.table(x = 1:20, g = rep(c("a", "b"), each = 10)) data[, mean.x := mean(x), by = .(g)] data[mean.x < 10,] But I?m not really interested in ?mean.x?. Can I do the same thing without adding it to the table? Thanks. - Elliot _______________________________________________ datatable-help mailing list datatable-help at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help -------------- next part -------------- An HTML attachment was scrubbed... URL: From simply.red at gmx.net Sat Aug 26 18:44:38 2017 From: simply.red at gmx.net (Happyglider) Date: Sat, 26 Aug 2017 09:44:38 -0700 (MST) Subject: [datatable-help] Connect tables in R Message-ID: <1503765878671-4744994.post@n4.nabble.com> Hi all, I'm new to R and would like to know how i can connect two tables/matrixes (e.g. through use of a key). I tried googling my question, but unfortunately wasn't successful. I'd be very happy if somebody could help me. If the question is too simple and was answered somewhere else, which I didn't find, just post the link. :) Here's my question: I have two tables. The tables where inserted into R from excel. One table is filled with soccer players, another one with soccerclubs. In the first table, soccer players are listed. Each player has an ID, a name, age and nationality. The second table lists soccerclubs. Of course players compete for the club, therefore the the IDs of the players are listed also in the second table. Furthermore the second table states, which kind of nutrition products the players of a club use. I now want to collect information from the two tables. For example I want to find out, whether players from a certain age group or a certain nationality prefere different nutrition products than older/younger players from another country. If I'm correct, I need to connect the two tables in a first step (but I don't know how to do that in R) and then need to collect the information in a second step. I could do that with the help of MS Access, but my university wants me to solve the problem with R. I hope you understand my question, as I probably described it quite complicatedly (english is not my first language). Still, I hope, that someone can help me. Looking forward to some answers and thank you in advance, for your help, Happyglider -- View this message in context: http://r.789695.n4.nabble.com/Connect-tables-in-R-tp4744994.html Sent from the datatable-help mailing list archive at Nabble.com. From zkokia at 163.com Mon Aug 28 13:18:23 2017 From: zkokia at 163.com (Kokia Z) Date: Mon, 28 Aug 2017 19:18:23 +0800 (CST) Subject: [datatable-help] Matrix multiplication in R: requires numeric/complex matrix/vector arguments Message-ID: <3b277382.943e.15e28905eed.Coremail.zkokia@163.com> ======================================= Error in Ws %*% as.matrix(endog) : requires numeric/complex matrix/vector arguments ======================================= where ?Ws? is a N*N spatial weight matrix and ?endog?is N*3 (we have three endogenous variables) I cannot understand why this error occours because when I set only one endogenous variable my code works well. What's more, three endogenous variables are not dummy. Is it because that two of these endogenous variables are interaction? Or because of the package 'splm/spgm' cannot allow for extra endogenous variables? Here is part of my code: c11_omega <- interaction(mydata$c11,mydata$omega) d11_omega <- interaction(mydata$d11,mydata$omega) endog = ~ omega + c11_omega + d11_omega instruments = ~ XI + c11_XI + d11_XI and the regression is: fm <- ex ~ c11 + d11 + control variables Thanks for your help! -------------- next part -------------- An HTML attachment was scrubbed... URL: From bioglp at gmail.com Tue Aug 29 11:01:23 2017 From: bioglp at gmail.com (glaporta) Date: Tue, 29 Aug 2017 02:01:23 -0700 (MST) Subject: [datatable-help] Connect tables in R In-Reply-To: <1503765878671-4744994.post@n4.nabble.com> References: <1503765878671-4744994.post@n4.nabble.com> Message-ID: <1503997283795-4745108.post@n4.nabble.com> Hi, I could suggest two ways to do that: ?merge or install the package sqldf that includes all the functions of sql language. All the best, Gianandrea -- View this message in context: http://r.789695.n4.nabble.com/Connect-tables-in-R-tp4744994p4745108.html Sent from the datatable-help mailing list archive at Nabble.com.