From memonsalim2008 at yahoo.co.in Fri Nov 3 10:23:03 2017 From: memonsalim2008 at yahoo.co.in (memonsalim2008) Date: Fri, 3 Nov 2017 02:23:03 -0700 (MST) Subject: [datatable-help] Time Series Graphics - cannot plot more than 10 series. Message-ID: <1509700983751-0.post@n4.nabble.com> Hi, anyone help me how to plot more than 10 series in Time Series Graphics. -- Sent from: http://r.789695.n4.nabble.com/datatable-help-f2315188.html From memedede379 at gmail.com Mon Nov 13 14:44:09 2017 From: memedede379 at gmail.com (leebrimlow) Date: Mon, 13 Nov 2017 06:44:09 -0700 (MST) Subject: [datatable-help] non-numeric argument to binary operator -R Message-ID: <1510580649004-0.post@n4.nabble.com> I have this code : dataset <- maml.mapInputPort(1) library(tm) ## Text mining library ## Set the comlumn names colnames(dataset) <- c("sentiment", "tweets") ## Extract text data and coerce the vector to a tm corpus tweet.text<- Corpus(VectorSource(dataset['tweets'])) ## Apply transformations to the corpus tweet.text<- tm_map(tweet.text, content_transformer(removeNumbers)) tweet.text<- tm_map(tweet.text, content_transformer(removePunctuation)) tweet.text<- tm_map(tweet.text, content_transformer(stripWhitespace)) tweet.text<- tm_map(tweet.text, content_transformer(tolower)) ## Transform the processed corpus back to a vector of ## character strings in a dataframe tweet_content<- unlist(sapply(tweet.text, '[', "content")) outframe <- data.frame(tweets = enc2utf8(tweet_content), sentiment =dataset$sentiment / as.numeric(2 - 1), stringsAsFactors = F, row.names = NULL) ## Output the result maml.mapOutputPort("outframe") Error message is: requestId = 4291be2aafb34195b285d9a930804d81 errorComponent=Module. taskStatusCode=400. {"Exception":{"ErrorId":"FailedToEvaluateRScript","ErrorCode":"0063","ExceptionType":"ModuleException","Message":"Error 0063: The following error occurred during evaluation of R script:\r\n---------- Start of error message from R ----------\r\nnon-numeric argument to binary operator\r\n\r\n\r\nnon-numeric argument to binary operator\r\n----------- End of error message from R -----------"}}Error: Error 0063: The following error occurred during evaluation of R script:---------- Start of error message from R ----------non-numeric argument to binary operatornon-numeric argument to binary operator----------- End of error message from R ----------- Process exited with error code -2 Any help please! -- Sent from: http://r.789695.n4.nabble.com/datatable-help-f2315188.html From jordan.browne at sickkids.ca Mon Nov 13 14:57:10 2017 From: jordan.browne at sickkids.ca (Jordan Browne) Date: Mon, 13 Nov 2017 13:57:10 +0000 Subject: [datatable-help] Remove from mailing list Message-ID: Remove me from mailing list [Picture1] ________________________________ This e-mail may contain confidential, personal and/or health information(information which may be subject to legal restrictions on use, retention and/or disclosure) for the sole use of the intended recipient. Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited. If you have received this e-mail in error, please contact the sender and delete all copies. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 9718 bytes Desc: image001.jpg URL: From jnowacki at gmail.com Mon Nov 20 16:55:03 2017 From: jnowacki at gmail.com (Jon17) Date: Mon, 20 Nov 2017 08:55:03 -0700 (MST) Subject: [datatable-help] Fitting Data to Arbitrary Line (Regression but different) Message-ID: <1511193303563-0.post@n4.nabble.com> I'm trying to describe the variability of the blue dots from the reds line (always set at 1). What's the best way to do this? This is kind of like Regression but I'm fixing the red line in place. Regular regression would change the slope and position of the red line and I don't want that. What's the easiest way to do this in R? Actual Data 0.774229 1.162761 1.109631 1.372468 1.13379 1.153927 1.118349 1.137478 1.22055 1.207787 1.226321 1.215817 1.226673 1.24001 1.241038 1.254003 1.253196 1.253837 1.255753 1.262815 1.252019 1.262826 1.253053 1.255433 1.258043 1.248196 1.244444 1.240364 1.2393 1.217359 1.198974 1.196677 1.174224 1.159929 1.122434 1.089592 1.03896 0.988716 0.960175 0.944184 0.901207 0.894323 0.879981 0.886207 0.868376 0.861888 0.856275 0.855425 0.85457 0.855529 0.855288 0.850127 0.852516 0.849213 0.858875 0.864621 0.878321 0.893041 0.893581 0.905733 0.919137 0.922423 0.930123 0.921668 0.920419 0.909795 0.908668 0.894328 0.89776 0.891001 0.879262 0.870372 0.853152 0.842036 0.834984 0.800609 0.796472 0.767675 0.747799 0.73456 0.706946 0.704581 0.699557 0.675703 0.543421 0.55512 0.662283 0.629829 0.510493 0.086025 0.271311 -- Sent from: http://r.789695.n4.nabble.com/datatable-help-f2315188.html From npandis at yahoo.com Mon Nov 20 22:47:01 2017 From: npandis at yahoo.com (Nikolaos Pandis) Date: Mon, 20 Nov 2017 21:47:01 +0000 (UTC) Subject: [datatable-help] Stratified confdence intervals References: <1878649147.17618.1511214421476.ref@mail.yahoo.com> Message-ID: <1878649147.17618.1511214421476@mail.yahoo.com> Hi,I was wondering if someone could possibly suggest a solution to the following:I have fitted a linear model in R [lm command] which includes both categorical and continuous variables and one interaction between a categorical and continuous variable.The interaction is significant and I was wondering if there is a way after fitting the model to calculate using a command the stratified estimates and associated confidence intervals for the interaction variables.Many thanks,Nikos -------------- next part -------------- An HTML attachment was scrubbed... URL: From max-gloeckner at web.de Sun Nov 26 02:35:27 2017 From: max-gloeckner at web.de (xam) Date: Sat, 25 Nov 2017 18:35:27 -0700 (MST) Subject: [datatable-help] Random Walk - stock price Message-ID: <1511660127733-0.post@n4.nabble.com> Hello everybody, first of all I am a beginner in r. At the moment I' m trying to figure out a solution for my r problem. I want to create a random walk code for a stock price. That is my current status: days=250 plot(rnorm(250),type="l",xlab="days",ylab="stock price",main="stock price XY") plus for example mean() and sd() (based on historical data) plot(rnorm(250,mean(),sd()),type="l",xlab="days",ylab="stock price",main="stock price XY") Problem: The random walk simulation starts with 0. The stock price is for example 100 and i want to start a random walk simulation from this price. How can I do that? Thx. -- Sent from: http://r.789695.n4.nabble.com/datatable-help-f2315188.html From max-gloeckner at web.de Sun Nov 26 02:42:10 2017 From: max-gloeckner at web.de (xam) Date: Sat, 25 Nov 2017 18:42:10 -0700 (MST) Subject: [datatable-help] Random Walk - stock price Message-ID: <1511660530791-0.post@n4.nabble.com> Hello everybody, first of all I am a beginner in r. At the moment I' m trying to figure out a solution for my r problem. I want to create a random walk code for a stock price. That is my current status: days=250 plot(rnorm(250),type="l",xlab="days",ylab="stock price",main="stock price XY") plus for example mean() and sd() (based on historical data) plot(rnorm(250,mean(),sd()),type="l",xlab="days",ylab="stock price",main="stock price XY") Problem: The random walk simulation starts with 0. The stock price is for example 100 and i want to start a random walk simulation from this price. How can I do that? Thx. -- Sent from: http://r.789695.n4.nabble.com/datatable-help-f2315188.html From dgkarthik.mf at gmail.com Mon Nov 27 16:18:20 2017 From: dgkarthik.mf at gmail.com (KarthikDG) Date: Mon, 27 Nov 2017 08:18:20 -0700 (MST) Subject: [datatable-help] RCurl_SOAP Request with Attachment Message-ID: <1511795900939-0.post@n4.nabble.com> I am trying for a SOAP request with an attachment but without success. Could you please suggest on the below code. Thanks in advance. library(RCurl) library(XML) body = ' 1111111 false CAESCorrespondentie 100 false false image001.pdf X.X. XXX XXXXX XXXXXXXX 1111XX 11 ? ? XXXXXXXX XXXXXX XXX 2017-11-06 XXXXXX XXXXX XXXX XXXXXXXXXXX XXXXXXXXXXXXX XXXXXXXXXXXX false false false false Email false false ' reader = basicTextGatherer() options(RCurlOptions = list(cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl"), followlocation = TRUE, httpheader = c(Accept = "text/xml",Accept = "multipart/*",'Content-Type' = "text/xml; charset=utf-8", SOAPAction="INTRANET URL"), userpwd = "USERID:PASSWORD", netrc = TRUE,timeout = 100, postfields = body, writefunction = reader$update, verbose=TRUE,httpauth=AUTH_BASIC)) response_IncomingCommunicationService <- getForm("INTRANET URL", file = fileUpload(filename = "C:\\Users\\USERNAME\\Desktop\\image001.pdf", contentType = "application/pdf"),upload=TRUE) response_IncomingCommunicationService_html <- htmlTreeParse(response_IncomingCommunicationService)$children$html response_IncomingCommunicationService_html closeAllConnections() For the above code, I am getting the status as HTTP 1.1 200 OK but with the response as no attachment. Could you please check and let me know whether I am missing anything. -- Sent from: http://r.789695.n4.nabble.com/datatable-help-f2315188.html From izabellamurphy at aol.co.uk Mon Nov 27 18:22:51 2017 From: izabellamurphy at aol.co.uk (Izzy_M) Date: Mon, 27 Nov 2017 10:22:51 -0700 (MST) Subject: [datatable-help] Problem(s) finding p-values for numerous spearman correlations Message-ID: <1511803371083-0.post@n4.nabble.com> Hi everyone, I am very, very new to R, and I'm trying to work out the p-values for thousands of spearman correlation scores. Essentially, I have imported a large dataset from a CSV file (366 obs. of 73775 variables) into R Studio. Along the x-axis, I have a series of words, the y-axis contains dates, and the data is the relative frequencies of each of the words on that particular date. Essentially, I am trying to see if the frequency of any/all of the given words increases significantly over the course of a year. After some trial and error (and a lot of Googling!), I have a code which successfully stores the Spearman Correlation values in a matrix: x <- my_data[1:73775] y <- my_data[1] corrs3 <- round(cor(x, y, method = "spearman", use="complete.obs"), 3) This code stores the words in one column of the matrix and their Spearman value in the second column However, what I need to do now is to calculate the corresponding p-values for each of the variables. I have been able to this for individual variables by running the following code (although I do get a warning saying "Cannot compute exact p-value with ties", but I've been told that this isn't a major problem?): cor.test(1:73775, my_data$romcom, method = "spearman") However, what I would ideally like to do is store the p-value next to the Spearman value in the matrix (if that is possible). The consensus seems to be that Hmisc is the ideal tool for this kind of thing, so I installed that library, and I've been attempting to run it as follows flattenCorrMatrix <- function(cormat, pmat) { ut <- upper.tri(cormat) data.frame( row = rownames(cormat)[row(cormat)[ut]], column = rownames(cormat)[col(cormat)[ut]], cor =(cormat)[ut], p = pmat[ut] ) } x <- my_data[1:73775] y <- my_data[1] library(Hmisc) res2<-rcorr(as.matrix(my_data[x,y])) flattenCorrMatrix(res2$r, res2$P) However, I get an error message, stating: "Unsupported index type: tbl_df". And I'm unsure how to fix this. I've also tried bypassing Hmisc and using the following: x <- my_data[1:73775] y <- my_data[1] corrs3 <- round(cor.test(x, y, method = "spearman", use="complete.obs"), 3) But this returns the error message: Error in cor.test.default(x, y, method = "spearman", use = "complete.obs") : 'x' and 'y' must have the same length More Googling suggested that the "corr.test" function from the psych library would be better. However, when I use the following code: x <- my_data[1:73775] y <- my_data[1] library("psych") corr.test(x, y = NULL, use = "pairwise", method="spearman", ci=TRUE) I get the following error message: Error: cannot allocate vector of size 40.6 Gb I'm really out of options now, and I would really appreciate any suggestions! Thanks! -- Sent from: http://r.789695.n4.nabble.com/datatable-help-f2315188.html From EJBernstein at wellington.com Tue Nov 28 18:50:26 2017 From: EJBernstein at wellington.com (Bernstein, Elliot J) Date: Tue, 28 Nov 2017 17:50:26 +0000 Subject: [datatable-help] Column Name Masking Variable Message-ID: I'm running into an issue where a data.table has a column with the same name as a variable, and I would like to reference the variable in an i expression. For example, library(data.table) x <- data.table(y = 1:10, z = 11:20) setkey(x, y) z <- data.table(y = 1:5, w = 21:25) setkey(z, y) I want to extract the subset of x with values of y that are not in the y column of the data.table z. I can do the following: ind <- setdiff(x[,y], z[,y]) x[ind] But I can't do the following: x[setdiff(y, z[,y])] because it tries to use the column z instead of the data.table z. Is there any way to get around that? I tried using "with = FALSE", but that only seems to apply to the j expression. Thank you for your help. - Elliot -------------- next part -------------- An HTML attachment was scrubbed... URL: