[datatable-help] tstrsplit throwing error after melt
Carl Sutton
suttoncarl at ymail.com
Thu Jan 19 22:36:37 CET 2017
Hi
I had thought I was finished with this aspect of the project but yesterday this error appeared.
Error in strsplit(as.character(x), ...) : object 'variable' not found
This occurs immediately after melting a 363 column data table. A head of the data shows :
> head(races_1$variable)
[1] raceDate_1 raceDate_1 raceDate_1 raceDate_1 raceDate_1 raceDate_1
116 Levels: raceDate_1 raceDate_2 raceDate_3 raceDate_4 ... Winner_7
Now admittedly melt returns the variable column as a factor, but that works just fine with toy data. I am
perplexed why it bombs on real data. The error message appears non nonsensical because it is obvious the
column does exist. Also, since the error specifies STRSPLIT, not TSTRSPLIT, it just may be one of the base
nonsensical error messages. Helpful to know there is an error, but....
Here is the code leading up to this error:
suppressMessages(library(data.table))
races.names <- colnames(races)
id_vars <- races.names[1:14]
measure_vars <- races.names[15:363] # yes, I have to reset the mode afterwords.
system.time(races_1 <-melt(races, id = id_vars, measure =
measure_vars))
# Separate variable name from prior race numbers
# sequence (1:10)
races_1 <- races[, c("MPdata","PriorRaceSeq") :=
tstrsplit(variable, "_")]
> races_1 <- races[, c("MPdata","PriorRaceSeq") :=
+ tstrsplit(variable, "_")]
Error in strsplit(as.character(x), ...) : object 'variable' not found
And data str after this
> str(races_1)
Classes ‘data.table’ and 'data.frame': 48511 obs. of 16 variables:
$ TrackToday : chr "AQU" "AQU" "AQU" "AQU" ...
$ DateToday : int 20120101 20120101 20120101 20120101 20120101 20120101 20120101 20120101 20120101 20120101 ...
$ RaceNumberToday : int 1 1 1 1 1 1 1 1 2 2 ...
$ PostPositionToday : int 1 2 3 4 5 6 7 8 1 2 ...
$ DistanceToday : int 1320 1320 1320 1320 1320 1320 1320 1320 1320 1320 ...
$ SurfaceToday : chr "d" "d" "d" "d" ...
$ RaceTypeToday : chr "AO" "AO" "AO" "AO" ...
$ RaceClassToday : chr "OClm 50000nw1" "OClm 50000nw1" "OClm 50000nw1" "OClm 50000nw1" ...
$ PurseToday : int 51000 51000 51000 51000 51000 51000 51000 51000 60000 60000 ...
$ ClaimingPriceToday: int 50000 50000 50000 50000 50000 50000 50000 50000 NA NA ...
$ MorningLineOdds : num 2 20 4 30 2.5 8 5 15 20 2 ...
$ HorseName : chr "FUNKY MUNKY MAMA" "STARSHIP WARPSPEED" "SIGGI THE ALIEN" "SHANDREA" ...
$ HDWrunStyle : chr "E " "P " "EP " "E " ...
$ DaysSinceLastRace : int 78 45 31 17 39 30 17 50 NA NA ...
$ variable : Factor w/ 349 levels "raceDate_1","raceDate_2",..: 1 1 1 1 1 1 1 1 1 1 ...
$ value : chr "20111015" "20111117" "20111201" "20111215" ...
- attr(*, ".internal.selfref")=<externalptr>
Toy data that works thanks to prior help request:
library(data.table)
library(tidyr)
# data table for melt and columns split
dt1 <- data.table(a_1 = 1:10, b_2 = 20:29,folks = c("art","brian","ed",
"rich","dennis","frank", "derrick","paul","fred","numnuts"),
a_2 = 2:11, b_1 = 21:30)
melted <- melt(dt1, id = "folks")[,c("varType","varIndex") :=
tstrsplit(variable,"_")][,variable:=NULL]
What is also puzzling is that the next statement sets column "variable" to NULL and that works. Thus logic
says it is not the column that is missing but "something" else entirely.
Any ideas greatly appreciated.
Carl Sutton
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20170119/88767fce/attachment.html>
More information about the datatable-help
mailing list