[datatable-help] tstrsplit throwing error after melt

Carl Sutton suttoncarl at ymail.com
Thu Jan 19 22:36:37 CET 2017


Hi
I had thought I was finished with this aspect of the project but yesterday this error appeared.  
Error in strsplit(as.character(x), ...) : object 'variable' not found

This occurs immediately after melting a 363 column data table.  A head of the data shows :

> head(races_1$variable)
[1] raceDate_1 raceDate_1 raceDate_1 raceDate_1 raceDate_1 raceDate_1
116 Levels: raceDate_1 raceDate_2 raceDate_3 raceDate_4 ... Winner_7

Now admittedly melt returns the variable column as a factor, but that works just fine with toy data. I am 
perplexed why it bombs on real data.  The error message appears non nonsensical because it is obvious the 
column does exist.  Also, since the error specifies STRSPLIT, not TSTRSPLIT, it just may be one of the base 
nonsensical error messages.  Helpful to know there is an error, but....

Here is the code leading up to this error:
suppressMessages(library(data.table))
        races.names <- colnames(races)
        id_vars <- races.names[1:14]
        measure_vars <- races.names[15:363]  #  yes, I have to reset the mode afterwords.
        system.time(races_1 <-melt(races, id = id_vars, measure = 
                          measure_vars))
        #  Separate variable name from prior race numbers
        #  sequence (1:10)
        races_1 <- races[, c("MPdata","PriorRaceSeq") := 
                        tstrsplit(variable, "_")]

        
>         races_1 <- races[, c("MPdata","PriorRaceSeq") := 
+                         tstrsplit(variable, "_")]
Error in strsplit(as.character(x), ...) : object 'variable' not found

      And data str after this
> str(races_1)
Classes ‘data.table’ and 'data.frame':	48511 obs. of  16 variables:
 $ TrackToday        : chr  "AQU" "AQU" "AQU" "AQU" ...
 $ DateToday         : int  20120101 20120101 20120101 20120101 20120101 20120101 20120101 20120101 20120101 20120101 ...
 $ RaceNumberToday   : int  1 1 1 1 1 1 1 1 2 2 ...
 $ PostPositionToday : int  1 2 3 4 5 6 7 8 1 2 ...
 $ DistanceToday     : int  1320 1320 1320 1320 1320 1320 1320 1320 1320 1320 ...
 $ SurfaceToday      : chr  "d" "d" "d" "d" ...
 $ RaceTypeToday     : chr  "AO" "AO" "AO" "AO" ...
 $ RaceClassToday    : chr  "OClm 50000nw1" "OClm 50000nw1" "OClm 50000nw1" "OClm 50000nw1" ...
 $ PurseToday        : int  51000 51000 51000 51000 51000 51000 51000 51000 60000 60000 ...
 $ ClaimingPriceToday: int  50000 50000 50000 50000 50000 50000 50000 50000 NA NA ...
 $ MorningLineOdds   : num  2 20 4 30 2.5 8 5 15 20 2 ...
 $ HorseName         : chr  "FUNKY MUNKY MAMA" "STARSHIP WARPSPEED" "SIGGI THE ALIEN" "SHANDREA" ...
 $ HDWrunStyle       : chr  "E  " "P  " "EP " "E  " ...
 $ DaysSinceLastRace : int  78 45 31 17 39 30 17 50 NA NA ...
 $ variable          : Factor w/ 349 levels "raceDate_1","raceDate_2",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ value             : chr  "20111015" "20111117" "20111201" "20111215" ...
 - attr(*, ".internal.selfref")=<externalptr> 

Toy data that works thanks to prior help request:
 library(data.table)
library(tidyr)
#  data table for melt and columns split
dt1 <- data.table(a_1 = 1:10, b_2 = 20:29,folks = c("art","brian","ed",
         "rich","dennis","frank", "derrick","paul","fred","numnuts"),
          a_2 = 2:11, b_1 = 21:30)
melted <- melt(dt1, id = "folks")[,c("varType","varIndex") :=
                 tstrsplit(variable,"_")][,variable:=NULL]

 What is also puzzling is that the next statement sets column "variable" to NULL and that works.  Thus logic 
says it is not the column that is missing but "something" else entirely.

Any ideas greatly appreciated.

Carl Sutton
 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20170119/88767fce/attachment.html>


More information about the datatable-help mailing list