[datatable-help] leading spaces on column names from CSV file
Matt Dowle
mdowle at mdowle.plus.com
Wed Jan 22 20:40:50 CET 2014
Hi Jim,
Ok yes good idea. The roots of that are due to data.table allows
leading and trailing spaces (and all special characters) in column
names. The confusion that by="a, b, c" doesn't work due to the spaces,
due to column name " b" being different to "b". But for fread that
doesn't make sense and it should drop the leading spaces by default,
yes. I've added this to the top of fread.c where I'm logging the fread
ToDos.
Thanks, Matt
On 22/01/14 16:21, jholtman wrote:
> I have a CSV file that I had been reading with 'read.csv' and it turns out
> that the column names had leading spaces in some instances, but read.csv
> would remove them.
>
> I tried to read the file with fread and it was keeping the leading spaces.
> Here is dump of the session:
>
>> positXY <- read.csv(positionFile, as.is = TRUE)
>>
>>
>> str(positXY)
> 'data.frame': 12 obs. of 8 variables:
> $ ieee : chr "1972cd01004b1200" "6375cd01004b1200"
> "5875cd01004b1200" "1972cd01004b1200" ...
> $ startTime : chr "13:46" "13:46" "13:46" "13:53" ... # no leading
> space on next three
> $ endTime : chr "13:51" "13:51" "13:51" "13:58" ...
> $ x : int 65 65 65 65 65 65 65 65 65 65 ...
> $ y : int 45 45 45 45 45 45 45 45 45 45 ...
> $ z : num 3.4 3.4 3.4 3.4 3.4 3.4 3.4 3.4 3.4 3.4 ...
> $ deviceDirection: chr "east" "west" "north" "south" ...
> $ test. : logi NA NA NA NA NA NA ...
>> positXY <- fread(positionFile)
>>
>> str(positXY)
> Classes ‘data.table’ and 'data.frame': 12 obs. of 8 variables:
> $ ieee : chr "1972cd01004b1200" "6375cd01004b1200"
> "5875cd01004b1200" "1972cd01004b1200" ...
> $ startTime : chr "13:46" "13:46" "13:46" "13:53" ... # leading
> space on next three
> $ endTime : chr "13:51" "13:51" "13:51" "13:58" ...
> $ x : int 65 65 65 65 65 65 65 65 65 65 ...
> $ y : int 45 45 45 45 45 45 45 45 45 45 ...
> $ z : num 3.4 3.4 3.4 3.4 3.4 3.4 3.4 3.4 3.4 3.4 ...
> $ deviceDirection: chr "east" "west" "north" "south" ...
> $ test# : int NA NA NA NA NA NA NA NA NA NA ...
> - attr(*, ".internal.selfref")=<externalptr>
>> sessionInfo()
> R version 3.0.2 (2013-09-25)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252
> [2] LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats grDevices utils datasets graphics methods
> [7] base
>
> other attached packages:
> [1] data.table_1.8.10 bitops_1.0-6
>
> loaded via a namespace (and not attached):
> [1] tools_3.0.2
>
>
> Is it possible to have the leading spaces removed automatically, or via a
> parameter?
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/leading-spaces-on-column-names-from-CSV-file-tp4683986.html
> Sent from the datatable-help mailing list archive at Nabble.com.
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
More information about the datatable-help
mailing list