[datatable-help] leading spaces on column names from CSV file

Jim Holtman jholtman at gmail.com
Thu Jan 23 01:44:17 CET 2014


Matt

Thanks for making the change. When will the update be available? 


Sent from my Verizon Wireless 4G LTE Smartphone

<div>-------- Original message --------</div><div>From: Matt Dowle <mdowle at mdowle.plus.com> </div><div>Date:01/22/2014  14:40  (GMT-05:00) </div><div>To: jholtman <jholtman at gmail.com>,datatable-help at lists.r-forge.r-project.org </div><div>Subject: Re: [datatable-help] leading spaces on column names from CSV file </div><div>
</div>Hi Jim,
Ok yes good idea.   The roots of that are due to data.table allows 
leading and trailing spaces (and all special characters) in column 
names.  The confusion that  by="a, b, c" doesn't work due to the spaces, 
due to column name " b" being different to "b".   But for fread that 
doesn't make sense and it should drop the leading spaces by default, 
yes.   I've added this to the top of fread.c where I'm logging the fread 
ToDos.
Thanks, Matt


On 22/01/14 16:21, jholtman wrote:
> I have a CSV file that I had been reading with 'read.csv' and it turns out
> that the column names had leading spaces in some instances, but read.csv
> would remove them.
>
> I tried to read the file with fread and it was keeping the leading spaces.
> Here is dump of the session:
>
>> positXY <- read.csv(positionFile, as.is = TRUE)
>>
>>
>> str(positXY)
> 'data.frame':   12 obs. of  8 variables:
>   $ ieee           : chr  "1972cd01004b1200" "6375cd01004b1200"
> "5875cd01004b1200" "1972cd01004b1200" ...
>   $ startTime      : chr  "13:46" "13:46" "13:46" "13:53" ...  # no leading
> space on next three
>   $ endTime        : chr  "13:51" "13:51" "13:51" "13:58" ...
>   $ x              : int  65 65 65 65 65 65 65 65 65 65 ...
>   $ y              : int  45 45 45 45 45 45 45 45 45 45 ...
>   $ z              : num  3.4 3.4 3.4 3.4 3.4 3.4 3.4 3.4 3.4 3.4 ...
>   $ deviceDirection: chr  "east" "west" "north" "south" ...
>   $ test.          : logi  NA NA NA NA NA NA ...
>> positXY <- fread(positionFile)
>>
>> str(positXY)
> Classes ‘data.table’ and 'data.frame':  12 obs. of  8 variables:
>   $ ieee           : chr  "1972cd01004b1200" "6375cd01004b1200"
> "5875cd01004b1200" "1972cd01004b1200" ...
>   $  startTime     : chr  "13:46" "13:46" "13:46" "13:53" ...  # leading
> space on next three
>   $  endTime       : chr  "13:51" "13:51" "13:51" "13:58" ...
>   $  x             : int  65 65 65 65 65 65 65 65 65 65 ...
>   $ y              : int  45 45 45 45 45 45 45 45 45 45 ...
>   $ z              : num  3.4 3.4 3.4 3.4 3.4 3.4 3.4 3.4 3.4 3.4 ...
>   $ deviceDirection: chr  "east" "west" "north" "south" ...
>   $  test#         : int  NA NA NA NA NA NA NA NA NA NA ...
>   - attr(*, ".internal.selfref")=<externalptr>
>> sessionInfo()
> R version 3.0.2 (2013-09-25)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252
> [2] LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats     grDevices utils     datasets  graphics  methods
> [7] base
>
> other attached packages:
> [1] data.table_1.8.10 bitops_1.0-6
>
> loaded via a namespace (and not attached):
> [1] tools_3.0.2
>
>
> Is it possible to have the leading spaces removed automatically, or via a
> parameter?
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/leading-spaces-on-column-names-from-CSV-file-tp4683986.html
> Sent from the datatable-help mailing list archive at Nabble.com.
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20140122/9f96efab/attachment-0001.html>


More information about the datatable-help mailing list