[datatable-help] fread'ing logicals

Matthew Dowle mdowle at mdowle.plus.com
Mon Sep 16 01:35:27 CEST 2013


Good.  Now committed in v1.8.11 (rev 966).  Also drop and select is done.

o  fread's drop, select and NULL in colClasses are implemented. To drop 
or select columns by name
      or by number. See examples in ?fread.

o  fread now detects T,F,True,False,TRUE and FALSE as type logical, 
consistent with read.csv.

I pasted the new examples from ?fread to this answer as well:
http://stackoverflow.com/a/18702011/403310

Hope this covers everything in this area,  but please shout if anyone 
can think of anything further.

Matthew


On 15/09/13 22:42, Eduard Antonyan wrote:
>
> +1 for T and F, but definitely not because it's that way in read.csv 
> (which imo is not a good reason), but rather because those are 
> commonly used substitutes for TRUE and FALSE.
>
> On Sep 14, 2013 5:29 AM, "Arunkumar Srinivasan" <aragorn168b at gmail.com 
> <mailto:aragorn168b at gmail.com>> wrote:
>
>     Matthew,
>
>     +1 for retaining T and F like read.csv.
>     +1 for the dropins() feature as well.
>
>     Arun
>
>     On Saturday, September 14, 2013 at 11:53 AM, Matthew Dowle wrote:
>
>>     On 14/09/13 06:48, Chinmay Patil wrote:
>>>     I didn't mean changes in data.table's interface but the way
>>>     data.table works in itself compared to normal data frames. I
>>>     know there are valid reasons for structuring data.table's
>>>     interface the way it is but not all users get it immediately.
>>
>>     The bottom line in my mind is that even if base syntax was sped up
>>     (assignment to an unnamed data.frame needn't copy the whole
>>     data.frame
>>     for example), I would still move from
>>     subset()/transform()/with()/DF[i,j]<-value syntax, to i,j and by
>>     inside
>>     [...] with .SD,.I,.N and := in j. I can do things with that syntax
>>     that I need to do which aren't always so easy with base syntax (like
>>     adding columns by reference by group).
>>
>>     And base R syntax is indeed being sped up by pqR, Renjin,
>>     Riposte, TERR,
>>     CXXR, fastr which may feed into GNU R. Once that is mature and
>>     the dust
>>     has settled, I would still move from data.frame to data.table on
>>     each of
>>     them. Maybe we should market the things that data.table does that
>>     base
>>     R doesn't. Rather than speed differences.
>>
>>>
>>>     As for data.table, I am not complaining, just saying what other
>>>     users complaints I have heard of.
>>>     I personally love data.table and am willing to put the effort to
>>>     learn best ways to use it while most users aren't.
>>
>>     Great. data.table is for people like you.
>>
>>     So we'll keep the default fread'ing of "T" and "F" as logicals
>>     then for
>>     consistency with read.csv.
>>
>>     And I still hope to produce a drop-in replacement for read.csv which
>>     returns a data.frame but uses fread under the hood. That will
>>     speed up
>>     existing code, but users can use the extra features of fread if they
>>     want, too.
>>
>>     Matthew
>>
>>>
>>>     Chinmay
>>>
>>>     On 14 Sep, 2013, at 1:29 PM, Steve Lianoglou
>>>     <lianoglou.steve at gene.com <mailto:lianoglou.steve at gene.com>> wrote:
>>>
>>>>     Thanks for the quick response.
>>>>
>>>>     As for the "learning curve" stuff -- no real comment there, but:
>>>>
>>>>>     For eg. I recently heard complains about data.table itself
>>>>>     from due to
>>>>>     changes in interface
>>>>     Could you provide some concrete examples about which changes have
>>>>     stumped users? Perhaps we can learn from these critiques. I had
>>>>     thought we were pretty good about discussing any (breaking)
>>>>     changes on
>>>>     list, but I'd be interested to see where this has failed so it
>>>>     might
>>>>     perhaps be avoided in the future.
>>>>
>>>>>     and learning curve that data.table comes with... I hear
>>>>>     similar complaints about some packages like ggplot2, plyr..
>>>>>
>>>>>     Even though all these are great packages.. people don't like
>>>>>     radical changes
>>>>>     to interfaces as it makes refactoring older code even more
>>>>>     painful.
>>>>     Still curious to hear what radical changes have come down the pipe.
>>>>
>>>>     Thanks for taking the time to comment.
>>>>
>>>>     Cheers,
>>>>     -steve
>>>>
>>>>     -- 
>>>>     Steve Lianoglou
>>>>     Computational Biologist
>>>>     Bioinformatics and Computational Biology
>>>>     Genentech
>>>     _______________________________________________
>>>     datatable-help mailing list
>>>     datatable-help at lists.r-forge.r-project.org
>>>     <mailto:datatable-help at lists.r-forge.r-project.org>
>>>     https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>
>>     _______________________________________________
>>     datatable-help mailing list
>>     datatable-help at lists.r-forge.r-project.org
>>     <mailto:datatable-help at lists.r-forge.r-project.org>
>>     https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>
>
>     _______________________________________________
>     datatable-help mailing list
>     datatable-help at lists.r-forge.r-project.org
>     <mailto:datatable-help at lists.r-forge.r-project.org>
>     https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130916/725527a9/attachment-0001.html>


More information about the datatable-help mailing list