[datatable-help] fread'ing logicals
Matthew Dowle
mdowle at mdowle.plus.com
Mon Sep 16 01:35:27 CEST 2013
Good. Now committed in v1.8.11 (rev 966). Also drop and select is done.
o fread's drop, select and NULL in colClasses are implemented. To drop
or select columns by name
or by number. See examples in ?fread.
o fread now detects T,F,True,False,TRUE and FALSE as type logical,
consistent with read.csv.
I pasted the new examples from ?fread to this answer as well:
http://stackoverflow.com/a/18702011/403310
Hope this covers everything in this area, but please shout if anyone
can think of anything further.
Matthew
On 15/09/13 22:42, Eduard Antonyan wrote:
>
> +1 for T and F, but definitely not because it's that way in read.csv
> (which imo is not a good reason), but rather because those are
> commonly used substitutes for TRUE and FALSE.
>
> On Sep 14, 2013 5:29 AM, "Arunkumar Srinivasan" <aragorn168b at gmail.com
> <mailto:aragorn168b at gmail.com>> wrote:
>
> Matthew,
>
> +1 for retaining T and F like read.csv.
> +1 for the dropins() feature as well.
>
> Arun
>
> On Saturday, September 14, 2013 at 11:53 AM, Matthew Dowle wrote:
>
>> On 14/09/13 06:48, Chinmay Patil wrote:
>>> I didn't mean changes in data.table's interface but the way
>>> data.table works in itself compared to normal data frames. I
>>> know there are valid reasons for structuring data.table's
>>> interface the way it is but not all users get it immediately.
>>
>> The bottom line in my mind is that even if base syntax was sped up
>> (assignment to an unnamed data.frame needn't copy the whole
>> data.frame
>> for example), I would still move from
>> subset()/transform()/with()/DF[i,j]<-value syntax, to i,j and by
>> inside
>> [...] with .SD,.I,.N and := in j. I can do things with that syntax
>> that I need to do which aren't always so easy with base syntax (like
>> adding columns by reference by group).
>>
>> And base R syntax is indeed being sped up by pqR, Renjin,
>> Riposte, TERR,
>> CXXR, fastr which may feed into GNU R. Once that is mature and
>> the dust
>> has settled, I would still move from data.frame to data.table on
>> each of
>> them. Maybe we should market the things that data.table does that
>> base
>> R doesn't. Rather than speed differences.
>>
>>>
>>> As for data.table, I am not complaining, just saying what other
>>> users complaints I have heard of.
>>> I personally love data.table and am willing to put the effort to
>>> learn best ways to use it while most users aren't.
>>
>> Great. data.table is for people like you.
>>
>> So we'll keep the default fread'ing of "T" and "F" as logicals
>> then for
>> consistency with read.csv.
>>
>> And I still hope to produce a drop-in replacement for read.csv which
>> returns a data.frame but uses fread under the hood. That will
>> speed up
>> existing code, but users can use the extra features of fread if they
>> want, too.
>>
>> Matthew
>>
>>>
>>> Chinmay
>>>
>>> On 14 Sep, 2013, at 1:29 PM, Steve Lianoglou
>>> <lianoglou.steve at gene.com <mailto:lianoglou.steve at gene.com>> wrote:
>>>
>>>> Thanks for the quick response.
>>>>
>>>> As for the "learning curve" stuff -- no real comment there, but:
>>>>
>>>>> For eg. I recently heard complains about data.table itself
>>>>> from due to
>>>>> changes in interface
>>>> Could you provide some concrete examples about which changes have
>>>> stumped users? Perhaps we can learn from these critiques. I had
>>>> thought we were pretty good about discussing any (breaking)
>>>> changes on
>>>> list, but I'd be interested to see where this has failed so it
>>>> might
>>>> perhaps be avoided in the future.
>>>>
>>>>> and learning curve that data.table comes with... I hear
>>>>> similar complaints about some packages like ggplot2, plyr..
>>>>>
>>>>> Even though all these are great packages.. people don't like
>>>>> radical changes
>>>>> to interfaces as it makes refactoring older code even more
>>>>> painful.
>>>> Still curious to hear what radical changes have come down the pipe.
>>>>
>>>> Thanks for taking the time to comment.
>>>>
>>>> Cheers,
>>>> -steve
>>>>
>>>> --
>>>> Steve Lianoglou
>>>> Computational Biologist
>>>> Bioinformatics and Computational Biology
>>>> Genentech
>>> _______________________________________________
>>> datatable-help mailing list
>>> datatable-help at lists.r-forge.r-project.org
>>> <mailto:datatable-help at lists.r-forge.r-project.org>
>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>
>> _______________________________________________
>> datatable-help mailing list
>> datatable-help at lists.r-forge.r-project.org
>> <mailto:datatable-help at lists.r-forge.r-project.org>
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>
>
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> <mailto:datatable-help at lists.r-forge.r-project.org>
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130916/725527a9/attachment-0001.html>
More information about the datatable-help
mailing list