[datatable-help] fread'ing logicals

Matthew Dowle mdowle at mdowle.plus.com
Sat Sep 14 11:53:21 CEST 2013


On 14/09/13 06:48, Chinmay Patil wrote:
> I didn't mean changes in data.table's interface but the way data.table works in itself compared to normal data frames. I know there are valid reasons for structuring data.table's interface the way it is but not all users get it immediately.

The bottom line in my mind is that even if base syntax was sped up 
(assignment to an unnamed data.frame needn't copy the whole data.frame 
for example), I would still move from 
subset()/transform()/with()/DF[i,j]<-value syntax,  to i,j and by inside 
[...]  with .SD,.I,.N and := in j.  I can do things with that syntax 
that I need to do which aren't always so easy with base syntax (like 
adding columns by reference by group).

And base R syntax is indeed being sped up by pqR, Renjin, Riposte, TERR, 
CXXR, fastr which may feed into GNU R. Once that is mature and the dust 
has settled, I would still move from data.frame to data.table on each of 
them.  Maybe we should market the things that data.table does that base 
R doesn't.   Rather than speed differences.

>
> As for data.table, I am not complaining, just saying what other users complaints I have heard of.
> I personally love data.table and am willing to put the effort to learn best ways to use it while most users aren't.

Great.  data.table is for people like you.

So we'll keep the default fread'ing of "T" and "F" as logicals then for 
consistency with read.csv.

And I still hope to produce a drop-in replacement for read.csv which 
returns a data.frame but uses fread under the hood. That will speed up 
existing code,  but users can use the extra features of fread if they 
want, too.

Matthew

>
> Chinmay
>
> On 14 Sep, 2013, at 1:29 PM, Steve Lianoglou <lianoglou.steve at gene.com> wrote:
>
>> Thanks for the quick response.
>>
>> As for the "learning curve" stuff -- no real comment there, but:
>>
>>> For eg. I recently heard complains about data.table itself from due to
>>> changes in interface
>> Could you provide some concrete examples about which changes have
>> stumped users? Perhaps we can learn from these critiques. I had
>> thought we were pretty good about discussing any (breaking) changes on
>> list, but I'd be interested to see where this has failed so it might
>> perhaps be avoided in the future.
>>
>>> and learning curve that data.table comes with... I hear
>>> similar complaints about some packages like ggplot2, plyr..
>>>
>>> Even though all these are great packages.. people don't like radical changes
>>> to interfaces as it makes refactoring older code even more painful.
>> Still curious to hear what radical changes have come down the pipe.
>>
>> Thanks for taking the time to comment.
>>
>> Cheers,
>> -steve
>>
>> -- 
>> Steve Lianoglou
>> Computational Biologist
>> Bioinformatics and Computational Biology
>> Genentech
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>



More information about the datatable-help mailing list