[datatable-help] Coercian to character

Damian Betebenner dbetebenner at nciea.org
Sun Apr 15 11:40:24 CEST 2012


I started having character vectors popping up in places I never had before but upon further investigation that turned out to be an issue with my own setup, not data.table.

With regard to characters (and data.tables ability to handle them as a key now), I did notice that data.table and data.frame default to using
stringsAsFactors differently:

DF <- data.frame(X=letters[1:10], Y=rnorm(10))
sapply(DF, class)

        X         Y 
 "factor" "numeric"

DT <- data.table(X=letters[1:10], Y=rnorm(10)) 
sapply(DT, class)

> DT <- data.table(X=rep(letters[1:10], each=2), Y=rnorm(20)) 
> sapply(DT, class)
          X           Y 
"character"   "numeric"


Will this inconsistency cause problems down the road?

Thanks for all your help,

Damian


Damian Betebenner
Center for Assessment
PO Box 351
Dover, NH   03821-0351
 
Phone (office): (603) 516-7900
Phone (cell): (857) 234-2474
Fax: (603) 516-7910

dbetebenner at nciea.org
www.nciea.org




-----Original Message-----
From: Matthew Dowle [mailto:mdowlenoreply at virginmedia.com] On Behalf Of Matthew Dowle
Sent: Thursday, April 12, 2012 5:50 PM
To: Damian Betebenner
Cc: datatable-help at lists.r-forge.r-project.org
Subject: Re: [datatable-help] Coercian to character

It shouldn't coerce. What makes you think it does?

> DT = data.table(a=factor(c("a","b","b","c")),b=1:4)
> DT[,sum(b),by=a]
     a V1
[1,] a  1
[2,] b  5
[3,] c  4
> str(DT[,sum(b),by=a])
Classes ‘data.table’ and 'data.frame':	3 obs. of  2 variables:
 $ a : Factor w/ 3 levels "a","b","c": 1 2 3  $ V1: int  1 5 4



On Thu, 2012-04-12 at 14:57 -0500, Damian Betebenner wrote:
> Data tablers
> 
>  
> 
> Does data.table now coerce factors to character variables when doing 
> by summaries?
> 
>  
> 
> If so, is there any way to not allow this coercion?
> 
>  
> 
> Thanks,
> 
>  
> 
> Damian Betebenner
> 
> Center for Assessment
> 
> PO Box 351
> 
> Dover, NH   03821-0351
> 
>  
> 
> Phone (office): (603) 516-7900
> 
> Phone (cell): (857) 234-2474
> 
> Fax: (603) 516-7910
> 
>  
> 
> dbetebenner at nciea.org
> 
> www.nciea.org
> 
>  
> 
>  
> 
>  
> 
> 
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable
> -help




More information about the datatable-help mailing list