[datatable-help] Apparent loss of decimals when reading a numeric column with fread() !

Bacou, Melanie mel at mbacou.com
Tue Aug 4 14:54:28 CEST 2015


Starting to wonder if something else is going on with my R install. 
Using the same CSV file, I'm not getting what I expect with `.I` (I 
would expect an index by group).

```
pcn08 <- fread("./data/PovCalServlet_15.08.03.csv")

# and then grouping by 2 fields
pcn08[, test := .I, by=list(country, povLine)]

pcn08$test
# [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 
23 24 25 26 27
# [28] 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 
50 51 52 53 54
# [55] 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 
77 78 79 80 81
# [82] 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98

# but there are 86 groups
dim(pcn08[, .N, by=list(country, povLine)])
# [1] 86  3

```

On 8/4/2015 8:06 AM, Bacou, Melanie wrote:
> In case that happens to others. Seems related to R global `digits` 
> options, maybe something changed in R.3.2.1.
> --Mel.
>
> ```
> options(digits=3)
> 1000-0.5
> # [1] 1000
> options(digits=5)
> 1000-0.5
>
> # [1] 999.5
>
> ```
>
> On 8/4/2015 7:54 AM, Bacou, Melanie wrote:
>> Hi,
>> Thx, I see I have another problem, not related to data.table (sorry). 
>> R seems to truncate numbers in the console. Not sure what's going on.
>>
>> --Mel.
>>
>> ```{r}
>> > 1-0.5
>> [1] 0.5
>> > 2008-0.05
>> [1] 2008
>> > 45-0.5
>> [1] 44.5
>> > 100-0.5
>> [1] 99.5
>> > 1000-0.5
>> [1] 1000
>> > 10000-0.5
>> [1] 10000
>> ```
>>
>> On 8/4/2015 7:16 AM, nachti wrote:
>>> Hi,
>>> copying your code, everything works as expected for me.
>>> Maybe you just referenced to a wrong object (pcn08)?
>>>
>>> ```
>>>> library(data.table)
>>> data.table 1.9.4  For help type: ?data.table
>>> *** NB: by=.EACHI is now explicit. See README to restore previous 
>>> behaviour.
>>>> pcn <- fread("PovCalServlet_15.08.03.csv")
>>> sapply(pcn, class)
>>> pcn <- fread("PovCalServlet_15.08.03.csv")
>>>> sapply(pcn, class)
>>>      country     povLine        mean         hcr gap sev
>>> "character"   "numeric"   "numeric"   "numeric"   "numeric" "numeric"
>>>        watts        popM     yearNum
>>>    "numeric"   "numeric"   "numeric"
>>>> pcn08$yearNum
>>> Error: object 'pcn08' not found
>>>> pcn$yearNum
>>>   [1] 2008.50 2011.50 2009.25 2009.00 2006.00 2007.50 2007.00 
>>> 2008.00 2011.00
>>> [10] 2004.00 2005.50 2011.00 2008.00 2010.50 2005.00 2003.00 2005.50 
>>> 2008.00
>>> [19] 2007.00 2012.00 2002.00 2005.40 2010.00 2007.00 2010.00 2010.23 
>>> 2010.00
>>> [28] 2008.00 2008.00 2012.00 2006.00 2008.64 2009.50 2011.00 2009.83 
>>> 2010.83
>>> [37] 2010.00 2011.00 2006.50 2011.00 2010.67 2009.00 2009.50 2011.80 
>>> 2011.00
>>> [46] 2008.00 2012.50 2009.30 2010.00 2008.50 2011.50 2009.25 2009.00 
>>> 2006.00
>>> [55] 2007.50 2007.00 2008.00 2011.00 2004.00 2005.50 2011.00 2008.00 
>>> 2010.50
>>> [64] 2005.00 2003.00 2005.50 2008.00 2007.00 2012.00 2002.00 2005.40 
>>> 2010.00
>>> [73] 2007.00 2010.00 2010.23 2010.00 2008.00 2008.00 2012.00 2006.00 
>>> 2008.64
>>> [82] 2009.50 2011.00 2009.83 2010.83 2010.00 2011.00 2006.50 2011.00 
>>> 2010.67
>>> [91] 2009.00 2009.50 2011.80 2011.00 2008.00 2012.50 2009.30 2010.00
>>>> sessionInfo()
>>> R version 3.2.1 (2015-06-18)
>>> Platform: x86_64-suse-linux-gnu (64-bit)
>>> Running under: openSUSE 13.1 (Bottle) (x86_64)
>>>
>>> locale:
>>>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>>   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>>>   [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>
>>> attached base packages:
>>> [1] stats     graphics  grDevices utils     datasets  methods base
>>>
>>> other attached packages:
>>> [1] data.table_1.9.4
>>>
>>> loaded via a namespace (and not attached):
>>> [1] compiler_3.2.1 magrittr_1.5   plyr_1.8.3     tools_3.2.1
>>> reshape2_1.4.1
>>> [6] Rcpp_0.11.6    stringi_0.5-5  stringr_1.0.0  chron_2.3-4
>>> ```
>>>
>>> ~g
>>>
>>>
>>>
>>> -- 
>>> View this message in context: 
>>> http://r.789695.n4.nabble.com/Apparent-loss-of-decimals-when-reading-a-numeric-column-with-fread-tp4710722p4710727.html
>>> Sent from the datatable-help mailing list archive at Nabble.com.
>>> _______________________________________________
>>> datatable-help mailing list
>>> datatable-help at lists.r-forge.r-project.org
>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help 
>>>
>>
>



More information about the datatable-help mailing list