[datatable-help] Apparent loss of decimals when reading a numeric column with fread() !

Frank Erickson fperickson at wisc.edu
Tue Aug 4 15:02:43 CEST 2015


.I was never an index by group. You'd have to make that more manually, like
1:.N

On Tue, Aug 4, 2015 at 8:54 AM, Bacou, Melanie <mel at mbacou.com> wrote:

> Starting to wonder if something else is going on with my R install. Using
> the same CSV file, I'm not getting what I expect with `.I` (I would expect
> an index by group).
>
> ```
> pcn08 <- fread("./data/PovCalServlet_15.08.03.csv")
>
> # and then grouping by 2 fields
> pcn08[, test := .I, by=list(country, povLine)]
>
> pcn08$test
> # [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
> 24 25 26 27
> # [28] 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49
> 50 51 52 53 54
> # [55] 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76
> 77 78 79 80 81
> # [82] 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98
>
> # but there are 86 groups
> dim(pcn08[, .N, by=list(country, povLine)])
> # [1] 86  3
>
> ```
>
>
> On 8/4/2015 8:06 AM, Bacou, Melanie wrote:
>
>> In case that happens to others. Seems related to R global `digits`
>> options, maybe something changed in R.3.2.1.
>> --Mel.
>>
>> ```
>> options(digits=3)
>> 1000-0.5
>> # [1] 1000
>> options(digits=5)
>> 1000-0.5
>>
>> # [1] 999.5
>>
>> ```
>>
>> On 8/4/2015 7:54 AM, Bacou, Melanie wrote:
>>
>>> Hi,
>>> Thx, I see I have another problem, not related to data.table (sorry). R
>>> seems to truncate numbers in the console. Not sure what's going on.
>>>
>>> --Mel.
>>>
>>> ```{r}
>>> > 1-0.5
>>> [1] 0.5
>>> > 2008-0.05
>>> [1] 2008
>>> > 45-0.5
>>> [1] 44.5
>>> > 100-0.5
>>> [1] 99.5
>>> > 1000-0.5
>>> [1] 1000
>>> > 10000-0.5
>>> [1] 10000
>>> ```
>>>
>>> On 8/4/2015 7:16 AM, nachti wrote:
>>>
>>>> Hi,
>>>> copying your code, everything works as expected for me.
>>>> Maybe you just referenced to a wrong object (pcn08)?
>>>>
>>>> ```
>>>>
>>>>> library(data.table)
>>>>>
>>>> data.table 1.9.4  For help type: ?data.table
>>>> *** NB: by=.EACHI is now explicit. See README to restore previous
>>>> behaviour.
>>>>
>>>>> pcn <- fread("PovCalServlet_15.08.03.csv")
>>>>>
>>>> sapply(pcn, class)
>>>> pcn <- fread("PovCalServlet_15.08.03.csv")
>>>>
>>>>> sapply(pcn, class)
>>>>>
>>>>      country     povLine        mean         hcr gap sev
>>>> "character"   "numeric"   "numeric"   "numeric"   "numeric" "numeric"
>>>>        watts        popM     yearNum
>>>>    "numeric"   "numeric"   "numeric"
>>>>
>>>>> pcn08$yearNum
>>>>>
>>>> Error: object 'pcn08' not found
>>>>
>>>>> pcn$yearNum
>>>>>
>>>>   [1] 2008.50 2011.50 2009.25 2009.00 2006.00 2007.50 2007.00 2008.00
>>>> 2011.00
>>>> [10] 2004.00 2005.50 2011.00 2008.00 2010.50 2005.00 2003.00 2005.50
>>>> 2008.00
>>>> [19] 2007.00 2012.00 2002.00 2005.40 2010.00 2007.00 2010.00 2010.23
>>>> 2010.00
>>>> [28] 2008.00 2008.00 2012.00 2006.00 2008.64 2009.50 2011.00 2009.83
>>>> 2010.83
>>>> [37] 2010.00 2011.00 2006.50 2011.00 2010.67 2009.00 2009.50 2011.80
>>>> 2011.00
>>>> [46] 2008.00 2012.50 2009.30 2010.00 2008.50 2011.50 2009.25 2009.00
>>>> 2006.00
>>>> [55] 2007.50 2007.00 2008.00 2011.00 2004.00 2005.50 2011.00 2008.00
>>>> 2010.50
>>>> [64] 2005.00 2003.00 2005.50 2008.00 2007.00 2012.00 2002.00 2005.40
>>>> 2010.00
>>>> [73] 2007.00 2010.00 2010.23 2010.00 2008.00 2008.00 2012.00 2006.00
>>>> 2008.64
>>>> [82] 2009.50 2011.00 2009.83 2010.83 2010.00 2011.00 2006.50 2011.00
>>>> 2010.67
>>>> [91] 2009.00 2009.50 2011.80 2011.00 2008.00 2012.50 2009.30 2010.00
>>>>
>>>>> sessionInfo()
>>>>>
>>>> R version 3.2.1 (2015-06-18)
>>>> Platform: x86_64-suse-linux-gnu (64-bit)
>>>> Running under: openSUSE 13.1 (Bottle) (x86_64)
>>>>
>>>> locale:
>>>>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>>>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>>>   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>>>>   [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>>>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
>>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>>
>>>> attached base packages:
>>>> [1] stats     graphics  grDevices utils     datasets  methods base
>>>>
>>>> other attached packages:
>>>> [1] data.table_1.9.4
>>>>
>>>> loaded via a namespace (and not attached):
>>>> [1] compiler_3.2.1 magrittr_1.5   plyr_1.8.3     tools_3.2.1
>>>> reshape2_1.4.1
>>>> [6] Rcpp_0.11.6    stringi_0.5-5  stringr_1.0.0  chron_2.3-4
>>>> ```
>>>>
>>>> ~g
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://r.789695.n4.nabble.com/Apparent-loss-of-decimals-when-reading-a-numeric-column-with-fread-tp4710722p4710727.html
>>>> Sent from the datatable-help mailing list archive at Nabble.com.
>>>> _______________________________________________
>>>> datatable-help mailing list
>>>> datatable-help at lists.r-forge.r-project.org
>>>>
>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>>>
>>>
>>>
>>
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20150804/2bd50f45/attachment-0001.html>


More information about the datatable-help mailing list