[datatable-help] fread crashes reading R when reading csv
Ben Weinstein
benweinstein2010 at gmail.com
Thu May 8 16:39:40 CEST 2014
Data table crashes
I am having a similar issue to this post:
http://r.789695.n4.nabble.com/fread-crash-td4683394.html
please see markdown script: http://rpubs.com/bw4sz0511/16766 or text below:
or text below:
The file is about 550MB, i'm unsure how many rows it actually is (several
million).
When i try to run fread, Rstudio just crashes with no error. I can read in
up to about 15 rows
require(data.table)
## Loading required package: data.table
# env dist table
env <- fread("EnvData.csv", nrows = 15, verbose = TRUE)
## Input contains no \n. Taking this to be a filename to open
## File opened, filesize is 0.543B
## File is opened and mapped ok
## Detected eol as \n only (no \r afterwards), the UNIX and Mac standard.
## Using line 30 to detect sep (the last non blank line in the first
'autostart') ... sep=','
## Found 4 columns
## First row with 4 fields occurs on line 2 (either column names or first
row of data)
## Some fields on line 2 are not type character (or are empty). Treating as
a data row and using default column names.
## Count of eol after first data row: 15989212
## Subtracted 0 for last eol and any trailing empty lines, leaving 15989212
data rows
## nrow limited to nrows passed in (15)
## Type codes: 4113 (first 5 rows)
## Type codes: 4113 (after applying colClasses and integer64)
## Type codes: 4113 (after applying drop or select (if supplied)
## Allocating 4 column slots (4 - 0 NULL)
## 0.000s ( 0%) Memory map (rerun may be quicker)
## 0.000s ( 0%) sep and header detection
## 0.702s (100%) Count rows (wc -l)
## 0.000s ( 0%) Column type detection (first, middle and last 5 rows)
## 0.000s ( 0%) Allocation of 15x4 result (xMB) in RAM
## 0.000s ( 0%) Reading data
## 0.000s ( 0%) Allocation for type bumps (if any), including gc time
if triggered
## 0.000s ( 0%) Coercing data already read in type bumps (if any)
## 0.000s ( 0%) Changing na.strings to NA
## 0.702s Total
head(env)
## V1 V2 V3 V4
## 1: 1 2 1 249.3
## 2: 2 3 1 536.9
## 3: 3 4 1 1161.8
## 4: 4 5 1 1234.0
## 5: 5 6 1 1513.4
## 6: 6 7 1 1757.1
However when i run fread with more than 20 rows, it crashes Rstudio.
# not run
env <- fread("EnvData.csv", nrows = 25, verbose = TRUE)
verbose on the error output reads:
Input contains no \n. Taking this to be a filename to open
File opened, filesize is 0.543B
File is opened and mapped ok
Detected eol as \n only (no \r afterwards), the UNIX and Mac standard.
Using line 30 to detect sep (the last non blank line in the first
'autostart') ... sep=','
Found 4 columns
First row with 4 fields occurs on line 2 (either column names or first row
of data)
Some fields on line 2 are not type character (or are empty). Treating as a
data row and using default column names.
Count of eol after first data row: 15989212
Subtracted 0 for last eol and any trailing empty lines, leaving 15989212
data rows
nrow limited to nrows passed in (25)
Type codes: 4113 (first 5 rows)
Type codes: 4113 (+middle 5 rows)
Look at the file, nothing seems wrong
env <- read.csv("EnvData.csv", nrows = 25)
env
## V1 V2 V3
## 1 2 1 249.3
## 2 3 1 536.9
## 3 4 1 1161.8
## 4 5 1 1234.0
## 5 6 1 1513.4
## 6 7 1 1757.1
## 7 8 1 2176.7
## 8 9 1 2644.0
## 9 10 1 3033.3
## 10 11 1 3721.2
## 11 12 1 4432.8
## 12 13 1 4609.6
## 13 14 1 5378.8
## 14 15 1 5953.6
## 15 16 1 5913.9
## 16 17 1 6281.3
## 17 18 1 6669.7
## 18 19 1 6449.7
## 19 20 1 6218.4
## 20 21 1 6493.4
## 21 22 1 6056.6
## 22 23 1 5275.8
## 23 24 1 4605.2
## 24 25 1 3153.9
## 25 26 1 2532.1
Thanks for your help,
Ben Weinstein
--
Ben Weinstein
PhD Candidate
Ecology and Evolution
Stony Brook University
http://benweinstein.weebly.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20140508/1e702a7c/attachment.html>
More information about the datatable-help
mailing list