<div dir="ltr"><div><div>Data table crashes</div><div><br></div><div>I am having a similar issue to this post: <a href="http://r.789695.n4.nabble.com/fread-crash-td4683394.html">http://r.789695.n4.nabble.com/fread-crash-td4683394.html</a></div>
<div><br></div><div>please see markdown script: <a href="http://rpubs.com/bw4sz0511/16766">http://rpubs.com/bw4sz0511/16766</a> or text below: or text below:</div><div><br></div><div>The file is about 550MB, i'm unsure how many rows it actually is (several million).</div>
<div><br></div><div>When i try to run fread, Rstudio just crashes with no error. I can read in up to about 15 rows</div><div><br></div><div><br></div><div>require(data.table)</div><div>## Loading required package: data.table</div>
<div><br></div><div># env dist table<br></div><div><br></div><div>env <- fread("EnvData.csv", nrows = 15, verbose = TRUE)</div><div>## Input contains no \n. Taking this to be a filename to open</div><div>## File opened, filesize is 0.543B</div>
<div>## File is opened and mapped ok</div><div>## Detected eol as \n only (no \r afterwards), the UNIX and Mac standard.</div><div>## Using line 30 to detect sep (the last non blank line in the first 'autostart') ... sep=','</div>
<div>## Found 4 columns</div><div>## First row with 4 fields occurs on line 2 (either column names or first row of data)</div><div>## Some fields on line 2 are not type character (or are empty). Treating as a data row and using default column names.</div>
<div>## Count of eol after first data row: 15989212</div><div>## Subtracted 0 for last eol and any trailing empty lines, leaving 15989212 data rows</div><div>## nrow limited to nrows passed in (15)</div><div>## Type codes: 4113 (first 5 rows)</div>
<div>## Type codes: 4113 (after applying colClasses and integer64)</div><div>## Type codes: 4113 (after applying drop or select (if supplied)</div><div>## Allocating 4 column slots (4 - 0 NULL)</div><div>## 0.000s ( 0%) Memory map (rerun may be quicker)</div>
<div>## 0.000s ( 0%) sep and header detection</div><div>## 0.702s (100%) Count rows (wc -l)</div><div>## 0.000s ( 0%) Column type detection (first, middle and last 5 rows)</div><div>## 0.000s ( 0%) Allocation of 15x4 result (xMB) in RAM</div>
<div>## 0.000s ( 0%) Reading data</div><div>## 0.000s ( 0%) Allocation for type bumps (if any), including gc time if triggered</div><div>## 0.000s ( 0%) Coercing data already read in type bumps (if any)</div><div>
## 0.000s ( 0%) Changing na.strings to NA</div><div>## 0.702s Total</div><div><br></div><div>head(env)</div><div>## V1 V2 V3 V4</div><div>## 1: 1 2 1 249.3</div><div>## 2: 2 3 1 536.9</div><div>
## 3: 3 4 1 1161.8</div><div>## 4: 4 5 1 1234.0</div><div>## 5: 5 6 1 1513.4</div><div>## 6: 6 7 1 1757.1</div><div>However when i run fread with more than 20 rows, it crashes Rstudio.</div><div><br></div><div>
# not run</div><div>env <- fread("EnvData.csv", nrows = 25, verbose = TRUE)</div><div>verbose on the error output reads:</div><div><br></div><div>Input contains no \n. Taking this to be a filename to open</div>
<div><br></div><div>File opened, filesize is 0.543B</div><div><br></div><div>File is opened and mapped ok</div><div><br></div><div>Detected eol as \n only (no \r afterwards), the UNIX and Mac standard.</div><div><br></div>
<div>Using line 30 to detect sep (the last non blank line in the first 'autostart') … sep=','</div><div><br></div><div>Found 4 columns</div><div><br></div><div>First row with 4 fields occurs on line 2 (either column names or first row of data)</div>
<div><br></div><div>Some fields on line 2 are not type character (or are empty). Treating as a data row and using default column names.</div><div><br></div><div>Count of eol after first data row: 15989212</div><div><br></div>
<div>Subtracted 0 for last eol and any trailing empty lines, leaving 15989212 data rows</div><div><br></div><div>nrow limited to nrows passed in (25)</div><div><br></div><div>Type codes: 4113 (first 5 rows)</div><div><br>
</div><div>Type codes: 4113 (+middle 5 rows)</div><div><br></div><div>Look at the file, nothing seems wrong</div><div><br></div><div><br></div><div>env <- read.csv("EnvData.csv", nrows = 25)</div><div><br></div>
<div>env</div><div>## V1 V2 V3</div><div>## 1 2 1 249.3</div><div>## 2 3 1 536.9</div><div>## 3 4 1 1161.8</div><div>## 4 5 1 1234.0</div><div>## 5 6 1 1513.4</div><div>## 6 7 1 1757.1</div><div>
## 7 8 1 2176.7</div><div>## 8 9 1 2644.0</div><div>## 9 10 1 3033.3</div><div>## 10 11 1 3721.2</div><div>## 11 12 1 4432.8</div><div>## 12 13 1 4609.6</div><div>## 13 14 1 5378.8</div><div>## 14 15 1 5953.6</div>
<div>## 15 16 1 5913.9</div><div>## 16 17 1 6281.3</div><div>## 17 18 1 6669.7</div><div>## 18 19 1 6449.7</div><div>## 19 20 1 6218.4</div><div>## 20 21 1 6493.4</div><div>## 21 22 1 6056.6</div><div>## 22 23 1 5275.8</div>
<div>## 23 24 1 4605.2</div><div>## 24 25 1 3153.9</div><div>## 25 26 1 2532.1</div></div><div><br></div><div><br></div><div>Thanks for your help,</div><div><br></div><div>Ben Weinstein</div>-- <br>Ben Weinstein<br>PhD Candidate <br>
Ecology and Evolution<br>Stony Brook University<br><br><a href="http://benweinstein.weebly.com/">http://benweinstein.weebly.com/</a><br><br>
</div>