[datatable-help] Reading character strings into fread

Matthew Dowle mdowle at mdowle.plus.com
Sun Feb 3 10:34:04 CET 2013


Thanks, looks like a bug when detecting types at the end of the file. The
verbose output suggests the final line isn't \n terminated.  There are
many tests that should be ok without but perhaps something peculiar
combined with this format.  Does adding a newline at the end fix it as a
temp workaround?

> I am having a bit of trouble reading character string tables into fread.
>
> The character string was generated as part of the output from an API GET
> request from dropbox. (the API link to dropbox was done via the httr
> package)
>
> Basically I would like freed to do something similar to read.csv
but it
> seems to not work
.please see the below example, i which i download a csv
> file, with the content of response as a character string
.
>
> I would rather not use read.csv to write a csv file first then and get
> fread to read it in, as it seems like it is a bit of a round about way of
> doing it, rather than reading in the table directly from the character
> string.
>
> any help would be much appreciated.
>
> thanks
>
> p.s. @MD
i thought i responded to your previous email
but clearly I
> didn't
.so i thought i might as well email the list as well..
>
>> db.app <- oauth_app("db",key=getOption("DropboxKey"),
>> secret=getOption("DropboxSecret"))
>> db.sig <- sign_oauth1.0(db.app, token=getOption("DropboxOAuthKey"),
>> token_secret=getOption("DropboxOAuthSecret"))
>>
>> response <-
>> GET(url=paste0("https://api-content.dropbox.com/1/files/dropbox/",gsub("%2F","/",curlEscape("!!
>> test folder/new
>> file.csv"))),config=c(db.sig,add_headers(Accept="x-dropbox-metadata")))
>> response
> Response
> [https://api-content.dropbox.com/1/files/dropbox/%21%21%20test%20folder/new%20file.csv]
>   Status: 200
>   Content-type: text/csv; charset=ascii
> "Date.and.Time","Open","High","Low","Close","Volume"
> "2007/01/02 01:46:00",20083,20088,20071,20075,212
> "2007/01/02 01:47:00",20075,20120,20075,20106,328
> "2007/01/02 01:48:00",20105,20110,20094,20096,256
> "2007/01/02 01:49:00",20096,20106,20085,20099,177
> "2007/01/02 01:50:00",20098,20100,20081,20092,184
> "2007/01/02 01:51:00",20091,20094,20087,20093,48
> "2007/01/02 01:52:00",20093,20095,20085,20088,147
> "2007/01/02 01:53:00",20088,20090,20086,20089,26
> "2007/01/02 01:54:00",20089,20100,20089,20091,116 ...
>> require(data.table)
> Loading required package: data.table
> data.table 1.8.7  For help type: help("data.table")
>> x <- fread(content(response),sep=",",verbose=TRUE)
> Input contains a \n (or is ""), taking this to be text input (not a
> filename)
> Detected eol as \n only (no \r afterwards), the UNIX and Mac standard.
> Looking for supplied sep ',' on line 30 (the last non blank line in the
> first 30) ... found
> Found 6 columns
> First row with 6 fields occurs on line 1 (either column names or first row
> of data)
> All the fields on line 1 are character fields. Treating as the column
> names.
> Count of eol after first data row: 82
> Subtracted 0 for last eol and any trailing empty lines, leaving 82 data
> rows
> Type codes: 300000 (first 5 rows)
> Type codes: 300000 (+middle 5 rows)
> Error in fread(content(response), sep = ",", verbose = TRUE) :
>   Expected sep (',') but '
>> x <- read.csv(text=content(response),header=TRUE,stringsAsFactors=FALSE)
>> head(x)
>         Date.and.Time  Open  High   Low Close Volume
> 1 2007/01/02 01:46:00 20083 20088 20071 20075    212
> 2 2007/01/02 01:47:00 20075 20120 20075 20106    328
> 3 2007/01/02 01:48:00 20105 20110 20094 20096    256
> 4 2007/01/02 01:49:00 20096 20106 20085 20099    177
> 5 2007/01/02 01:50:00 20098 20100 20081 20092    184
> 6 2007/01/02 01:51:00 20091 20094 20087 20093     48
>> str(content(response))
>  chr
> "\"Date.and.Time\",\"Open\",\"High\",\"Low\",\"Close\",\"Volume\"\n\"2007/01/02
> 01:46:00\",20083,20088,20071,20075,212\n\"2007/0"| __truncated__
>> str(x)
> 'data.frame':	100 obs. of  6 variables:
>  $ Date.and.Time: chr  "2007/01/02 01:46:00" "2007/01/02 01:47:00"
> "2007/01/02 01:48:00" "2007/01/02 01:49:00" ...
>  $ Open         : int  20083 20075 20105 20096 20098 20091 20093 20088
> 20089 20090 ...
>  $ High         : int  20088 20120 20110 20106 20100 20094 20095 20090
> 20100 20093 ...
>  $ Low          : int  20071 20075 20094 20085 20081 20087 20085 20086
> 20089 20083 ...
>  $ Close        : int  20075 20106 20096 20099 20092 20093 20088 20089
> 20091 20093 ...
>  $ Volume       : int  212 328 256 177 184 48 147 26 116 47 ...
>
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help




More information about the datatable-help mailing list