<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix"><br>
The comments are really a banner at the start of the file it
seems. So this is all built in to fread already. But the banner in
the example is 34 rows, so the default of autostart=30 isn't
enough. Try:<br>
<br>
fread("03217500.exsa.rsb", autostart=40)<br>
<br>
That should do it in one shot, including detecting the column
names. I've just increased autostart a bit to be within the data
block. See ?fread for a detailed description of autostart and the
procedure.<br>
<br>
Btw, if there is more than one table in a single file, then
setting autostart to be within each one is how to read each one
in. And provided there is no footer, you can set autostart to be
very large, too (with downside of time to seek back from the end
to find the column names).<br>
<br>
Matthew<br>
<br>
On 05/08/13 20:52, jim holtman wrote:<br>
</div>
<blockquote
cite="mid:CAAxdm-7mYKibepgzH3YVvoPqpyNqRnVdyUoRhy4WydjJmVngFg@mail.gmail.com"
type="cite">
<div dir="ltr">Here is what I would do. Read in the file, delete
the comments, write it back out and then process it.
<div><br>
</div>
<div><br>
</div>
<div>
<div>> myFile <- tempfile() # temp file</div>
<div>> input <- readLines('/temp/dv.txt') # this is a
copy of the data you posted</div>
<div>> # remove comments</div>
<div>> input <- input[!grepl("^#", input)]</div>
<div>> require(data.table)</div>
<div>Loading required package: data.table</div>
<div>data.table 1.8.8 For help type: help("data.table")</div>
<div>> writeLines(input, myFile)</div>
<div>> dv <- fread(myFile)</div>
<div><br>
</div>
<div> </div>
<div>> </div>
<div>> str(dv)</div>
<div>Classes ‘data.table’ and 'data.frame': 367 obs. of 21
variables:</div>
<div> $ agency_cd : chr "5s" "USGS" "USGS" "USGS" ...</div>
<div> $ site_no : chr "15s" "02169570" "02169570"
"02169570" ...</div>
<div> $ datetime : chr "20d" "2012-08-04"
"2012-08-05" "2012-08-06" ...</div>
<div> $ 04_00095_00001 : chr "14n" "" "" "" ...</div>
<div> $ 04_00095_00001_cd: chr "10s" "" "" "" ...</div>
<div> $ 04_00095_00002 : chr "14n" "" "" "" ...</div>
<div> $ 04_00095_00002_cd: chr "10s" "" "" "" ...</div>
<div> $ 04_00095_00003 : chr "14n" "" "" "" ...</div>
<div> $ 04_00095_00003_cd: chr "10s" "" "" "" ...</div>
<div> $ 05_00065_00001 : chr "14n" "2.10" "1.71" "1.77" ...</div>
<div> $ 05_00065_00001_cd: chr "10s" "A" "A" "A" ...</div>
<div> $ 05_00065_00002 : chr "14n" "1.71" "1.56" "1.57" ...</div>
<div> $ 05_00065_00002_cd: chr "10s" "A" "A" "A" ...</div>
<div> $ 05_00065_00003 : chr "14n" "1.89" "1.62" "1.63" ...</div>
<div> $ 05_00065_00003_cd: chr "10s" "A" "A" "A" ...</div>
<div> $ 15_00060_00001 : chr "14n" "52" "33" "36" ...</div>
<div> $ 15_00060_00001_cd: chr "10s" "A" "A" "A" ...</div>
<div> $ 15_00060_00002 : chr "14n" "33" "27" "27" ...</div>
<div> $ 15_00060_00002_cd: chr "10s" "A" "A" "A" ...</div>
<div> $ 15_00060_00003 : chr "14n" "42" "29" "30" ...</div>
<div> $ 15_00060_00003_cd: chr "10s" "A" "A" "A" ...</div>
<div> - attr(*, ".internal.selfref")=<externalptr> </div>
</div>
<div><br>
</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Mon, Aug 5, 2013 at 3:38 PM, iembry
<span dir="ltr"><<a moz-do-not-send="true"
href="mailto:iruckaE@mail2world.com" target="_blank">iruckaE@mail2world.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">Hi
Matthew, this link is in a similar format to the files that
I'm processing<br>
now:<br>
<a moz-do-not-send="true"
href="http://waterdata.usgs.gov/nwis/dv?cb_00095=on&cb_00065=on&cb_00060=on&format=rdb&period=&begin_date=2012-08-04&end_date=2013-08-04&site_no=02169570&referred_module=sw"
target="_blank">http://waterdata.usgs.gov/nwis/dv?cb_00095=on&cb_00065=on&cb_00060=on&format=rdb&period=&begin_date=2012-08-04&end_date=2013-08-04&site_no=02169570&referred_module=sw</a><br>
<br>
Both file formats begin with the comments followed by the
column names<br>
followed by agency code information and then the actual
data.<br>
<br>
The .rdb text files vary in length (some may range from a
few hundred lines<br>
long to over 20,000 lines). I am given the files that I am
processing.<br>
<br>
Thank you.<br>
<br>
Irucka<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
--<br>
View this message in context: <a moz-do-not-send="true"
href="http://r.789695.n4.nabble.com/data-table-on-existing-data-frame-list-tp4673142p4673181.html"
target="_blank">http://r.789695.n4.nabble.com/data-table-on-existing-data-frame-list-tp4673142p4673181.html</a><br>
Sent from the datatable-help mailing list archive at
Nabble.com.<br>
_______________________________________________<br>
datatable-help mailing list<br>
<a moz-do-not-send="true"
href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a><br>
<a moz-do-not-send="true"
href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help"
target="_blank">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</a><br>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
Jim Holtman<br>
Data Munger Guru<br>
<br>
What is the problem that you are trying to solve?<br>
Tell me what you want to do, not how you want to do it.
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
datatable-help mailing list
<a class="moz-txt-link-abbreviated" href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a>
<a class="moz-txt-link-freetext" href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</a></pre>
</blockquote>
<br>
</body>
</html>