Judging from the unusual silence I'm guessing that this doesn't have an obvious solution. I can't provide the data, and in my simulated data I don't get the same error. <div><br></div><div>I'll do some more testing today and see if I can isolate the problem. </div>
<div><br></div><div>I have a suspicion about the cause, and ill teat that. The problem seems related to one particularly messy text field. I will bet that there is some combination of characters that is causing the problem. Or some of them are too long. </div>
<div><br></div><div>I could split the file and see if each part blows up in size when I save, in order to isolate the problem. </div><div><br></div><div>Thanks, and my apologies for being unable to send a good example. <span></span><br>
<br>On Thursday, October 18, 2012, Gene Leynes wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Ok, here is my very lengthy reply with lots of diagnostics. <div>
<br></div><div><br></div><div><div><font face="courier new, monospace">> </font></div><div><font face="courier new, monospace">> ## Clear the workspace</font></div>
<div><font face="courier new, monospace">> rm(list=ls())</font></div><div><font face="courier new, monospace">> </font></div><div><font face="courier new, monospace">> ## I use a function called "loader" to load single data objects</font></div>
<div><font face="courier new, monospace">> if(!require('geneorama')){</font></div><div><font face="courier new, monospace">+ source('<a href="https://raw.github.com/geneorama/geneorama/master/R/loader.R" target="_blank">https://raw.github.com/geneorama/geneorama/master/R/loader.R</a>')</font></div>
<div><font face="courier new, monospace">+ cat('loading function \"loader\"')</font></div><div><font face="courier new, monospace">+ }</font></div><div><font face="courier new, monospace">> </font></div>
<div><font face="courier new, monospace">> ## Load the data</font></div><div><font face="courier new, monospace">> Small = loader('test0')</font></div><div><font face="courier new, monospace">> Large = loader('test1')</font></div>
<div><font face="courier new, monospace">> </font></div><div><font face="courier new, monospace">> ## The two files will be different because their order is different</font></div><div><font face="courier new, monospace">> str(Small)</font></div>
<div><font face="courier new, monospace">Classes ‘data.table’ and 'data.frame':<span style="white-space:pre-wrap"> </span>3103314 obs. of 42 variables:</font></div><div><font face="courier new, monospace"> $ index : int 1 2 3 4 5 6 7 8 9 10 ...</font></div>
<div><font face="courier new, monospace"> $ char1 : chr "<a href="http://conradhotels3.hilton.com" target="_blank">http://conradhotels3.hilton.com</a>" "<a href="http://conradhotels3.hilton.com" target="_blank">http://conradhotels3.hilton.com</a>" "<a href="http://conradhotels3.hilton.com" target="_blank">http://conradhotels3.hilton.com</a>" "<a href="http://conradhotels3.hilton.com" target="_blank">http://conradhotels3.hilton.com</a>" ...</font></div>
<div><font face="courier new, monospace"> $ char2 : chr "/en/index.html" "/en/index.html" "/en/index.html" "/en/index.html" ...</font></div><div><font face="courier new, monospace"> $ char3 : chr "" "" "" "" ...</font></div>
<div><font face="courier new, monospace"> $ int1 : int 44903 44903 44903 44903 44903 44903 44903 44903 44903 44903 ...</font></div><div><font face="courier new, monospace"> $ int2 : int 411 411 254 254 336 336 118 118 386 386 ...</font></div>
<div><font face="courier new, monospace"> $ char4 : chr "2012-05-09 20:17:40.587" "2012-05-09 21:17:54.427" "2012-05-09 20:10:49.560" "2012-05-09 21:11:05.107" ...</font></div><div>
<font face="courier new, monospace"> $ int3 : int 0 0 0 0 0 0 0 0 0 0 ...</font></div><div><font face="courier new, monospace"> $ int4 : int 0 0 0 0 0 0 0 0 0 0 ...</font></div><div><font face="courier new, monospace"> $ int5 : int 69 69 69 69 69 69 69 68 68 68 ...</font></div>
<div><font face="courier new, monospace"> $ int6 : int 68 68 68 68 68 68 68 67 67 67 ...</font></div><div><font face="courier new, monospace"> $ int7 : int 35 35 37 35 35 35 33 38 38 40 ...</font></div><div><font face="courier new, monospace"> $ int8 : int 0 0 0 0 0 0 0 0 0 0 ...</font></div>
<div><font face="courier new, monospace"> $ int9 : int 0 0 0 0 0 0 0 0 0 0 ...</font></div><div><font face="courier new, monospace"> $ int10 : int 0 0 0 0 0 0 0 0 0 0 ...</font></div><div><font face="courier new, monospace"> $ int11 : int 1 1 1 1 1 1 1 1 1 1 ...</font></div>
<div><font face="courier new, monospace"> $ int12 : int 334830 334847 335102 334838 334836 342687 334521 318626 318578 326800 ...</font></div><div><font face="courier new, monospace"> $ int13 : int 36 36 37 36 36 36 35 38 37 39 ...</font></div>
<div><font face="courier new, monospace"> $ int14 : int 44 44 49 47 45 45 45 46 45 48 ...</font></div><div><font face="courier new, monospace"> $ char5 : chr "" "" "" "" ...</font></div>
<div><font face="courier new, monospace"> $ int15 : int NA NA NA NA NA NA NA NA NA NA ...</font></div><div><font face="courier new, monospace"> $ int16 : int 0 0 0 0 0 0 0 0 0 0 ...</font></div><div><font face="courier new, monospace"> $ int17 : int 0 0 0 0 0 0 0 0 0 0 ...</font></div>
<div><font face="courier new, monospace"> $ int18 : int 2 2 2 2 2 2 2 2 2 2 ...</font></div><div><font face="courier new, monospace"> $ int19 : int 1381 1152 424 3728 1772 921 385 725 401 314 ...</font></div><div><font face="courier new, monospace"> $ int20 : int 36 36 37 36 36 36 35 38 37 39 ...</font></div>
<div><font face="courier new, monospace"> $ int21 : int 2199 2201 1492 1448 2559 2529 1084 1432 1876 1984 ...</font></div><div><font face="courier new, monospace"> $ int22 : int 44 44 49 47 45 45 45 46 45 48 ...</font></div>
<div><font face="courier new, monospace"> $ int23 : int 2203 2188 1199 1162 2324 2346 821 897 1386 1189 ...</font></div><div><font face="courier new, monospace"> $ int24 : int 13 13 14 13 13 13 12 13 13 14 ...</font></div>
<div><font face="courier new, monospace"> $ int25 : int 5166 5761 3755 3794 5614 7779 2830 3971 4637 5871 ...</font></div><div><font face="courier new, monospace"> $ int26 : int 103 103 105 103 103 103 101 105 105 107 ...</font></div>
<div><font face="courier new, monospace"> $ int27 : int 70 183 159 197 217 165 153 232 92 102 ...</font></div><div><font face="courier new, monospace"> $ int28 : int 103 103 105 103 103 103 101 105 105 107 ...</font></div>
<div><font face="courier new, monospace"> $ int29 : int 0 0 0 0 0 0 0 0 0 0 ...</font></div><div><font face="courier new, monospace"> $ int30 : int 161 146 200 158 150 160 190 161 163 169 ...</font></div><div><font face="courier new, monospace"> $ char6 : chr "Limelight" "Limelight" "Fusepoint/Savvis" "Fusepoint/Savvis" ...</font></div>
<div><font face="courier new, monospace"> $ char7 : chr "Paris" "Paris" "Toronto" "Toronto" ...</font></div><div><font face="courier new, monospace"> $ char8 : chr "-1" "-1" "-1" "-1" ...</font></div>
<div><font face="courier new, monospace"> $ char9 : chr "FRANCE" "FRANCE" "CANADA" "CANADA" ...</font></div><div><font face="courier new, monospace"> $ char10: chr "FR" "FR" "CA" "CA" ...</font></div>
<div><font face="courier new, monospace"></font></div><div><font face="courier new, monospace">> str(Large)</font></div><div><font face="courier new, monospace">Classes ‘data.table’ and 'data.frame':<span style="white-space:pre-wrap"> </span>3103314 obs. of 42 variables:</font></div>
<div><font face="courier new, monospace"> $ index : int 716234 716235 1007651 2679944 1550732 1932010 2879445 1007670 1736006 666363 ...</font></div><div><font face="courier new, monospace"> $ char1 : chr "<a href="http://go.compuware.com" target="_blank">http://go.compuware.com</a>" "<a href="http://go.compuware.com" target="_blank">http://go.compuware.com</a>" "<a href="http://www.achmeacollectief.nl" target="_blank">http://www.achmeacollectief.nl</a>" "<a href="https://db3.notify.windows.com" target="_blank">https://db3.notify.windows.com</a>" ...</font></div>
<div><font face="courier new, monospace"> $ char2 : chr "/default.aspx" "/dynaTraceMonitor" "/unilever/" "/ping" ...</font></div><div><font face="courier new, monospace"> $ char3 : chr "?rurl=<a href="http://frontline.compuware.com//products/BU/default.aspx" target="_blank">http://frontline.compuware.com//products/BU/default.aspx</a>" "?url=http%3A%2F%<a href="http://2Fgo.compuware.com" target="_blank">2Fgo.compuware.com</a>%2Fdefault.aspx%3Frurl%3Dhttp%3A%2F%<a href="http://2Ffrontline.compuware.com" target="_blank">2Ffrontline.compuware.com</a>%2F%2Fproducts%2FBU%2Fdefault.as"| __truncated__ "" "" ...</font></div>
<div><font face="courier new, monospace"> $ int1 : int 2812881 2812881 3149757 4286896 3618836 3861870 4315803 3149760 3779387 2754629 ...</font></div><div><font face="courier new, monospace"> $ int2 : int 133 133 133 133 340 340 326 133 133 340 ...</font></div>
<div><font face="courier new, monospace"> $ char4 : chr "2012-05-09 20:00:00.000" "2012-05-09 20:00:00.000" "2012-05-09 20:00:00.000" "2012-05-09 20:00:00.000" ...</font></div><div>
<font face="courier new, monospace"> $ int3 : int 0 1 0 0 0 0 0 0 0 0 ...</font></div><div><font face="courier new, monospace"> $ int4 : int 2264 2496 1782 461 1953 1418 641 1207 167 278 ...</font></div><div><font face="courier new, monospace"> $ int5 : int 26 20 6 1 71 64 1 6 1 15 ...</font></div>
<div><font face="courier new, monospace"> $ int6 : int 26 20 6 1 69 64 1 6 1 15 ...</font></div><div><font face="courier new, monospace"> $ int7 : int 2 2 4 0 2 12 0 2 0 0 ...</font></div><div><font face="courier new, monospace"> $ int8 : int 0 0 0 0 2 0 0 0 0 0 ...</font></div>
<div><font face="courier new, monospace"> $ int9 : int 0 0 0 0 0 0 0 0 0 0 ...</font></div><div><font face="courier new, monospace"> $ int10 : int 0 0 0 0 0 0 0 0 0 0 ...</font></div><div><font face="courier new, monospace"> $ int11 : int 0 0 0 0 0 0 0 0 0 0 ...</font></div>
<div><font face="courier new, monospace"> $ int12 : int 392752 417195 43107 0 1419015 1031349 187344 62969 43 428189 ...</font></div><div><font face="courier new, monospace"> $ int13 : int 4 4 5 1 8 22 1 3 1 1 ...</font></div>
<div><font face="courier new, monospace"> $ int14 : int 9 11 8 1 17 38 1 6 1 15 ...</font></div><div><font face="courier new, monospace"> $ char5 : chr "" "" "" "" ...</font></div>
<div><font face="courier new, monospace"> $ int15 : int NA NA NA NA 0 NA NA NA NA 0 ...</font></div><div><font face="courier new, monospace"> $ int16 : int 0 0 0 0 0 0 0 0 0 0 ...</font></div><div><font face="courier new, monospace"> $ int17 : int 0 0 0 0 0 0 0 0 0 0 ...</font></div>
<div><font face="courier new, monospace"> $ int18 : int 2 28 3 0 0 1 0 1 0 0 ...</font></div><div><font face="courier new, monospace"> $ int19 : int 137 0 136 298 277 255 147 141 137 209 ...</font></div><div><font face="courier new, monospace"> $ int20 : int 4 0 5 1 8 22 1 3 1 1 ...</font></div>
<div><font face="courier new, monospace"> $ int21 : int 945 612 59 22 689 1153 54 29 13 59 ...</font></div><div><font face="courier new, monospace"> $ int22 : int 9 5 8 1 17 38 1 6 1 15 ...</font></div><div><font face="courier new, monospace"> $ int23 : int 0 0 0 118 0 0 0 0 0 0 ...</font></div>
<div><font face="courier new, monospace"> $ int24 : int 0 0 0 1 0 0 0 0 0 0 ...</font></div><div><font face="courier new, monospace"> $ int25 : int 3243 2653 1585 22 3292 3076 64 1043 13 81 ...</font></div><div><font face="courier new, monospace"> $ int26 : int 28 22 10 1 73 76 1 8 1 15 ...</font></div>
<div><font face="courier new, monospace"> $ int27 : int 2060 3365 257 1 3304 1038 376 258 4 80 ...</font></div><div><font face="courier new, monospace"> $ int28 : int 28 22 10 1 73 76 1 8 1 15 ...</font></div><div><font face="courier new, monospace"> $ int29 : int 0 0 0 0 0 0 0 0 0 0 ...</font></div>
<div><font face="courier new, monospace"> $ int30 : int 921 750 203 578 609 1078 234 187 31 140 ...</font></div><div><font face="courier new, monospace"> $ char6 : chr "Interoute" "Interoute" "Interoute" "Interoute" ...</font></div>
<div><font face="courier new, monospace"> $ char7 : chr "Amsterdam" "Amsterdam" "Amsterdam" "Amsterdam" ...</font></div><div><font face="courier new, monospace"> $ char8 : chr "-1" "-1" "-1" "-1" ...</font></div>
<div><font face="courier new, monospace"> $ char9 : chr "NETHERLANDS" "NETHERLANDS" "NETHERLANDS" "NETHERLANDS" ...</font></div><div><font face="courier new, monospace"> $ char10: chr "NL" "NL" "NL" "NL" ...</font></div>
<div><font face="courier new, monospace"> $ char11: chr "NETHERLANDS" "NETHERLANDS" "NETHERLANDS" "NETHERLANDS" ...</font></div><div><font face="courier new, monospace"> - attr(*, ".internal.selfref")=<externalptr> </font></div>
<div><font face="courier new, monospace"> - attr(*, "sorted")= chr "char4"</font></div><div><font face="courier new, monospace">> </font></div><div><font face="courier new, monospace">> ## The difference is shown here</font></div>
<div><font face="courier new, monospace">> mapply(identical, Small, Large)</font></div><div><font face="courier new, monospace"> index char1 char2 char3 int1 int2 char4 int3 int4 int5 int6 int7 int8 int9 </font></div>
<div><font face="courier new, monospace"> FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE </font></div><div><font face="courier new, monospace"> int10 int11 int12 int13 int14 char5 int15 int16 int17 int18 int19 int20 int21 int22 </font></div>
<div><font face="courier new, monospace"> FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE </font></div><div><font face="courier new, monospace"> int23 int24 int25 int26 int27 int28 int29 int30 char6 char7 char8 char9 char10 char11 </font></div>
<div><font face="courier new, monospace"> FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE </font></div><div><font face="courier new, monospace">> mapply(all.equal, Small, Large)</font></div>
<div><font face="courier new, monospace"> index </font></div><div><font face="courier new, monospace"> "Mean relative difference: 0.6660698" </font></div>
<div><font face="courier new, monospace"> char1 </font></div><div><font face="courier new, monospace"> "3100674 string mismatches" </font></div>
<div><font face="courier new, monospace"> char2 </font></div><div><font face="courier new, monospace"> "2961621 string mismatches" </font></div>
<div><font face="courier new, monospace"> char3 </font></div><div><font face="courier new, monospace"> "1753352 string mismatches" </font></div>
<div><font face="courier new, monospace"> int1 </font></div><div><font face="courier new, monospace"> "Mean relative difference: 0.2945024" </font></div>
<div><font face="courier new, monospace"> int2 </font></div><div><font face="courier new, monospace"> "Mean relative difference: 0.4866954" </font></div>
<div><font face="courier new, monospace"> char4 </font></div><div><font face="courier new, monospace"> "3103308 string mismatches" </font></div>
<div><font face="courier new, monospace"> int3 </font></div><div><font face="courier new, monospace"> "Mean relative difference: 1.759713" </font></div>
<div><font face="courier new, monospace"> int4 </font></div><div><font face="courier new, monospace"> "Mean relative difference: 1.408616" </font></div>
<div><font face="courier new, monospace"> int5 </font></div><div><font face="courier new, monospace"> "Mean relative difference: 1.411817" </font></div>
<div><font face="courier new, monospace"> int6 </font></div><div><font face="courier new, monospace"> "Mean relative difference: 1.415648" </font></div>
<div><font face="courier new, monospace"> int7 </font></div><div><font face="courier new, monospace"> "Mean relative difference: 1.705137" </font></div>
<div><font face="courier new, monospace"> int8 </font></div><div><font face="courier new, monospace"> "Mean relative difference: 1.954795" </font></div>
<div><font face="courier new, monospace"> int9 </font></div><div><font face="courier new, monospace"> "Mean relative difference: 1.99701" </font></div>
<div><font face="courier new, monospace"> int10 </font></div><div><font face="courier new, monospace"> "Mean relative difference: 1.995529" </font></div>
<div><font face="courier new, monospace"> int11 </font></div><div><font face="courier new, monospace"> "Mean relative difference: 2" </font></div>
<div><font face="courier new, monospace"> int12 </font></div><div><font face="courier new, monospace"> "Mean relative difference: 1.479043" </font></div>
<div><font face="courier new, monospace"> int13 </font></div><div><font face="courier new, monospace"> "Mean relative difference: 1.323619" </font></div>
<div><font face="courier new, monospace"> int14 </font></div><div><font face="courier new, monospace"> "Mean relative difference: 1.360022" </font></div>
<div><font face="courier new, monospace"> char5 </font></div><div><font face="courier new, monospace"> "1454309 string mismatches" </font></div>
<div><font face="courier new, monospace"> int15 </font></div><div><font face="courier new, monospace">"'is.NA' value mismatch: 2260789 in current 2260789 in target" </font></div>
<div><font face="courier new, monospace"> int16 </font></div><div><font face="courier new, monospace"> "Mean relative difference: 1.997195" </font></div>
<div><font face="courier new, monospace"> int17 </font></div><div><font face="courier new, monospace"> "Mean relative difference: 2" </font></div>
<div><font face="courier new, monospace"> int18 </font></div><div><font face="courier new, monospace"> "Mean relative difference: 1.799441" </font></div>
<div><font face="courier new, monospace"> int19 </font></div><div><font face="courier new, monospace"> </font></div></div></blockquote></div>