<div dir="ltr"><div class="gmail_default"><div class="gmail_default" style="color:rgb(0,51,51);font-family:tahoma,sans-serif">Hey,</div><div class="gmail_default" style="color:rgb(0,51,51);font-family:tahoma,sans-serif">Thanks for suggestion but this didn't work.</div><div class="gmail_default" style="color:rgb(0,51,51);font-family:tahoma,sans-serif"><br></div><div class="gmail_default" style="color:rgb(0,51,51);font-family:tahoma,sans-serif">Method 1 : use of data.table / sample </div><div class="gmail_default" style="color:rgb(0,51,51);font-family:tahoma,sans-serif"><div class="gmail_default">> set.seed(1); size <- 100000000; dt <- data.table::data.table("a"=c(1:size),"b"=rep(letters[1:10],size/10));head(dt);system.time(</div><div class="gmail_default">dt[,c("a","b"):=list(sample(a),sample(b))]</div><div class="gmail_default">);head(dt)</div><div class="gmail_default"> a b</div><div class="gmail_default">1: 1 a</div><div class="gmail_default">2: 2 b</div><div class="gmail_default">3: 3 c</div><div class="gmail_default">4: 4 d</div><div class="gmail_default">5: 5 e</div><div class="gmail_default">6: 6 f</div><div class="gmail_default">utilisateur système écoulé </div><div class="gmail_default"> 10.190 0.252 10.456 </div><div class="gmail_default"> a b</div><div class="gmail_default">1: 26550867 a</div><div class="gmail_default">2: 37212390 b</div><div class="gmail_default">3: 57285336 c</div><div class="gmail_default">4: 90820777 e</div><div class="gmail_default">5: 20168193 a</div><div class="gmail_default">6: 89838965 h</div><div><br></div></div><div class="gmail_default" style="color:rgb(0,51,51);font-family:tahoma,sans-serif"><br></div><div class="gmail_default" style="color:rgb(0,51,51);font-family:tahoma,sans-serif">Method 2 : use of factor / data.table / sample</div><div class="gmail_default" style="color:rgb(0,51,51);font-family:tahoma,sans-serif"><div class="gmail_default">> set.seed(1); size <- 100000000; dt <- data.table::data.table("a"=c(1:size),"b"=as.factor(rep(letters[1:10],size/10)));head(dt);system.time(</div><div class="gmail_default"> dt[,c("a","b"):=list(sample(a),sample(b))]</div><div class="gmail_default">);head(dt)</div><div class="gmail_default"> a b</div><div class="gmail_default">1: 1 a</div><div class="gmail_default">2: 2 b</div><div class="gmail_default">3: 3 c</div><div class="gmail_default">4: 4 d</div><div class="gmail_default">5: 5 e</div><div class="gmail_default">6: 6 f</div><div class="gmail_default">utilisateur système écoulé </div><div class="gmail_default"> 9.271 0.276 9.559 </div><div class="gmail_default"> a b</div><div class="gmail_default">1: 26550867 a</div><div class="gmail_default">2: 37212390 b</div><div class="gmail_default">3: 57285336 c</div><div class="gmail_default">4: 90820777 e</div><div class="gmail_default">5: 20168193 a</div><div class="gmail_default">6: 89838965 h</div><div><br></div></div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif">Method 3: Use of internal / data.table / factor</div><div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif">> set.seed(1); size <- 100000000; dt <- data.table::data.table("a"=c(1:size),"b"=as.factor(rep(letters[1:10],size/10)));head(dt);system.time(</div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif"> dt[,c("a","b"):=list(a[.Internal(sample(size, size, FALSE, NULL))],b[.Internal(sample(size, size, FALSE, NULL))])]</div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif">);head(dt)</div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif"> a b</div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif">1: 1 a</div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif">2: 2 b</div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif">3: 3 c</div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif">4: 4 d</div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif">5: 5 e</div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif">6: 6 f</div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif">utilisateur système écoulé </div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif"> 8.786 0.137 8.935 </div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif"> a b</div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif">1: 26550867 a</div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif">2: 37212390 b</div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif">3: 57285336 c</div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif">4: 90820777 e</div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif">5: 20168193 a</div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif">6: 89838965 h</div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif"><br></div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif">Method 4 (thanks for pointing it banded): set / factor / sample</div><div><div><font color="#003333" face="tahoma, sans-serif">> set.seed(1); size <- 100000000; dt <- data.table::data.table("a"=c(1:size),"b"=as.factor(rep(letters[1:10],size/10)));head(dt);system.time({</font></div><div><font color="#003333" face="tahoma, sans-serif">set(dt,j="a",value=sample(dt$a));</font></div><div><font color="#003333" face="tahoma, sans-serif">set(dt,j="b",value=sample(dt$b))}</font></div><div><font color="#003333" face="tahoma, sans-serif">);head(dt);</font></div><div><font color="#003333" face="tahoma, sans-serif"> a b</font></div><div><font color="#003333" face="tahoma, sans-serif">1: 1 a</font></div><div><font color="#003333" face="tahoma, sans-serif">2: 2 b</font></div><div><font color="#003333" face="tahoma, sans-serif">3: 3 c</font></div><div><font color="#003333" face="tahoma, sans-serif">4: 4 d</font></div><div><font color="#003333" face="tahoma, sans-serif">5: 5 e</font></div><div><font color="#003333" face="tahoma, sans-serif">6: 6 f</font></div><div><font color="#003333" face="tahoma, sans-serif">utilisateur système écoulé </font></div><div><font color="#003333" face="tahoma, sans-serif"> 8.790 0.204 9.006 </font></div><div><font color="#003333" face="tahoma, sans-serif"> a b</font></div><div><font color="#003333" face="tahoma, sans-serif">1: 26550867 a</font></div><div><font color="#003333" face="tahoma, sans-serif">2: 37212390 b</font></div><div><font color="#003333" face="tahoma, sans-serif">3: 57285336 c</font></div><div><font color="#003333" face="tahoma, sans-serif">4: 90820777 e</font></div><div><font color="#003333" face="tahoma, sans-serif">5: 20168193 a</font></div><div><font color="#003333" face="tahoma, sans-serif">6: 89838965 h</font></div></div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif"><br></div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif">Method 5 use of a data.frame</div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif"><div>> set.seed(1); size <- 100000000; dt <- data.frame("a"=c(1:size),"b"=as.factor(rep(letters[1:10],size/10)));head(dt);system.time({</div><div>dt$a <- sample(dt$a);dt$b <- sample(dt$b)</div><div>});head(dt);</div><div> a b</div><div>1 1 a</div><div>2 2 b</div><div>3 3 c</div><div>4 4 d</div><div>5 5 e</div><div>6 6 f</div><div>utilisateur système écoulé </div><div> 8.755 0.152 8.921 </div><div> a b</div><div>1 26550867 a</div><div>2 37212390 b</div><div>3 57285336 c</div><div>4 90820777 e</div><div>5 20168193 a</div><div>6 89838965 h</div><div><br></div><div><br></div></div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif">sadly, data.table does not improve. sample is the bottleneck</div><div style="color:rgb(0,51,51);font-family:tahoma,sans-serif"><br></div></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">2017-01-05 14:20 GMT+01:00 banded08 <span dir="ltr"><<a href="mailto:david.awam.jansen@gmail.com" target="_blank">david.awam.jansen@gmail.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Maybe not the fastest of most efficient, but this should work<br>
<br>
for(ii in 1:dim(dt1)[1]) set(dt1, ii, 1:dim(dt1)[2] ,sample(dt1[ii]))<br>
<br>
<br>
<br>
--<br>
View this message in context: <a href="http://r.789695.n4.nabble.com/Shuffle-row-wise-column-independently-tp4727865p4727871.html" rel="noreferrer" target="_blank">http://r.789695.n4.nabble.com/<wbr>Shuffle-row-wise-column-<wbr>independently-<wbr>tp4727865p4727871.html</a><br>
Sent from the datatable-help mailing list archive at Nabble.com.<br>
______________________________<wbr>_________________<br>
datatable-help mailing list<br>
<a href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.<wbr>r-project.org</a><br>
<a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help" rel="noreferrer" target="_blank">https://lists.r-forge.r-<wbr>project.org/cgi-bin/mailman/<wbr>listinfo/datatable-help</a><br>
</blockquote></div><br></div>