<span class="Apple-style-span" style="font-family: arial, sans-serif; font-size: 13px; border-collapse: collapse; ">Hello <br><br>I have a data table called dt in which each student can have multiple<br>records (created using data.table)<br>
<br>coursecode student_id<br>---------------- ----------------<br>NA 1<br>NA 1<br>NA 1<br>.... 1<br>.... 1<br>NA 2<br>101 2<br>
102 2<br>NA 2<br>103 2<br><br>I am trying to group by student id and concatenate the coursecode<br>strings in<br>student records. This string is mostly NA but it can also be real<br>
course code<br>(because of messy real life data coursecode was not always entered)<br>There are 999999 records.<br><br>So, I thought I would get results like<br><br>1 NA NA NA .....<br>2 NA 101 102 NA 123 ....<br><br>However, as seen below, it brings me a result with 999999 rows<br>
and it fails to concatenate the coursecode's.<br><br>> codes <- dt[,paste(coursecode),by=student_id]<br>> codes<br> student_id V1<br> [1,] 1 NA<br> [2,] 1 NA<br> [3,] 1 NA<br> [4,] 1 NA<br>
[5,] 1 NA<br> [6,] 1 NA<br> [7,] 1 NA<br> [8,] 1 NA<br> [9,] 1 NA<br>[10,] 1 NA<br>First 10 rows of 999999 printed.<br><br>If I repeat the same example for a numeric attribute and use some math<br>
aggregation functions such as sum, mean, etc., then the number of rows<br>returned is correct, it is indeed equal to the number of students.<br><br>I was wondering if the problem is with NA's or with the use of paste<br>
as the aggregation function. I can alternatively use RMySQL with MySQL<br>to concatenate those strings but I would like to use data.table if<br>possible.<br><br>Thanks in advance,<br><font color="#888888"><br>Steve</font></span>