[datatable-help] Using data.table to run a function on every row

Ricardo Saporta saporta at scarletmail.rutgers.edu
Fri Sep 27 08:37:28 CEST 2013


Hi there,

Try inserting a `by=id` in

   a <- db[(has_url), getUrls(text, id), by=id]

Also, no need for "has_url == T"
instead, use
  (has_url)
If the variable is alread logical.  (Otherwise, you are just slowing things
down ;)



Ricardo Saporta
Graduate Student, Data Analytics
Rutgers University, New Jersey
e: saporta at rutgers.edu



On Thu, Sep 26, 2013 at 11:16 PM, Stian Håklev <shaklev at gmail.com> wrote:

> I'm trying to run a function on every row fulfilling a certain criterium,
> which returns a data frame - the idea is then to take the list of data
> frames and rbindlist them together for a totally separate data.table. (I'm
> extracting several URL links from each forum post, and tagging them with
> the forum post they came from).
>
> I tried doing this with a data.table
>
> a <- db[has_url == T, getUrls(text, id)]
>
> and get the message
>
> Error in `$<-.data.frame`(`*tmp*`, "id", value = c(1L, 6L, 1L, 2L, 4L,  :
>   replacement has 11007 rows, data has 29787
>
> Because some rows have several URLs... However, I don't care that these
> rowlengths don't match, I still want these rows :) I thought J would just
> let me execute arbitrary R code in the context of the rows as variable
> names, etc.
>
> Here's the function it's running, but that shouldn't be relevant
>
> getUrls <- function(text, id) {
>   matches <- str_match_all(text, url_pattern)
>   a <- data.frame(urls=unlist(matches))
>   a$id <- id
>   a
> }
>
>
> Thanks, and thanks for an amazing package - data.table has made my life so
> much easier. It should be part of base, I think.
> Stian Haklev, University of Toronto
>
> --
> http://reganmian.net/blog -- Random Stuff that Matters
>
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130927/a33eb022/attachment.html>


More information about the datatable-help mailing list