<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN">
<html><body>
<p> </p>
<p>Thanks. Have added that (1970 potential issue) to statquant's FR to follow up...</p>
<p>https://r-forge.r-project.org/tracker/index.php?func=detail&aid=2582&group_id=240&atid=978</p>
<p> </p>
<p>On 26.02.2013 00:46, Alexander Chernyakov wrote:</p>
<blockquote type="cite" style="padding-left:5px; border-left:#1010ff 2px solid; margin-left:5px; width:100%"><!-- html ignored --><!-- head ignored --><!-- meta ignored -->
<p>Regarding fasttime: my understanding is that only works after 1970.<br /><br /></p>
<div class="gmail_quote">On Mon, Feb 25, 2013 at 7:41 PM, <span><<a href="mailto:datatable-help-request@lists.r-forge.r-project.org">datatable-help-request@lists.r-forge.r-project.org</a>></span> wrote:<br />
<blockquote class="gmail_quote" style="margin: 0 0 0 .8ex; border-left: 1px #ccc solid; padding-left: 1ex;">Send datatable-help mailing list submissions to<br /> <a href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a><br /><br /> To subscribe or unsubscribe via the World Wide Web, visit<br /> <a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</a><br /><br /> or, via email, send a message with subject or body 'help' to<br /> <a href="mailto:datatable-help-request@lists.r-forge.r-project.org">datatable-help-request@lists.r-forge.r-project.org</a><br /><br /> You can reach the person managing the list at<br /> <a href="mailto:datatable-help-owner@lists.r-forge.r-project.org">datatable-help-owner@lists.r-forge.r-project.org</a><br /><br /> When replying, please edit your Subject line so it is more specific<br /> than "Re: Contents of datatable-help digest..."<br /><br /><br /> Today's Topics:<br /><br /> 1. About adding fastmatch and fasttime to data.table (stat quant)<br /> 2. Potential bug with sorting/summarizing by POSIXct and logical<br /> column (Victor Kryukov)<br /> 3. Re: About adding fastmatch and fasttime to data.table<br /> (Matthew Dowle)<br /> 4. Re: Potential bug with sorting/summarizing by POSIXct and<br /> logical column (Michael Nelson)<br /><br /><br /> ----------------------------------------------------------------------<br /><br /> Message: 1<br /> Date: Mon, 25 Feb 2013 19:40:35 +0100<br /> From: stat quant <<a href="mailto:statquant@outlook.com">statquant@outlook.com</a>><br /> To: <a href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a><br /> Subject: [datatable-help] About adding fastmatch and fasttime to<br /> data.table<br /> Message-ID:<br /> <<a href="mailto:CAJJHHA9qL8hURXF0%2B8OnPaD1t7Y5csoOLX7qDKNUqXc1XpmGCA@mail.gmail.com">CAJJHHA9qL8hURXF0+8OnPaD1t7Y5csoOLX7qDKNUqXc1XpmGCA@mail.gmail.com</a>><br /> Content-Type: text/plain; charset="iso-8859-1"<br /><br /> Hello list,<br /><br /> Looking at fastmatch and fasttime, I realized that those package consists<br /> solely in 1 C file (each).<br /> We spoke about the possibility to add those to data.table, I tried to<br /> contact S.Urbanek without any success so I do not have feedback from his<br /> side.<br /> Using fastPOSIXct provide a huge gain when one have to load files with<br /> datetime, on my laptop using data.table:::fread, I realized that most of<br /> the time is spent casting datetimes to POSIXct (I have several columns).<br /><br /> Looking at fasttime, you can see pretty good improvement (factor 15)<br /><br /> R) ts R) system.time(a utilisateur syst?me ?coul?<br /> 6.49 0.04 6.57<br /> R) system.time(b utilisateur syst?me ?coul?<br /> 0.40 0.00 0.41<br /><br /> When colClasses will be implemented in fread, can I suggest to allow using<br /> fasttime as an option ?<br /> Concerning fastmatch, the vignette already shows some nice benchmarks, I<br /> tend to do a lot of selects based on string columns, not sure if this is<br /> the case for most of us.<br /><br /> My 0.002 cent<br /> Cheers<br /> -------------- next part --------------<br /> An HTML attachment was scrubbed...<br /> URL: <<a href="http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130225/f45e5d57/attachment-0001.html">http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130225/f45e5d57/attachment-0001.html</a>><br /><br /> ------------------------------<br /><br /> Message: 2<br /> Date: Mon, 25 Feb 2013 14:26:28 -0800<br /> From: Victor Kryukov <<a href="mailto:victor.kryukov@gmail.com">victor.kryukov@gmail.com</a>><br /> To: <a href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a><br /> Subject: [datatable-help] Potential bug with sorting/summarizing by<br /> POSIXct and logical column<br /> Message-ID:<br /> 1X+n5suowA@mail.gmail.com><br /> Content-Type: text/plain; charset="iso-8859-1"<br /><br /> Hello,<br /><br /> I've encounted what looks like a bug while sorting by POSIXct and logical<br /> column, which may or may not be related to the following bug:<br /><br /><a href="https://r-forge.r-project.org/tracker/index.php?func=detail&aid=2552&group_id=240&atid=975">https://r-forge.r-project.org/tracker/index.php?func=detail&aid=2552&group_id=240&atid=975</a><br /><br /> Here are all the details:<br /><a href="http://stackoverflow.com/questions/15077232/data-table-not-summarizing-properly-by-two-columns">http://stackoverflow.com/questions/15077232/data-table-not-summarizing-properly-by-two-columns</a><br /><br /> Here is the test case:<br /><br /> # First some data<br /> data month = structure(c(1356998400, 1356998400, 1356998400,<br /> 1359676800, 1354320000, 1359676800, 1359676800,<br /> 1356998400, 1356998400,<br /> 1354320000, 1354320000, 1354320000, 1359676800,<br /> 1359676800, 1359676800,<br /> 1356998400, 1359676800, 1359676800, 1356998400,<br /> 1359676800, 1359676800,<br /> 1359676800, 1359676800, 1354320000, 1354320000),<br /> class = c("POSIXct",<br /><br /> "POSIXt"), tzone = "UTC"),<br /> portal = c(TRUE, TRUE, FALSE, TRUE,<br /> TRUE, TRUE, TRUE, TRUE, TRUE, FALSE, TRUE, FALSE, TRUE,<br /> FALSE,<br /> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE,<br /> TRUE, TRUE<br /> ),<br /> satisfaction = c(10L, 10L, 10L, 9L, 10L, 10L, 9L, 10L, 10L,<br /> 9L, 2L, 8L, 10L, 9L, 10L, 10L, 9L, 10L, 10L, 10L,<br /> 9L, 10L, 9L,<br /> 10L, 10L)),<br /> .Names = c("month", "portal", "satisfaction"),<br /> row.names = c(NA, -25L), class = "data.frame"))<br /><br /> # Summarizing by month, portal with tapply works:<br /><br /> > tapply(data$satisfaction, list(data$month, data$portal), mean)<br /> FALSE TRUE<br /> 2012-12-01 8.5 8.000000<br /> 2013-01-01 10.0 10.000000<br /> 2013-02-01 9.0 9.545455<br /><br /> # Summarizing with 'by' argument of data.table does not:<br /><br /> > data[, mean(satisfaction), by = 'month,portal']><br /> data[, mean(satisfaction), by = list(month, portal)]<br /> month portal V1<br /> 1: 2013-01-01 FALSE 10.000000<br /> 2: 2013-02-01 TRUE 9.000000<br /> 3: 2013-01-01 TRUE 10.000000<br /> 4: 2012-12-01 FALSE 8.500000<br /> 5: 2012-12-01 TRUE 7.333333<br /> 6: 2013-02-01 TRUE 9.666667<br /> 7: 2013-02-01 FALSE 9.000000<br /> 8: 2012-12-01 TRUE 10.000000<br /><br /> # Summarizing only this year's data works:<br /> data[month >= ymd(20130101), mean(satisfaction), by = 'month,portal']<br /> month portal V1<br /> 1: 2013-01-01 TRUE 10.000000<br /> 2: 2013-01-01 FALSE 10.000000<br /> 3: 2013-02-01 TRUE 9.545455<br /> 4: 2013-02-01 FALSE 9.000000<br /><br /> Yours Sincerely,<br /> Victor Kryukov<br /> -------------- next part --------------<br /> An HTML attachment was scrubbed...<br /> URL: <<a href="http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130225/45b99e3e/attachment-0001.html">http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130225/45b99e3e/attachment-0001.html</a>><br /><br /> ------------------------------<br /><br /> Message: 3<br /> Date: Tue, 26 Feb 2013 00:39:09 +0000<br /> From: Matthew Dowle <<a href="mailto:mdowle@mdowle.plus.com">mdowle@mdowle.plus.com</a>><br /> To: <<a href="mailto:statquant@outlook.com">statquant@outlook.com</a>><br /> Cc: <a href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a><br /> Subject: Re: [datatable-help] About adding fastmatch and fasttime to<br /> data.table<br /> Message-ID: <<a href="mailto:aed96221d7d28ff8d77ea8823135b49a@imap.plus.net">aed96221d7d28ff8d77ea8823135b49a@imap.plus.net</a>><br /> Content-Type: text/plain; charset="utf-8"<br /><br /><br /><br /> Hi,<br /><br /> This sounds like a geat idea. I don't know why Simon U didn't<br /> reply, or without success, so that may depend on the way you asked,<br /> whether he is on holiday at the moment, his reaction to the precise<br /> wording of the email you wrote, or some other factor. It is difficult to<br /> tell! But we don't need to wait for him or for for you: this is open<br /> source. You have got much further than I have so if you'd like to add<br /> this please go ahead and make progress. You're very welcome to join the<br /> project and commit directly. Or if you can't for some reason please file<br /> as a feature request so it doesn't get forgotten.<br /><br /> Matthew<br /><br /> On<br /><a>25.02.2013 18</a>:40, stat quant wrote:<br /><br /> > Hello list,<br /> ><br /> > Looking at<br /> fastmatch and fasttime, I realized that those package consists solely in<br /> 1 C file (each).<br /> > We spoke about the possibility to add those to<br /> data.table, I tried to contact S.Urbanek without any success so I do not<br /> have feedback from his side.<br /> > Using fastPOSIXct provide a huge gain<br /> when one have to load files with datetime, on my laptop using<br /> data.table:::fread, I realized that most of the time is spent casting<br /> datetimes to POSIXct (I have several columns).<br /> ><br /> > Looking at<br /> fasttime, you can see pretty good improvement (factor 15)<br /> ><br /> > R) ts R)<br /> system.time(a utilisateur syst?me ?coul?<br /> > 6.49 0.04 6.57<br /> > R)<br /> system.time(b utilisateur syst?me ?coul?<br /> > 0.40 0.00 0.41<br /> ><br /> > When<br /> colClasses will be implemented in fread, can I suggest to allow using<br /> fasttime as an option ?<br /> > Concerning fastmatch, the vignette already<br /> shows some nice benchmarks, I tend to do a lot of selects based on<br /> string columns, not sure if this is the case for most of us.<br /> ><br /> > My<br /> 0.002 cent<br /> > Cheers<br /><br /><br /> -------------- next part --------------<br /> An HTML attachment was scrubbed...<br /> URL: <<a href="http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130226/643480c3/attachment-0001.html">http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130226/643480c3/attachment-0001.html</a>><br /><br /> ------------------------------<br /><br /> Message: 4<br /> Date: Tue, 26 Feb 2013 00:40:02 +0000<br /> From: Michael Nelson <<a href="mailto:michael.nelson@sydney.edu.au">michael.nelson@sydney.edu.au</a>><br /> To: "<a href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a>"<br /> <<a href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a>><br /> Subject: Re: [datatable-help] Potential bug with sorting/summarizing<br /> by POSIXct and logical column<br /> Message-ID:<br /> <<a href="mailto:6FB5193A6CDCDF499486A833B7AFBDCD5827D4E4@EX-MBX-PRO-04.mcs.usyd.edu.au">6FB5193A6CDCDF499486A833B7AFBDCD5827D4E4@EX-MBX-PRO-04.mcs.usyd.edu.au</a>><br /><br /> Content-Type: text/plain; charset="iso-8859-1"<br /><br /> I can't replicate this problem using data.table 1.8.7 (installed about 3 weeks ago) on<br /> R version 2.15.2 (2012-10-26)<br /> Platform: i386-w64-mingw32/i386 (32-bit)<br /><br /> Michael<br /> ________________________________<br /> From: <a href="mailto:datatable-help-bounces@lists.r-forge.r-project.org">datatable-help-bounces@lists.r-forge.r-project.org</a> [<a href="mailto:datatable-help-bounces@lists.r-forge.r-project.org">datatable-help-bounces@lists.r-forge.r-project.org</a>] on behalf of Victor Kryukov [<a href="mailto:victor.kryukov@gmail.com">victor.kryukov@gmail.com</a>]<br /> Sent: Tuesday, 26 February 2013 9:26 AM<br /> To: <a href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a><br /> Subject: [datatable-help] Potential bug with sorting/summarizing by POSIXct and logical column<br /><br /> Hello,<br /><br /> I've encounted what looks like a bug while sorting by POSIXct and logical column, which may or may not be related to the following bug:<br /><br /><a href="https://r-forge.r-project.org/tracker/index.php?func=detail&aid=2552&group_id=240&atid=975">https://r-forge.r-project.org/tracker/index.php?func=detail&aid=2552&group_id=240&atid=975</a><br /><br /> Here are all the details: <a href="http://stackoverflow.com/questions/15077232/data-table-not-summarizing-properly-by-two-columns">http://stackoverflow.com/questions/15077232/data-table-not-summarizing-properly-by-two-columns</a><br /><br /> Here is the test case:<br /><br /> # First some data<br /> data month = structure(c(1356998400, 1356998400, 1356998400,<br /> 1359676800, 1354320000, 1359676800, 1359676800, 1356998400, 1356998400,<br /> 1354320000, 1354320000, 1354320000, 1359676800, 1359676800, 1359676800,<br /> 1356998400, 1359676800, 1359676800, 1356998400, 1359676800, 1359676800,<br /> 1359676800, 1359676800, 1354320000, 1354320000), class = c("POSIXct",<br /> "POSIXt"), tzone = "UTC"),<br /> portal = c(TRUE, TRUE, FALSE, TRUE,<br /> TRUE, TRUE, TRUE, TRUE, TRUE, FALSE, TRUE, FALSE, TRUE, FALSE,<br /> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE<br /> ),<br /> satisfaction = c(10L, 10L, 10L, 9L, 10L, 10L, 9L, 10L, 10L,<br /> 9L, 2L, 8L, 10L, 9L, 10L, 10L, 9L, 10L, 10L, 10L, 9L, 10L, 9L,<br /> 10L, 10L)),<br /> .Names = c("month", "portal", "satisfaction"),<br /> row.names = c(NA, -25L), class = "data.frame"))<br /><br /> # Summarizing by month, portal with tapply works:<br /><br /> > tapply(data$satisfaction, list(data$month, data$portal), mean)<br /> FALSE TRUE<br /> 2012-12-01 8.5 8.000000<br /> 2013-01-01 10.0 10.000000<br /> 2013-02-01 9.0 9.545455<br /><br /> # Summarizing with 'by' argument of data.table does not:<br /><br /> > data[, mean(satisfaction), by = 'month,portal']><br /> data[, mean(satisfaction), by = list(month, portal)]<br /> month portal V1<br /> 1: 2013-01-01 FALSE 10.000000<br /> 2: 2013-02-01 TRUE 9.000000<br /> 3: 2013-01-01 TRUE 10.000000<br /> 4: 2012-12-01 FALSE 8.500000<br /> 5: 2012-12-01 TRUE 7.333333<br /> 6: 2013-02-01 TRUE 9.666667<br /> 7: 2013-02-01 FALSE 9.000000<br /> 8: 2012-12-01 TRUE 10.000000<br /><br /> # Summarizing only this year's data works:<br /> data[month >= ymd(20130101), mean(satisfaction), by = 'month,portal']<br /> month portal V1<br /> 1: 2013-01-01 TRUE 10.000000<br /> 2: 2013-01-01 FALSE 10.000000<br /> 3: 2013-02-01 TRUE 9.545455<br /> 4: 2013-02-01 FALSE 9.000000<br /><br /> Yours Sincerely,<br /> Victor Kryukov<br /> -------------- next part --------------<br /> An HTML attachment was scrubbed...<br /> URL: <<a href="http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130226/c1945761/attachment.html">http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130226/c1945761/attachment.html</a>><br /><br /> ------------------------------<br /><br /> _______________________________________________<br /> datatable-help mailing list<br /><a href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a><br /><a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</a><br /><br /> End of datatable-help Digest, Vol 36, Issue 8<br /> *********************************************</blockquote>
</div>
</blockquote>
<p> </p>
<div> </div>
</body></html>