[Blotter-commits] r910 - in pkg/FinancialInstrument: . R man

noreply at r-forge.r-project.org noreply at r-forge.r-project.org
Thu Jan 12 22:04:34 CET 2012


Author: gsee
Date: 2012-01-12 22:04:33 +0100 (Thu, 12 Jan 2012)
New Revision: 910

Modified:
   pkg/FinancialInstrument/DESCRIPTION
   pkg/FinancialInstrument/R/Tick2Sec.R
   pkg/FinancialInstrument/man/to_secBATV.Rd
Log:
 ' to_secBATV was sometimes printing duplicate rownames. Thanks Mike Rothman!


Modified: pkg/FinancialInstrument/DESCRIPTION
===================================================================
--- pkg/FinancialInstrument/DESCRIPTION	2012-01-10 19:21:27 UTC (rev 909)
+++ pkg/FinancialInstrument/DESCRIPTION	2012-01-12 21:04:33 UTC (rev 910)
@@ -11,7 +11,7 @@
     meta-data and relationships. Provides support for
     multi-asset class and multi-currency portfolios. Still
     in heavy development.
-Version: 0.10.4
+Version: 0.10.5
 URL: https://r-forge.r-project.org/projects/blotter/
 Date: $Date$
 Depends:

Modified: pkg/FinancialInstrument/R/Tick2Sec.R
===================================================================
--- pkg/FinancialInstrument/R/Tick2Sec.R	2012-01-10 19:21:27 UTC (rev 909)
+++ pkg/FinancialInstrument/R/Tick2Sec.R	2012-01-12 21:04:33 UTC (rev 910)
@@ -14,6 +14,9 @@
 
 #' convert tick data to one-second data
 #'
+#' This is like taking a snapshot of the market at the end of every second, except
+#' the volume over the second is summed.
+#' 
 #' From tick data with columns: \dQuote{Price}, \dQuote{Volume}, \dQuote{Bid.Price},
 #' \dQuote{Bid.Size}, \dQuote{Ask.Price}, \dQuote{Ask.Size}, to data of one second frequency
 #' with columns \dQuote{Bid.Price}, \dQuote{Bid.Size}, \dQuote{Ask.Price}, \dQuote{Ask.Size},
@@ -25,6 +28,11 @@
 #' If there are no trades or bid/ask price updates in a given second, we will not make
 #' a row for that timestamp.  If there were no trades, but the bid or ask
 #' price changed, then we _will_ have a row but the Volume and Trade.Price will be NA.  
+#'
+#' If there are multiple trades in the same second, Volume will be the sum of the volume, 
+#' but only the last trade price in that second will be printed. Similarly, if there
+#' is a trade, and then later in the same second, there is a bid/ask update, the last
+#' Bid/Ask Price/Size will be used.
 #' 
 #' @param x the xts series to convert to 1 minute BATMV
 #' @return an xts object of 1 second frequency
@@ -62,16 +70,41 @@
     xx <- cbind(Bi(xx), As(xx), ClVo, all=TRUE)
     xx[, 1:4] <- na.locf(xx[, 1:4])
     colnames(xx) <- c("Bid.Price", "Bid.Size", "Ask.Price", "Ask.Size", "Trade.Price", "Volume")
+
     #if volume is zero, and all other rows are unchanged, delete that row 
-    out <- xx
-    v <- out[, 6]
+    xxx <- xx
+    v <- xxx[, 6]
     v[is.na(v)] <- 0
-    dout <- cbind(diff(out[,c(1, 3)]), v)
-    align.time(out[index(dout[!rowSums(dout) == 0])], 1)
-}
+    dxxx <- cbind(diff(xxx[,c(1, 3)]), v)
+    xxx <- align.time(xxx[index(dxxx[!rowSums(dxxx) == 0])], 1)
 
+    # if, during a second, there is a trade, and then later in the same second there is a quote update,
+    # then we'll have a duplicate timestamp;  Something like
+    #                      Bid.Price Bid.Size Ask.Price Ask.Size Trade.Price Volume
+    #2011-12-06 07:00:02   1249.75       13      1250       40        1250      7
+    #2011-12-06 07:00:02   1249.75       14      1250       15          NA     NA
+    #
+    #We're going to use the non-NA Trade.Price/Volume and the last Bid.Price/Size Ask.Price/Size
+    # A duplicate index should only have 2 rows, and the second row should always be the one that has NAs
+    # so, a simple na.locf should be fine
+    dupidx <- index(xxx)[duplicated(index(xxx))] # indexes of duplicates
+    tmp <- xxx[dupidx] # data at duplicate rownames
 
+    # make sure that first row of each duplicate is not NA
+    firstDupes <- tmp[seq(1, nrow(tmp), 2),] 
+    if (any(is.na(firstDupes)))
+        warning(paste("NA in first row of dupe is unexpected; First offense at ", head(firstDupes[is.na(firstDupes)], 1)))
+    
+    # Fill forward Trade.Price and Volume, then remove first row of duplicate
+    tmp <- na.locf(tmp)
+    tmp <- tmp[seq(2, nrow(tmp), 2),] # only use 2nd row of each duplicate
+    nodupe <- xxx[!index(xxx) %in% dupidx]
+    out <- rbind(nodupe, tmp)
+    out
+} 
 
+
+
 #' Convert several files from tick to 1 second
 #'
 #' @param getdir directory to get tick data from
@@ -114,6 +147,8 @@
                             sfl <- paste(sdir, fl, sep="/")
                             save(list = xsym, file = sfl, envir = tmpenv)
                             rm(xsym, pos=tmpenv)
+                            rm(list='x')
+                            gc()
                             fl
                         }
                     }

Modified: pkg/FinancialInstrument/man/to_secBATV.Rd
===================================================================
--- pkg/FinancialInstrument/man/to_secBATV.Rd	2012-01-10 19:21:27 UTC (rev 909)
+++ pkg/FinancialInstrument/man/to_secBATV.Rd	2012-01-12 21:04:33 UTC (rev 910)
@@ -11,14 +11,18 @@
   an xts object of 1 second frequency
 }
 \description{
+  This is like taking a snapshot of the market at the end
+  of every second, except the volume over the second is
+  summed.
+}
+\details{
   From tick data with columns: \dQuote{Price},
   \dQuote{Volume}, \dQuote{Bid.Price}, \dQuote{Bid.Size},
   \dQuote{Ask.Price}, \dQuote{Ask.Size}, to data of one
   second frequency with columns \dQuote{Bid.Price},
   \dQuote{Bid.Size}, \dQuote{Ask.Price}, \dQuote{Ask.Size},
   \dQuote{Trade.Price}, and \dQuote{Volume}
-}
-\details{
+
   The primary purpose of this function is to reduce the
   amount of data on disk so that it will take less time to
   load the data into memory.
@@ -28,6 +32,12 @@
   If there were no trades, but the bid or ask price
   changed, then we _will_ have a row but the Volume and
   Trade.Price will be NA.
+
+  If there are multiple trades in the same second, Volume
+  will be the sum of the volume, but only the last trade
+  price in that second will be printed. Similarly, if there
+  is a trade, and then later in the same second, there is a
+  bid/ask update, the last Bid/Ask Price/Size will be used.
 }
 \examples{
 \dontrun{



More information about the Blotter-commits mailing list