[Rcpp-commits] r3728 - in pkg/RcppCNPy: . demo inst man src vignettes

noreply at r-forge.r-project.org noreply at r-forge.r-project.org
Tue Jul 31 04:20:59 CEST 2012


Author: edd
Date: 2012-07-31 04:20:59 +0200 (Tue, 31 Jul 2012)
New Revision: 3728

Modified:
   pkg/RcppCNPy/ChangeLog
   pkg/RcppCNPy/DESCRIPTION
   pkg/RcppCNPy/cleanup
   pkg/RcppCNPy/demo/timings.R
   pkg/RcppCNPy/inst/NEWS.Rd
   pkg/RcppCNPy/man/RcppCNPy-package.Rd
   pkg/RcppCNPy/src/cnpy.h
   pkg/RcppCNPy/src/cnpyMod.cpp
   pkg/RcppCNPy/vignettes/RcppCNPy-intro.Rnw
   pkg/RcppCNPy/vignettes/RcppCNPy-intro.pdf
Log:
 o Release 0.2.0
 o Support for writing .npy.gz
 o Support for not transposing on read
 o Plugged a memory hole in reading .npy{.gz}
 o Documentation updates


Modified: pkg/RcppCNPy/ChangeLog
===================================================================
--- pkg/RcppCNPy/ChangeLog	2012-07-30 20:28:34 UTC (rev 3727)
+++ pkg/RcppCNPy/ChangeLog	2012-07-31 02:20:59 UTC (rev 3728)
@@ -1,13 +1,33 @@
+2012-07-30  Dirk Eddelbuettel  <edd at debian.org>
+
+	* DESCRIPTION: Version 0.2.0
+
+	* src/cnpyMod.cpp:
+	  (npySave): Support gzip'ed files for saving
+	  (npyLoad): Support new argument 'dotranspose', plug a memory leak
+
+	* src/cnpy.h (cnpy): Added npy_gzsave() function
+
+	* man/RcppCNPy-package.Rd: Document new gzip-compression
+
+	* demo/timings.R: Use new gzip-compression feature
+
+	* vignettes/RcppCNPy-intro.Rnw: Updated as well
+
+	* inst/NEWS.Rd: Updated as well
+
 2012-07-07  Dirk Eddelbuettel  <edd at debian.org>
 
+	* DESCRIPTION: Version 0.1.0
+
 	* vignettes/RcppCNPy-intro.Rnw: Added vignette documentation
 
 	* demo/timings.R: Added simple timing benchmark demo
 
 2012-07-06  Dirk Eddelbuettel  <edd at debian.org>
 
-	* src/cnpy.h: Include cstdint for int64_t if C++11 has been enabled
-	* src/cnpyMod.cpp: Support integer types if C++11 available
+	* src/cnpy.h: Include cstdint for int64_t if C++0x has been enabled
+	* src/cnpyMod.cpp: Support integer types if C++0x available
 
 	* tests/: Simple set of regression tests added
 

Modified: pkg/RcppCNPy/DESCRIPTION
===================================================================
--- pkg/RcppCNPy/DESCRIPTION	2012-07-30 20:28:34 UTC (rev 3727)
+++ pkg/RcppCNPy/DESCRIPTION	2012-07-31 02:20:59 UTC (rev 3728)
@@ -1,16 +1,16 @@
 Package: RcppCNPy
 Type: Package
 Title: Rcpp bindings for NumPy files
-Version: 0.1.0
+Version: 0.2.0
 Date: $Date$
 Author: Dirk Eddelbuettel
 Maintainer: Dirk Eddelbuettel <edd at debian.org>
 Description: This package provides R with access to the cnpy library written
  by Carl Rogers which provides read and write facilities for files created
  with (or for) the NumPy extension for Python.  Vectors and matrices of
- numeric types can be read or written; compressed files can be read as well.
- Support for integer files is available if the package (and Rcpp) are
- compiled with -std=c++11.
+ numeric types can be read or written to and from files as well as compressed
+ files. Support for integer files is available if the package (and Rcpp) have
+ been compiled with -std=c++0x.
 License: GPL (>= 2)
 LazyLoad: yes
 Depends: methods, Rcpp (>= 0.9.13)

Modified: pkg/RcppCNPy/cleanup
===================================================================
--- pkg/RcppCNPy/cleanup	2012-07-30 20:28:34 UTC (rev 3727)
+++ pkg/RcppCNPy/cleanup	2012-07-31 02:20:59 UTC (rev 3728)
@@ -1,6 +1,6 @@
 #!/bin/sh
 
-rm -f src/*.o src/*.so 
+rm -f src/*.o src/*.so src/symbols.rds
 
 rm -rf vignettes/auto/ vignettes/*.log vignettes/*.aux vignettes/*.out vignettes/*.tex
 

Modified: pkg/RcppCNPy/demo/timings.R
===================================================================
--- pkg/RcppCNPy/demo/timings.R	2012-07-30 20:28:34 UTC (rev 3727)
+++ pkg/RcppCNPy/demo/timings.R	2012-07-31 02:20:59 UTC (rev 3728)
@@ -12,13 +12,11 @@
 txtfile <- tempfile(fileext=".txt")
 write.table(M, file=txtfile)
 
-pyfile <- tempfile(fileext=".py")
+pyfile <- tempfile(fileext=".npy")
 npySave(pyfile, M)
 
-pygzfile <- tempfile(fileext=".py")
+pygzfile <- tempfile(fileext=".npy.gz")
 npySave(pygzfile, M)
-system(paste("gzip -9", pygzfile))
-pygzfile <- paste(pygzfile, ".gz", sep="")
 
 print(do.call(rbind, (lapply(c(txtfile, pyfile, pygzfile),
                              function(f) file.info(f)["size"]))))

Modified: pkg/RcppCNPy/inst/NEWS.Rd
===================================================================
--- pkg/RcppCNPy/inst/NEWS.Rd	2012-07-30 20:28:34 UTC (rev 3727)
+++ pkg/RcppCNPy/inst/NEWS.Rd	2012-07-31 02:20:59 UTC (rev 3728)
@@ -2,6 +2,16 @@
 \title{News for Package \pkg{RcppCNPy}}
 \newcommand{\cpkg}{\href{http://CRAN.R-project.org/package=#1}{\pkg{#1}}}
 
+\section{Changes in version 0.2.0 (2012-07-30)}{
+  \itemize{
+    \item Support for writing of \code{gzip}-ed \code{npy} files has
+    been added.
+    \item A new option \code{dotranspose} has been added to
+    \code{npyLoad()} to support data sets that do not need to be
+    transposed to be used in R.
+    \item A memory leak in reading files has been corrected.
+  }
+}
 \section{Changes in version 0.1.0 (2012-07-07)}{
   \itemize{
     \item Added automatic use of transpose to automagically account for

Modified: pkg/RcppCNPy/man/RcppCNPy-package.Rd
===================================================================
--- pkg/RcppCNPy/man/RcppCNPy-package.Rd	2012-07-30 20:28:34 UTC (rev 3727)
+++ pkg/RcppCNPy/man/RcppCNPy-package.Rd	2012-07-31 02:20:59 UTC (rev 3728)
@@ -17,19 +17,30 @@
   package, as well \pkg{Rcpp} are recompiled using the \code{-std=c++11}
   flag.
 
-  Files with \code{gzip} compression can be transparently read as well.
+  Files with \code{gzip} compression can be transparently read and
+  written as well.
 }
 \usage{
-  npyLoad(filename, type="numeric")
+  npyLoad(filename, type="numeric", dotranspose=TRUE)
   npySave(filename, object, mode="w")
 }
 \arguments{
-  \item{filename}{string with (path and) filename for a \code{npy} object file}
-  \item{type}{string with type 'numeric' (default) or 'integer'}
+  \item{filename}{string with (path and) filename for a \code{npy}
+    object file. If the string ends with \code{.gz}, compressed files
+    can be read or written.}
+  \item{type}{string with type 'numeric' (default) or 'integer'. Integer
+    support is available only if Rcpp and RcppCNPy have been compiled
+    with the \code{-std=c++0x} option as the required \code{int64_t}
+    types are not available otherwise.}
   \item{object}{an R object, currently limited to a vector or matrix of
     either integer or numeric type}
+  \item{dotranspose}{a boolean variable indicating whether a
+  two-dimensional object should be transposed after reading, default is yes}
   \item{mode}{a one-character string indicating whether files are
-    appended to ("a") or written ("w", the default)}
+    appended to ("a") or written ("w", the default). In case of writing
+    \code{gzip}-ed file, this option is not supported as such files can
+    only be (over-)written, and bot appended.
+  }
 }
 \details{
   \tabular{ll}{

Modified: pkg/RcppCNPy/src/cnpy.h
===================================================================
--- pkg/RcppCNPy/src/cnpy.h	2012-07-30 20:28:34 UTC (rev 3727)
+++ pkg/RcppCNPy/src/cnpy.h	2012-07-31 02:20:59 UTC (rev 3728)
@@ -2,6 +2,10 @@
 //Released under MIT License
 //license available in LICENSE file, or at http://www.opensource.org/licenses/mit-license.php
 
+// Changes for RcppCNPy are 
+// Copyright (C) 2012  Dirk Eddelbuettel
+// and licensed under GNU GPL (>= 2) 
+
 #ifndef LIBCNPY_H_
 #define LIBCNPY_H_
 
@@ -120,6 +124,21 @@
         fclose(fp);
     }
 
+    template<typename T> void npy_gzsave(std::string fname, const T* data, const unsigned int* shape, const unsigned int ndims) {
+        gzFile fp = gzopen(fname.c_str(),"wb");
+	if(!fp) {
+	  Rf_error("npy_gzsave: Error! Unable to open file %s!\n",fname.c_str());
+	}
+	std::vector<char> header = create_npy_header(data,shape,ndims);
+	gzwrite(fp, &header[0], sizeof(char) * header.size());
+
+        unsigned int nels = 1;
+        for (unsigned int i = 0;i < ndims;i++) nels *= shape[i];
+
+        gzwrite(fp, data, sizeof(T)*nels);
+        gzclose(fp);
+    }
+
     template<typename T> void npz_save(std::string zipname, std::string fname, const T* data, const unsigned int* shape, const unsigned int ndims, std::string mode = "w")
     {
         //first, append a .npy to the fname

Modified: pkg/RcppCNPy/src/cnpyMod.cpp
===================================================================
--- pkg/RcppCNPy/src/cnpyMod.cpp	2012-07-30 20:28:34 UTC (rev 3727)
+++ pkg/RcppCNPy/src/cnpyMod.cpp	2012-07-31 02:20:59 UTC (rev 3728)
@@ -45,7 +45,7 @@
     }
 }
 
-Rcpp::RObject npyLoad(const std::string & filename, const std::string & type) { 
+Rcpp::RObject npyLoad(const std::string & filename, const std::string & type, const bool dotranspose) { 
 
     cnpy::NpyArray arr;
 
@@ -73,20 +73,29 @@
     } else if (shape.size() == 2) {
         if (type == "numeric") {
             // invert dimension for creation, and then tranpose to correct Fortran-vs-C storage
-            ret = transpose(Rcpp::NumericMatrix(shape[1], shape[0], reinterpret_cast<double*>(arr.data)));
+            if (dotranspose) {
+                ret = transpose(Rcpp::NumericMatrix(shape[1], shape[0], reinterpret_cast<double*>(arr.data)));
+            } else {
+                ret = Rcpp::NumericMatrix(shape[0], shape[1], reinterpret_cast<double*>(arr.data));
+            }
 #ifdef RCPP_HAS_LONG_LONG_TYPES
         } else if (type == "integer") {
             // invert dimension for creation, and then tranpose to correct Fortran-vs-C storage
-            ret = transpose(Rcpp::IntegerMatrix(shape[1], shape[0], reinterpret_cast<int64_t*>(arr.data)));
+            if (dotranspose) {
+                ret = transpose(Rcpp::IntegerMatrix(shape[1], shape[0], reinterpret_cast<int64_t*>(arr.data)));
+            } else {
+                ret = transpose(Rcpp::IntegerMatrix(shape[0], shape[1], reinterpret_cast<int64_t*>(arr.data)));
+            }
 #endif
         } else {
             arr.destruct();
             Rf_error("Unsupported type in npyLoad");
         }
     } else {
+        arr.destruct();
         Rf_error("Unsupported dimension in npyLoad");
-        arr.destruct();
     }
+    arr.destruct();
     return ret;
 }
 
@@ -96,13 +105,22 @@
             Rcpp::NumericMatrix mat = transpose(Rcpp::NumericMatrix(x));
             std::vector<unsigned int> shape = 
                 Rcpp::as<std::vector<unsigned int> >(Rcpp::IntegerVector::create(mat.ncol(), mat.nrow()));
-            cnpy::npy_save(filename, mat.begin(), &(shape[0]), 2, mode);
+            
+            if (hasEnding(filename, ".gz")) {
+                cnpy::npy_gzsave(filename, mat.begin(), &(shape[0]), 2); 	// no mode, overwrite only
+            } else {
+                cnpy::npy_save(filename, mat.begin(), &(shape[0]), 2, mode);
+            }
 #ifdef RCPP_HAS_LONG_LONG_TYPES
         } else if (::Rf_isInteger(x)) {
             Rcpp::IntegerMatrix mat = transpose(Rcpp::IntegerMatrix(x));
             std::vector<unsigned int> shape = 
                 Rcpp::as<std::vector<unsigned int> >(Rcpp::IntegerVector::create(mat.ncol(), mat.nrow()));
-            cnpy::npy_save(filename, mat.begin(), &(shape[0]), 2, mode);
+            if (hasEnding(filename, ".gz")) {
+                cnpy::npy_gzsave(filename, mat.begin(), &(shape[0]), 2); 	// no mode, overwrite only
+            } else {
+                cnpy::npy_save(filename, mat.begin(), &(shape[0]), 2, mode);
+            }
 #endif
         } else {
             Rf_error("Unsupported matrix type\n");
@@ -112,13 +130,21 @@
             Rcpp::NumericVector vec(x);
             std::vector<unsigned int> shape = 
                 Rcpp::as<std::vector<unsigned int> >(Rcpp::IntegerVector::create(vec.length()));
-            cnpy::npy_save(filename, vec.begin(), &(shape[0]), 1, mode);
+            if (hasEnding(filename, ".gz")) {
+                cnpy::npy_gzsave(filename, vec.begin(), &(shape[0]), 1); 	// no mode, append only
+            } else {
+                cnpy::npy_save(filename, vec.begin(), &(shape[0]), 1, mode);
+            }
 #ifdef RCPP_HAS_LONG_LONG_TYPES
         } else if (::Rf_isInteger(x)) {
             Rcpp::IntegerVector vec(x);
             std::vector<unsigned int> shape = 
                 Rcpp::as<std::vector<unsigned int> >(Rcpp::IntegerVector::create(vec.length()));
-            cnpy::npy_save(filename, vec.begin(), &(shape[0]), 1, mode);
+            if (hasEnding(filename, ".gz")) {
+                cnpy::npy_gzsave(filename, vec.begin(), &(shape[0]), 1);	// no mode, append only
+            } else {
+                cnpy::npy_save(filename, vec.begin(), &(shape[0]), 1, mode);
+            }
 #endif
         } else {
             Rf_error("Unsupported vector type\n");
@@ -135,7 +161,8 @@
     function("npyLoad",         		// name of the identifier at the R level
              &npyLoad,          		// function pointer to helper function defined above
              List::create( Named("filename"),   // function arguments including default value
-                           Named("type") = "numeric"),
+                           Named("type") = "numeric",
+                           Named("dotranspose") = true),
              "read an npy file into a numeric or integer vector or matrix");
 
     function("npySave",         		// name of the identifier at the R level

Modified: pkg/RcppCNPy/vignettes/RcppCNPy-intro.Rnw
===================================================================
--- pkg/RcppCNPy/vignettes/RcppCNPy-intro.Rnw	2012-07-30 20:28:34 UTC (rev 3727)
+++ pkg/RcppCNPy/vignettes/RcppCNPy-intro.Rnw	2012-07-31 02:20:59 UTC (rev 3728)
@@ -3,61 +3,16 @@
 %\VignetteKeywords{Python, NumPy, R, data transfer}
 %\VignettePackage{RcppCNPy}
 
-%\usepackage[sf,bf,compact,small]{titlesec}
-\usepackage[sf,bf,compact,small]{titlesec}
+\usepackage[bf,sf,compact,small]{titlesec}
 
-\usepackage[USletter]{vmargin}
-\setmargrb{0.75in}{0.75in}{0.75in}{0.75in}
+\usepackage[margin=0.85in,paper=letterpaper]{geometry}
 
-\usepackage{utopia} % not bad
-%\usepackage{newcent} % not bad
-%\usepackage{palatino} % not bad
-%\usepackage[bitstream-charter]{mathdesign}
-%\usepackage[T1]{fontenc}
+\usepackage[T1]{fontenc}
+\usepackage{pslatex}        % just like RJournal
+\usepackage{palatino,mathpazo}
 
+\usepackage{color,url,booktabs}
 
-\usepackage{color,alltt,url,booktabs}
-\usepackage[authoryear,round,longnamesfirst]{natbib}
-\usepackage[colorlinks]{hyperref}
-\definecolor{link}{rgb}{0,0,0.3}	%% next few lines courtesy of RJournal.sty
-\hypersetup{
-    colorlinks,%
-    citecolor=link,%
-    filecolor=link,%
-    linkcolor=link,%
-    urlcolor=link
-}
-
-\usepackage{listings}           % code examples
-\definecolor{darkgray}{rgb}{0.975,0.975,0.975}
-\lstset{backgroundcolor=\color{darkgray}}
-\lstset{numbers=left, numberstyle=\tiny, stepnumber=2, numbersep=5pt}
-\lstset{keywordstyle=\color{black}\bfseries\tt}
-\lstset{ %
-  %language=Octave,                % the language of the code
-  %basicstyle=\footnotesize,       % the size of the fonts that are used for the code
-  basicstyle=\small,              % the size of the fonts that are used for the code
-  numbers=left,                   % where to put the line-numbers
-  %numberstyle=\footnotesize,      % the size of the fonts that are used for the line-numbers
-  stepnumber=2,                   % the step between two line-numbers. If it's 1, each line
-                                  % will be numbered
-  numbersep=5pt,                  % how far the line-numbers are from the code
-  %backgroundcolor=\color{white},  % choose the background color. You must add \usepackage{color}
-  showspaces=false,               % show spaces adding particular underscores
-  showstringspaces=false,         % underline spaces within strings
-  showtabs=false,                 % show tabs within strings adding particular underscores
-  %frame=single,                   % adds a frame around the code
-  tabsize=2,                      % sets default tabsize to 2 spaces
-  captionpos=b,                   % sets the caption-position to bottom
-  breaklines=true,                % sets automatic line breaking
-  breakatwhitespace=false         % sets if automatic breaks should only happen at whitespace
-  %title=\lstname,                 % show the filename of files included with \lstinputlisting;
-                                  % also try caption instead of title
-  %escapeinside={\%*}{*)},         % if you want to add a comment within your code
-  %morekeywords={*,...}            % if you want to add more keywords to the set
-}
-
-
 \newcommand{\proglang}[1]{\textsf{#1}}
 \newcommand{\pkg}[1]{{\fontseries{b}\selectfont #1}}
 \newcommand{\code}[1]{\texttt{#1}}
@@ -280,10 +235,8 @@
   \normalsize
 \end{quote}
 
-Support for compressed file is currently limited to reading, but could be
-implemented for writing as well.
+Support for writing compressed files has been added in version 0.2.0.
 
-
 \subsection{Data writing in \R}
 
 Matrices and vectors can be written to files using the \code{npySave()}
@@ -377,7 +330,8 @@
 % \end{enumerate}
 i) ascii format using \code{write.table()};
 ii) \code{NumPy} format using \code{npySave()}; and
-iii) \code{NumPy} format using \code{npySave()} followed by a call to \code{gzip}.
+iii) \code{NumPy} format using \code{npySave()} with compression via
+the \code{zlib} library (used also by \code{gzip}).
 
 Table~\ref{tab:benchmark} shows some timing comparisons for a matrix with
 five million elements.  Reading the \code{npy} is clearly fastest as it
@@ -415,10 +369,10 @@
 
 \subsection{Integer support}
 
-Support for integer data types is conditional on use of the \code{-std=c++11}
+Support for integer data types is conditional on use of the \code{-std=c++0x}
 compiler extension. Only the newer standard supports the \code{long long int}
 type needed to represent \code{int64} data on a 32-bit OS.  So until \R
-switches to allowing \code{-std=c++11} on CRAN packages, users will need to
+switches to allowing \code{-std=c++0x} on CRAN packages, users will need to
 rebuild both \pkg{Rcpp} and \pkg{RcppCNPy} with the switch enabled. As shown
 in the previous examples, integers also transparently convert to float types.
 

Modified: pkg/RcppCNPy/vignettes/RcppCNPy-intro.pdf
===================================================================
(Binary files differ)



More information about the Rcpp-commits mailing list