[Rspatial-devel] new("SpatialGridDataFrame") time issue

Rainer M Krug r.m.krug at gmail.com
Mon Jan 23 10:58:22 CET 2012


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

OK - I have done the following:

  library(profr)
  plot(rg <- parse_rprof("readGDAL.out"))
  x11()
  plot(rg[(rg$start>0 & rg$end <20), ])
  x11()
  plot(rg[(rg$start>0 & rg$end <2), ])
  x11()
  plot(rg1 <- parse_rprof("readGDAL1.out"))
  x11()
  plot(rg1[(rg1$start>0 & rg1$end <2), ])

I do not know the inner workings of readGDAL(), but it strikes me,
that SpatialGridDataFrame is called so often (or is this the reason
why you have chosen readGDAL() for this profiling?).

This looks similar to the aspect you are looking at in readRAST6 - if
a more basic format could be used internally, this problem would
probably go away for readGDAL().

But there is still the issue, why SpatialGridDataFrame is relatively
slow. I profiled the function readRAST6() whose running time is almost
comletely caused by SpatialGridDataFrame. I profiled reading the
"aspect" from the spearfish dataset, started R inside GRASS 6.4.1:


  Rprof("withGDAL.out");x <- readRAST6("aspect", useGDAL=TRUE,
plugin=FALSE);Rprof(NULL)

  Rprof("noGDAL.out");x <- readRAST6("aspect", useGDAL=FALSE,
plugin=FALSE);Rprof(NULL)
  xnoGDAL <- parse_rprof("noGDAL.out")
  xwithGDAL <- parse_rprof("withGDAL.out")

  plot(xwithGDAL, main="With GDAL")
  plot(xnoGDAL, main="No GDAL")

attached the profile data and graphs generated by profr as above.

For SpatialGridDataFrame, the function getGridDataFrame uses most of
the time, particularly the "-" minus fiunction, and for "noGDAL" the
"<=" comparison (not visible on the plot, but through plotting only
that region).


Profiling was done with: R 2.14.0, Linux 32 bit, spgrass6_0.7-4,
rgdal_0.7-5, sp_0.9-91

Hope this helps,

Rainer


On 22/01/12 23:32, Roger Bivand wrote:
> With revision 1200, the same test is:
> 
> user  system elapsed 7.763   0.346   8.119
> 
> but no validity checks are done. As may be sensible, something may
> be done about validity checking, but an order of magnitude speedup
> is worth having, I think.
> 
> Roger
> 
> On Sun, 22 Jan 2012, Roger Bivand wrote:
> 
>> Hi,
>> 
>> Working with Rainer Krug on making readRAST6() in spgrass6
>> faster, we've identified an issue with time wasting in creating
>> SGDF. For 2.14.1, sp 0.9-93, rgdal 0.7-8:
>> 
>>> Rprof("readGDAL.out") system.time(for (i in 1:50) SP27GTIF <-
>> readGDAL(system.file("pictures/SP27GTIF.TIF", package =
>> "rgdal")[1])) user  system elapsed 122.961   4.678 131.383
>>> Rprof("") summaryRprof("readGDAL.out")
>> $by.self self.time self.pct total.time total.pct "getGridIndex"
>> 32.32    25.31      45.44     35.59 "asMethod"
>> 10.60     8.30     119.04     93.23 "slot<-"
>> 8.50     6.66       8.64      6.77 "apply"
>> 7.52     5.89      15.12     11.84 "validityMethod"
>> 7.24     5.67     120.30     94.22 ".local"
>> 6.40     5.01      22.92     17.95 "structure"
>> 5.56     4.35       5.58      4.37 "expand.grid"
>> 5.14     4.03      12.72      9.96 "FUN"
>> 4.86     3.81       4.86      3.81 "initialize"
>> 4.34     3.40     123.08     96.40 "SpatialPixels"
>> 3.06     2.40      96.18     75.33 "aperm.default"
>> 2.70     2.11       2.70      2.11 "SpatialPoints"
>> 2.54     1.99      49.16     38.50 "gc"
>> 2.52     1.97       2.52      1.97 .... $by.total total.time
>> total.pct self.time self.pct "system.time"               127.68
>> 100.00      0.00     0.00 "readGDAL"                  127.64
>> 99.97      0.00     0.00 "new"                       123.10
>> 96.41      0.02     0.02 "initialize"                123.08
>> 96.40      4.34     3.40 "SpatialGridDataFrame"      123.04
>> 96.37      0.00     0.00 "validObject"               120.42
>> 94.31      0.00     0.00 "validityMethod"            120.30
>> 94.22      7.24     5.67 "anyStrings"                120.30
>> 94.22      0.00     0.00 "as"                        119.08
>> 93.26      0.00     0.00 "asMethod"                  119.04
>> 93.23     10.60     8.30 "SpatialPixels"              96.18
>> 75.33      3.06     2.40 "nrow"                       96.18
>> 75.33      0.00     0.00 "SpatialGrid"                59.22
>> 46.38      0.00     0.00 "SpatialPoints"              49.16
>> 38.50      2.54     1.99 "getGridIndex"               45.44
>> 35.59     32.32    25.31 "standardGeneric"            27.94
>> 21.88      0.00     0.00 "is"                         26.28
>> 20.58      0.02     0.02 ".local"                     22.92
>> 17.95      6.40     5.01 "coordinates"                22.92
>> 17.95      0.00     0.00 "do.call"                    16.48
>> 12.91      0.00     0.00 ".bboxCoords"                15.16
>> 11.87      0.02     0.02 "t"                          15.14
>> 11.86      0.02     0.02 "apply"                      15.12
>> 11.84      7.52     5.89 "expand.grid"                12.72
>> 9.96      5.14     4.03 "slot<-"                      8.64
>> 6.77      8.50     6.66 "structure"                   5.58
>> 4.37      5.56     4.35
>> 
>> 
>> The getGridIndex() call is particularly odd, taking 25% of
>> execution time. I've put the Rprof file on rspatial/misc. This
>> looks like an S4 oddity to me. I can't see where the validity
>> checks might recurse to trip calls to other classes. The validity
>> check ought to be a no-brainer, really. I don't know when this
>> happened, it may be caused by recent methods.
>> 
>> Any ideas?
>> 
>> Roger
>> 
>> 
> 


- -- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :       +33 - (0)9 53 10 27 44
Cell:       +33 - (0)6 85 62 59 98
Fax :       +33 - (0)9 58 10 27 44

Fax (D):    +49 - (0)3 21 21 25 22 44

email:      Rainer at krugs.de

Skype:      RMkrug
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEUEARECAAYFAk8dLz4ACgkQoYgNqgF2egq4zQCYvSIqm3rtOeJz/9Y4exywlnne
+QCfaTwnirthAp1zfHx7AqtInXrDKZI=
=dMcm
-----END PGP SIGNATURE-----
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: noGDAL.out
URL: <http://lists.r-forge.r-project.org/pipermail/rspatial-devel/attachments/20120123/92d534a4/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: withGDAL.out
URL: <http://lists.r-forge.r-project.org/pipermail/rspatial-devel/attachments/20120123/92d534a4/attachment.asc>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Rplots.pdf
Type: application/pdf
Size: 6915 bytes
Desc: not available
URL: <http://lists.r-forge.r-project.org/pipermail/rspatial-devel/attachments/20120123/92d534a4/attachment.pdf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: noGDAL.out.sig
Type: application/octet-stream
Size: 72 bytes
Desc: not available
URL: <http://lists.r-forge.r-project.org/pipermail/rspatial-devel/attachments/20120123/92d534a4/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: withGDAL.out.sig
Type: application/octet-stream
Size: 72 bytes
Desc: not available
URL: <http://lists.r-forge.r-project.org/pipermail/rspatial-devel/attachments/20120123/92d534a4/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Rplots.pdf.sig
Type: application/octet-stream
Size: 72 bytes
Desc: not available
URL: <http://lists.r-forge.r-project.org/pipermail/rspatial-devel/attachments/20120123/92d534a4/attachment-0002.obj>


More information about the Rspatial-devel mailing list