[Rspatial-devel] new("SpatialGridDataFrame") time issue

Edzer Pebesma edzer.pebesma at uni-muenster.de
Mon Jan 23 11:03:50 CET 2012


Thanks for the fix, although you probably dislike it as much as I do.

The problem seems to be in the only setIs() remaining in sp; if I modify
it to spitting debug prints by:

setIs("SpatialGrid", "SpatialPixels",
    coerce = function(from) {
        print("coercing by setIs() on SpatialGrid")
        SpatialPixels(SpatialPoints(coordinates(from), from at proj4string),
        grid = from at grid)
        },
    replace = function(from, value) stop("replace-coercion not allowed")
)

then testing on test.ag in inst/external gives:

> x = readGDAL("test.ag")
test.ag has GDAL driver AAIGrid
and has 115 rows and 80 columns
[1] "coercing by setIs() on SpatialGrid"
[1] "coercing by setIs() on SpatialGrid"

So, apparently, setIs() causes the validity checking of SpatialGrid or
SpatialGridDataFrame to ALSO coerce to SpatialPixels[DataFrame], not
only once but twice, to check validity there. Indeed, a pretty stupid
thing to do.

maybe it should be setIs("SpatialPixels", "SpatialGrid", ...) ?

will look into this; would be good to have this settled before 1.0, it
has given me a few headaches before, see

http://lists.r-forge.r-project.org/pipermail/rspatial-devel/2011-September/000023.html

On 01/22/2012 11:32 PM, Roger Bivand wrote:
> With revision 1200, the same test is:
> 
>    user  system elapsed
>   7.763   0.346   8.119
> 
> but no validity checks are done. As may be sensible, something may be
> done about validity checking, but an order of magnitude speedup is worth
> having, I think.
> 
> Roger
> 
> On Sun, 22 Jan 2012, Roger Bivand wrote:
> 
>> Hi,
>>
>> Working with Rainer Krug on making readRAST6() in spgrass6 faster,
>> we've identified an issue with time wasting in creating SGDF. For
>> 2.14.1, sp 0.9-93, rgdal 0.7-8:
>>
>>> Rprof("readGDAL.out")
>>> system.time(for (i in 1:50) SP27GTIF <- 
>> readGDAL(system.file("pictures/SP27GTIF.TIF", package = "rgdal")[1]))
>>   user  system elapsed
>> 122.961   4.678 131.383
>>> Rprof("")
>>> summaryRprof("readGDAL.out")
>> $by.self
>>                        self.time self.pct total.time total.pct
>> "getGridIndex"              32.32    25.31      45.44     35.59
>> "asMethod"                  10.60     8.30     119.04     93.23
>> "slot<-"                     8.50     6.66       8.64      6.77
>> "apply"                      7.52     5.89      15.12     11.84
>> "validityMethod"             7.24     5.67     120.30     94.22
>> ".local"                     6.40     5.01      22.92     17.95
>> "structure"                  5.56     4.35       5.58      4.37
>> "expand.grid"                5.14     4.03      12.72      9.96
>> "FUN"                        4.86     3.81       4.86      3.81
>> "initialize"                 4.34     3.40     123.08     96.40
>> "SpatialPixels"              3.06     2.40      96.18     75.33
>> "aperm.default"              2.70     2.11       2.70      2.11
>> "SpatialPoints"              2.54     1.99      49.16     38.50
>> "gc"                         2.52     1.97       2.52      1.97
>> ....
>> $by.total
>>                        total.time total.pct self.time self.pct
>> "system.time"               127.68    100.00      0.00     0.00
>> "readGDAL"                  127.64     99.97      0.00     0.00
>> "new"                       123.10     96.41      0.02     0.02
>> "initialize"                123.08     96.40      4.34     3.40
>> "SpatialGridDataFrame"      123.04     96.37      0.00     0.00
>> "validObject"               120.42     94.31      0.00     0.00
>> "validityMethod"            120.30     94.22      7.24     5.67
>> "anyStrings"                120.30     94.22      0.00     0.00
>> "as"                        119.08     93.26      0.00     0.00
>> "asMethod"                  119.04     93.23     10.60     8.30
>> "SpatialPixels"              96.18     75.33      3.06     2.40
>> "nrow"                       96.18     75.33      0.00     0.00
>> "SpatialGrid"                59.22     46.38      0.00     0.00
>> "SpatialPoints"              49.16     38.50      2.54     1.99
>> "getGridIndex"               45.44     35.59     32.32    25.31
>> "standardGeneric"            27.94     21.88      0.00     0.00
>> "is"                         26.28     20.58      0.02     0.02
>> ".local"                     22.92     17.95      6.40     5.01
>> "coordinates"                22.92     17.95      0.00     0.00
>> "do.call"                    16.48     12.91      0.00     0.00
>> ".bboxCoords"                15.16     11.87      0.02     0.02
>> "t"                          15.14     11.86      0.02     0.02
>> "apply"                      15.12     11.84      7.52     5.89
>> "expand.grid"                12.72      9.96      5.14     4.03
>> "slot<-"                      8.64      6.77      8.50     6.66
>> "structure"                   5.58      4.37      5.56     4.35
>>
>>
>> The getGridIndex() call is particularly odd, taking 25% of execution
>> time. I've put the Rprof file on rspatial/misc. This looks like an S4
>> oddity to me. I can't see where the validity checks might recurse to
>> trip calls to other classes. The validity check ought to be a
>> no-brainer, really. I don't know when this happened, it may be caused
>> by recent methods.
>>
>> Any ideas?
>>
>> Roger
>>
>>
> 

-- 
Edzer Pebesma
Institute for Geoinformatics (ifgi), University of Münster
Weseler Straße 253, 48151 Münster, Germany. Phone: +49 251
8333081, Fax: +49 251 8339763  http://ifgi.uni-muenster.de
http://www.52north.org/geostatistics      e.pebesma at wwu.de


More information about the Rspatial-devel mailing list