[Subgroup-commits] r18 - in pkg: . subgroup subgroup/R subgroup/inst/java subgroup/man subgroup/tests

noreply at r-forge.r-project.org noreply at r-forge.r-project.org
Sun Aug 3 22:18:52 CEST 2014


Author: atzmueller
Date: 2014-08-03 22:18:51 +0200 (Sun, 03 Aug 2014)
New Revision: 18

Added:
   pkg/subgroup/
   pkg/subgroup/DESCRIPTION
   pkg/subgroup/NAMESPACE
   pkg/subgroup/R/
   pkg/subgroup/R/AAAonLoad.R
   pkg/subgroup/R/classes.R
   pkg/subgroup/R/subgroup.R
   pkg/subgroup/data/
   pkg/subgroup/inst/
   pkg/subgroup/inst/java/subgroup.jar
   pkg/subgroup/man/
   pkg/subgroup/man/DiscoverSubgroups.Rd
   pkg/subgroup/man/DiscoverSubgroupsByTask.Rd
   pkg/subgroup/man/Pattern-class.Rd
   pkg/subgroup/man/credit.data.Rd
   pkg/subgroup/man/subgroup-package.Rd
   pkg/subgroup/tests/
   pkg/subgroup/tests/test.R
Removed:
   pkg/DESCRIPTION
   pkg/NAMESPACE
   pkg/data/
   pkg/subgroup/R/AAAconfigurePackage.R
   pkg/subgroup/R/AAAonLoad.R
   pkg/subgroup/R/classes.R
   pkg/subgroup/R/subgroup.R
   pkg/subgroup/inst/java/subgroup.jar
   pkg/subgroup/man/DiscoverSubgroups.Rd
   pkg/subgroup/man/DiscoverSubgroupsByTask.Rd
   pkg/subgroup/man/subgroup-package.Rd
   pkg/subgroup/tests/test.R
Log:
reorganize package in pkg/<subdirectory>

Deleted: pkg/DESCRIPTION
===================================================================
--- pkg/DESCRIPTION	2014-08-03 17:55:21 UTC (rev 17)
+++ pkg/DESCRIPTION	2014-08-03 20:18:51 UTC (rev 18)
@@ -1,18 +0,0 @@
-Package: subgroup.discovery
-Type: Package
-Title: Subgroup Discovery
-Version: 0.4
-Date: 2014-08-01
-Author: Martin Atzmueller
-Maintainer: Martin Atzmueller <martin at atzmueller.net>
-Description: A collection of efficient and effective tools and
-	algorithms for subgroup discovery and analytics. The package
-	integrates an R interface to the org.vikamine.kernel library
-	of the VIKAMINE system (http://www.vikamine.org) implementing
-	subgroup discovery, pattern mining and analytics in Java.
-Classification/ACM: G.4, H.2.8, I.5.1
-License: GPL (>= 3)
-Depends: R (>= 2.6.0), methods, rJava (>= 0.6-3), foreign (>= 0.8-40)
-SystemRequirements: Java (>= 6.0)
-Collate: 'AAAonLoad.R' 'randomSeed.R' 'classes.R' 'subgroup.R'
-URL: http://www.subgroup.cc
\ No newline at end of file

Deleted: pkg/NAMESPACE
===================================================================
--- pkg/NAMESPACE	2014-08-03 17:55:21 UTC (rev 17)
+++ pkg/NAMESPACE	2014-08-03 20:18:51 UTC (rev 18)
@@ -1 +0,0 @@
-exportPattern("^[^\\.]")
\ No newline at end of file

Copied: pkg/subgroup/DESCRIPTION (from rev 17, pkg/DESCRIPTION)
===================================================================
--- pkg/subgroup/DESCRIPTION	                        (rev 0)
+++ pkg/subgroup/DESCRIPTION	2014-08-03 20:18:51 UTC (rev 18)
@@ -0,0 +1,18 @@
+Package: subgroup.discovery
+Type: Package
+Title: Subgroup Discovery
+Version: 0.4
+Date: 2014-08-01
+Author: Martin Atzmueller
+Maintainer: Martin Atzmueller <martin at atzmueller.net>
+Description: A collection of efficient and effective tools and
+	algorithms for subgroup discovery and analytics. The package
+	integrates an R interface to the org.vikamine.kernel library
+	of the VIKAMINE system (http://www.vikamine.org) implementing
+	subgroup discovery, pattern mining and analytics in Java.
+Classification/ACM: G.4, H.2.8, I.5.1
+License: GPL (>= 3)
+Depends: R (>= 2.6.0), methods, rJava (>= 0.6-3), foreign (>= 0.8-40)
+SystemRequirements: Java (>= 6.0)
+Collate: 'AAAonLoad.R' 'randomSeed.R' 'classes.R' 'subgroup.R'
+URL: http://www.subgroup.cc
\ No newline at end of file

Copied: pkg/subgroup/NAMESPACE (from rev 15, pkg/NAMESPACE)
===================================================================
--- pkg/subgroup/NAMESPACE	                        (rev 0)
+++ pkg/subgroup/NAMESPACE	2014-08-03 20:18:51 UTC (rev 18)
@@ -0,0 +1 @@
+exportPattern("^[^\\.]")
\ No newline at end of file

Deleted: pkg/subgroup/R/AAAconfigurePackage.R
===================================================================
--- pkg/R/AAAconfigurePackage.R	2012-05-27 13:18:02 UTC (rev 7)
+++ pkg/subgroup/R/AAAconfigurePackage.R	2014-08-03 20:18:51 UTC (rev 18)
@@ -1,7 +0,0 @@
-###############################################################################
-#
-# Some configuration options
-#
-# Author: Martin Atzmueller (martin at atzmueller.net)
-###############################################################################
-#options(java.parameters = "-Xmx1024m")
\ No newline at end of file

Deleted: pkg/subgroup/R/AAAonLoad.R
===================================================================
--- pkg/R/AAAonLoad.R	2012-05-27 13:18:02 UTC (rev 7)
+++ pkg/subgroup/R/AAAonLoad.R	2014-08-03 20:18:51 UTC (rev 18)
@@ -1,6 +0,0 @@
-require(rJava)
-
-.First.lib <- function(libname, pkgname) {
-	   .jpackage(pkgname)
-	   .jengine(TRUE)
-}

Copied: pkg/subgroup/R/AAAonLoad.R (from rev 17, pkg/R/AAAonLoad.R)
===================================================================
--- pkg/subgroup/R/AAAonLoad.R	                        (rev 0)
+++ pkg/subgroup/R/AAAonLoad.R	2014-08-03 20:18:51 UTC (rev 18)
@@ -0,0 +1,28 @@
+###############################################################################
+#    subgroup.discovery package R classes
+# 
+#    This file is part of the R subgroup.discovery package.
+#    Copyright (C) 2011-2014 by Martin Atzmueller
+#    
+#    This program is free software: you can redistribute it and/or modify
+#    it under the terms of the GNU General Public License as published by
+#    the Free Software Foundation, either version 3 of the License, or
+#    (at your option) any later version.
+
+#    This program is distributed in the hope that it will be useful,
+#    but WITHOUT ANY WARRANTY; without even the implied warranty of
+#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+#    GNU General Public License for more details.
+#    
+#    You should have received a copy of the GNU General Public License
+#    along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+#    Contact: Martin Atzmueller (martin at atzmueller.net)
+###############################################################################
+
+require(rJava)
+
+.onLoad <- function(libname, pkgname) {
+	   rJava::.jpackage(pkgname, lib.loc=libname)
+	   #.jengine(TRUE)
+}

Deleted: pkg/subgroup/R/classes.R
===================================================================
--- pkg/R/classes.R	2012-05-27 13:18:02 UTC (rev 7)
+++ pkg/subgroup/R/classes.R	2014-08-03 20:18:51 UTC (rev 18)
@@ -1,57 +0,0 @@
-###############################################################################
-#    Subgroup package R classes
-# 
-#    This file is part of the R subgroup package.
-#    Copyright (C) 2011-2012  by Martin Atzmueller
-#    
-#    This program is free software: you can redistribute it and/or modify
-#    it under the terms of the GNU General Public License as published by
-#    the Free Software Foundation, either version 3 of the License, or
-#    (at your option) any later version.
-
-#    This program is distributed in the hope that it will be useful,
-#    but WITHOUT ANY WARRANTY; without even the implied warranty of
-#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-#    GNU General Public License for more details.
-#    
-#    You should have received a copy of the GNU General Public License
-#    along with this program.  If not, see <http://www.gnu.org/licenses/>.
-#
-#    Contact: Martin Atzmueller (martin at atzmueller.net)
-###############################################################################
-
-setClass("SDOptions",
-    representation(
-        verbose = "logical",
-        debug = "logical"
-    ),
-    prototype(verbose = FALSE, debug = FALSE)
-)
-
-
-setClass("SDTaskConfig",
-    representation(
-        qf          = "character",
-        method      = "character",
-        k           = "numeric",
-        minqual     = "numeric",
-        minsize     = "numeric",
-        maxlen      = "numeric",
-        nodefaults  = "logical",
-        relfilter   = "logical",
-        postfilter  = "character",
-        attributes  = "character"
-    ),
-    prototype(qf="ps", method="sdmap", k = as.integer(20), minqual = as.integer(0), minsize = as.integer(0),
-        maxlen = as.integer(7), nodefaults = FALSE, relfilter = FALSE, postfilter = "", attributes = character(0))
-)
-
-
-setClass("Pattern",
-    representation(
-        description="character",
-        quality="numeric",
-        size="numeric",
-        parameters="list"
-    )
-)
\ No newline at end of file

Copied: pkg/subgroup/R/classes.R (from rev 17, pkg/R/classes.R)
===================================================================
--- pkg/subgroup/R/classes.R	                        (rev 0)
+++ pkg/subgroup/R/classes.R	2014-08-03 20:18:51 UTC (rev 18)
@@ -0,0 +1,49 @@
+###############################################################################
+#    subgroup.discovery package R classes
+# 
+#    This file is part of the R subgroup.discovery package.
+#    Copyright (C) 2011-2014 by Martin Atzmueller
+#    
+#    This program is free software: you can redistribute it and/or modify
+#    it under the terms of the GNU General Public License as published by
+#    the Free Software Foundation, either version 3 of the License, or
+#    (at your option) any later version.
+
+#    This program is distributed in the hope that it will be useful,
+#    but WITHOUT ANY WARRANTY; without even the implied warranty of
+#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+#    GNU General Public License for more details.
+#    
+#    You should have received a copy of the GNU General Public License
+#    along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+#    Contact: Martin Atzmueller (martin at atzmueller.net)
+###############################################################################
+
+
+setClass("SDTaskConfig",
+    representation(
+        qf          = "character",
+        method      = "character",
+        k           = "numeric",
+        minqual     = "numeric",
+        minsize     = "numeric",
+        maxlen      = "numeric",
+        nodefaults  = "logical",
+        relfilter   = "logical",
+        postfilter  = "character",
+        attributes  = "character"
+    ),
+    prototype(qf="ps", method="sdmap", k = as.integer(20), minqual = as.integer(0), minsize = as.integer(0),
+        maxlen = as.integer(7), nodefaults = FALSE, relfilter = FALSE, postfilter = "", attributes = character(0))
+)
+
+
+setClass("Pattern",
+    representation(
+        description="character",
+        quality="numeric",
+        size="numeric",
+        parameters="list"
+    )
+)
\ No newline at end of file

Deleted: pkg/subgroup/R/subgroup.R
===================================================================
--- pkg/R/subgroup.R	2012-05-27 13:18:02 UTC (rev 7)
+++ pkg/subgroup/R/subgroup.R	2014-08-03 20:18:51 UTC (rev 18)
@@ -1,250 +0,0 @@
-###############################################################################
-#    Subgroup package R core support code
-# 
-#    This file is part of the R subgroup package.
-#    Copyright (C) 2011-2012  by Martin Atzmueller
-#    
-#    This program is free software: you can redistribute it and/or modify
-#    it under the terms of the GNU General Public License as published by
-#    the Free Software Foundation, either version 3 of the License, or
-#    (at your option) any later version.
-    
-#    This program is distributed in the hope that it will be useful,
-#    but WITHOUT ANY WARRANTY; without even the implied warranty of
-#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-#    GNU General Public License for more details.
-#    
-#    You should have received a copy of the GNU General Public License
-#    along with this program.  If not, see <http://www.gnu.org/licenses/>.
-#
-#    Contact: Martin Atzmueller (martin at atzmueller.net)
-###############################################################################
-
-library(foreign)
-library(rJava)
-
-setGeneric("CreateARFFProvider",
-    function(source, name, ...) {
-      standardGeneric("CreateARFFProvider")
-    }
-)
-
-setMethod("CreateARFFProvider", signature(source = "data.frame", name = "character"),
-    function(source, name, ...) {
-    # Creates a dataset provider (converting the dataframe)
-      con <- textConnection("arffVector", "w")
-      write.arff(source, con)
-      flush(con)
-      close(con)
-      arff <- paste(arffVector, "", collapse="\n")
-      provider <- .jnew("org/vikamine/kernel/xpdl/ARFFAsStringDatasetProvider", arff, name)     
-      return(provider)
-    }
-)
-
-setMethod("CreateARFFProvider", signature(source = "character", name = "character"),
-    function(source, name, ...) {
-      # Creates a dataset provider given a file name
-      provider <- .jnew("org/vikamine/kernel/xpdl/FileDatasetProvider", source)
-      return(provider)
-    }
-)
-
-CreateOntologyForData <- function(provider, dataset) {
-  # Creates the ontology object for the respective dataset
-  ontology <- J(provider, "getDataset", dataset)
-  return(ontology)
-}
-
-CreateSimpleSDTask <- function(ontology, target) {
-  # Creates a simple subgroup discovery task
-  simpleTask <- new(J("org/vikamine/kernel/subgroup/search/SDSimpleTask"), ontology)
-  if (!is.null(target$value)) {
-    selector <- new(J("org/vikamine/kernel/subgroup/selectors/DefaultSGSelector"), ontology, target$attribute, target$value)
-    target <- new(J("org/vikamine/kernel/subgroup/target/SelectorTarget"), selector)
-  } else {
-    attribute <- J(ontology, "getAttribute", target$attribute)
-    target <- new(J("org/vikamine/kernel/subgroup/target/NumericTarget"), attribute)
-  }
-  J(simpleTask, "setTarget", target)
-  return(simpleTask)
-}
-
-CreateSDTask <- function(source, target, config = new("SDTaskConfig")) {
-  # Creates a subgroup discovery task
-  #
-  # Args:
-  #   source: A data source, i.e., dataframe or file (name)
-  #   target: The target variable
-  #   config: A SDTaskConfig
-  #
-  # Returns:
-  #   A subgroup discovery task
-  provider <- CreateARFFProvider(source, "data")
-  ontology <- CreateOntologyForData(provider, "data")
-  task <- CreateSimpleSDTask(ontology, target)
-  J(task, "setQualityFunction", config at qf)
-  J(task, "setSDMethod", config at method)
-  J(task, "setMaxSGCount", as.integer(config at k))
-  J(task, "setMinQualityLimit", as.double(config at minqual))
-  J(task, "setMinSubgroupSize", as.double(config at minsize))
-  J(task, "setMaxSGDSize", as.integer(config at maxlen))
-  J(task, "setSuppressStrictlyIrrelevantSubgroups", config at relfilter)
-  J(task, "setIgnoreDefaultValues", config at nodefaults)
-  if (config at postfilter != "") {
-    J(task, "setPostFilter", config at postfilter)
-  }
-  if ((!is.null(config at attributes)) && (length(config at attributes) > 0)) {
-    J(task, "setAttributes", .jarray(config at attributes))
-  }
-  return(task)
-}
-
-as.target <- function(attribute=NULL, value=NULL) {
-  # Creates a target variable object given attribute and value (for nominals)
-  #
-  # Args:
-  #   attribute: The respective attribute
-  #   value: For nominals, the respective value; for numeric NULL
-  #
-  # Returns:
-  #   A target object representation
-  if (!is.null(attribute) && !is.null(value))
-    return(list(attribute=attribute, value=value))
-  else if (!is.null(attribute))
-    return(list(attribute=attribute))
-  else
-    return(NULL)
-}
-
-GetParameters <- function(task, sg) {
-  target <- J(task, "getTarget")
-  if (J(target, "isBoolean")) {
-    size <- J(J(sg, "getStatistics"), "getSubgroupSize")
-    p <- J(J(sg, "getStatistics"), "getP")
-    p0 <- J(J(sg, "getStatistics"), "getP0")
-    return(list(p = p, p0 = p0, size = size))
-  } else if (J(target, "isNumeric")) {
-    size <- J(J(sg, "getStatistics"), "getSubgroupSize")
-    mean <- J(J(sg, "getStatistics"), "getSGMean")
-    popMean <- J(J(sg, "getStatistics"), "getPopulationMean")
-    return(list(mean = mean, populationMean = popMean, size = size))
-  } else {
-    stop("Unknown target")
-  }
-}
-
-ConvertDescription <- function(sgDescription) {
-  # Internal function for converting a (Java) SGDescription consisting
-  # of a set of selection expressions into a character vector of strings
-  # representing these
-  # Args:
-  #   sgDescription: A (Java) SGDescription object
-  #
-  # Returns:
-  #   A character vector
-  sgSelectorArray <-
-      J("org.vikamine.kernel.subgroup.search.SDSimpleTask")$getSimpleDescription(sgDescription)
-  return(as.character(sgSelectorArray))
-}
-
-DiscoverSubgroupsByTask <- function(task) {
-  # Internal function for setting up and performing subgroup discovery
-  # Args:
-  #   task: A subgroup discovery task
-  #
-  # Returns:
-  #   A list of subgroup patterns
-  sgSet <- J(task, "performSubgroupDiscovery")
-  sgList <- J(sgSet, "toSortedList", FALSE)
-  sgArray <- .jevalArray(J(sgList, "toArray"))
-  
-  patterns = list()
-  for (sg in sgArray) {
-    #description <- as.character(J(J(sg, "getSGDescription"), "getDescription"))
-    sgDescription <- J(sg, "getSGDescription")
-    description <- ConvertDescription(sgDescription)
-    quality <- J(sg, "getQuality")
-    size <- J(J(sg, "getStatistics"), "getSubgroupSize")
-    parameters = GetParameters(task, sg)    
-    pattern <- new("Pattern", description=description, quality=quality, size=size, parameters=parameters)
-    patterns = append(patterns, pattern)
-  }
-  
-  return(patterns)
-}
-
-DiscoverSubgroups <- function(source, target, config=new("SDTaskConfig")) {
-  # Performs subgroup discovery according to target and config on data
-  #
-  # Args:
-  #   data: A dataframe
-  #   target: A target variable (constructed by as.target)
-  #   config: a SDTaskConfig configuration for the algorithm
-  #
-  # Returns:
-  #   A list of subgroup patterns
-  task <- CreateSDTask(source, target, config)
-  result <- DiscoverSubgroupsByTask(task)
-  return(result)
-}
-
-
-FormatDoubleSignificantDigits <- function(double, ndigits=2) {
-  # Internal function: Prints double according to
-  if (is.numeric(ndigits)) {
-    sprintf(paste("%.", ndigits, "f", sep=""), double)
-  } else {
-    double
-  }
-}
-
-
-
-ToDataFrame <- function(patterns, ndigits=2) {
-  # Transforms a list/vector of patterns into a dataframe
-  #
-  # Args:
-  #   patterns: List of patterns
-  #   ndigits: Number of significant digits for floats
-  #
-  # Returns:
-  #   The dataframe containing the pattern information
-  isNumeric = FALSE  
-  descriptions <- list()
-  length(descriptions) = length(patterns)
-  qualities <-list()
-  length(qualities) = length(patterns)
-  sizes <- list()
-  length(sizes) <- length(patterns)
-  ps <- list()
-  
-  i = 1
-  for (pattern in patterns) {
-    descriptions[i] = paste(pattern at description, collapse=", ")
-    qualities[i] = FormatDoubleSignificantDigits(pattern at quality, ndigits)
-    sizes[i] = pattern at size
-    if (!is.null(pattern at parameters$mean)) {
-      ps[i] = FormatDoubleSignificantDigits(pattern at parameters$mean, ndigits)
-      isNumeric = TRUE
-    } else {
-      ps[i] = FormatDoubleSignificantDigits(pattern at parameters$p, ndigits)
-      isNumeric = FALSE
-    }
-    i = i + 1
-  }
-  if (isNumeric) {
-    dataframe <- data.frame(
-        quality=as.vector(qualities, "numeric"),
-        mean=as.vector(ps, "numeric"), 
-        size=as.vector(sizes, "numeric"),
-        description=as.vector(descriptions, "character"))
-  } else {
-    dataframe <- data.frame(
-        quality=as.vector(qualities, "numeric"),
-        p=as.vector(ps, "numeric"), 
-        size=as.vector(sizes, "numeric"),
-        description=as.vector(descriptions, "character"))
-  }
-  return(dataframe)
-}
\ No newline at end of file

Copied: pkg/subgroup/R/subgroup.R (from rev 17, pkg/R/subgroup.R)
===================================================================
--- pkg/subgroup/R/subgroup.R	                        (rev 0)
+++ pkg/subgroup/R/subgroup.R	2014-08-03 20:18:51 UTC (rev 18)
@@ -0,0 +1,266 @@
+###############################################################################
+#    subgroup.discovery package R classes
+# 
+#    This file is part of the R subgroup.discovery package.
+#    Copyright (C) 2011-2014 by Martin Atzmueller
+#    
+#    This program is free software: you can redistribute it and/or modify
+#    it under the terms of the GNU General Public License as published by
+#    the Free Software Foundation, either version 3 of the License, or
+#    (at your option) any later version.
+    
+#    This program is distributed in the hope that it will be useful,
+#    but WITHOUT ANY WARRANTY; without even the implied warranty of
+#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+#    GNU General Public License for more details.
+#    
+#    You should have received a copy of the GNU General Public License
+#    along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+#    Contact: Martin Atzmueller (martin at atzmueller.net)
+###############################################################################
+
+library(foreign)
+library(rJava)
+
+setGeneric(".CreateARFFProvider",
+    function(source, name, ...) {
+      standardGeneric(".CreateARFFProvider")
+    }
+)
+
+setMethod(".CreateARFFProvider", signature(source = "data.frame", name = "character"),
+    function(source, name, ...) {
+      # Creates a dataset provider (converting the dataframe)
+      con <- textConnection("arffVector", "w")
+      write.arff(source, con)
+      flush(con)
+      close(con)
+      rm(con)
+      arff <- paste(arffVector, "", collapse="\n")
+      provider <- .jnew("org/vikamine/kernel/xpdl/ARFFAsStringDatasetProvider", arff, name)
+      return(provider)
+    }
+)
+
+setMethod(".CreateARFFProvider", signature(source = "character", name = "character"),
+    function(source, name, ...) {
+      # Creates a dataset provider given a file name
+      provider <- .jnew("org/vikamine/kernel/xpdl/FileDatasetProvider", source)
+      return(provider)
+    }
+)
+
+.CreateOntologyForData <- function(provider, dataset) {
+  # Creates the ontology object for the respective dataset
+  ontology <- .jcall(provider, "Lorg/vikamine/kernel/data/Ontology;","getDataset", dataset)
+  return(ontology)
+}
+
+.CreateSimpleSDTask <- function(ontology, target) {
+  # Creates a simple subgroup discovery task
+  .FreeMemory()
+  simpleTask <- new(J("org/vikamine/kernel/subgroup/search/SDSimpleTask"), ontology)
+  if (!is.null(target$value)) {
+    selector <- new(J("org/vikamine/kernel/subgroup/selectors/DefaultSGSelector"), ontology, target$attribute, target$value)
+    target <- new(J("org/vikamine/kernel/subgroup/target/SelectorTarget"), selector)
+  } else {
+    attribute <- J(ontology, "getAttribute", target$attribute)
+    target <- new(J("org/vikamine/kernel/subgroup/target/NumericTarget"), attribute)
+  }
+  J(simpleTask, "setTarget", target)
+  return(simpleTask)
+}
+
+CreateSDTask <- function(source, target, config = new("SDTaskConfig")) {
+  # Creates a subgroup discovery task
+  #
+  # Args:
+  #   source: A data source, i.e., dataframe or file (name)
+  #   target: The target variable
+  #   config: A SDTaskConfig
+  #
+  # Returns:
+  #   A subgroup discovery task
+  .FreeMemory()
+  provider <- .CreateARFFProvider(source, "data")
+  ontology <- .CreateOntologyForData(provider, "data")
+  task <- .CreateSimpleSDTask(ontology, target)
+  J(task, "setQualityFunction", config at qf)
+  J(task, "setSDMethod", config at method)
+  J(task, "setMaxSGCount", as.integer(config at k))
+  J(task, "setMinQualityLimit", as.double(config at minqual))
+  J(task, "setMinSubgroupSize", as.double(config at minsize))
+  J(task, "setMaxSGDSize", as.integer(config at maxlen))
+  J(task, "setSuppressStrictlyIrrelevantSubgroups", config at relfilter)
+  J(task, "setIgnoreDefaultValues", config at nodefaults)
+  if (config at postfilter != "") {
+    J(task, "setPostFilter", config at postfilter)
+  }
+  if ((!is.null(config at attributes)) && (length(config at attributes) > 0)) {
+    J(task, "setAttributes", .jarray(config at attributes))
+  }
+  return(task)
+}
+
+as.target <- function(attribute=NULL, value=NULL) {
+  # Creates a target variable object given attribute and value (for nominals)
+  #
+  # Args:
+  #   attribute: The respective attribute
+  #   value: For nominals, the respective value; for numeric NULL
+  #
+  # Returns:
+  #   A target object representation
+  if (!is.null(attribute) && !is.null(value))
+    return(list(attribute=attribute, value=value))
+  else if (!is.null(attribute))
+    return(list(attribute=attribute))
+  else
+    return(NULL)
+}
+
+.GetParameters <- function(task, sg) {
+  target <- J(task, "getTarget")
+  if (J(target, "isBoolean")) {
+    size <- J(J(sg, "getStatistics"), "getSubgroupSize")
+    p <- J(J(sg, "getStatistics"), "getP")
+    p0 <- J(J(sg, "getStatistics"), "getP0")
+    return(list(p = p, p0 = p0, size = size))
+  } else if (J(target, "isNumeric")) {
+    size <- J(J(sg, "getStatistics"), "getSubgroupSize")
+    mean <- J(J(sg, "getStatistics"), "getSGMean")
+    popMean <- J(J(sg, "getStatistics"), "getPopulationMean")
+    return(list(mean = mean, populationMean = popMean, size = size))
+  } else {
+    stop("Unknown target")
+  }
+}
+
+.ConvertDescription <- function(sgDescription) {
+  # Internal function for converting a (Java) SGDescription consisting
+  # of a set of selection expressions into a character vector of strings
+  # representing these
+  # Args:
+  #   sgDescription: A (Java) SGDescription object
+  #
+  # Returns:
+  #   A character vector
+  sgSelectorArray <-
+      J("org.vikamine.kernel.subgroup.search.SDSimpleTask")$getSimpleDescription(sgDescription)
+  return(as.character(sgSelectorArray))
+}
+
+DiscoverSubgroupsByTask <- function(task, as.df=FALSE) {
+  # Internal function for setting up and performing subgroup discovery
+  # Args:
+  #   task: A subgroup discovery task
+  #
+  # Returns:
+  #   A list of subgroup patterns
+  sgSet <- J(task, "performSubgroupDiscovery")
+  sgList <- J(sgSet, "toSortedList", FALSE)
+  sgArray <- .jevalArray(J(sgList, "toArray"))
+  
+  patterns = list()
+  for (sg in sgArray) {
+    #description <- as.character(J(J(sg, "getSGDescription"), "getDescription"))
+    sgDescription <- J(sg, "getSGDescription")
+    description <- .ConvertDescription(sgDescription)
+    quality <- J(sg, "getQuality")
+    size <- J(J(sg, "getStatistics"), "getSubgroupSize")
+    parameters = .GetParameters(task, sg)    
+    pattern <- new("Pattern", description=description, quality=quality, size=size, parameters=parameters)
+    patterns = append(patterns, pattern)
+  }
+  
+  if (as.df) {
+    dataFrameRules <- ToDataFrame(patterns)
+    return(dataFrameRules)
+  } else {
+    return(patterns)
+  }
+}
+
+DiscoverSubgroups <- function(source, target, config=new("SDTaskConfig"), as.df=FALSE) {
+  # Performs subgroup discovery according to target and config on data
+  #
+  # Args:
+  #   data: A dataframe
+  #   target: A target variable (constructed by as.target)
+  #   config: a SDTaskConfig configuration for the algorithm
+  #
+  # Returns:
+  #   A list of subgroup patterns
+  task <- CreateSDTask(source, target, config)
+  result <- DiscoverSubgroupsByTask(task, as.df)
+  return(result)
+}
+
+
+.FormatDoubleSignificantDigits <- function(double, ndigits=2) {
+  # Internal function: Prints double according to
+  if (is.numeric(ndigits)) {
+    sprintf(paste("%.", ndigits, "f", sep=""), double)
+  } else {
+    double
+  }
+}
+
+
+
+ToDataFrame <- function(patterns, ndigits=2) {
+  # Transforms a list/vector of patterns into a dataframe
+  #
+  # Args:
+  #   patterns: List of patterns
+  #   ndigits: Number of significant digits for floats
+  #
+  # Returns:
+  #   The dataframe containing the pattern information
+  isNumeric = FALSE  
+  descriptions <- list()
+  length(descriptions) = length(patterns)
+  qualities <-list()
+  length(qualities) = length(patterns)
+  sizes <- list()
+  length(sizes) <- length(patterns)
+  ps <- list()
+  
+  i = 1
+  for (pattern in patterns) {
+    descriptions[i] = paste(pattern at description, collapse=", ")
+    qualities[i] = .FormatDoubleSignificantDigits(pattern at quality, ndigits)
+    sizes[i] = pattern at size
+    if (!is.null(pattern at parameters$mean)) {
+      ps[i] = .FormatDoubleSignificantDigits(pattern at parameters$mean, ndigits)
+      isNumeric = TRUE
+    } else {
+      ps[i] = .FormatDoubleSignificantDigits(pattern at parameters$p, ndigits)
+      isNumeric = FALSE
+    }
+    i = i + 1
+  }
+  if (isNumeric) {
+    dataframe <- data.frame(
+        quality=as.vector(qualities, "numeric"),
+        mean=as.vector(ps, "numeric"), 
+        size=as.vector(sizes, "numeric"),
+        description=as.vector(descriptions, "character"))
+  } else {
+    dataframe <- data.frame(
+        quality=as.vector(qualities, "numeric"),
+        p=as.vector(ps, "numeric"), 
+        size=as.vector(sizes, "numeric"),
+        description=as.vector(descriptions, "character"))
+  }
+  return(dataframe)
+}
+
+.FreeMemory <- function(...) {
+  # Call the R garbage collection
+  # Then call Java garbage collection
+  gc(...)
+  .jcall("java/lang/System", method = "gc")
+  invisible()
+}
\ No newline at end of file

Deleted: pkg/subgroup/inst/java/subgroup.jar
===================================================================
(Binary files differ)

Copied: pkg/subgroup/inst/java/subgroup.jar (from rev 10, pkg/inst/java/subgroup.jar)
===================================================================
(Binary files differ)

Deleted: pkg/subgroup/man/DiscoverSubgroups.Rd
===================================================================
--- pkg/man/DiscoverSubgroups.Rd	2012-05-27 13:18:02 UTC (rev 7)
+++ pkg/subgroup/man/DiscoverSubgroups.Rd	2014-08-03 20:18:51 UTC (rev 18)
@@ -1,40 +0,0 @@
-\name{DiscoverSubgroups}
-\alias{DiscoverSubgroups}
-\title{Performs Subgroup Discovery}
-\description{
-Performs subgroup discovery according to the given target and
-the configuration on the data.
-}
-\usage{
-DiscoverSubgroups(source, target, config=new("SDTaskConfig"))
-}
-\arguments{
-\item{source}{a data.frame or the a character string giving
-the filename of an ARFF file to use.}
-\item{target}{the target variable (constructed by as.target)
-to consider for subgroup discovery.}
-\item{config}{an instance of SDTaskConfig providing various
-parameters for subgroup discovery.}
-}
-\seealso{
-\code{\link{DiscoverSubgroupsByTask}}.
-\code{\link{as.target}}
-\code{\link{CreateSDTask}}
-\code{\link{SDTaskConfig}}
-}
-\examples{
-# subgroup discovery on a data.frame, for binary target
-data(credit.data)
-result1 <- DiscoverSubgroups(credit.data, as.target("class", "good"))
-result2 <- DiscoverSubgroups(credit.data, as.target("class", "good"), new("SDTaskConfig", attributes=c("checking_status", "employment")))
-
-ToDataFrame(result1)
-ToDataFrame(result2)
-
-# subgroup discovery for numeric target variable
-result3 <- DiscoverSubgroups(credit.data, as.target("credit_amount"))
-
-ToDataFrame(result3)
-}
-\keyword{subgroup discovery}
-

Copied: pkg/subgroup/man/DiscoverSubgroups.Rd (from rev 16, pkg/man/DiscoverSubgroups.Rd)
===================================================================
--- pkg/subgroup/man/DiscoverSubgroups.Rd	                        (rev 0)
+++ pkg/subgroup/man/DiscoverSubgroups.Rd	2014-08-03 20:18:51 UTC (rev 18)
@@ -0,0 +1,44 @@
+\name{DiscoverSubgroups}
+\alias{DiscoverSubgroups}
+\title{Performs Subgroup Discovery}
+\description{
+Performs subgroup discovery according to the given target and
+the configuration on the data.
+}
+\usage{
+DiscoverSubgroups(source, target, config=new("SDTaskConfig"), as.df=FALSE)
+}
+\arguments{
+\item{source}{a data.frame or the a character string giving
+the filename of an ARFF file to use.}
+\item{target}{the target variable (constructed by as.target)
+to consider for subgroup discovery.}
+\item{config}{an instance of SDTaskConfig providing various
+parameters for subgroup discovery.}
+\item{as.df}{TRUE, if the result patterns should be returned as
+a data.frame using \code{\link{ToDataFrame}}}
+}
+\seealso{
+\code{\link{DiscoverSubgroupsByTask}}.
+\code{\link{as.target}}
+\code{\link{CreateSDTask}}
+\code{\link{SDTaskConfig}}
+}
+\examples{
+# subgroup discovery on a data.frame, for binary target
+data(credit.data)
+result1 <- DiscoverSubgroups(credit.data, as.target("class", "good"))
+result2 <- DiscoverSubgroups(
+    credit.data, as.target("class", "good"), new("SDTaskConfig",
+    attributes=c("checking_status", "employment")))
+
+ToDataFrame(result1)
+ToDataFrame(result2)
+
+# subgroup discovery for numeric target variable
+result3 <- DiscoverSubgroups(credit.data, as.target("credit_amount"))
+
+ToDataFrame(result3)
+}
+\keyword{subgroup discovery}
+

Deleted: pkg/subgroup/man/DiscoverSubgroupsByTask.Rd
===================================================================
--- pkg/man/DiscoverSubgroupsByTask.Rd	2012-05-27 13:18:02 UTC (rev 7)
+++ pkg/subgroup/man/DiscoverSubgroupsByTask.Rd	2014-08-03 20:18:51 UTC (rev 18)
@@ -1,29 +0,0 @@
-\name{DiscoverSubgroupsByTask}
-\alias{DiscoverSubgroupsByTask}
-\title{Performs Subgroup Discovery for a given Task}
-\description{
-Performs subgroup discovery according to the given task.
-}
-\usage{
-DiscoverSubgroupsByTask(task)
-}
-\arguments{
-\item{task}{a subgroup discovery task constructed by
- CreateSDTask.}
-}
-\seealso{
-\code{\link{DiscoverSubgroups}}.
-\code{\link{CreateSDTask}}
-}
-\examples{
-# creating a task
-data(credit.data)
-task <- CreateSDTask(credit.data, as.target("class", "bad"))
-taskNum <- CreateSDTask(credit.data, as.target("credit_amount"))
-
-# running the tasks
-DiscoverSubgroupsByTask(task)
-DiscoverSubgroupsByTask(taskNum)
-}
-\keyword{subgroup task}
-

Copied: pkg/subgroup/man/DiscoverSubgroupsByTask.Rd (from rev 16, pkg/man/DiscoverSubgroupsByTask.Rd)
===================================================================
--- pkg/subgroup/man/DiscoverSubgroupsByTask.Rd	                        (rev 0)
+++ pkg/subgroup/man/DiscoverSubgroupsByTask.Rd	2014-08-03 20:18:51 UTC (rev 18)
@@ -0,0 +1,31 @@
+\name{DiscoverSubgroupsByTask}
+\alias{DiscoverSubgroupsByTask}
+\title{Performs Subgroup Discovery for a given Task}
+\description{
+Performs subgroup discovery according to the given task.
+}
+\usage{
+DiscoverSubgroupsByTask(task, as.df=FALSE)
+}
+\arguments{
+\item{task}{a subgroup discovery task constructed by
+ CreateSDTask.}
+ \item{as.df}{TRUE, if the result patterns should be returned
+as a data.frame using \code{\link{ToDataFrame}}}
+}
+\seealso{
+\code{\link{DiscoverSubgroups}}.
+\code{\link{CreateSDTask}}
+}
+\examples{
+# creating a task
+data(credit.data)
+task <- CreateSDTask(credit.data, as.target("class", "bad"))
[TRUNCATED]

To get the complete diff run:
    svnlook diff /svnroot/subgroup -r 18


More information about the Subgroup-commits mailing list