[Subgroup-commits] r18 - in pkg: . subgroup subgroup/R subgroup/inst/java subgroup/man subgroup/tests
noreply at r-forge.r-project.org
noreply at r-forge.r-project.org
Sun Aug 3 22:18:52 CEST 2014
Author: atzmueller
Date: 2014-08-03 22:18:51 +0200 (Sun, 03 Aug 2014)
New Revision: 18
Added:
pkg/subgroup/
pkg/subgroup/DESCRIPTION
pkg/subgroup/NAMESPACE
pkg/subgroup/R/
pkg/subgroup/R/AAAonLoad.R
pkg/subgroup/R/classes.R
pkg/subgroup/R/subgroup.R
pkg/subgroup/data/
pkg/subgroup/inst/
pkg/subgroup/inst/java/subgroup.jar
pkg/subgroup/man/
pkg/subgroup/man/DiscoverSubgroups.Rd
pkg/subgroup/man/DiscoverSubgroupsByTask.Rd
pkg/subgroup/man/Pattern-class.Rd
pkg/subgroup/man/credit.data.Rd
pkg/subgroup/man/subgroup-package.Rd
pkg/subgroup/tests/
pkg/subgroup/tests/test.R
Removed:
pkg/DESCRIPTION
pkg/NAMESPACE
pkg/data/
pkg/subgroup/R/AAAconfigurePackage.R
pkg/subgroup/R/AAAonLoad.R
pkg/subgroup/R/classes.R
pkg/subgroup/R/subgroup.R
pkg/subgroup/inst/java/subgroup.jar
pkg/subgroup/man/DiscoverSubgroups.Rd
pkg/subgroup/man/DiscoverSubgroupsByTask.Rd
pkg/subgroup/man/subgroup-package.Rd
pkg/subgroup/tests/test.R
Log:
reorganize package in pkg/<subdirectory>
Deleted: pkg/DESCRIPTION
===================================================================
--- pkg/DESCRIPTION 2014-08-03 17:55:21 UTC (rev 17)
+++ pkg/DESCRIPTION 2014-08-03 20:18:51 UTC (rev 18)
@@ -1,18 +0,0 @@
-Package: subgroup.discovery
-Type: Package
-Title: Subgroup Discovery
-Version: 0.4
-Date: 2014-08-01
-Author: Martin Atzmueller
-Maintainer: Martin Atzmueller <martin at atzmueller.net>
-Description: A collection of efficient and effective tools and
- algorithms for subgroup discovery and analytics. The package
- integrates an R interface to the org.vikamine.kernel library
- of the VIKAMINE system (http://www.vikamine.org) implementing
- subgroup discovery, pattern mining and analytics in Java.
-Classification/ACM: G.4, H.2.8, I.5.1
-License: GPL (>= 3)
-Depends: R (>= 2.6.0), methods, rJava (>= 0.6-3), foreign (>= 0.8-40)
-SystemRequirements: Java (>= 6.0)
-Collate: 'AAAonLoad.R' 'randomSeed.R' 'classes.R' 'subgroup.R'
-URL: http://www.subgroup.cc
\ No newline at end of file
Deleted: pkg/NAMESPACE
===================================================================
--- pkg/NAMESPACE 2014-08-03 17:55:21 UTC (rev 17)
+++ pkg/NAMESPACE 2014-08-03 20:18:51 UTC (rev 18)
@@ -1 +0,0 @@
-exportPattern("^[^\\.]")
\ No newline at end of file
Copied: pkg/subgroup/DESCRIPTION (from rev 17, pkg/DESCRIPTION)
===================================================================
--- pkg/subgroup/DESCRIPTION (rev 0)
+++ pkg/subgroup/DESCRIPTION 2014-08-03 20:18:51 UTC (rev 18)
@@ -0,0 +1,18 @@
+Package: subgroup.discovery
+Type: Package
+Title: Subgroup Discovery
+Version: 0.4
+Date: 2014-08-01
+Author: Martin Atzmueller
+Maintainer: Martin Atzmueller <martin at atzmueller.net>
+Description: A collection of efficient and effective tools and
+ algorithms for subgroup discovery and analytics. The package
+ integrates an R interface to the org.vikamine.kernel library
+ of the VIKAMINE system (http://www.vikamine.org) implementing
+ subgroup discovery, pattern mining and analytics in Java.
+Classification/ACM: G.4, H.2.8, I.5.1
+License: GPL (>= 3)
+Depends: R (>= 2.6.0), methods, rJava (>= 0.6-3), foreign (>= 0.8-40)
+SystemRequirements: Java (>= 6.0)
+Collate: 'AAAonLoad.R' 'randomSeed.R' 'classes.R' 'subgroup.R'
+URL: http://www.subgroup.cc
\ No newline at end of file
Copied: pkg/subgroup/NAMESPACE (from rev 15, pkg/NAMESPACE)
===================================================================
--- pkg/subgroup/NAMESPACE (rev 0)
+++ pkg/subgroup/NAMESPACE 2014-08-03 20:18:51 UTC (rev 18)
@@ -0,0 +1 @@
+exportPattern("^[^\\.]")
\ No newline at end of file
Deleted: pkg/subgroup/R/AAAconfigurePackage.R
===================================================================
--- pkg/R/AAAconfigurePackage.R 2012-05-27 13:18:02 UTC (rev 7)
+++ pkg/subgroup/R/AAAconfigurePackage.R 2014-08-03 20:18:51 UTC (rev 18)
@@ -1,7 +0,0 @@
-###############################################################################
-#
-# Some configuration options
-#
-# Author: Martin Atzmueller (martin at atzmueller.net)
-###############################################################################
-#options(java.parameters = "-Xmx1024m")
\ No newline at end of file
Deleted: pkg/subgroup/R/AAAonLoad.R
===================================================================
--- pkg/R/AAAonLoad.R 2012-05-27 13:18:02 UTC (rev 7)
+++ pkg/subgroup/R/AAAonLoad.R 2014-08-03 20:18:51 UTC (rev 18)
@@ -1,6 +0,0 @@
-require(rJava)
-
-.First.lib <- function(libname, pkgname) {
- .jpackage(pkgname)
- .jengine(TRUE)
-}
Copied: pkg/subgroup/R/AAAonLoad.R (from rev 17, pkg/R/AAAonLoad.R)
===================================================================
--- pkg/subgroup/R/AAAonLoad.R (rev 0)
+++ pkg/subgroup/R/AAAonLoad.R 2014-08-03 20:18:51 UTC (rev 18)
@@ -0,0 +1,28 @@
+###############################################################################
+# subgroup.discovery package R classes
+#
+# This file is part of the R subgroup.discovery package.
+# Copyright (C) 2011-2014 by Martin Atzmueller
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
+#
+# Contact: Martin Atzmueller (martin at atzmueller.net)
+###############################################################################
+
+require(rJava)
+
+.onLoad <- function(libname, pkgname) {
+ rJava::.jpackage(pkgname, lib.loc=libname)
+ #.jengine(TRUE)
+}
Deleted: pkg/subgroup/R/classes.R
===================================================================
--- pkg/R/classes.R 2012-05-27 13:18:02 UTC (rev 7)
+++ pkg/subgroup/R/classes.R 2014-08-03 20:18:51 UTC (rev 18)
@@ -1,57 +0,0 @@
-###############################################################################
-# Subgroup package R classes
-#
-# This file is part of the R subgroup package.
-# Copyright (C) 2011-2012 by Martin Atzmueller
-#
-# This program is free software: you can redistribute it and/or modify
-# it under the terms of the GNU General Public License as published by
-# the Free Software Foundation, either version 3 of the License, or
-# (at your option) any later version.
-
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program. If not, see <http://www.gnu.org/licenses/>.
-#
-# Contact: Martin Atzmueller (martin at atzmueller.net)
-###############################################################################
-
-setClass("SDOptions",
- representation(
- verbose = "logical",
- debug = "logical"
- ),
- prototype(verbose = FALSE, debug = FALSE)
-)
-
-
-setClass("SDTaskConfig",
- representation(
- qf = "character",
- method = "character",
- k = "numeric",
- minqual = "numeric",
- minsize = "numeric",
- maxlen = "numeric",
- nodefaults = "logical",
- relfilter = "logical",
- postfilter = "character",
- attributes = "character"
- ),
- prototype(qf="ps", method="sdmap", k = as.integer(20), minqual = as.integer(0), minsize = as.integer(0),
- maxlen = as.integer(7), nodefaults = FALSE, relfilter = FALSE, postfilter = "", attributes = character(0))
-)
-
-
-setClass("Pattern",
- representation(
- description="character",
- quality="numeric",
- size="numeric",
- parameters="list"
- )
-)
\ No newline at end of file
Copied: pkg/subgroup/R/classes.R (from rev 17, pkg/R/classes.R)
===================================================================
--- pkg/subgroup/R/classes.R (rev 0)
+++ pkg/subgroup/R/classes.R 2014-08-03 20:18:51 UTC (rev 18)
@@ -0,0 +1,49 @@
+###############################################################################
+# subgroup.discovery package R classes
+#
+# This file is part of the R subgroup.discovery package.
+# Copyright (C) 2011-2014 by Martin Atzmueller
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
+#
+# Contact: Martin Atzmueller (martin at atzmueller.net)
+###############################################################################
+
+
+setClass("SDTaskConfig",
+ representation(
+ qf = "character",
+ method = "character",
+ k = "numeric",
+ minqual = "numeric",
+ minsize = "numeric",
+ maxlen = "numeric",
+ nodefaults = "logical",
+ relfilter = "logical",
+ postfilter = "character",
+ attributes = "character"
+ ),
+ prototype(qf="ps", method="sdmap", k = as.integer(20), minqual = as.integer(0), minsize = as.integer(0),
+ maxlen = as.integer(7), nodefaults = FALSE, relfilter = FALSE, postfilter = "", attributes = character(0))
+)
+
+
+setClass("Pattern",
+ representation(
+ description="character",
+ quality="numeric",
+ size="numeric",
+ parameters="list"
+ )
+)
\ No newline at end of file
Deleted: pkg/subgroup/R/subgroup.R
===================================================================
--- pkg/R/subgroup.R 2012-05-27 13:18:02 UTC (rev 7)
+++ pkg/subgroup/R/subgroup.R 2014-08-03 20:18:51 UTC (rev 18)
@@ -1,250 +0,0 @@
-###############################################################################
-# Subgroup package R core support code
-#
-# This file is part of the R subgroup package.
-# Copyright (C) 2011-2012 by Martin Atzmueller
-#
-# This program is free software: you can redistribute it and/or modify
-# it under the terms of the GNU General Public License as published by
-# the Free Software Foundation, either version 3 of the License, or
-# (at your option) any later version.
-
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program. If not, see <http://www.gnu.org/licenses/>.
-#
-# Contact: Martin Atzmueller (martin at atzmueller.net)
-###############################################################################
-
-library(foreign)
-library(rJava)
-
-setGeneric("CreateARFFProvider",
- function(source, name, ...) {
- standardGeneric("CreateARFFProvider")
- }
-)
-
-setMethod("CreateARFFProvider", signature(source = "data.frame", name = "character"),
- function(source, name, ...) {
- # Creates a dataset provider (converting the dataframe)
- con <- textConnection("arffVector", "w")
- write.arff(source, con)
- flush(con)
- close(con)
- arff <- paste(arffVector, "", collapse="\n")
- provider <- .jnew("org/vikamine/kernel/xpdl/ARFFAsStringDatasetProvider", arff, name)
- return(provider)
- }
-)
-
-setMethod("CreateARFFProvider", signature(source = "character", name = "character"),
- function(source, name, ...) {
- # Creates a dataset provider given a file name
- provider <- .jnew("org/vikamine/kernel/xpdl/FileDatasetProvider", source)
- return(provider)
- }
-)
-
-CreateOntologyForData <- function(provider, dataset) {
- # Creates the ontology object for the respective dataset
- ontology <- J(provider, "getDataset", dataset)
- return(ontology)
-}
-
-CreateSimpleSDTask <- function(ontology, target) {
- # Creates a simple subgroup discovery task
- simpleTask <- new(J("org/vikamine/kernel/subgroup/search/SDSimpleTask"), ontology)
- if (!is.null(target$value)) {
- selector <- new(J("org/vikamine/kernel/subgroup/selectors/DefaultSGSelector"), ontology, target$attribute, target$value)
- target <- new(J("org/vikamine/kernel/subgroup/target/SelectorTarget"), selector)
- } else {
- attribute <- J(ontology, "getAttribute", target$attribute)
- target <- new(J("org/vikamine/kernel/subgroup/target/NumericTarget"), attribute)
- }
- J(simpleTask, "setTarget", target)
- return(simpleTask)
-}
-
-CreateSDTask <- function(source, target, config = new("SDTaskConfig")) {
- # Creates a subgroup discovery task
- #
- # Args:
- # source: A data source, i.e., dataframe or file (name)
- # target: The target variable
- # config: A SDTaskConfig
- #
- # Returns:
- # A subgroup discovery task
- provider <- CreateARFFProvider(source, "data")
- ontology <- CreateOntologyForData(provider, "data")
- task <- CreateSimpleSDTask(ontology, target)
- J(task, "setQualityFunction", config at qf)
- J(task, "setSDMethod", config at method)
- J(task, "setMaxSGCount", as.integer(config at k))
- J(task, "setMinQualityLimit", as.double(config at minqual))
- J(task, "setMinSubgroupSize", as.double(config at minsize))
- J(task, "setMaxSGDSize", as.integer(config at maxlen))
- J(task, "setSuppressStrictlyIrrelevantSubgroups", config at relfilter)
- J(task, "setIgnoreDefaultValues", config at nodefaults)
- if (config at postfilter != "") {
- J(task, "setPostFilter", config at postfilter)
- }
- if ((!is.null(config at attributes)) && (length(config at attributes) > 0)) {
- J(task, "setAttributes", .jarray(config at attributes))
- }
- return(task)
-}
-
-as.target <- function(attribute=NULL, value=NULL) {
- # Creates a target variable object given attribute and value (for nominals)
- #
- # Args:
- # attribute: The respective attribute
- # value: For nominals, the respective value; for numeric NULL
- #
- # Returns:
- # A target object representation
- if (!is.null(attribute) && !is.null(value))
- return(list(attribute=attribute, value=value))
- else if (!is.null(attribute))
- return(list(attribute=attribute))
- else
- return(NULL)
-}
-
-GetParameters <- function(task, sg) {
- target <- J(task, "getTarget")
- if (J(target, "isBoolean")) {
- size <- J(J(sg, "getStatistics"), "getSubgroupSize")
- p <- J(J(sg, "getStatistics"), "getP")
- p0 <- J(J(sg, "getStatistics"), "getP0")
- return(list(p = p, p0 = p0, size = size))
- } else if (J(target, "isNumeric")) {
- size <- J(J(sg, "getStatistics"), "getSubgroupSize")
- mean <- J(J(sg, "getStatistics"), "getSGMean")
- popMean <- J(J(sg, "getStatistics"), "getPopulationMean")
- return(list(mean = mean, populationMean = popMean, size = size))
- } else {
- stop("Unknown target")
- }
-}
-
-ConvertDescription <- function(sgDescription) {
- # Internal function for converting a (Java) SGDescription consisting
- # of a set of selection expressions into a character vector of strings
- # representing these
- # Args:
- # sgDescription: A (Java) SGDescription object
- #
- # Returns:
- # A character vector
- sgSelectorArray <-
- J("org.vikamine.kernel.subgroup.search.SDSimpleTask")$getSimpleDescription(sgDescription)
- return(as.character(sgSelectorArray))
-}
-
-DiscoverSubgroupsByTask <- function(task) {
- # Internal function for setting up and performing subgroup discovery
- # Args:
- # task: A subgroup discovery task
- #
- # Returns:
- # A list of subgroup patterns
- sgSet <- J(task, "performSubgroupDiscovery")
- sgList <- J(sgSet, "toSortedList", FALSE)
- sgArray <- .jevalArray(J(sgList, "toArray"))
-
- patterns = list()
- for (sg in sgArray) {
- #description <- as.character(J(J(sg, "getSGDescription"), "getDescription"))
- sgDescription <- J(sg, "getSGDescription")
- description <- ConvertDescription(sgDescription)
- quality <- J(sg, "getQuality")
- size <- J(J(sg, "getStatistics"), "getSubgroupSize")
- parameters = GetParameters(task, sg)
- pattern <- new("Pattern", description=description, quality=quality, size=size, parameters=parameters)
- patterns = append(patterns, pattern)
- }
-
- return(patterns)
-}
-
-DiscoverSubgroups <- function(source, target, config=new("SDTaskConfig")) {
- # Performs subgroup discovery according to target and config on data
- #
- # Args:
- # data: A dataframe
- # target: A target variable (constructed by as.target)
- # config: a SDTaskConfig configuration for the algorithm
- #
- # Returns:
- # A list of subgroup patterns
- task <- CreateSDTask(source, target, config)
- result <- DiscoverSubgroupsByTask(task)
- return(result)
-}
-
-
-FormatDoubleSignificantDigits <- function(double, ndigits=2) {
- # Internal function: Prints double according to
- if (is.numeric(ndigits)) {
- sprintf(paste("%.", ndigits, "f", sep=""), double)
- } else {
- double
- }
-}
-
-
-
-ToDataFrame <- function(patterns, ndigits=2) {
- # Transforms a list/vector of patterns into a dataframe
- #
- # Args:
- # patterns: List of patterns
- # ndigits: Number of significant digits for floats
- #
- # Returns:
- # The dataframe containing the pattern information
- isNumeric = FALSE
- descriptions <- list()
- length(descriptions) = length(patterns)
- qualities <-list()
- length(qualities) = length(patterns)
- sizes <- list()
- length(sizes) <- length(patterns)
- ps <- list()
-
- i = 1
- for (pattern in patterns) {
- descriptions[i] = paste(pattern at description, collapse=", ")
- qualities[i] = FormatDoubleSignificantDigits(pattern at quality, ndigits)
- sizes[i] = pattern at size
- if (!is.null(pattern at parameters$mean)) {
- ps[i] = FormatDoubleSignificantDigits(pattern at parameters$mean, ndigits)
- isNumeric = TRUE
- } else {
- ps[i] = FormatDoubleSignificantDigits(pattern at parameters$p, ndigits)
- isNumeric = FALSE
- }
- i = i + 1
- }
- if (isNumeric) {
- dataframe <- data.frame(
- quality=as.vector(qualities, "numeric"),
- mean=as.vector(ps, "numeric"),
- size=as.vector(sizes, "numeric"),
- description=as.vector(descriptions, "character"))
- } else {
- dataframe <- data.frame(
- quality=as.vector(qualities, "numeric"),
- p=as.vector(ps, "numeric"),
- size=as.vector(sizes, "numeric"),
- description=as.vector(descriptions, "character"))
- }
- return(dataframe)
-}
\ No newline at end of file
Copied: pkg/subgroup/R/subgroup.R (from rev 17, pkg/R/subgroup.R)
===================================================================
--- pkg/subgroup/R/subgroup.R (rev 0)
+++ pkg/subgroup/R/subgroup.R 2014-08-03 20:18:51 UTC (rev 18)
@@ -0,0 +1,266 @@
+###############################################################################
+# subgroup.discovery package R classes
+#
+# This file is part of the R subgroup.discovery package.
+# Copyright (C) 2011-2014 by Martin Atzmueller
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
+#
+# Contact: Martin Atzmueller (martin at atzmueller.net)
+###############################################################################
+
+library(foreign)
+library(rJava)
+
+setGeneric(".CreateARFFProvider",
+ function(source, name, ...) {
+ standardGeneric(".CreateARFFProvider")
+ }
+)
+
+setMethod(".CreateARFFProvider", signature(source = "data.frame", name = "character"),
+ function(source, name, ...) {
+ # Creates a dataset provider (converting the dataframe)
+ con <- textConnection("arffVector", "w")
+ write.arff(source, con)
+ flush(con)
+ close(con)
+ rm(con)
+ arff <- paste(arffVector, "", collapse="\n")
+ provider <- .jnew("org/vikamine/kernel/xpdl/ARFFAsStringDatasetProvider", arff, name)
+ return(provider)
+ }
+)
+
+setMethod(".CreateARFFProvider", signature(source = "character", name = "character"),
+ function(source, name, ...) {
+ # Creates a dataset provider given a file name
+ provider <- .jnew("org/vikamine/kernel/xpdl/FileDatasetProvider", source)
+ return(provider)
+ }
+)
+
+.CreateOntologyForData <- function(provider, dataset) {
+ # Creates the ontology object for the respective dataset
+ ontology <- .jcall(provider, "Lorg/vikamine/kernel/data/Ontology;","getDataset", dataset)
+ return(ontology)
+}
+
+.CreateSimpleSDTask <- function(ontology, target) {
+ # Creates a simple subgroup discovery task
+ .FreeMemory()
+ simpleTask <- new(J("org/vikamine/kernel/subgroup/search/SDSimpleTask"), ontology)
+ if (!is.null(target$value)) {
+ selector <- new(J("org/vikamine/kernel/subgroup/selectors/DefaultSGSelector"), ontology, target$attribute, target$value)
+ target <- new(J("org/vikamine/kernel/subgroup/target/SelectorTarget"), selector)
+ } else {
+ attribute <- J(ontology, "getAttribute", target$attribute)
+ target <- new(J("org/vikamine/kernel/subgroup/target/NumericTarget"), attribute)
+ }
+ J(simpleTask, "setTarget", target)
+ return(simpleTask)
+}
+
+CreateSDTask <- function(source, target, config = new("SDTaskConfig")) {
+ # Creates a subgroup discovery task
+ #
+ # Args:
+ # source: A data source, i.e., dataframe or file (name)
+ # target: The target variable
+ # config: A SDTaskConfig
+ #
+ # Returns:
+ # A subgroup discovery task
+ .FreeMemory()
+ provider <- .CreateARFFProvider(source, "data")
+ ontology <- .CreateOntologyForData(provider, "data")
+ task <- .CreateSimpleSDTask(ontology, target)
+ J(task, "setQualityFunction", config at qf)
+ J(task, "setSDMethod", config at method)
+ J(task, "setMaxSGCount", as.integer(config at k))
+ J(task, "setMinQualityLimit", as.double(config at minqual))
+ J(task, "setMinSubgroupSize", as.double(config at minsize))
+ J(task, "setMaxSGDSize", as.integer(config at maxlen))
+ J(task, "setSuppressStrictlyIrrelevantSubgroups", config at relfilter)
+ J(task, "setIgnoreDefaultValues", config at nodefaults)
+ if (config at postfilter != "") {
+ J(task, "setPostFilter", config at postfilter)
+ }
+ if ((!is.null(config at attributes)) && (length(config at attributes) > 0)) {
+ J(task, "setAttributes", .jarray(config at attributes))
+ }
+ return(task)
+}
+
+as.target <- function(attribute=NULL, value=NULL) {
+ # Creates a target variable object given attribute and value (for nominals)
+ #
+ # Args:
+ # attribute: The respective attribute
+ # value: For nominals, the respective value; for numeric NULL
+ #
+ # Returns:
+ # A target object representation
+ if (!is.null(attribute) && !is.null(value))
+ return(list(attribute=attribute, value=value))
+ else if (!is.null(attribute))
+ return(list(attribute=attribute))
+ else
+ return(NULL)
+}
+
+.GetParameters <- function(task, sg) {
+ target <- J(task, "getTarget")
+ if (J(target, "isBoolean")) {
+ size <- J(J(sg, "getStatistics"), "getSubgroupSize")
+ p <- J(J(sg, "getStatistics"), "getP")
+ p0 <- J(J(sg, "getStatistics"), "getP0")
+ return(list(p = p, p0 = p0, size = size))
+ } else if (J(target, "isNumeric")) {
+ size <- J(J(sg, "getStatistics"), "getSubgroupSize")
+ mean <- J(J(sg, "getStatistics"), "getSGMean")
+ popMean <- J(J(sg, "getStatistics"), "getPopulationMean")
+ return(list(mean = mean, populationMean = popMean, size = size))
+ } else {
+ stop("Unknown target")
+ }
+}
+
+.ConvertDescription <- function(sgDescription) {
+ # Internal function for converting a (Java) SGDescription consisting
+ # of a set of selection expressions into a character vector of strings
+ # representing these
+ # Args:
+ # sgDescription: A (Java) SGDescription object
+ #
+ # Returns:
+ # A character vector
+ sgSelectorArray <-
+ J("org.vikamine.kernel.subgroup.search.SDSimpleTask")$getSimpleDescription(sgDescription)
+ return(as.character(sgSelectorArray))
+}
+
+DiscoverSubgroupsByTask <- function(task, as.df=FALSE) {
+ # Internal function for setting up and performing subgroup discovery
+ # Args:
+ # task: A subgroup discovery task
+ #
+ # Returns:
+ # A list of subgroup patterns
+ sgSet <- J(task, "performSubgroupDiscovery")
+ sgList <- J(sgSet, "toSortedList", FALSE)
+ sgArray <- .jevalArray(J(sgList, "toArray"))
+
+ patterns = list()
+ for (sg in sgArray) {
+ #description <- as.character(J(J(sg, "getSGDescription"), "getDescription"))
+ sgDescription <- J(sg, "getSGDescription")
+ description <- .ConvertDescription(sgDescription)
+ quality <- J(sg, "getQuality")
+ size <- J(J(sg, "getStatistics"), "getSubgroupSize")
+ parameters = .GetParameters(task, sg)
+ pattern <- new("Pattern", description=description, quality=quality, size=size, parameters=parameters)
+ patterns = append(patterns, pattern)
+ }
+
+ if (as.df) {
+ dataFrameRules <- ToDataFrame(patterns)
+ return(dataFrameRules)
+ } else {
+ return(patterns)
+ }
+}
+
+DiscoverSubgroups <- function(source, target, config=new("SDTaskConfig"), as.df=FALSE) {
+ # Performs subgroup discovery according to target and config on data
+ #
+ # Args:
+ # data: A dataframe
+ # target: A target variable (constructed by as.target)
+ # config: a SDTaskConfig configuration for the algorithm
+ #
+ # Returns:
+ # A list of subgroup patterns
+ task <- CreateSDTask(source, target, config)
+ result <- DiscoverSubgroupsByTask(task, as.df)
+ return(result)
+}
+
+
+.FormatDoubleSignificantDigits <- function(double, ndigits=2) {
+ # Internal function: Prints double according to
+ if (is.numeric(ndigits)) {
+ sprintf(paste("%.", ndigits, "f", sep=""), double)
+ } else {
+ double
+ }
+}
+
+
+
+ToDataFrame <- function(patterns, ndigits=2) {
+ # Transforms a list/vector of patterns into a dataframe
+ #
+ # Args:
+ # patterns: List of patterns
+ # ndigits: Number of significant digits for floats
+ #
+ # Returns:
+ # The dataframe containing the pattern information
+ isNumeric = FALSE
+ descriptions <- list()
+ length(descriptions) = length(patterns)
+ qualities <-list()
+ length(qualities) = length(patterns)
+ sizes <- list()
+ length(sizes) <- length(patterns)
+ ps <- list()
+
+ i = 1
+ for (pattern in patterns) {
+ descriptions[i] = paste(pattern at description, collapse=", ")
+ qualities[i] = .FormatDoubleSignificantDigits(pattern at quality, ndigits)
+ sizes[i] = pattern at size
+ if (!is.null(pattern at parameters$mean)) {
+ ps[i] = .FormatDoubleSignificantDigits(pattern at parameters$mean, ndigits)
+ isNumeric = TRUE
+ } else {
+ ps[i] = .FormatDoubleSignificantDigits(pattern at parameters$p, ndigits)
+ isNumeric = FALSE
+ }
+ i = i + 1
+ }
+ if (isNumeric) {
+ dataframe <- data.frame(
+ quality=as.vector(qualities, "numeric"),
+ mean=as.vector(ps, "numeric"),
+ size=as.vector(sizes, "numeric"),
+ description=as.vector(descriptions, "character"))
+ } else {
+ dataframe <- data.frame(
+ quality=as.vector(qualities, "numeric"),
+ p=as.vector(ps, "numeric"),
+ size=as.vector(sizes, "numeric"),
+ description=as.vector(descriptions, "character"))
+ }
+ return(dataframe)
+}
+
+.FreeMemory <- function(...) {
+ # Call the R garbage collection
+ # Then call Java garbage collection
+ gc(...)
+ .jcall("java/lang/System", method = "gc")
+ invisible()
+}
\ No newline at end of file
Deleted: pkg/subgroup/inst/java/subgroup.jar
===================================================================
(Binary files differ)
Copied: pkg/subgroup/inst/java/subgroup.jar (from rev 10, pkg/inst/java/subgroup.jar)
===================================================================
(Binary files differ)
Deleted: pkg/subgroup/man/DiscoverSubgroups.Rd
===================================================================
--- pkg/man/DiscoverSubgroups.Rd 2012-05-27 13:18:02 UTC (rev 7)
+++ pkg/subgroup/man/DiscoverSubgroups.Rd 2014-08-03 20:18:51 UTC (rev 18)
@@ -1,40 +0,0 @@
-\name{DiscoverSubgroups}
-\alias{DiscoverSubgroups}
-\title{Performs Subgroup Discovery}
-\description{
-Performs subgroup discovery according to the given target and
-the configuration on the data.
-}
-\usage{
-DiscoverSubgroups(source, target, config=new("SDTaskConfig"))
-}
-\arguments{
-\item{source}{a data.frame or the a character string giving
-the filename of an ARFF file to use.}
-\item{target}{the target variable (constructed by as.target)
-to consider for subgroup discovery.}
-\item{config}{an instance of SDTaskConfig providing various
-parameters for subgroup discovery.}
-}
-\seealso{
-\code{\link{DiscoverSubgroupsByTask}}.
-\code{\link{as.target}}
-\code{\link{CreateSDTask}}
-\code{\link{SDTaskConfig}}
-}
-\examples{
-# subgroup discovery on a data.frame, for binary target
-data(credit.data)
-result1 <- DiscoverSubgroups(credit.data, as.target("class", "good"))
-result2 <- DiscoverSubgroups(credit.data, as.target("class", "good"), new("SDTaskConfig", attributes=c("checking_status", "employment")))
-
-ToDataFrame(result1)
-ToDataFrame(result2)
-
-# subgroup discovery for numeric target variable
-result3 <- DiscoverSubgroups(credit.data, as.target("credit_amount"))
-
-ToDataFrame(result3)
-}
-\keyword{subgroup discovery}
-
Copied: pkg/subgroup/man/DiscoverSubgroups.Rd (from rev 16, pkg/man/DiscoverSubgroups.Rd)
===================================================================
--- pkg/subgroup/man/DiscoverSubgroups.Rd (rev 0)
+++ pkg/subgroup/man/DiscoverSubgroups.Rd 2014-08-03 20:18:51 UTC (rev 18)
@@ -0,0 +1,44 @@
+\name{DiscoverSubgroups}
+\alias{DiscoverSubgroups}
+\title{Performs Subgroup Discovery}
+\description{
+Performs subgroup discovery according to the given target and
+the configuration on the data.
+}
+\usage{
+DiscoverSubgroups(source, target, config=new("SDTaskConfig"), as.df=FALSE)
+}
+\arguments{
+\item{source}{a data.frame or the a character string giving
+the filename of an ARFF file to use.}
+\item{target}{the target variable (constructed by as.target)
+to consider for subgroup discovery.}
+\item{config}{an instance of SDTaskConfig providing various
+parameters for subgroup discovery.}
+\item{as.df}{TRUE, if the result patterns should be returned as
+a data.frame using \code{\link{ToDataFrame}}}
+}
+\seealso{
+\code{\link{DiscoverSubgroupsByTask}}.
+\code{\link{as.target}}
+\code{\link{CreateSDTask}}
+\code{\link{SDTaskConfig}}
+}
+\examples{
+# subgroup discovery on a data.frame, for binary target
+data(credit.data)
+result1 <- DiscoverSubgroups(credit.data, as.target("class", "good"))
+result2 <- DiscoverSubgroups(
+ credit.data, as.target("class", "good"), new("SDTaskConfig",
+ attributes=c("checking_status", "employment")))
+
+ToDataFrame(result1)
+ToDataFrame(result2)
+
+# subgroup discovery for numeric target variable
+result3 <- DiscoverSubgroups(credit.data, as.target("credit_amount"))
+
+ToDataFrame(result3)
+}
+\keyword{subgroup discovery}
+
Deleted: pkg/subgroup/man/DiscoverSubgroupsByTask.Rd
===================================================================
--- pkg/man/DiscoverSubgroupsByTask.Rd 2012-05-27 13:18:02 UTC (rev 7)
+++ pkg/subgroup/man/DiscoverSubgroupsByTask.Rd 2014-08-03 20:18:51 UTC (rev 18)
@@ -1,29 +0,0 @@
-\name{DiscoverSubgroupsByTask}
-\alias{DiscoverSubgroupsByTask}
-\title{Performs Subgroup Discovery for a given Task}
-\description{
-Performs subgroup discovery according to the given task.
-}
-\usage{
-DiscoverSubgroupsByTask(task)
-}
-\arguments{
-\item{task}{a subgroup discovery task constructed by
- CreateSDTask.}
-}
-\seealso{
-\code{\link{DiscoverSubgroups}}.
-\code{\link{CreateSDTask}}
-}
-\examples{
-# creating a task
-data(credit.data)
-task <- CreateSDTask(credit.data, as.target("class", "bad"))
-taskNum <- CreateSDTask(credit.data, as.target("credit_amount"))
-
-# running the tasks
-DiscoverSubgroupsByTask(task)
-DiscoverSubgroupsByTask(taskNum)
-}
-\keyword{subgroup task}
-
Copied: pkg/subgroup/man/DiscoverSubgroupsByTask.Rd (from rev 16, pkg/man/DiscoverSubgroupsByTask.Rd)
===================================================================
--- pkg/subgroup/man/DiscoverSubgroupsByTask.Rd (rev 0)
+++ pkg/subgroup/man/DiscoverSubgroupsByTask.Rd 2014-08-03 20:18:51 UTC (rev 18)
@@ -0,0 +1,31 @@
+\name{DiscoverSubgroupsByTask}
+\alias{DiscoverSubgroupsByTask}
+\title{Performs Subgroup Discovery for a given Task}
+\description{
+Performs subgroup discovery according to the given task.
+}
+\usage{
+DiscoverSubgroupsByTask(task, as.df=FALSE)
+}
+\arguments{
+\item{task}{a subgroup discovery task constructed by
+ CreateSDTask.}
+ \item{as.df}{TRUE, if the result patterns should be returned
+as a data.frame using \code{\link{ToDataFrame}}}
+}
+\seealso{
+\code{\link{DiscoverSubgroups}}.
+\code{\link{CreateSDTask}}
+}
+\examples{
+# creating a task
+data(credit.data)
+task <- CreateSDTask(credit.data, as.target("class", "bad"))
[TRUNCATED]
To get the complete diff run:
svnlook diff /svnroot/subgroup -r 18
More information about the Subgroup-commits
mailing list