Create a FeatureObject, which will be used as input for all the feature computations.

createFeatureObject(init, X, y, fun, minimize, lower, upper, blocks, objective,
  force = FALSE)

Arguments

init

[data.frame] A data.frame, which can be used as initial design. If not provided, it will be created either based on the initial sample X and the objective values y or X and the function definition fun.

X

[data.frame or matrix] A data.frame or matrix containing the initial sample. If not provided, it will be extracted from init.

y

[numeric or integer] A vector containing the objective values of the initial design. If not provided, it will be extracted from init.

fun

[function] A function, which allows the computation of the objective values. If it is not provided, features that require additional function evaluations, can't be computed.

minimize

[logical(1)] Should the objective function be minimized? The default is TRUE.

lower

[numeric or integer] The lower limits per dimension.

upper

[numeric or integer] The upper limits per dimension.

blocks

[integer] The number of blocks per dimension.

objective

[character(1)] The name of the feature, which contains the objective values. The default is "y".

force

[logical(1)] Only change this parameter IF YOU KNOW WHAT YOU ARE DOING! Per default (force = FALSE), the function checks whether the total number of cells that you are trying to generate, is below the (hard-coded) internal maximum of 25,000 cells. If you set this parameter to TRUE, you agree that you want to exceed that internal limit. Note: *Exploratory Landscape Analysis (ELA)* is only useful when you are limited to a small budget (i.e., a small number of function evaluations) and in such scenarios, the number of cells should also be kept low!

Value

[FeatureObject].

Examples

# (1a) create a feature object using X and y: X = createInitialSample(n.obs = 500, dim = 3, control = list(init_sample.lower = -10, init_sample.upper = 10)) y = apply(X, 1, function(x) sum(x^2)) feat.object1 = createFeatureObject(X = X, y = y, lower = -10, upper = 10, blocks = c(5, 10, 4)) # (1b) create a feature object using X and fun: feat.object2 = createFeatureObject(X = X, fun = function(x) sum(sin(x) * x^2), lower = -10, upper = 10, blocks = c(5, 10, 4)) # (1c) create a feature object using a data.frame: feat.object3 = createFeatureObject(iris[,-5], blocks = 5, objective = "Petal.Length") # (2) have a look at the feature objects: feat.object1
#> Feature Object: #> - Number of Observations: 500 #> - Number of Variables: 3 #> - Lower Boundaries: -1.00e+01, -1.00e+01, -1.00e+01 #> - Upper Boundaries: 1.00e+01, 1.00e+01, 1.00e+01 #> - Name of Variables: x1, x2, x3 #> - Optimization Problem: minimize y #> - Number of Cells per Dimension: 5, 10, 4 #> - Size of Cells per Dimension: 4.00, 2.00, 5.00 #> - Number of Cells: #> - total: 200 #> - non-empty: 182 (91.00%) #> - empty: 18 (9.00%) #> - Average Number of Observations per Cell: #> - total: 2.50 #> - non-empty: 2.75
feat.object2
#> Feature Object: #> - Number of Observations: 500 #> - Number of Variables: 3 #> - Lower Boundaries: -1.00e+01, -1.00e+01, -1.00e+01 #> - Upper Boundaries: 1.00e+01, 1.00e+01, 1.00e+01 #> - Name of Variables: x1, x2, x3 #> - Optimization Problem: minimize y #> - Function to be Optimized: function (x) sum(sin(x) * x^2) #> - Number of Cells per Dimension: 5, 10, 4 #> - Size of Cells per Dimension: 4.00, 2.00, 5.00 #> - Number of Cells: #> - total: 200 #> - non-empty: 182 (91.00%) #> - empty: 18 (9.00%) #> - Average Number of Observations per Cell: #> - total: 2.50 #> - non-empty: 2.75
feat.object3
#> Feature Object: #> - Number of Observations: 150 #> - Number of Variables: 3 #> - Lower Boundaries: 4.30e+00, 2.00e+00, 1.00e-01 #> - Upper Boundaries: 7.90e+00, 4.40e+00, 2.50e+00 #> - Name of Variables: Sepal.Length, Sepal.Width, Petal.Width #> - Optimization Problem: minimize Petal.Length #> - Number of Cells per Dimension: 5, 5, 5 #> - Size of Cells per Dimension: 0.72, 0.48, 0.48 #> - Number of Cells: #> - total: 125 #> - non-empty: 37 (29.60%) #> - empty: 88 (70.40%) #> - Average Number of Observations per Cell: #> - total: 1.20 #> - non-empty: 4.05
# (3) now, one could calculate features calculateFeatureSet(feat.object1, "ela_meta")
#> $ela_meta.lin_simple.adj_r2 #> [1] -0.005733815 #> #> $ela_meta.lin_simple.intercept #> [1] 96.88311 #> #> $ela_meta.lin_simple.coef.min #> [1] 0.1943708 #> #> $ela_meta.lin_simple.coef.max #> [1] 0.3058714 #> #> $ela_meta.lin_simple.coef.max_by_min #> [1] 1.573649 #> #> $ela_meta.lin_w_interact.adj_r2 #> [1] 0.006650439 #> #> $ela_meta.quad_simple.adj_r2 #> [1] 1 #> #> $ela_meta.quad_simple.cond #> [1] 1 #> #> $ela_meta.quad_w_interact.adj_r2 #> [1] 1 #> #> $ela_meta.costs_fun_evals #> [1] 0 #> #> $ela_meta.costs_runtime #> [1] 0.017 #>
calculateFeatureSet(feat.object2, "cm_grad")
#> $cm_grad.mean #> [1] 0.655673 #> #> $cm_grad.sd #> [1] 0.1791901 #> #> $cm_grad.costs_fun_evals #> [1] 0 #> #> $cm_grad.costs_runtime #> [1] 0.056 #>
library(plyr) calculateFeatureSet(feat.object3, "cm_angle", control = list(cm_angle.show_warnings = FALSE))
#> $cm_angle.dist_ctr2best.mean #> [1] 1.886064 #> #> $cm_angle.dist_ctr2best.sd #> [1] 0.7072052 #> #> $cm_angle.dist_ctr2worst.mean #> [1] 1.909764 #> #> $cm_angle.dist_ctr2worst.sd #> [1] 0.6994935 #> #> $cm_angle.angle.mean #> [1] 6.260766 #> #> $cm_angle.angle.sd #> [1] 7.44275 #> #> $cm_angle.y_ratio_best2worst.mean #> [1] 0.0650481 #> #> $cm_angle.y_ratio_best2worst.sd #> [1] 0.07084862 #> #> $cm_angle.costs_fun_evals #> [1] 0 #> #> $cm_angle.costs_runtime #> [1] 0.147 #>