Input Features as Output Objectives

This notebook demonstrates how to put objectives on input features or a combination of input features. Possible usecases are favoring lower or higher amounts of an ingredient or to take into account a known (linear) cost function. In case of categorical inputs it can be used to penalize the optimizer for choosing specific categories.

Imports

import numpy as np

import bofire.strategies.api as strategies
import bofire.surrogates.api as surrogates
from bofire.benchmarks.api import Himmelblau
from bofire.data_models.features.api import CategoricalInput, ContinuousOutput
from bofire.data_models.objectives.api import (
    MaximizeObjective,
    MaximizeSigmoidObjective,
)
from bofire.data_models.strategies.api import MultiplicativeSoboStrategy
from bofire.data_models.surrogates.api import (
    BotorchSurrogates,
    CategoricalDeterministicSurrogate,
    LinearDeterministicSurrogate,
)

Setup an Example

We use Himmelblau as example with an additional objective on x_2 which pushes it to be larger 3 during the optimization. In addition, we introduce a categorical feature called x_cat which is mapped by an CategoricalDeterministicSurrogate to a continuous output called y_cat.

bench = Himmelblau()
experiments = bench.f(bench.domain.inputs.sample(10), return_complete=True)

domain = bench.domain

# setup extra feature `y_x2` that is the same as `x_2` and is taken into account in the optimization by a sigmoid objective
domain.outputs.features.append(
    ContinuousOutput(key="y_x2", objective=MaximizeSigmoidObjective(tp=3, steepness=10))
)
experiments["y_x2"] = experiments.x_2


# add extra categorical input feature and corresponding output feature
domain.inputs.features.append(CategoricalInput(key="x_cat", categories=["a", "b", "c"]))
domain.outputs.features.append(
    ContinuousOutput(key="y_cat", objective=MaximizeObjective())
)

# generate random values for the new categorical feature
experiments["x_cat"] = np.random.choice(["a", "b", "c"], size=experiments.shape[0])

The LinearDeterministicSurrogate can be used to model that y_x2 = x_2.

surrogate_data = LinearDeterministicSurrogate(
    inputs=domain.inputs.get_by_keys(["x_2"]),
    outputs=domain.outputs.get_by_keys(["y_x2"]),
    coefficients={"x_2": 1},
    intercept=0,
)
surrogate = surrogates.map(surrogate_data)
surrogate.predict(experiments[domain.inputs.get_keys()].copy())
/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/site-packages/bofire/surrogates/botorch.py:47: UserWarning: The given NumPy array is not writable, and PyTorch does not support non-writable tensors. This means writing to this tensor will result in undefined behavior. You may want to copy the array to protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:213.)
  X = torch.from_numpy(transformed_X.values).to(**tkwargs)
y_x2_pred y_x2_sd
0 -1.138489 0.0
1 1.987826 0.0
2 -3.720015 0.0
3 -0.180767 0.0
4 0.587606 0.0
5 -2.216124 0.0
6 -1.008282 0.0
7 -4.447336 0.0
8 1.088959 0.0
9 -2.562236 0.0

The CategoricalDeterministicSurrogate can be used to map categories to specific continuous values.

categorical_surrogate_data = CategoricalDeterministicSurrogate(
    inputs=domain.inputs.get_by_keys(["x_cat"]),
    outputs=domain.outputs.get_by_keys(["y_cat"]),
    mapping={"a": 1, "b": 0.2, "c": 0.3},
)

surrogate = surrogates.map(categorical_surrogate_data)

surrogate.predict(experiments[domain.inputs.get_keys()].copy())

experiments["y_cat"] = surrogate.predict(experiments[domain.inputs.get_keys()].copy())[
    "y_cat_pred"
]

experiments
x_1 x_2 y valid_y y_x2 x_cat y_cat
0 0.896675 -1.138489 151.578920 1 -1.138489 b 0.2
1 -1.865327 1.987826 54.757235 1 1.987826 a 1.0
2 2.546576 -3.720015 155.894551 1 -3.720015 c 0.3
3 5.409597 -0.180767 329.420359 1 -0.180767 a 1.0
4 -2.146421 0.587606 111.161263 1 0.587606 b 0.2
5 4.593425 -2.216124 68.421625 1 -2.216124 c 0.3
6 -1.503030 -1.008282 151.092705 1 -1.008282 a 1.0
7 0.830973 -4.447336 402.989549 1 -4.447336 b 0.2
8 3.283702 1.088959 7.163048 1 1.088959 c 0.3
9 -2.463271 -2.562236 64.567658 1 -2.562236 c 0.3

Next we setup a SoboStrategy using the custom surrogates for outputs y_x2 and y_cat and ask for a candidate. Note that the surrogate specs for output y is automatically generated and defaulted to be a SingleTaskGPSurrogate.

strategy_data = MultiplicativeSoboStrategy(
    domain=domain,
    surrogate_specs=BotorchSurrogates(
        surrogates=[surrogate_data, categorical_surrogate_data]
    ),
)
strategy = strategies.map(strategy_data)
strategy.tell(experiments)
strategy.ask(4)
x_1 x_2 x_cat y_pred y_cat_pred y_x2_pred y_sd y_cat_sd y_x2_sd y_des y_x2_des y_cat_des
0 -0.871188 3.386148 c -124.009135 0.3 3.386148 63.756700 0.0 0.0 124.009135 0.979397 0.3
1 -4.069133 3.562314 c -55.534698 0.3 3.562314 94.468448 0.0 0.0 55.534698 0.996400 0.3
2 0.470655 3.459160 c -104.181274 0.3 3.459160 77.581543 0.0 0.0 104.181274 0.989965 0.3
3 -5.656718 5.351850 a 204.323964 1.0 5.351850 131.981789 0.0 0.0 -204.323964 1.000000 1.0