Classification Surrogate Tests

We are interested in testing whether or not a surrogate model can correctly identify unknown constraints based on categorical criteria with classification surrogates. Essentially, we want to account for scenarios where specialists can look at a set of experiments and label outcomes as ‘acceptable’, ‘unacceptable’, ‘ideal’, etc.

This involves new models that produce CategoricalOutput’s rather than continuous outputs. Mathematically, if \(g_{\theta}:\mathbb{R}^d\to[0,1]^c\) represents the function governed by learnable parameters \(\theta\) which outputs a probability vector over \(c\) potential classes (i.e. for input \(x\in\mathbb{R}^d\), \(g_{\theta}(x)^\top\mathbf{1}=1\) where \(\mathbf{1}\) is the vector of all 1’s) and we have acceptibility criteria for the corresponding classes given by \(a\in\{0,1\}^c\), we can compute the scalar output \(g_{\theta}(x)^\top a\in[0,1]\) which represents the expected value of acceptance as an objective value to be passed in as a constrained function.

In this script, we look at the Rosenbrock function constrained to a disk which attains a global minima at \((x_0^*,x_1^*)=(1.0, 1.0)\). To facilitate testing the functionality offered by BoFire, we label all points inside of the circle \(x_0^2+x_1^2\le2\) as ‘acceptable’ and further label anything inside of the intersection of this circle and the circle \((x_0-1)^2+(x_1-1)^2\le1.0\) as ‘ideal’; points lying outside of these two locations are labeled as “unacceptable.”

# Import packages
import pandas as pd

import bofire.strategies.api as strategies
from bofire.data_models.api import Domain, Inputs, Outputs
from bofire.data_models.features.api import (
    CategoricalInput,
    CategoricalOutput,
    ContinuousInput,
    ContinuousOutput,
)
from bofire.data_models.objectives.api import (
    ConstrainedCategoricalObjective,
    MinimizeObjective,
)

Manual setup of the optimization domain

The following cells show how to manually setup the optimization problem in BoFire for didactic purposes.

# Write helper functions which give the objective and the constraints
def rosenbrock(x: pd.Series) -> pd.Series:
    assert "x_0" in x.columns
    assert "x_1" in x.columns
    return (1 - x["x_0"]) ** 2 + 100 * (x["x_1"] - x["x_0"] ** 2) ** 2


def constraints(x: pd.Series) -> pd.Series:
    assert "x_0" in x.columns
    assert "x_1" in x.columns
    feasiblity_vector = []
    for _, row in x.iterrows():
        if (row["x_0"] ** 2 + row["x_1"] ** 2 <= 2.0) and (
            (row["x_0"] - 1.0) ** 2 + (row["x_1"] - 1.0) ** 2 <= 1.0
        ):
            feasiblity_vector.append("ideal")
        elif row["x_0"] ** 2 + row["x_1"] ** 2 <= 2.0:
            feasiblity_vector.append("acceptable")
        else:
            feasiblity_vector.append("unacceptable")
    return feasiblity_vector
# Set-up the inputs and outputs, use categorical domain just as an example
input_features = Inputs(
    features=[ContinuousInput(key=f"x_{i}", bounds=(-1.75, 1.75)) for i in range(2)]
    + [CategoricalInput(key="x_3", categories=["0", "1"], allowed=[True, True])],
)

# here the minimize objective is used, if you want to maximize you have to use the maximize objective.
output_features = Outputs(
    features=[
        ContinuousOutput(key="f_0", objective=MinimizeObjective(w=1.0)),
        CategoricalOutput(
            key="f_1",
            categories=["unacceptable", "acceptable", "ideal"],
            objective=ConstrainedCategoricalObjective(
                categories=["unacceptable", "acceptable", "ideal"],
                desirability=[False, True, True],
            ),
        ),  # This function will be associated with learning the categories
    ],
)

# Create domain
domain1 = Domain(inputs=input_features, outputs=output_features)

# Sample random points
sample_df = domain1.inputs.sample(100)

# Write a function which outputs one continuous variable and another discrete based on some logic
sample_df["f_0"] = rosenbrock(x=sample_df)
sample_df["f_1"] = constraints(x=sample_df)

sample_df.head(5)
x_0 x_1 x_3 f_0 f_1
0 1.088766 1.259982 0 0.563971 unacceptable
1 1.070676 0.762891 0 14.708753 ideal
2 0.757029 -0.406970 0 96.111197 acceptable
3 0.180336 0.663406 0 40.473414 ideal
4 0.208489 -1.347722 0 194.167397 acceptable
# Plot the sample df
import math

import plotly.express as px


fig = px.scatter(
    sample_df,
    x="x_0",
    y="x_1",
    color="f_1",
    width=550,
    height=525,
    title="Samples with labels",
)
fig.add_shape(
    type="circle",
    xref="x",
    yref="y",
    opacity=0.1,
    fillcolor="red",
    x0=-math.sqrt(2),
    y0=-math.sqrt(2),
    x1=math.sqrt(2),
    y1=math.sqrt(2),
    line_color="red",
)
fig.add_shape(
    type="circle",
    xref="x",
    yref="y",
    opacity=0.2,
    fillcolor="LightSeaGreen",
    x0=0,
    y0=0,
    x1=2,
    y1=2,
    line_color="LightSeaGreen",
)
fig.show()

Evaluate the classification model performance (outside of the optimization procedure)

# Import packages
import bofire.surrogates.api as surrogates
from bofire.data_models.surrogates.api import ClassificationMLPEnsemble
from bofire.surrogates.diagnostics import ClassificationMetricsEnum


# Instantiate the surrogate data model
surrogate_data = ClassificationMLPEnsemble(
    inputs=domain1.inputs,
    outputs=Outputs(features=[domain1.outputs.get_by_key("f_1")]),
    lr=0.03,
    n_epochs=100,
    hidden_layer_sizes=(
        4,
        2,
    ),
    weight_decay=0.0,
    batch_size=10,
    activation="tanh",
)
surrogate = surrogates.map(surrogate_data)

# Fit the surrogate to the classification data
cv_df = sample_df.drop(["f_0"], axis=1)
cv_df["valid_f_1"] = 1
cv_train, cv_test, _ = surrogate.cross_validate(cv_df, folds=3)
/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/botorch/models/ensemble.py:82: RuntimeWarning:

Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.

/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/torch/nn/modules/module.py:2910: RuntimeWarning:

Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.

/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/botorch/models/ensemble.py:82: RuntimeWarning:

Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.

/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/torch/nn/modules/module.py:2910: RuntimeWarning:

Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.

/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/botorch/models/ensemble.py:82: RuntimeWarning:

Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.

/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/torch/nn/modules/module.py:2910: RuntimeWarning:

Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.

/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/botorch/models/ensemble.py:82: RuntimeWarning:

Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.

/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/torch/nn/modules/module.py:2910: RuntimeWarning:

Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.

/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/botorch/models/ensemble.py:82: RuntimeWarning:

Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.

/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/torch/nn/modules/module.py:2910: RuntimeWarning:

Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.

/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/botorch/models/ensemble.py:82: RuntimeWarning:

Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.

/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/torch/nn/modules/module.py:2910: RuntimeWarning:

Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.
# Print training performance
cv_train.get_metrics(
    metrics=ClassificationMetricsEnum,
    combine_folds=True,
)
ACCURACY F1
0 0.73 0.73
# Print test performance
cv_test.get_metrics(
    metrics=ClassificationMetricsEnum,
    combine_folds=True,
)
ACCURACY F1
0 0.62 0.62

Setup strategy and ask for candidates

Now we setup a SoboStrategy for generating candidates, the categorical output is modelled using the surrogate from above. The categorical output is modelled as an output constraint in the acquistion function optimization (constrained expected improvement). For more details have a look at this notebook: https://github.com/pytorch/botorch/blob/main/notebooks_community/clf_constrained_bo.ipynb and/or this paper: https://arxiv.org/abs/2402.07692.

from bofire.data_models.acquisition_functions.api import qLogEI
from bofire.data_models.strategies.api import SoboStrategy
from bofire.data_models.surrogates.api import BotorchSurrogates


strategy_data = SoboStrategy(
    domain=domain1,
    acquisition_function=qLogEI(),
    surrogate_specs=BotorchSurrogates(
        surrogates=[surrogate_data],
    ),
)

strategy = strategies.map(strategy_data)

strategy.tell(sample_df)
candidates = strategy.ask(10)
candidates
/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/botorch/models/ensemble.py:82: RuntimeWarning:

Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.

/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/torch/nn/modules/module.py:2910: RuntimeWarning:

Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.

/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/botorch/optim/optimize.py:789: RuntimeWarning:

Optimization failed in `gen_candidates_scipy` with the following warning(s):
[RuntimeWarning('Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.'), RuntimeWarning('Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.'), OptimizationWarning('Optimization failed within `scipy.optimize.minimize` with status 2 and message ABNORMAL: .'), RuntimeWarning('Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.'), RuntimeWarning('Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.')]
Trying again with a new set of initial conditions.

/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/botorch/optim/optimize.py:789: RuntimeWarning:

Optimization failed on the second try, after generating a new set of initial conditions.

/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/botorch/optim/optimize.py:789: RuntimeWarning:

Optimization failed in `gen_candidates_scipy` with the following warning(s):
[RuntimeWarning('Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.'), RuntimeWarning('Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.'), NumericalWarning('A not p.d., added jitter of 1.0e-08 to the diagonal'), NumericalWarning('A not p.d., added jitter of 1.0e-07 to the diagonal'), NumericalWarning('A not p.d., added jitter of 1.0e-06 to the diagonal'), OptimizationWarning('Optimization failed within `scipy.optimize.minimize` with status 2 and message ABNORMAL: .'), OptimizationWarning('Optimization failed within `scipy.optimize.minimize` with status 2 and message ABNORMAL: .'), RuntimeWarning('Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.'), RuntimeWarning('Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.')]
Trying again with a new set of initial conditions.

/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/botorch/optim/optimize.py:789: RuntimeWarning:

Optimization failed on the second try, after generating a new set of initial conditions.

/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/botorch/optim/optimize.py:789: RuntimeWarning:

Optimization failed in `gen_candidates_scipy` with the following warning(s):
[RuntimeWarning('Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.'), RuntimeWarning('Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.'), OptimizationWarning('Optimization failed within `scipy.optimize.minimize` with status 2 and message ABNORMAL: .'), RuntimeWarning('Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.'), RuntimeWarning('Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.')]
Trying again with a new set of initial conditions.

/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/botorch/optim/optimize.py:789: RuntimeWarning:

Optimization failed in `gen_candidates_scipy` with the following warning(s):
[RuntimeWarning('Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.'), RuntimeWarning('Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.'), OptimizationWarning('Optimization failed within `scipy.optimize.minimize` with status 2 and message ABNORMAL: .'), RuntimeWarning('Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.'), RuntimeWarning('Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.')]
Trying again with a new set of initial conditions.

/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/botorch/models/ensemble.py:82: RuntimeWarning:

Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.

/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/torch/nn/modules/module.py:2910: RuntimeWarning:

Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.

/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/botorch/models/ensemble.py:82: RuntimeWarning:

Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.

/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/torch/nn/modules/module.py:2910: RuntimeWarning:

Could not update `train_inputs` with transformed inputs since _MLPEnsemble does not have a `train_inputs` attribute. Make sure that the `input_transform` is applied to both the train inputs and test inputs.
x_0 x_1 x_3 f_1_pred f_1_sd f_1_unacceptable_prob f_1_acceptable_prob f_1_ideal_prob f_0_pred f_1_unacceptable_sd f_1_acceptable_sd f_1_ideal_sd f_0_sd f_0_des f_1_des
0 0.166739 0.017691 1 ideal 0.0 0.230724 0.027416 0.972054 0.000530 3.936681 0.056340 0.056376 0.000433 -0.000530 0.999470
1 0.884821 0.766709 0 unacceptable 0.0 0.491461 0.223080 0.178551 0.598368 4.045405 0.436509 0.395845 0.545883 -0.598368 0.401632
2 0.318212 0.092365 0 ideal 0.0 0.430845 0.001651 0.989957 0.008392 3.946216 0.002943 0.015328 0.016028 -0.008392 0.991608
3 -1.316013 1.750000 0 unacceptable 0.0 5.100925 0.625732 0.373670 0.000598 6.084102 0.507432 0.507643 0.000475 -0.000598 0.999402
4 0.764939 0.777603 0 unacceptable 0.0 5.069102 0.210446 0.191209 0.598345 4.078997 0.441367 0.424099 0.545917 -0.598345 0.401655
5 0.427104 0.173584 1 ideal 0.0 0.276923 0.203299 0.398143 0.398557 3.955939 0.444491 0.541523 0.545475 -0.398557 0.601443
6 -0.627216 -1.203720 0 unacceptable 0.0 257.223197 0.003269 0.996432 0.000299 4.199537 0.006407 0.006487 0.000176 -0.000299 0.999701
7 -0.882740 0.794671 0 unacceptable 0.0 4.913346 0.090660 0.802018 0.107322 3.987153 0.196855 0.435521 0.238679 -0.107322 0.892678
8 0.896795 1.153983 1 unacceptable 0.0 13.109439 0.997869 0.001556 0.000576 4.151717 0.002048 0.002394 0.000579 -0.000576 0.999424
9 0.950642 0.906093 0 acceptable 0.0 0.295298 0.410543 0.010968 0.578489 4.064524 0.517112 0.020948 0.529179 -0.578489 0.421511

Check classification of proposed candidates

Use the logic from above to verify the classification values

# Append to the candidates
candidates["f_1_true"] = constraints(x=candidates)
# Print results
candidates[["x_0", "x_1", "f_1_pred", "f_1_true"]]
x_0 x_1 f_1_pred f_1_true
0 0.166739 0.017691 ideal acceptable
1 0.884821 0.766709 unacceptable ideal
2 0.318212 0.092365 ideal acceptable
3 -1.316013 1.750000 unacceptable unacceptable
4 0.764939 0.777603 unacceptable ideal
5 0.427104 0.173584 ideal acceptable
6 -0.627216 -1.203720 unacceptable acceptable
7 -0.882740 0.794671 unacceptable acceptable
8 0.896795 1.153983 unacceptable unacceptable
9 0.950642 0.906093 acceptable ideal