How to Configure OptunaSearchCV¶

This guide shows you how to configure OptunaSearchCV for common search scenarios: changing the sampler, adding callbacks, persisting studies, and customizing cross-validation.

Prerequisites¶

Yohou-Optuna installed (Getting Started)
Familiarity with OptunaSearchCV.fit() basics

Choose a Sampler¶

The sampler parameter controls the optimization strategy. The default (TPE) works well for most cases. Wrap any Optuna sampler with the Sampler class to make it compatible with get_params() / set_params() / clone().

import optuna
from yohou_optuna import OptunaSearchCV, Sampler

search = OptunaSearchCV(
    forecaster=forecaster,
    param_distributions=distributions,
    scoring=scorer,
    n_trials=50,
    sampler=Sampler(sampler=optuna.samplers.TPESampler, seed=42),
)

For other strategies, swap the sampler class:

# CMA-ES: effective for continuous spaces with correlated parameters
sampler=Sampler(sampler=optuna.samplers.CmaEsSampler, seed=42)

# Gaussian Process: best for very small budgets (< 20 trials)
sampler=Sampler(sampler=optuna.samplers.GPSampler)

# Random: useful as a baseline or for reproducible ablations
sampler=Sampler(sampler=optuna.samplers.RandomSampler, seed=42)

Pass seed for reproducible results when n_jobs=1. Always use the Sampler wrapper rather than a raw Optuna sampler object because raw Optuna objects are not compatible with clone().

Tip

Start with TPE. Switch to CMA-ES only when you have a large all-continuous search space and notice slow convergence.

Add Callbacks¶

Callbacks run after each completed trial. Use them for early stopping, logging, or custom logic. Pass a dictionary mapping callback names to Callback instances:

from optuna.study import MaxTrialsCallback
from yohou_optuna import Callback

search = OptunaSearchCV(
    forecaster=forecaster,
    param_distributions=distributions,
    scoring=scorer,
    n_trials=200,
    callbacks={
        "stop": Callback(callback=MaxTrialsCallback, n_trials=50),
    },
)

MaxTrialsCallback stops the study once the specified number of trials completes, regardless of the n_trials setting on OptunaSearchCV. This is useful when you want to set a generous upper bound on trials but stop early once you have enough results.

Always use the Callback wrapper instead of a raw Optuna callback for the same cloneability reasons as Sampler.

Write a Custom Callback¶

Any callable class that accepts study and trial arguments works as a callback:

class EarlyStoppingCallback:
    def __init__(self, patience: int = 10):
        self.patience = patience

    def __call__(self, study, trial):
        if trial.number >= self.patience:
            best = study.best_trial.number
            if trial.number - best >= self.patience:
                study.stop()

search = OptunaSearchCV(
    forecaster=forecaster,
    param_distributions=distributions,
    scoring=scorer,
    n_trials=200,
    callbacks={
        "early_stop": Callback(callback=EarlyStoppingCallback, patience=10),
    },
)

Persist and Resume Studies¶

For long-running or distributed searches, save the study to a storage backend:

import optuna
from yohou_optuna import Storage

search = OptunaSearchCV(
    forecaster=forecaster,
    param_distributions=distributions,
    scoring=scorer,
    n_trials=50,
    storage=Storage(storage=optuna.storages.RDBStorage, url="sqlite:///my_study.db"),
)

search.fit(y_train, forecasting_horizon=12)

To add more trials later, pass the existing study back to fit():

search.n_trials = 50  # 50 additional trials
search.fit(y_train, forecasting_horizon=12, study=search.study_)

To name a study for easier identification, create it externally and pass it via fit():

study = optuna.create_study(
    study_name="ridge_air_passengers",
    direction="maximize",
    storage="sqlite:///my_study.db",
)
search.fit(y_train, forecasting_horizon=12, study=study)

Use a Custom CV Splitter¶

By default, OptunaSearchCV uses a 5-fold expanding window. Pass any Yohou splitter to change this:

from yohou.model_selection import SlidingWindowSplitter

search = OptunaSearchCV(
    forecaster=forecaster,
    param_distributions=distributions,    scoring=scorer,    n_trials=30,
    cv=SlidingWindowSplitter(n_splits=5, train_size=24),
)

Use ExpandingWindowSplitter (default) for growing training windows. Use SlidingWindowSplitter for a fixed-size training window.

Handle Fitting Errors¶

By default, error_score=np.nan catches fitting errors during cross-validation folds and records NaN for that trial. To stop the search immediately on the first error instead, set error_score="raise":

search = OptunaSearchCV(
    forecaster=forecaster,
    param_distributions=distributions,
    scoring=scorer,
    n_trials=50,
    error_score="raise",  # stop on first error (useful during development)
)

Use error_score="raise" during development to catch bad parameter combinations early. In production, keep the default (np.nan) so the search continues past occasional failures.

Filter Failed Trials¶

After fitting, inspect which trials failed:

import polars as pl

results = pl.DataFrame(search.cv_results_)
failed = results.filter(pl.col("mean_test_score").is_nan())
print(f"{len(failed)} trials failed out of {len(results)}")

Collect Training Scores¶

To compute scores on the training folds in addition to the validation folds, set return_train_score=True:

search = OptunaSearchCV(
    forecaster=forecaster,
    param_distributions=distributions,
    scoring=scorer,
    n_trials=30,
    return_train_score=True,
)

search.fit(y_train, forecasting_horizon=12)

# cv_results_ now contains train_score columns alongside test_score columns
import polars as pl
results = pl.DataFrame(search.cv_results_)
print(results.select(["params", "mean_test_score", "mean_train_score"]))

Large gaps between training and test scores suggest overfitting.