Skip to content

6. Optuna Hyperparameter Tuning

rusket comes with built-in Bayesian hyperparameter optimization via Optuna. Instead of doing a manual grid search, you can use rusket.optuna_optimize() to intelligently and efficiently explore the parameter space. It leverages the Rust-native cross-validation engine, making each trial evaluation extremely fast.

Basic Usage

The easiest way to start is to use the default search space provided by rusket for your model class (e.g., ALS, eALS, or BPR).

import pandas as pd
import rusket

# Load your interaction data
df = pd.DataFrame({
    "user_id": [1, 1, 2, 2, 3],
    "item_id": [10, 20, 10, 30, 20],
    "clicks": [5, 1, 3, 2, 1]
})

# Run the optimization
result = rusket.optuna_optimize(
    rusket.ALS,
    df,
    user_col="user_id",
    item_col="item_id",
    rating_col="clicks",  # optional
    n_trials=30,          # number of parameter combinations to try
    n_folds=3,            # 3-fold cross validation per trial
    metric="ndcg",        # metric to optimize (e.g., ndcg, precision, recall)
    k=10                  # cut-off for the ranking metric
)

print(f"Best score (NDCG@10): {result.best_score:.4f}")
print(f"Best parameters: {result.best_params}")

Customizing the Search Space

If you want to fine-tune the parameter boundaries, you can define a custom search_space using OptunaSearchSpace:

from rusket import eALS, OptunaSearchSpace, optuna_optimize

search_space = [
    OptunaSearchSpace.int("factors", 16, 256, log=True),
    OptunaSearchSpace.float("alpha", 1.0, 100.0, log=True),
    OptunaSearchSpace.float("regularization", 1e-4, 1.0, log=True),
    OptunaSearchSpace.int("iterations", 5, 30),
]

result = optuna_optimize(
    eALS,
    df,
    user_col="user_id",
    item_col="item_id",
    search_space=search_space,
    n_trials=50,
    metric="precision",
    refit_best=True  # Automatically retrains the best model on the full dataset!
)

# Because we used refit_best=True, we can use the best model immediately:
items, scores = result.best_model.recommend_items(user_id=1, n=5)

MLflow Integration

You can easily log all Optuna trials directly into MLflow:

import mlflow
import rusket

mlflow.set_tracking_uri("http://localhost:5000")
mlflow.set_experiment("rusket-tuning")

# Setting mlflow_tracking=True takes care of everything automatically
result = rusket.optuna_optimize(
    rusket.ALS, 
    df,
    user_col="user_id", 
    item_col="item_id",
    n_trials=50, 
    mlflow_tracking=True
)