Skip to content

Optimized Rounder: An efficient optimizer for finding optimal thresholds in ordinal classification problems. This package uses Optuna for efficient threshold search with support for cross-validation and multiple evaluation metrics.

License

Notifications You must be signed in to change notification settings

susuky/optimized-rounder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Optimized Rounder

License: MIT

An efficient optimizer for finding optimal thresholds in ordinal classification problems. This package uses Optuna for efficient threshold search with support for cross-validation and multiple evaluation metrics.

Installation

pip install optimized-rounder

Features

  • Threshold Optimization: Find optimal thresholds to convert continuous predictions to discrete classes
  • Multiple Metrics: Support for various evaluation metrics including quadratic kappa, linear kappa, RMSE, accuracy, and F1 scores
  • Cross-Validation: Built-in support for K-fold and stratified cross-validation
  • Efficient Search: Uses Optuna for efficient Bayesian optimization of thresholds
  • Comprehensive Evaluation: Evaluate models using multiple metrics simultaneously

Quick Start

from oprounder import OptimizedRounder
import numpy as np
from sklearn.metrics import cohen_kappa_score

# Generate synthetic data
np.random.seed(42)
n_classes = 4
n_samples = 1000
y_true = np.random.randint(0, n_classes, size=n_samples)
output = y_true + np.random.normal(0, 0.9, size=n_samples) # dummy model output

# Initialize and fit the optimizer
rounder = OptimizedRounder(n_classes=n_classes, n_trials=100)
rounder.fit(output, y_true)

# Get the optimal thresholds
print(f'Optimal thresholds: {rounder.thresholds}')

# Make predictions
y_pred = rounder.predict(output)
kappa = cohen_kappa_score(y_true, y_pred, weights='quadratic')
print(f'Optimal Quadratic kappa: {kappa:.4f}')
y_pred_default = rounder.apply_thresholds(output, rounder.default_thresholds) # [0.5, 1.5, 2.5, 3.5]
kappa_default = cohen_kappa_score(y_true, y_pred_default, weights='quadratic')
print(f'Default Quadratic kappa: {kappa_default:.4f}')

Advanced Usage

With Cross-Validation

# Use 5-fold stratified cross-validation
rounder = OptimizedRounder(
    n_classes=4,
    n_trials=200,
    cv=5,
    stratified=True,
    metric='quadratic_kappa',
    verbose=True
)

rounder.fit(output, y_true)
print(f'CV Results: {rounder.cv_results_}')

Using Different Metrics

# Optimize for F1 weighted score
rounder = OptimizedRounder(
    n_classes=4,
    n_trials=200,
    metric='f1_weighted'
)

rounder.fit(output, y_true)

# Comprehensive evaluation
output_val = y_true + np.random.normal(0, 0.8, size=n_samples)
metrics = rounder.evaluate(output_val, y_true)
for metric_name, value in metrics.items():
    print(f"{metric_name}: {value:.4f}")

API Reference

OptimizedRounder

OptimizedRounder(
    n_classes=None,
    n_trials=200,
    cv=None,
    stratified=True,
    metric='quadratic_kappa',
    verbose=False,
    random_state=42
)

Parameters

  • n_classes: Number of target classes (0, 1, 2, ..., n_classes-1)
  • n_trials: Number of optimization trials for Optuna
  • cv: Number of cross-validation folds or a CV splitter object
  • stratified: Whether to use stratified CV (only when cv is an integer)
  • metric: Metric to optimize ('quadratic_kappa', 'linear_kappa', 'rmse', 'accuracy', 'f1_macro', 'f1_weighted', 'f1_micro')
  • verbose: Whether to display Optuna's optimization progress
  • random_state: Random seed for reproducibility

Methods

  • fit(X, y): Find optimal thresholds using Optuna optimization
  • predict(X): Convert continuous predictions to discrete classes using optimal thresholds
  • fit_predict(X, y): Train the optimizer and return predictions in one step
  • coefficients(): Get the optimal thresholds found during training
  • evaluate(X, y): Evaluate the model on multiple metrics

Use Cases

  • Regression to Classification: Convert regression outputs to discrete classes
  • Ordinal Classification: Optimize thresholds for ordinal targets
  • Ensemble Calibration: Calibrate probability outputs from ensemble models
  • Competition Metrics: Optimize directly for competition metrics like quadratic kappa

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

Optimized Rounder: An efficient optimizer for finding optimal thresholds in ordinal classification problems. This package uses Optuna for efficient threshold search with support for cross-validation and multiple evaluation metrics.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages