CLEAR – Ilia Azizi

Main resources

Overview

CLEAR (Calibrated Learning for Epistemic and Aleatoric Risk) is a calibration method that addresses both aleatoric uncertainty (measurement noise) and epistemic uncertainty (limited data) in regression tasks. Unlike existing methods that typically focus on one type of uncertainty, CLEAR uses two distinct parameters (\(\gamma_1\) and \(\gamma_2\)) to combine both uncertainty components and improve the conditional coverage of predictive intervals.

Key Features

Dual Uncertainty Handling: Simultaneously addresses aleatoric and epistemic uncertainty
Flexible Integration: Compatible with any pair of aleatoric and epistemic estimators
Superior Performance: Achieves 28.2% average improvement in interval width compared to individually calibrated baselines while maintaining nominal coverage
Validated on Real Data: Tested across 17 diverse real-world regression datasets
Multiple Backends: Works with quantile regression, PCS ensembles, Deep Ensembles, and Simultaneous Quantile Regression

Installation

Quick Installation (PyPI)

Requirements: Python 3.11+

pip install clear-uq

Development Installation

# From GitHub
pip install git+https://github.com/Unco3892/clear.git

# Local editable install
pip install -e .

Minimal Example

import numpy as np
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from clear.clear import CLEAR

# Generate data
X, y = make_regression(n_samples=200, n_features=1, noise=10, random_state=42)
X_train, X_temp, y_train, y_temp = train_test_split(X, y, test_size=0.5, random_state=42)
X_calib, X_test, y_calib, y_test = train_test_split(X_temp, y_temp, test_size=0.5, random_state=42)

# External epistemic predictions (example)
val_ep_median = X_calib[:,0] * np.mean(y_train/X_train[:,0])
val_ep_lower = val_ep_median - 2 
val_ep_upper = val_ep_median + 2

test_ep_median = X_test[:,0] * np.mean(y_train/X_train[:,0])
test_ep_lower = test_ep_median - 2
test_ep_upper = test_ep_median + 2

# 1. Initialize CLEAR
clear_model = CLEAR(desired_coverage=0.95, n_bootstraps=10, random_state=777)

# 2. Fit CLEAR's Aleatoric Component
clear_model.fit_aleatoric(
    X=X_train, 
    y=y_train, 
    quantile_model='qrf',
    fit_on_residuals=True, 
    epistemic_preds=X_train[:,0] * np.mean(y_train/X_train[:,0])
)

# 3. Get Aleatoric Predictions for Calibration
al_median_calib, al_lower_calib, al_upper_calib = \
    clear_model.predict_aleatoric(X=X_calib, epistemic_preds=val_ep_median)

# 4. Calibrate CLEAR
clear_model.calibrate(
    y_calib=y_calib,
    median_epistemic=val_ep_median,
    aleatoric_median=al_median_calib, 
    aleatoric_lower=al_lower_calib, 
    aleatoric_upper=al_upper_calib,
    epistemic_lower=val_ep_lower, 
    epistemic_upper=val_ep_upper
)

print(f"Optimal Lambda: {clear_model.optimal_lambda:.3f}, Gamma: {clear_model.gamma:.3f}")

# 5. Predict with Calibrated CLEAR
clear_lower, clear_upper = clear_model.predict(
    X=X_test,
    external_epistemic={
        'median': test_ep_median, 
        'lower': test_ep_lower, 
        'upper': test_ep_upper
    }
)

Citation

@article{azizi2025clear,
  title={CLEAR: Calibrated Learning for Epistemic and Aleatoric Risk},
  author={Azizi, Ilia and Bodik, Juraj and Heiss, Jakob and Yu, Bin},
  journal={arXiv preprint arXiv:2507.08150},
  year={2025}
}

Documentation

For more detailed examples, reproducibility instructions, and the full experimental setup, visit the GitHub repository.