Transductive model selection under prior probability shift

AUTHORS: Lorenzo Volpi, Alejandro Moreo, and Fabrizio Sebastiani

WORK PACKAGE: WP8

URL: https://arxiv.org/abs/2512.04759

Keywords: Model selection, Hyperparameter optimisation, Classifier accuracy prediction, Dataset shift, Prior probability shift, Transductive learning

Abstract

Transductive learning is a supervised machine learning task in which, unlike in traditional inductive learning,
the unlabelled data that require labelling are a finite set and are available at training time. Similarly to inductive
learning contexts, transductive learning contexts may be affected by dataset shift, i.e., may be such that the
assumption according to which the training data and the unlabelled data are independently and identically
distributed (IID), does not hold. We here propose a method, tailored to transductive classification contexts,
for performing model selection (i.e., hyperparameter optimisation) when the data exhibit prior probability
shift, an important type of dataset shift typical of anti-causal learning problems. In our proposed method
the hyperparameters can be optimised directly on the unlabelled data to which the trained classifier must be
applied; this is unlike traditional model selection methods, that are based on performing cross-validation on the
labelled training data. By tailoring model selection to the actual test distribution, our approach contributes to
the trustworthiness of AI systems, as it enables more reliable and robust classifier deployment under changed
conditions. We provide experimental results that show the benefits brought about by our method.

Leave a comment