DeCAB: Debiased Semi-supervised Learning for Imbalanced Open-Set Data

Published in Chinese Conference on Pattern Recognition and Computer Vision (PRCV), 2024

Semi-supervised learning (SSL) has received significant attention due to its ability to use limited labeled data and various unlabeled data to train models with high generalization performance. However, the assumption of a balanced class distribution in traditional SSL approaches limits a wide range of real applications, where the training data exhibits long-tailed distributions. As a consequence, the model is biased towards head classes and disregards tail classes, thereby leading to severe class-aware bias. Additionally, since the unlabeled data may contain out-of-distribution (OOD) samples without manual filtering, the model will be inclined to assign OOD samples to non-tail classes with high confidence, which further overwhelms the tail classes. To alleviate this class-aware bias, we propose an end-to-end semi-supervised method Debias Class-Aware Bias (DeCAB). DeCAB introduces positive-pair scores for contrastive learning instead of positive-negative pairs based on unreliable pseudo-labels, avoiding false negative pairs negatively impacts the feature space. At the same time, DeCAB utilizes class-aware thresholds to select more tail samples and selective sample reweighting for feature learning, preventing OOD samples from being misclassified as head classes and accelerating the convergence speed of the model. Experimental results demonstrate that DeCAB is robust in various semi-supervised benchmarks and achieves state-of-the-art performance. Our code is temporarily available at https://github.com/xlhuang132/decab.

image

@inproceedings{huang2023decab,
  title={DeCAB: Debiased Semi-supervised Learning for Imbalanced Open-Set Data},
  author={Huang, Xiaolin and Li, Mengke and Lu, Yang and Wang, Hanzi},
  booktitle={Chinese Conference on Pattern Recognition and Computer Vision (PRCV)},
  pages={104--119},
  year={2023},
  organization={Springer}
}

Recommended citation: Huang, X., Li, M., Lu, Y., & Wang, H. (2023). "DeCAB: Debiased Semi-supervised Learning for Imbalanced Open-Set Data." Chinese Conference on Pattern Recognition and Computer Vision (PRCV). pp.104-119. https://keke921.github.io/files/2023-11-26-XLHuang-DeCAB.pdf