One of the significant challenges for deploying machine learning (ML) systems in the wild is distribution shifts — changes and mismatches in data distributions between training and test times. To address this, researchers from Stanford University, University of California-Berkeley, Cornell University, California Institute of Technology, and Microsoft, in a recent paper, present “WILDS,” an ambitious benchmark of in-the-wild distribution shifts spanning diverse data modalities and applications.
Here is a quick read: WILDS: Benchmarking Distribution Shifts in 7 Societally-Important Datasets
The paper Wilds: A Benchmark of in-the-Wild Distribution Shifts is on arXiv. The WILDS Python package and additional information are available on the Stanford University website. There is also a project GitHub.