AI reliability tools.

An overview.

Brought to you by the MIDRC AI Reliability Working Group.

Last updated March 23, 2026

AI reliability tool

Credit: MIDRC AI Reliability Working Group


Image of books

Selected literature

Image of Die

Selected code

  • NIH’s NCATS challenged participants to create a solution that detects bias in AI/ML models used in clinical decisions. Note that the provided solutions were not necessarily directly related to medical imaging.

    url: https://www.expeditionhacks.com/nih-bias-detection-gallery

  • Check back soon.

Image of a laptop

MIDRC-developed code

  • MIDRC REACT (representativeness exploration and comparison tool) is a tool designed to compare the representativeness of biomedical data. By leveraging the Jensen-Shannon distance (JSD) measure, this tool provides insights into the demographic representativeness of datasets within the biomedical field. It also supports monitoring the representativeness of datasets over time by assessing the representativeness of historical data. Developed and utilized by MIDRC, this tool assesses the representativeness of data within the open data commons to the US population. Additionally, it can be generalized by users for other diversity representativeness needs, such as assessing the similarity of demographic distributions across multiple attributes in different biomedical datasets.

    Available at https://github.com/MIDRC/MIDRC_Diversity_Calculator

  • The generalized stratified sampling tool on GitHub is a resource for researchers looking to implement advanced sampling techniques in medical imaging studies. This tool offers a framework for stratified sampling, which helps ensure that samples are representative of various subgroups within a dataset. It supports the development of more robust and generalizable models by improving the distribution and representativeness of sampled data, making it easier to analyze and interpret complex imaging datasets effectively.

    Read more about the de-identifier in this peer-reviewed publication.

  • MIDRC-MELODY (Model EvaLuation across subgroups for cOnsistent Decision accuracY) is a free open-source tool designed to assess the performance and subgroup-level reliability and robustness of AI models developed for medical imaging analysis tasks, such as the estimation of disease severity. It enables consistent evaluation of models across predefined subgroups (e.g. manufacturer, race, scanner type) by computing intergroup performance metrics and corresponding confidence intervals.