Understanding MIDRC data.
MIDRC central data versus distributed data
Our central MIDRC data commons, powered by the open-source Gen3 platform, supports findable, accessible, interoperable, and reusable (FAIR) data. Users can discover datasets, build cohorts, and access data through integrated analysis tools, notebooks, and workspaces. You can find information on how to get started in our Data Access QuickStart Guide.
MIDRC also offers cohort building capabilities using distributed data through interoperating with other data commons in the BDF Imaging Hub, which indexes external imaging datasets to enable cross-commons cohort discovery. After defining a cohort, users download the data directly from the respective data commons.
Building cohorts
MIDRC has developed several Jupyter notebooks with example use cases of data available on the MIDRC data commons such as building specific cohorts using the API (application programming interface) rather than through the data explorer. You can find the example notebooks on our resource page on the MIDRC data commons and in the notebook repository on the MIDRC GitHub. Note that you will need to be logged in and have valid credentials in order to run any of the notebooks or you will get an error message (e.g., “Either you weren't authenticated successfully or you don't have read-storage permission on authorization resource: […]. Please try again!”).
Multiple MIDRC seminars also directly discuss how to build cohorts and download data.
For many more programming applications, ranging from data harmonization to data representativeness assessment, including Python code and Jupyter notebooks, please see our downloadable tools page.
Last updated February 4, 2026
Questions? Check out our answers to frequently asked questions!
How to acknowledge 1) MIDRC funded research and 2) use of data downloaded from the MIDRC Data Commons