NCI Imaging Data Commons: Curated data and cloud-based reproducible AI workflows

A MICCAI 2023 Tutorial


Date: Sunday, Oct 8, 2023

Time: 8am-12:30pm PDT

NCI Imaging Data Commons (IDC) is a cloud-based repository of publicly available, real-world cancer imaging data co-located with the analysis and exploration tools and resources. IDC is a node within the broader NCI Cancer Research Data Commons (CRDC) infrastructure that provides secure access to comprehensive, diverse, and expanding multi-modality collections of cancer research data, including genomics, proteomics, and clinical trial data. CRDC’s cloud-based infrastructure offers a number of advantages to researchers, including ubiquitous access to data and tools, elastic storage and compute (particularly valuable for large datasets), access to readily available cloud-native AI/ML tools, and enhanced orchestration and automation of workflows. As of October 2023, IDC hosts over 40 TB of public radiology and digital pathology images and image-derived data, all in standard DICOM representation, co-located with the tools to support search, visualization, and analysis of the data.

IDC public data is available for cloud-based analysis or download in both Google Cloud Platform (GCP) and Amazon Web Services (AWS) environment. This tutorial will utilize a combination of lectures and hands-on exercises to introduce the attendees to the IDC and the main principles behind its development, and learn the basic skills of how to use IDC to search, visualize and download image data, and develop IDC-related AI workflows.



This is the first time we are teaching a tutorial about IDC at MICCAI, and for this first edition our goal is to give you highlights of the capabilities of IDC, and tools that you can use with IDC. We approach this with a series of brief lectures/presentation from both the developers/maintainers of IDC, but also collaborators and maintainers of the tools that can be used with IDC. Along with the lectures, we give you some highlights of the tutorials and interactive materials you can use to start using IDC for your work. Most importantly, we look to your feedback to improve both IDC and this tutorial for the future - please bring raise your questions and concerns at the discussion sessions that are spread through the program!

8:00-8:55 Part 1: Introduction to IDC

In this session you will learn about what is IDC and the foundation of the infrastructure and architecture that enable IDC.

8:55-10:00 Part 2: Applications

In this session you will learn about some of the tools you can use with IDC, and examples of applications and analyses enabled by IDC.

10:00-10:30 Coffee break

10:30-12:00 Part 3: Hands-on tutorials and demonstrations

In this session we will give you an introduction on how to get started with using IDC, and demonstrate some of the results for selected use cases. Each of the three individual components of this session will consist of 25 min of hands-on activities/demonstrations and 5 minutes for the questions, followed by 30 min of discussion session.