Proteomic Data Commons: Facilitating Proteogenomics to Revolutionize Precision Medicine

About the Proteomic Data Commons

The objectives of the National Cancer Institute’s Proteomic Data Commons (PDC) are (1) to make cancer-related proteomic datasets easily accessible to the public, and (2) facilitate direct multiomics integration in support of precision medicine through interoperability with accompanying data resources (genomic and medical image datasets).

The PDC was developed to advance our understanding of how proteins help to shape the risk, diagnosis, development, progression, and treatment of cancer. In-depth analysis of proteomic data allows us to study both how and why cancer develops and to devise ways of personalizing treatment for patients using precision medicine.

The PDC is one of several repositories within the NCI Cancer Research Data Commons (CRDC), a secure cloud-based infrastructure featuring diverse data sets and innovative analytic tools – all designed to advance data-driven scientific discovery. The CRDC enables researchers to link proteomic data with other data sets (e.g., genomic and imaging data) and to submit, collect, analyze, store, and share data throughout the cancer data ecosystem.

Benefits of the PDC

  • Access to highly curated and standardized biospecimen, clinical and proteomic data with direct integration of accompanying data resources (genomic and medical image datasets).
  • Intuitive interface to filter, query, search, visualize and download the data and metadata.
  • A common data harmonization pipeline to uniformly analyze all PDC data and provide advanced visualization of the quantitative information.
  • Cloud based (Amazon Web Services) infrastructure facilitates interoperability with AWS based data analysis tools and platforms natively.
  • Application programming interface (API) provides cloud-agnostic data access and allows third parties to extend the functionality beyond the PDC.
  • A highly structured workspace that serves as a private user data store and also data submission portal.
  • Distributes controlled access data, such as the patient-specific protein fasta sequence databases, with dbGaP authorization and eRA Commons authentication.
Contact Us Contact us: Email
Warning