The Center for Technical Operations Support primarily develops systems and software in support of data sharing, semantics, informatics and scientific operations for the National Cancer Institute and other National Institutes such as the National Institute of Allergy and Infectious Diseases.  
 
Our work is driven by the problems we are asked to solve, so we are not focused on a single solution technology as the problems are diverse. However, we are firmly focused on providing the best solution to our government clients.

This work can be broadly categorized into three categories: Application development, data and semantics management, and application support.

Creating powerful resources for impactful research

The applications we help develop and support range from various data commons, events registration websites, and others such as an informatics site for the Serological Sciences Network. We also provide testing support for configuration of administration tools and custom-off-the-shelf-tools used within the Frederick National Laboratory. 

To accomplish our work, we have a core group that works closely with staff from our many subcontractor teams. Our extensive use of subcontracts, from both academic and commercial groups, allows us to rapidly extend the perspectives and skills we need to solve the problems we are tasked with by the government.

Our projects range from Drupal-based information sites, providing tier 1 and 2 support for applications, overseeing data generation activities by a subcontractor, to maintaining legacy Oracle clinical systems and to complex multiple million dollar initiatives with many subprojects, such as the Cancer Research Data Commons and the Childhood Cancer Data Initiative.

We also provide development support for legacy systems on Java, JavaScript, Drupal, Google Cloud Platform, Amazon Web Service (AWS), and relational and non-relational databases. 

 

Image
A graphic representation of data science with code strings
Data Commons
We built and implemented various systems and databases for cancer researchers in partnership with the National Cancer Institute, including: 
Image
Database concept
Applications
We created multiple applications for various collaborative initiatives across the National Institutes of Health, including:
Image
Data illustration
Project support
For many years, the group has been an active collaborator in the development and support of projects, including:

Advancing disease research through our state-of-the-art tools 

Data sharing has always been an important resource for the research community and with the data sharing policies adopted by the National Cancer Institute and others, the availability of data for the community will increase even more.

We have a critical role in helping build NCI’s Cancer Research Data Commons, which serves as a data central data provider for genomic, proteomic, imaging, population science, immuno-oncology, comparative, and other data types. As part of this project, we developed the BENTO Framework, a state-of-the-art, cloud-based, micro-services platform developed with FAIR principles. 

DevOps technologies 

Given the data and functionality heterogeneity in our projects, each project is treated separately to determine the optimal technology stack. As a result, we use both relational and non-relational databases and have used OpenSearch to aid with query performance on the very large cloud-based data sets. All these technologies are leveraged within a DataOps process we developed to consistently track data through all processing steps and maintain robustness, integrity and reproducibility.
 
The first step in data sharing is understanding the data, so we have a dedicated data sciences team that leverage lessons learned from all our projects to ensure a data focus on all our data projects. In addition to more classical data activities, such as data curation and transformation, the data team follow a DataOps model for working with data to ensure data is managed appropriately through all stages of the application development process. Moreover, the data science staff are part of the application development teams to ensure we use the best technologies to deliver data to the relevant community

Our DevOps processes consist of provisioning cloud-based environments; developing pipelines; and deploying, testing, and monitoring our applications in a highly secure and repeatable fashion to advance our mission of supporting cancer research.  ​​​​​

Jenkins

We use Jenkins as an orchestrator for most of our DevOps workflows. Development and QA Team members kick off the build, deploy, and data load pipelines across all our applications. 

GitHub

We use GitHub as our source code repositories for our Application, Data Operations, Infrastructure Provisioning, and Configuration Management assets. It serves as a source of truth for most of our execution activities. 

Docker

Docker is a containerization technology that we use to encapsulate a working environment that runs on various infrastructure platforms on AWS. The source code along with associated dependencies get packaged into a docker container and stored in a centralized docker repository hosted on AWS. 

Terraform

Terraform is an open-source infrastructure as code (IaC) tool and allows users to define and provision infrastructure using a declarative configuration language. With Terraform, you can describe the components of your infrastructure, such as servers, networks, and databases, in a configuration file.

Our capabilities and specializations

Additional Content

Cloud-based technologies 

We use various Amazon Web Services (AWS) cloud technologies to develop powerful cloud-based platforms that make data easily accessible and computable for rapidly analyzing hypotheses from the huge data sets available. We also use AWS to serve our Drupal based project. Applications are typically architected with serverless managed services, and we operate at up to FISMA medium levels. Examples include the Index of NCI Studies and the CCDI Molecular Targets Program.

Additional Content
  • AWS RDMS  

  • AWS Lambda  

  • AWS OpenSearch

  • AWS Fargate

  • AWS ECS

  • Terraform  

Additional Content

Database technologies 

In addition to our Cancer Research Data Commons and Childhood Cancer Data Catalog data repositories projects, we developed other systems for data sharing. This includes NCI Metathersaurus, a comprehensive biomedical terminology database providing broad, concept-based mapping of terms from more than 101 biomedical terminologies, with 7,500,000 terms mapped to 3,200,000 concepts representing their shared meanings.

Additionally, we developed EVS-SIP, which permits search and retrieval of terms contained in or across the data dictionaries or data models of repositories participating in the Cancer Research Data Commons and beyond.  

Additional Content
  • AWS Neptune

  • Neo4j Graph DB

  • AWS RDS

  • Oracle

  • Mongo DB/AWS Document DB

  • PostgreSQL

  • MSSQL

  • MySQL 

Additional Content

Imaging and informatics for precision medicine

We oversee the Cancer Research Data Common’s Imaging Data Commons, a cloud-based repository of publicly available cancer imaging data co-located with the analysis and exploration tools. Data includes radiology collections from the Cancer Imaging Archive and major NCI initiatives, such as the Cancer Genome Atlas Program, Clinical Proteomic Tumor Analysis Consortium, National Lung Screening Trial, and Human Tumor Analysis Network

We also provide programmatic support for the National Biomedical Imaging Archive, supporting the interoperability between images and genomic data. 

Additional Content
  • MedICI Challenge Management System for image analysis algorithm development and validation 

  • Standards such as BRIDG, CDISC, and DICOM