What is Kaapana?
Kaapana (from the hawaiian word kaʻāpana, meaning “distributor” or “part”) is an open source toolkit for state of the art platform provisioning in the field of medical data analysis. The applications comprise AI-based workflows and federated learning scenarios with a focus on radiological and radiotherapeutic imaging.
Obtaining large amounts of medical data necessary for developing and training modern machine learning methods is an extremely challenging effort that often fails in a multi-center setting, e.g. due to technical, organizational and legal hurdles. A federated approach where the data remains under the authority of the individual institutions and is only processed on-site is, in contrast, a promising approach ideally suited to overcome these difficulties.
Following this federated concept, the goal of Kaapana is to provide a framework and a set of tools for sharing data processing algorithms, for standardized workflow design and execution as well as for performing distributed method development. This will facilitate data analysis in a compliant way enabling researchers and clinicians to perform large-scale multi-center studies.
By adhering to established standards and by adopting widely used open technologies for private cloud development and containerized data processing, Kaapana integrates seamlessly with the existing clinical IT infrastructure, such as the Picture Archiving and Communication System (PACS), and ensures modularity and easy extensibility.
Core components of Kaapana:
Workflows: Large-scale image processing with SOTA deep learning algorithms, such as nnU-Net image segmentation and TotalSegmentator
Datasets: Exploration, visualization and curation of medical images
Extensions: Simple integration of new, customized algorithms and applications
Store: An integrated PACS system and Minio for other types of data
Prometheus and Grafana: Extensive resource and system monitoring for administrators
Core technologies used in Kaapana:
Kubernetes: Container orchestration system
Airflow: Workflow management system enabling complex and flexible data processing workflows
OpenSearch: Search engine for DICOM metadata based searches
dcm4chee: Open source PACS system serving as a central DICOM data storage
Prometheus: Collecting metrics for system monitoring
Grafana: Visualization for monitoring metrics
Keycloak: User authentication
Currently the most widely used platform realized using Kaapana is the Joint Imaging Platform (JIP) of the German Cancer Consortium (DKTK). The JIP is currently being deployed at all 36 german university hospitals with the objective of distributed radiological image analysis and quantification.
For more information, please also take a look at our recent publication of the Kaapana-based Joint Imaging Platform in JCO Clinical Cancer Informatics.