Glossary

application

In the scope of Kaapana an application is a tool or service that can be installed as an extension into a running platform. Moreover, an application can be started and deleted and runs statically. An example of an application is JupyterLab.

build

A build describes the process of building container-images from files like Dockerfiles. Images for Kaapana can be build using tools like Docker or Podman.

chart

A Helm chart is a collection of Kubernetes files. The kaapana-admin-chart consists of all the configuration required for the kaapana-platform. Moreover, each extension and service is packaged within a Helm chart.

container

A container is a self contained virtual environment that runs a software along with the code and all of its dependencies. In this way, it can run quickly and reliably on any environment. We utilize containerd to run containers. In Kaapana, every service and job runs within a container. Containers are specified by container images that are built according to a file e.g. a Dockerfile.

containerd

We use containerd as runtime environment for the containers in the Kubernetes cluster. It manages the lifecycle of a container.

dag

A DAG (Directed Acyclic Graph) is an Airflow pipeline that is defined in a python script. It links multiple operators (output to input) to realize a multi-step processing workflow, typically starting with an operator that collects that data and ending with an operator that pushes the processing results back to some data storage. An instance of a running DAG is called DAG-run.

data-curation-tool

The data curation tool is the place to view, curate and manage your datasets. You can access it via the Datasets tab in the workflow-management-system.

data-upload

Data can be uploaded at the Data Upload tab of the workflow-management-system. After the upload has finished you can directly trigger special workflows on this data e.g. to convert nifti data to dicom or to import the data into the internal PACS.

dataset

A dataset is a list of dicom identifiers. Most workflows are executed on a dataset. Datasets can be managed in the data-curation-tool.

deploy-platform-script

This script is used to deploy a kaapana-platform into a Kubernetes cluster or to undeploy a platform. It basically installs the kaapana-admin-chart using Helm. After building the platform you can find the script at ./kaapana/build/kaapana-admin-chart/deploy_platform.sh.

deployment

A Kaapana deployment is a kaapana-platform that was deployed on a server using the deploy-platform-script. This is not the same as a deployment in the scope of Kubernetes, where a deployment is an object that is used to manage multiple pods. In fact a Kaapana deployment consists of multiple Kubernetes deployments.

DNS

The Domain Name System (DNS) is the phonebook of the Internet. Humans access information online through domain names, e.g. www.dkfz.de. Web browsers interact through Internet Protocol (IP) addresses. A DNS translates domain names to IP addresses so browsers can load Internet resources.

extension

Extensions are either workflows or applications that can be installed on the platform under the tab Extensions of the main menu.

helm

We use Helm to distribute and manage our Kubernetes configuration files. Like this we only need one Helm chart that contains the whole platform i.e. the kaapana-admin-chart.

job

A job belongs to a workflow and is associated with a unique Airflow DAG-run.

kaapana-admin-chart

This Helm chart consists of all Kubernetes configuration required for the kaapana-platform. It contains the fundamental features of the platform such as reverse proxy, authentication, and kube-helm backend. It has kaapana-platform-chart as a sub-chart.

kaapana-platform

The kaapana-platform is a platform that comes with all required base components like a reverse-proxy and an authentication provider as well as many usefull services like Airflow, MinIO and the workflow-management-system. You can utilize this platform as a starting-point to derive a customized platform for your specific project.

kaapana-platform-chart

This Helm chart consists of most of the interactive components of Kaapana, such as Airflow, PACS, Minio, landing page and Kaapana backend.

kubernetes

Kubernetes is an open-source container-orchestration system that we use to manage all the containers required for Kaapana.

microk8s

MicoK8s is a lightweight, single-package Kubernetes distribution that we utilize to set up our Kubernetes cluster.

operator

An Airflow operator is a Python class that represents a single task within a DAG. This allows for the reuse of operators as building blocks across multiple DAG definitions. Operators can also run tasks by running a container. This makes the execution of operators heavily scalable.

pipeline

See workflow

platform

A platform describes a system that runs on a remote server and is accessible via the browser. The kaapana-platform is an example of a platform. Kaapana empowers you to construct a customized platform by integrating the services and extensions you require, tailoring it precisely to your needs.

registry

A registry is a storage and content delivery system holding container images and Helm charts available in different tagged versions. A registry can be private or public. Examples of such registries are, DockerHub and Elastic Container Registry (ECR) provided by Amazon’s AWS. GitLab offers free, private registries.

runner-instance

In the scope of federated processing a runner-instance is associated with the kaapana-platform, where a job is executed. This must not be the same platform where the workflow the job belongs to was executed. You can add runner-instances under the tab Instance Overview of the workflow-management-system.

server

A dedicated physical or virtual machine with a supported operating system on which a platform can run.

server-installation-script

This script is used to install all required dependencies on the server. It can be found within the Kaapana-repository: ./kaapana/server-installation/server_installation.sh. It will execute the following steps:

  1. Configure a proxy (if needed)

  2. Install packages if not present: snap, nano, jq, curl, net-tools, core18, helm

  3. Install, configure and start microk8s

  4. Add alias for kubectl to .bashrc file and enable auto-completion

  5. (opt) Enable GPU for microk8s

  6. (opt) Change the SSL-certificates

Currently supported operating systems:

  • Ubuntu 22.04

  • Ubuntu 20.04

  • Ubuntu Server 20.04

service

Every container that runs statically inside a kaapana-platform is a service. Examples for services are Minio, OHIF, Airflow etc..

single file and batch processing

The difference between single and batch processing is that in single file processing for every image an own job is created. Therefore, each operator within the DAG only obtains a single image at a time. When selecting batch processing, a single job is created for all selected images and every operator obtains all images in the batch. In general, batch processing is recommended. Single file processing is only necessary if an operator within the DAG can only handle one image at a time.

workflow

Workflows semantically bind together multiple jobs, their processing data, and the orchestration/triggering and runner-instances of those jobs. Workflows can be started via the tab Workflow Execution in the workflow-management-system. In the Workflow List tab you can view information about worfklows and their jobs. Some workflows are preinstalled in the platform, others can be installed as extensions.

workflow-management-system

The workflow management system is the new environment for processing your data. You can access it via the Workflows tab in the main menu. Here you can upload data, use the data-curation-tool, start a workflow, get information about started workflows, and register runner-instances.