Glossary

access-information-interface
  • Database: Stores user, project, and permission data.

  • REST API: Enables Keycloak token mappers to fetch permissions and manage stored information.

Main database objects:

  • Rights: Key-value claims for access tokens.

  • Projects: Projects bundle the information.

  • Roles: Collections of rights mapped to users and projects.

  • UsersProjectsRoles: Links users, roles, and projects. A user can only have a single UsersProjectsRoles mapping per project. But a user can be mapped to the same role for multiple projects.

E.g. if a role that contains the right {"claim_key": "opensearch", "claim_value": "admin_project"} is mapped to user A in Project foo, the access token of user A will contain the claim "opensearch": ["admin_project_foo"]. Opensearch is configured accordingly to look for backend-roles in the opensearch claim of the access token and to know which permissions to grant users with the respective roles. Initial rights, roles, projects and respective mappings can be configured in the access-information-interface-config . Per default version 0.5.0 comes with only one project role, i.e. admin. This role grants access to the project bucket in Minio and the project index in opensearch.

application

In the scope of Kaapana an application is a service that can be installed as an extension into a running platform. Moreover, an application can be started and deleted and runs statically. An example of an application is JupyterLab.

chart

A Helm chart is a collection of Kubernetes files. The kaapana-admin-chart consists of all the configuration required for the kaapana-platform. Moreover, each extension and service is packaged within a Helm chart.

container

A container is a self contained virtual environment that runs a software along with the code and all of its dependencies. In this way, it can run quickly and reliably on any environment. We utilize containerd to run containers. In Kaapana, every service runs within a container. Containers are specified by container images that are built according to a file e.g. a Dockerfile.

containerd

We use containerd as runtime environment for the containers in the Kubernetes cluster. It manages the lifecycle of a container.

dag

A DAG (Directed Acyclic Graph) is an Airflow pipeline that is defined in a python script. It links multiple operators (output to input) to realize a multi-step processing workflow, typically starting with an operator that collects that data and ending with an operator that pushes the processing results back to some data storage. An instance of a running DAG is called DAG-run.

data-curation-tool

The data curation tool is the place to view, curate and manage your datasets. You can access it via the Datasets tab in the workflow-management-system.

data-upload

Data can be uploaded at the Data Upload tab of the workflow-management-system. After the upload has finished you can directly trigger special workflows on this data e.g. to convert nifti data to dicom or to import the data into the internal PACS.

dataset

A dataset is a list of dicom identifiers. Most workflows are executed on a dataset. Datasets can be managed in the data-curation-tool.

deploy-platform-script

Deprecated since Kaapana 0.6.0, replaced by kaapanactl This script is used to deploy a kaapana-platform into a Kubernetes cluster or to undeploy a platform. It basically installs the kaapana-admin-chart using Helm. After building the platform you can find the script at kaapana/build/kaapana-admin-chart/deploy_platform.sh.

deployment

A Kaapana deployment is a kaapana-platform that was deployed on a server using the deploy-platform-script. This is not the same as a deployment in the scope of Kubernetes, where a deployment is an object that is used to manage multiple pods. In fact a Kaapana deployment consists of many Kubernetes deployments and other resources.

dicom-web-filter

The DicomWebFilter enables series-level access control to DICOM data via a DicomWeb API. For example, two users can access the same DICOM study but only specific series within it. This is required in scenarios where a user generates segmentations for certain series of a study and does not want to share these segmentations. The DicomWebFilter operates as a database storing access information and a REST API supporting the DicomWeb standard. Acting as an intermediary layer, it filters data received from a PACS based on the client’s access token and stored access rules. A management API is also provided for updating access information.

DNS

The Domain Name System (DNS) is the phonebook of the Internet. Humans access information online through domain names, e.g. www.dkfz.de. Web browsers interact through Internet Protocol (IP) addresses. A DNS translates domain names to IP addresses so browsers can load Internet resources.

extension

Extensions are either workflows or applications that can be installed on the platform under the tab Extensions of the main menu.

helm

We use Helm to distribute and manage our Kubernetes configuration files. Like this we only need one Helm chart that contains the whole platform i.e. the kaapana-admin-chart.

job

A job belongs to a workflow and is associated with a unique Airflow DAG-run.

kaapana-admin-chart

This Helm chart consists of all Kubernetes configuration required for the kaapana-platform. It contains the fundamental features of the platform such as reverse proxy, authentication, and kube-helm backend. It contains kaapana-platform-chart as a sub-chart.

kaapana-platform

The kaapana-platform is a platform that comes with all required base components like a reverse-proxy and an authentication provider as well as many usefull services like Airflow, MinIO and the workflow-management-system. You can utilize this platform as a starting-point to derive a customized platform for your specific project.

kaapana-platform-chart

This Helm chart consists of most of the interactive components of Kaapana, such as Airflow, PACS, Minio, landing-page and Kaapana backend.

kaapanactl

A tool used to manage a Kaapana installation on a server. It is the unified successor of the deploy-platform-script and the server-installation-script which got deprecated in Kaapana 0.6.0. Administrators of a Kaapana Installation can use this script to

  1. Prepare the server to deploy kaapana by installing all required dependencies

  2. Deploy or undeploy a kaapana-platform into a Kubernetes cluster by installing and configuring the kaapana-admin-chart using Helm

  3. Report generation that collects diagnostic information about the installation, useful when troubleshooting issues.

kubernetes

Kubernetes is an open-source container-orchestration system that we use to manage all the containers required for Kaapana.

local-operator

A local operator is an Airflow operator that runs python code and is executed in the container of the Airflow-Scheduler service. Local operators are not scalable, but they are fast to execute and do not require a container to run.

microk8s

MicoK8s is a lightweight, single-package Kubernetes distribution that we utilize to set up our Kubernetes cluster.

operator

An Airflow operator is a Python class that represents a single task within a DAG. This allows for the reuse of operators as building blocks across multiple DAG definitions. Operators can also run tasks by running a container. This makes the execution of operators heavily scalable. In Kaapana we differntiate between two types of operators:

platform

A platform describes a system that runs on a remote server and is accessible via the browser. The kaapana-platform is an example of a platform. Kaapana empowers you to construct a customized platform by integrating the services and extensions you require, tailoring it precisely to your needs.

processing-container

A processing-container can refer to two things:

  1. A container image that is build for data processing.

  2. The runtime of a container image that processes data.

In Kaapana processing-containers are executed as tasks of a workflow.

project

A project is a logical grouping of data, workflows, and other resources within the kaapana-platform. Projects can be used to separate different use cases or research projects. You can create and manage projects in the System > Projects. Upon creating a project multiple objects are created:

  • A dedicated namespace is created in the Kubernetes cluster.

  • In Keycloak a project-system-user is created that has access to all data associated with the project. Within any processing-container this project-system-user can make authenticated requests to data storages such as the PACS, MinIO and OpenSearch.

  • In the access-information-interface a project is created and the and the project-system-user is mapped to the project role admin.

  • In MinIO a dedicated bucket and dedicated access policies are created.

  • In OpenSearch a dedicated index is created together with related roles and role-mappings.

registry

A registry is a storage and content delivery system holding container images and Helm charts available in different tagged versions. A registry can be private or public. Examples of such registries are, DockerHub and Elastic Container Registry (ECR) provided by Amazon’s AWS. GitLab offers free, private registries.

runner-instance

In the scope of federated processing a runner-instance is associated with the kaapana-platform, where a job is executed. This must not be the same platform where the workflow the job belongs to was executed. You can add runner-instances under the tab Instance Overview of the workflow-management-system.

server

A dedicated physical or virtual machine with a supported operating system on which a platform can run.

server-installation-script

Deprecated since Kaapana 0.6.0, replaced by kaapanactl This script is used to install all required dependencies on the server. It can be found within the Kaapana-repository: ./kaapana/server-installation/server_installation.sh. It will execute the following steps:

  1. Configure a proxy (if needed)

  2. Install packages if not present: snap, nano, jq, curl, net-tools, core20, core24, helm

  3. Install, configure and start microk8s

  4. Add alias for kubectl to .bashrc file and enable auto-completion

  5. (opt) Change the SSL-certificates

Currently supported operating systems are listed in Requirements.

service

Kaapana services are specific components of the platform that include one or multiple web-server such as an web-API or a web-interface. Examples for services are Minio, OHIF, Airflow etc..

single file and batch processing

The difference between single and batch processing is that in single file processing for every image an own job is created. Therefore, each operator within the DAG only obtains a single image at a time. When selecting batch processing, a single job is created for all selected images and every operator obtains all images in the batch. In general, batch processing is recommended. Single file processing is only necessary if an operator within the DAG can only handle one image at a time.

workflow

Workflows semantically bind together multiple jobs, their processing data, and the orchestration/triggering and runner-instances of those jobs. Workflows can be started via the tab Workflow Execution in the workflow-management-system. In the Workflow List tab you can view information about worfklows and their jobs. Some workflows are preinstalled in the platform, others can be installed as extensions.

workflow-extension

A workflow extension is an installable Helm chart that contains either one or multiple Airflow DAGs and :term`operators<operator>`. After installing a workflow extension, you can see the DAGs available under Workflow Execution menu.

workflow-management-system

The workflow management system is the new environment for processing your data. You can access it via the Workflows tab in the main menu. Here you can upload data, use the data-curation-tool, start a workflow, get information about started workflows, and register runner-instances.