Testen von künstlicher Intelligenz (KI) und Machine Learning (ML)

Adapting Testing Standards for the World of Data Science: Best Practices for a Reliable AI Testing Strategy


Let’s face the new challenges in the testing world together with solid knowledge from data science and data management – from pre-processing to corner cases.

Develop a robust AI testing strategy focusing on precise data analysis. Optimize your AI implementation through comprehensive testing to ensure your models perform reliably in challenging scenarios.

We advise you on developing a comprehensive testing strategy for AI and machine learning-based systems:

  • Collaboratively shaping AI testing strategies
  • Identifying potential exceptions (Corner Cases)
  • Excellent testing experts for exploratory practical testing
  • Comprehensive tool evaluation

How can you test machine learning?

QualityHeroes Podcast Episode 36: Our guests Namrata Gurung and Michael Mlynarski discuss various methods of testing AI and ML-based systems.

Listen now 

What data can be used to test and secure an AI?

QualityHeroes Podcast Episode 18: Bettina Stühle-Stein reports on a major IT project for autonomous driving.

Listen now 

AI Safety Project

The complexity of testing Artificial Intelligence is especially evident in our largest AI project, a federal government-funded project aimed at developing a universally applicable safety methodology for AI in autonomous vehicles.

We are responsible for systematically considering so-called Corner Cases, critical exception situations both within road traffic and within the AI system and its sensors.

More information about the project can be found here:

Conclusion of the project: AI Safety – Safety Argumentation for Autonomous Driving in Berlin.

Our Services in Testing Artificial Intelligence


The testing strategy is the most crucial component for testing AI and ML-based systems effectively and efficiently. While the testing of “traditional systems” (Web, Mobile, API, etc.) can be based on various existing test strategies, this is not the case for machine learning-based systems. The focus must be placed more on the data, its structure, semantics, etc., as well as on data generation techniques.

Test concepts form the basis for cross-project standards in quality and testing. They provide the foundation for the project-specific derivation of test strategies and thus form one of the cornerstones of testing activities. We advise product teams on how such a test strategy can be designed and how data preparation and validation pipelines can be created. Our consultancy is agile, using methods like the OnePager, the 10-minute test strategy, and risk storming.

Corner Cases


In the testing of AI and ML-based systems, the systematic identification of potential exceptions, or “Corner Cases,” is extremely relevant. We advise product teams on the identification, design, and testing of Corner Cases and are involved both in the training and testing process as well as during the application phase.


When testing ML-based systems, the focus is significantly more on data than with “traditional” testing approaches. The quality assessment of the training, testing, and validation data used requires both a solid understanding of data science and data management.

We offer our clients a combination of both worlds by providing methods and tools for designing and building a pipeline for test data in projects.


Qualität der Testdaten
Hands-on Testing


Every successful testing strategy is a mix of meticulously crafted test cases, test automation, and exploratory testing. The latter, in particular, requires a very good understanding of the domain and excellent testing skills.

We offer our clients a range of excellent testing experts for exploratory practical testing to identify the most critical exception situations for ML-based systems.


Techniques known from data science and machine learning can also be used to improve the testing process.

The market for ML-based testing tools is slowly developing. We provide our clients with an overview and a tool evaluation, based on their requirements and our knowledge of this market.

ML-basiertes Testen

Workshop: AI Testing Strategy Learning Journey for Businesses

The learning content will be sharpened through an initial Discovery Workshop and can be supplemented and adjusted as needed throughout the journey. For this purpose, we will involve all relevant roles of the development process in an interdisciplinary manner.

The following contents will be covered:

  • Process Derivation: From the requirement through test data to the production system
  • Risk analysis along the derived process
  • A cross-cutting aspect will be the creation and use of the Operational Design Domain (ODD) of the application, especially in terms of content, metrics and monitoring, risks and mitigation strategies, shared knowledge, and the construction of the test pyramid
  • Individual development of an AI testing concept for the project
  • Derivation of a modular AI testing strategy
    • Support in the introduction of need-based metrics and monitoring
    • Implementation of Shift Left approaches, e.g., CI/CD pipelines and unit tests in the AI field
    • Mitigation strategies for the risks
    • Schematic construction of an individually adapted test pyramid
Workshop: Lernreise KI-Teststrategie

References in the Field of AI Testing

KI Absicherung – Corner Cases

AI Safety in Autonomous Driving Project

This is a research project funded by the Federal Ministry for Economic Affairs and Climate Action. It develops a methodology for AI safety in the context of automated driving. QualityMinds led the work package on the topic of Corner Cases.

More information
Projekt ATTENTION KI-Testing

ATTENTION Funding Project

ATTENTION is a research project funded by the German government. In collaboration with a consortium of 5 German project partners (OEMs, suppliers, technology partners, research institutes), methods in the area of autonomous driving are being developed to predict the risk of injury in real-time and to mitigate damage.

More information 
KI-Testing Rainforest Connection

Rainforest Connection

Quality Assurance of a Distributed System for Audio Detection in the Rainforest. Using deep learning methods, chainsaw noises are identified to prevent deforestation.

More information 
Referenz KI-Testing autonomes Fahren

Test ML – Automotive

Consultation and development of test planning software using Supervised Machine Learning.

More information

Publications in the field of AI Testing


Research Survey "Inspect, Understand, Overcome: A Survey of Practical Methods for AI Safety

The use of deep neural networks (DNNs) in safety-critical applications like mobile health and autonomous driving is challenging due to numerous model-inherent shortcomings. 

Read now!


German Testing Magazine – Dangers Lurk Around Every Corner

In this article, we present specific examples of AI-specific Corner Cases, a taxonomy as part of this methodology, and key insights for testing Computer Vision.

Read now!


Highly Automated Corner Cases Extraction: Using Gradient Boost Quantile Regression for AI Quality Assurance

This work introduces a method for Quality Assurance of Artificial Intelligence (AI) Systems, which identifies and characterizes “corner cases”. Here, corner cases are intuitively defined as “inputs yielding an unexpectedly bad AI performance”.

Read now!


WSAM: Visual Explanations from Style Augmentation as Adversarial Attacker and Their Influence in Image Classification

This paper outlines a style augmentation algorithm using stochastic-based sampling with noise addition for randomization im- provement on a general linear transformation for style transfer.

Read now!


Context-Based Interpretable Spatio-Temporal Graph Convolutional Network for Human Motion Forecasting

In this work, we present a Context-based Interpretable Spatio-Temporal Graph Convolutional Network (CIST-GCN), as an efficient 3D human pose forecasting model based on GCNs that encompasses specific layers, aiding model interpretability and providing information that might be useful when analyzing motion distribution and body behavior.

Read now!

Industrial Podcast: Qualitätssicherung in der künstlichen Intelligenz mit Michael Mlynarski

Podcast: Brief AI Quality Assurance and Testing for Artificial Intelligence

Michael Mlynarski is the CEO of QualityMinds and ensures, with his team, that quality remains high in AI applications – testing is the keyword. Since AI learns from data and there are no clearly predefined algorithms developed by humans, testing requires entirely new approaches. How this is done, he explains in the podcast conversation.

Read now!

Bereit mit uns durchzustarten?


Tobias Varlemann

Tobias Varlemann

Lead R&D

Dr. Namrata Gurung

Dr. Namrata Gurung

Data Scientist

Bettina Stühle-Stein

Bettina Stühle-Stein

Senior Test Expert

Dr. Michael Mlynarski

Dr. Michael Mlynarski


Anything else you need to know?

Let’s talk!

QualityMinds Heroes