10x Autonomy

Assessment of the performance of AI-Models, trained on simulated LiDAR data and development of Methods for reducing the Sim2Real gap

Description of the company

10x Autonomy is a Hamburg based StartUp that offers a fully automated, AI-driven data analytics solution designed to accelerate the development of Advanced Driver Assistance Systems (ADAS) and Autonomous Driving (AD) systems.

The Status Quo when developing driving assistant functions is that after every code change, it has to be manually evaluated, how the performance of the systems has changed, and what exactly needs to be improved on. This usually takes weeks, as these are still processes, built around human assessment. We are changing that.

We are developing a Physical AI that acts as a digital driving inspector, that is able to evaluate the system's performance overnight and directly gives the engineers feedback, what the system needs to improve on. This automates the analysis of driving data at scale — enabling faster insights, detection of functional weaknesses, and automated KPI generation for AD and ADAS systems.

By joining this project, you will not just be running academic tests; your research on bridging the Sim2Real gap will directly influence the core AI architecture of an early-stage deep-tech startup.

Situation

Modern systems need to be trained on a large and representative amount of data. This data can originate from either simulations, or from real world data. Collecting real world data is very time consuming, expensive and you are not able to control edge case scenarios. That's why there usually is also the component of simulated data, to validate the performance of the system, but also to use it as a source of training data. And of course as we are validating the performance of other systems, we need to make sure that the performance of our system is very good. Therefore it is essential to be able to use high quality simulated data for development purposes. As a student team, you will work directly with the founding team, gaining firsthand mentorship in cutting-edge AI development.

Problem

Simulators in today's world, either open source or paid simulators, generate data that significantly deviate from real world data. Therefore there is the “Sim2Real Gap”, if you train your AI-models on simulated data and want to transfer the quality onto real world data. There will always be a gap, as if the data would come from the real world. To be able to reduce this gap, there are several parameters of the simulation that have to be physically accurately modeled to get realistic data. This contains:

  • Sensors models: Representing physics of sensors
  • Noise models: Representing noise in the data collection process
  • Motion models: Accurately representing motion of dynamic objects and the effect on the collected data
  • Environment models: Accurately representing the environment and situations like rain, fog, snow, etc.
  • Actors: How well represented are certain actors (e.g. Pedestrians, bikes, etc.) and how this results in the performance of the AI-model

Accurately representing these parameters as well as many other parameters would lead to more accurate training data and to better AI-models.

Aims of the project

1. Performance evaluation for data manipulation/augmentation of training data Based on the data used for the training of an AI-model, investigate the strengths and the weaknesses of the real world performance. You will get the training dataset, as well as the real world dataset, to analyse the AIs performance. Based on that, derive hypotheses for most significant parameters with the goal of implementing manipulation/augmentation techniques to investigate their influence on the resulting performance.

2. Data manipulation/augmentation Techniques To reliably train an AI-model, besides the raw data and data manipulation techniques, there are always data augmentation techniques applied to make the models more robust against noise or other phenomena in the data. Both can be studied.

Scopes

  • Investigate the Root Causes: Conduct deep-dive research into the specific parameters causing the Sim2Real gap and evaluate the exact differences between our training and evaluation datasets.
  • Engineer Data Solutions: Research, design, and implement advanced data manipulation and augmentation techniques tailored to LiDAR and simulated autonomous driving data.
  • Validate on Real-World Data: Put your theories to the test by evaluating how your implemented techniques actually perform on real datasets, directly measuring your success in closing the Sim2Real gap.

Target group (students)

Study programs:

  • Computer Science
  • Informatik-Ingenieurwesen
  • Data Science
  • Robotics
  • Mechatronics
  • Interdisciplinary Mathematics
  • Technomathematik

Required Skills:

  • Strong python skills
  • Experience with Numpy, Open3D, PyTorch or similar frameworks
  • Experience with git
  • Experience with C++ is a plus, but not required
  • Experience with LiDAR data is a plus, but not required

Dates
Please save these dates: Fishing for Experience Termine

Registration
You can apply for Fishing for Experience online.