PhD-Thesis, Technische Universität Kaiserslautern, ISBN 978-3-8439-4555-4, Dr.Hut, München, 9/2020.
Zusammenfassung
The great potential of Augmented Reality (AR) has started its realization in recent years. This
recent surge came with an emergence of new challenges that upon successful completion will
allow AR to reach technological maturity and become an essential part of everyday life. Even
if pose tracking is a solved problem in controlled environments, challenges beyond that remain
concerning geometric and semantic scene understanding. Dense mapping and semantic labeling
of the environment is required for the creation of meaningful virtual content that is able to fully
interact with the real world. When seen at a system level, scalability in the sense of device
accessibility and content generation is the current main challenge for AR.
In this thesis we provide novel solutions to several current challenges of AR concerning
tracking, mapping, and applications. We base these solutions on generating prior knowledge
using machine learning techniques. Deep Learning has already superseded traditional computer
vision in areas such as classification but is challenged in 3D estimation problems. In this work
we advocate a combination of initial hypotheses or prior estimates provided by Deep Learning
with traditional computer vision to achieve refined and robust results.
Thus, we propose visual-inertial tracking using a learned model of the inertial sensor fused
with a visual feature tracker. We address the need for efficient dense mapping with a piece-wise
planar SLAM system using Deep Learning for planar area hypotheses generation, fused incrementally
with a point cloud reconstructed using multiple view geometry. Furthermore, we show
that initial objects poses can be efficiently estimated through learning and can subsequently be
refined and tracked using feature based methods. The use of synthetic images as a reference
or as machine learning training data offers significant advantages but also presents us with the
Synthetic-to-Real representation gap problem for which we introduce novel solutions.
Finally, we investigate the factors that hinder AR applications from achieving mass adoption.
We consider these to be the lack of universality and semantic relevance. Therefore, we propose
a concept of objects that store and share their own AR experiences and tracking information in
order to decouple tracking applications from target objects and content generation, a web-based
AR tracking framework that is independent of device type and operating system and an edge
computing architecture for remote processing that enables devices of limited computational
capabilities to support AR applications.
@phdthesis{pub11160,
author = {
Rambach, Jason Raphael
},
title = {Learning Priors for Augmented Reality Tracking and Scene Understanding},
year = {2020},
month = {9},
publisher = {Dr.Hut},
isbn = {978-3-8439-4555-4}
}
Deutsches Forschungszentrum für Künstliche Intelligenz German Research Center for Artificial Intelligence