Product Description

MO3R is an application for creating 3D data from a video sequence. In it's current form it keeps track of the cameras movement in the world while at the same time constructing 3D points in front of the camera.

While this computation is performed the user is able to freely move around and look at the result. More information about the usage of the product may be found in the user manual.

Background

The goal of the project was to create a system that may effectively reconstruct an outdoor environment of rather large scale. The customer wanted an application capable of reconstructing a garden to some degree, but did not specify this any further. The hardware accessible ranged from both a stereoscopic camera named Bumblebee and a Microsoft Kinect to simpler setups. The choice of using a mobile phone camera came from the notion within the group that an application working with easy to access hardware would result in a much more versatile product that could be used more people.

3D Reconstruction using only the technology and tools most people already own and use daily

Heritage

We started out looking at works like PTAM (Parallel Tracking and Mapping) and the unofficial follow up DTAM (Dense Tracking and Mapping).

Our algorithm is mostly based on the ideas of PTAM implementing the same general division of work in tracking and mapping into two separate threads. However the code is completely written from scratch by us which gives us intricate knowledge of its behavior. The code also implements a number of extensions that have been tested previously by other researches.

Most notably the algorithm uses double window optimisation as introduced in "Double Window Optimisation for Constant Time Visual SLAM" that improves optimisation speed compared to ordinary bundle adjustment considerably when the environment grows.

MO3R also relies on more advanced feature matching compared to the model methods. In our implementation either FREAK or ORB descriptors may be chosen as compared to PTAM that uses patch matching.

Here is a list of some of the most important documents that served as inspiration to us:

Future Work

As with any project there is a number of features that would have been added and some problems that might have been resolved provided the group had more time.

In order to reach the kind of performance that for example PTAM is able to achieve parallelisation has to be further extended and the optimising part further tweaked. There may also be need for using CUDA to speed up parts of this process.

Large loop closures are not handled. To include this feature some method of finding out if the same environment is viewed again need to be implemented. For example using BAG of words including information about image similarities together with position estimations. The system will then have to efficiently propagate scale differences in order to rectify for scale drift.