In this experiment, I tried to find a way to produce a 3d-model from 360-degrees-video footage.
The video was recorded with a Insta360 camera. (I used the same recording
for testing a webviewer with interactive maps)
The basic ideas was to split the video in three normal video streams
with a left, frontal and right view (fov: 120 degrees) and to use them as input for two popular photogrammetry software
packages: RealityCapture and Jawset Postshot.
I have tested different methods for splitting the video in seperate streams and sequences of images. In the final experiments,
presented here, I splitted the video export from 360Insta studio with the 360-2d tool in the VDSC Video Editor. The three video-files
for left, frontal and right view were consequently converted in image sequences by sampling 1 frame per 2 seconds, resulting in
an image set of 1477 images (5760x2488 pixels).
The first tests were performed with Jawset Postshot. The extraction of keypoints and their alignment in a pointcloud
took one or two hours computation time. The registration of the cameras took another 4 or 5 hours. The final step, the computation of
the Gaussians splats for rendering could be followed real time by inspecting the updates of the iterative rendering process.
I decided to stop this process after about half an hour for close inspection. Navigation in this model appeared to be very
difficult because the scene was only recognizable from a limited number of view points. Continuation of this process did not yield
better results. The best renders were obtained by choosing a viewpoint that is close to one of the camera's.
The same image sets were also used as input for RealityCapture. I have used the fully automated workflow and after many hours
of computation, I got a roughly detailed model with colored textures. Unfortunately, I could only view this model in the RealityCapture software.
I couldn't find a way to get a managable export of this model.
Therefore, I decided to show some screenshots in the following pictures.
Discussion and conclusions
In this experiment, getting a point cloud and camara positions seemed to be no problem.
The point cloud could be useful as a tool for navigation in the webviewer for 360 degrees video.
This idea will be explored in further experiments.
However, rendering Gaussian splats (Jawset Postshot)
or textured meshes (RealityCapture) did appear to be a big problem. The results are not really surprising, because the glass panels,
ligthing and foliage in this scene are well known as problematic for photogrammetry applications.
Producers of 360 video cameras make an effort to produce a spherical image from planar images recorded by multiple sensors.
In this experiment, these spherical images are converted to planar images. In all this processing, there is a risk that
important image features will get lost or distorted. Better results might be obtained by methods that can avoid all these
processing steps.