Plenoptic cameras, known as light field cameras, best capture light fields. How do robotic systems use light field data? What new features does this type of camera create? How do you calibrate them properly?
A light field is a 4D dataset that offers great potential to improve the perceptions of future robots. Here, the 4D data is created as follows: A 2D pixel matrix of the sensor is distorted in 3D space and records the light field. However, this would produce 5 dimensions. But as it makes no difference if the camera is shifted in the direction of the light steel, this dimension is shortened from the equation. Compared to conventional 2D images, a 4D light field thus has two additional dimensions resulting from the shift in space and reflecting depth. With them, it is possible to obtain more information and infer different data products than with regular image sensors. Most often, 2D images focused on a certain distance or 3D depth images are derived.
The light field
Conventional cameras recreate the vision of the human eye. The camera views a scene from a fixed position and focuses on a specific object. Hence they focus a selection of light rays from a scene to form a single image.
In contrast, a light field contains not only the light intensity at the position where a light ray hits the camera sensor, but also the direction from which the ray arrives. The intensity, i.e. the amount of light measured by the sensor, at position (u, v) and the directional angles (Θ, ζ) are information that a 4D light field creates using rendering.
Thanks to the additional two dimensions, other data products are possible. 2D images occur most frequently. For example, these are focused on a certain distance or have an extended depth of field. An additional possibility is to create a 3D depth image from the 4D light field data. With a single image capture, it is now feasible to create multiple directly related image products.
However, due to the four dimensions of light fields, the entire process requires a relatively high computational effort.
Modifying a conventional camera creates a plenoptic camera. Due to the theoretical infinite depth of field and its refocusing abilities, it is possible to subsequently shift the plane of focus in object space. This additional depth information means that a plenoptic camera can also be used as a 3D camera.
Light field detection with micro lenses
In a micro lens array (MLA), a plate with micro lenses sits in front of the camera sensor. A micro lens is a miniaturized control and focusing lens. It bundles the light optimally to prevent light rays from hitting the edge of the image sensor. In this way, it prevents distortions or differences in brightness from occurring. Depending on the type of light, the micro lens must be made of a different material:
- Wavelength between 150 nm and 4 μm: silicon dioxide
- Wavelength between 1.2 μm and 15 μm (infrared light): silicon
The MLA is a matrix of individual micro lenses, allowing it to capture the scene from different angles. Each of these lenses has a diameter of between a few micrometers and a few millimetres. The field of view covers several hundred pixels. The incident light rays refract in the micro lenses, in accordance with the laws of physics. Thus, depending on the direction of incidence, the ray falls on a certain sensor pixel below the micro lens. Thus, the position (u, v) corresponds to a micro lens and the direction angles (Θ, ζ) correspond to a sensor pixel below this lens.
Light field detection with camera arrays
A camera array is basically a macroscopically extended approach of micro lenses. The individual cameras are controlled via an Ethernet switch, for example. The data is retrieved via an Ethernet interface. To simplify the processing of the camera data, the cameras are arranged in a known regular pattern.
Through the multiple cameras, a scene is observed from different positions. Each camera has different parameters:
- Focal length of the imaging optics
- Focus settings
- Exposure time
- Recording time
- Use of spectral and polarization filters
With the help of a camera array, the complete inspection of a scene (e.g. 3D survey), the determination of spectral properties (e.g. color and chromatic aberration) or the acquisition of dielectric properties (e.g. polarization) is possible.
Distributed camera arrays
Distributed camera arrays consist of several cameras, which are still modeled as single cameras. This means that the entire camera array cannot be described with common extrinsic parameters (= position of the camera in the world coordinate system). In addition, the spatial coverage areas often do not overlap. Application areas of these camera arrays are surveillance systems for different premises or industrial inspection, where only different object areas have to be covered.
Such systems can contain homogeneous (e.g. surveillance systems) as well as heterogeneous (e.g. inspection system) sensors. Here, the recorded data complement each other. To avoid overlapping, the number of cameras should always be chosen minimally with respect to the task.
Compact camera arrays
The cameras of a compact camera array are modeled together and therefore have additional extrinsic parameters by which it is possible to describe the position of the entire camera array with respect to the scene. In this case, the spatial coverage areas usually overlap considerably.
Such a system usually contains homogeneous sensors. The acquired information can be complementary as well as distributed (only a joint evaluation of the images provides the desired information). Compact camera arrays are also capable of capturing uni- (variation of a single acquisition parameter) and multivariate (variation of multiple acquisition parameters) image series.
Compact camera arrays are becoming increasingly important for many applications because they offer comprehensive capabilities to fully capture the visually detectable information of a scene.
Calibration of Plenoptic Cameras
As companies use measurement systems for image processing, it is necessary to obtain metric information from the light field data. They derive from the calibration and general characteristics of plenoptic cameras.
Commercially available plenoptic cameras provide distance values in non-metric units. This presents a hurdle for robotics applications, where metric distance values must be available. By separating the configuration of traditional plenoptic cameras from the new features, it is now possible to use traditional camera calibration methods to simplify the plenoptic camera alignment process and increase accuracy. Here, the pinhole camera model is used as in a traditional camera.
The system uses two different input data types to perform these two steps of alignment. These data types are 2D images with an extended depth of field and 3D depth images.
Thus, the noise of the depth estimation no longer affects the estimation of traditional parameters, such as the focal length or the radial lens distortion. This results in further advantages:
- Application of different optimizations to the above input data. This makes it possible to reduce outliers for the particular data type in a more targeted way.
- Bisection is easier, making novel and faster initialization models for all parameters realistic.
- Mixing of models for lens and depth distortion as well as those for internal quantities (focal length f, distance b between MLA and sensor, distance h between MLA and objective lens) is avoided.
Plenoptic cameras are therefore able to shift the depth of field even in retrospect or to generate different data products from their images. In order to evaluate the data correctly, the calibration must be divided into two steps. This makes it possible to reduce the noise component, which in turn makes depth estimation possible. Plenoptic imaging techniques are therefore disruptive technologies that allow new application areas to be opened up and traditional imaging techniques to be developed further.