Capturing Depth Information: Principles and Techniques of Time-of-Flight Sensors
The principle of depth measurement using Time-of-Flight sensors is based on measuring the time it takes for radiation to be emitted from a source, reflected off an object, and finally hit a sensor. As illustrated in Figure 1, a Time-of-Flight system fundamentally consists of a radiation source that emits an amplitude-modulated signal and a sensor that captures the signal reflected from objects. Typically, infrared (IR) or Near-Infrared (NIR) light is used, which is limited to a narrow frequency range to suppress effects from ambient light. In the simplest case, the sensor consists of a single pixel that provides only one depth information. The depth information is calculated using the flight time $\Delta t$, which the light took to be reflected from an object and hit the sensor, and the speed of light $c$. The product is then halved, as the light has traveled the double distance \begin{equation}d = \frac{c \cdot \Delta t}{2}.\label{eq:distancefromtime}\end{equation}
The flight time at the speed of light encompasses a few nanoseconds. For laser scanners, Single Photon Avalanche Diodes (SPADs), also known as Geiger-mode APDs (GAPDs), are used. These can detect photons with an accuracy of a few picoseconds, enabling distance measurement with an accuracy of a few millimeters [GVS18].
Since directly measuring time places high demands on the precision of the components, an alternative method integrates the modulated light signal over a period and calculates the distance either directly using pulse duration modulation from the ratio of two intensities or using Continuous-Wave Modulation by calculating the phase shift of a continuous wave function. The two methods are explained in detail in the subsections on Pulse Duration Modulation and Continuous-Wave Modulation.
Both methods use an amplitude-modulated signal to directly measure distance or phase shift, from which the distance can be calculated. In amplitude modulation, a high-frequency carrier signal is altered by a low-frequency useful signal. The carrier signal for Time-of-Flight camera systems is the emitted infrared light, whose amplitude is altered. The frequency of the carrier signal does not play a role in calculating the distance and is only used to transmit the useful signal.
Pulse Duration Modulation
Time-of-Flight cameras that operate with pulse duration modulation emit light pulses at specific intervals with a defined duration $t_0$ and determine the time $\Delta t$ that the light takes to be reflected from an object's surface and return to the sensor. The light is emitted over a duration $t_0$ by an infrared LED or laser, and the returning light is captured by sensors. Depending on the distance, the light hits the sensor with a delay $\Delta t$.
Li [Li14] presents a solution for measuring the delay by having multiple so-called buckets per pixel integrate the reflecting light over a period of $t_0$ with a time offset $\Delta t$ and the time offset $\Delta t$ is approximated using the ratio of the intensities. In the simplest form, the system contains two buckets $C_1$ and $C_2$. One bucket captures the light during the period $t_0$, while a second bucket starts time-offset from $t_0$ over the same period duration $t_0$ captures the reflected light. Figure 2 illustrates the concept. The electrical charges $Q_1$ and $Q_2$ of the respective buckets $C_1$ and $C_2$ are used to calculate the time offset $\Delta t$ from the ratio \begin{equation}\Delta t = t_0 \cdot \frac{Q_2}{Q_1 + Q_2}.\end{equation}
The distance can be determined by substituting $\Delta t$ into equation 2.34 \begin{equation}d = \frac{c \cdot t_0}{2} \cdot \frac{Q_2}{Q_1 + Q_2}.\end{equation}
The maximum distance $d_{max}$ that can be measured is derived from the time the light takes to be reflected from an object and return to the sensor during the period $t_0$ \begin{equation}d_{max} = \frac{c \cdot t_0}{2}.\end{equation}
Since light that takes longer than the maximum time of $t_0$ to return is no longer captured by $C_1$, a greater distance means that the charge of $Q_1$ has a value of 0, and thus the distance can no longer be correctly calculated as the ratio becomes 1 regardless of the distance, as long as $Q_2$ has a charge different from 0.
Continuous-Wave Modulation
In contrast to simple pulse duration modulation, Continuous-Wave Modulation emits either rectangular or sinusoidal wave functions. Rectangular functions are usually used because they are easier to realize in electronic circuits [Li14].
If the frequency $f$ of a signal is known, the phase shift of the reflected signal can be calculated, allowing the flight time $\Delta t$ to be determined. The determination of the phase shift varies depending on the chosen modulation. It should be noted that the frequency $f$ refers not to the frequency of the infrared light but to the frequency of the useful signal.
Creath [Cre88] compared several phase measurement methods, and according to Giancola et al. [GVS18], the four-bucket variant is most widely used and was also used by Meister et al. [MNK13] in their work on simulating a Time-of-Flight camera. Here, the number of photons hitting in the form of electrical charges $Q_1$, $Q_2$, $Q_3$, and $Q_4$ of the four buckets $C_1$, $C_2$, $C_3$, and $C_4$ are measured, and the phase shift $\phi$ is estimated \begin{equation}\phi = atan\Big(\frac{Q_3 - Q_4}{Q_1 - Q_2}\Big).\label{eq:phasefromcharge}\end{equation}
The time offset $\Delta t$ is then calculated using the phase shift $\phi$
\begin{equation}\Delta t = \frac{\phi}{2 \pi f}.\label{eq:timefromphase}\end{equation}
The resulting distance $d$ can finally be determined by substituting the time offset $\Delta t$ into equation 2.34.
Additionally, according to Li [Li14], the charges $Q_1$, $Q_2$, $Q_3$ and $Q_4$ can be used to calculate the intensity $A$ of the infrared signal and the offset $B$, caused by ambient lighting \begin{equation}A = \frac{\sqrt{(Q_1-Q_2)^2+(Q_3-Q_4)^2}}{2}\end{equation} \begin{equation}B = \frac{Q_1+Q_2+Q_3+Q_4}{4}.\end{equation}
The maximum distance that can be measured using Continuous-Wave Modulation is determined by the wavelength of the useful signal. Since the light has a round trip, the maximum distance is calculated from half the wavelength \begin{equation}d_{amb} = \frac{c}{2 f}.\end{equation}
Since the signal repeats at a phase shift of $2 \pi$, estimating the depth value using phase shift leads to an ambiguity (ambiguity) from a maximum measurable depth value $d_{amb}$. One way to increase the maximum distance is to reduce the frequency. However, this leads to a reduction in accuracy as the strength of the noise $\sigma$ increases. The noise behavior can be approximated by Li [Li14] using the following equation: \begin{equation}\sigma = \frac{c}{4 \sqrt{2} \pi f} \cdot \frac{\sqrt{A+B}}{c_dA}\label{eq:timeofflighterror}\end{equation}
The modulation contrast $c_d$ quantifies how well the Time-of-Flight sensor collects and separates photoelectrons. Equation 2.43 shows that a high amplitude and a high modulation frequency increase accuracy, but a high modulation frequency also leads to a shorter range.
Some Time-of-Flight systems, such as the Kinect v2, therefore combine, as illustrated in Figure 3, multiple recordings with different frequencies to increase the accuracy of the estimated depth values while simultaneously measuring large distances. The true depth value is determined by comparing the depth values of all frequencies and determining a depth value that matches in all frequencies. Giancola et al. [GVS18] studied in this context the structure and function of the Kinect v2 in detail. It was found that the Kinect v2 uses three different frequencies. For a depth image, a recording is first created at 80 MHz for 8 ms, followed by a recording at a frequency of 16 MHz for 4 ms, and a final recording at a frequency of 120 MHz for 8 ms. Each frequency has a different ambiguity distance $d_{amb}$. The frequency at which the ambiguity of all frequencies matches is referred to as the beat frequency. This frequency is usually lower and has a higher maximum distance [Li14]. The beat frequency of the Kinect v2 is 8 MHz, allowing a maximum ambiguity-free distance of 18.73 meters.
Amplitude Modulation with Rectangular Signal
With a rectangular signal, light pulses are emitted at regular intervals like with pulse duration modulation. Unlike the pulse method, the duration of the light pulses and the intervals between the light pulses coincide, creating a continuous signal whose phase shift can be determined using equation 2.39. As Figure 4 shows, this is done with the help of the above-mentioned four buckets, each offset by $90^\circ$. Using the charges $Q_1$, $Q_2$, $Q_3$ and $Q_4$, the phase shift of the reflected signal can be determined. The pulse duration $t_0$ of a rectangular pulse of a signal with a frequency $f$ can be calculated based on half the wavelength \begin{equation}t_0 = \frac{1}{2f}.\end{equation}
Amplitude Modulation with Sinusoidal Signal
Next, determining the flight time $\Delta t$ of the light using the approximation of the phase shift $\phi$ of a sine wave according to Hertzberg et al. [HF14] is explained. In the case of sinusoidal modulation, the high requirements for the flank steepness of the rectangular signal, which are particularly crucial at high frequencies, are omitted. Therefore, the use of sine waves allows for a more robust and accurate estimation of the phase shift $\phi$, thereby more accurately determining the time offset $\Delta t$ [GVS18].
The generated signal is a sine wave $\psi(t)$ with a known amplitude $I$, frequency $f$, and offset $c_o$
\begin{equation}\psi(t) = I \cdot \sin(2\pi f t) + c_o.\end{equation}
The offset is chosen so that it is greater than or equal to $I$, as no light can be emitted with negative intensity. A portion $\alpha$ of the emitted light returns to the sensor, with the phase of the signal shifted depending on the time $\Delta t$ the light was in transit. An additional constant amount of ambient lighting $c_B$ is also added. The measured light can thus be described as follows:
\begin{equation}z(t) = \alpha \cdot \psi(t - \Delta t) + c_B.\end{equation}
To determine the phase shift, the signal is integrated over half a wavelength in the individual buckets at a time offset
\begin{equation}s^{[k]}=\int_{\frac{k}{4f}}^{\frac{k+2}{4f}}z(t)dt=\Big[c_1t-\frac{2\alpha a}{4\pi f}\cos(2\pi f(t-\Delta t))\Big]_{\frac{k}{4f}}^{\frac{k+2}{4f}}\end{equation}
\begin{equation}=c_2+\frac{2\alpha a}{2\pi f} \cos\Big(\frac{\pi}{2}k-2\pi f \Delta t\Big).\end{equation}
The phase shift is determined analogously to equation 2.38 from the charges of the buckets $C_1$, $C_2$, $C_3$ and $C_4$
\begin{equation}\phi = atan\Big(\frac{s^{[1]} - s^{[3]}}{s^{[0]} - s^{[2]}}\Big).\label{eq:sin_phase_calculation}\end{equation}
The flight time is then calculated from equation 2.39 and the resulting distance is finally determined by substituting into equation 2.34.
Bibliography | |
---|---|
[GVS18] | Silvio Giancola, Matteo Valenti, and Remo Sala. A Survey on 3D Cameras: Metrological Comparison of Time-of-Flight, Structured-Light and Active Stereoscopy Technologies. Springer-Verlag GmbH, 2018. |
[Li14] | Larry Li. Time-of-flight camera - an introduction. Texas Instruments-Technical White Paper, 2014. |
[Cre88] | Katherine Creath. V phase-measurement interferometry techniques. Progress in optics, 26, 1988. |
[MNK13] | Stephan Meister, Rahul Nair, and Daniel Kondermann. Simulation of time-of-flight sensors using global illumination, 2013. doi:10. 2312/pe.vmv.vmv13.033040. |
[Wya82] | James C. Wyant. Interferometric optical metrology-basic principles and new systems. Laser Focus with Fiberoptic Technology, 1982. |
[HF14] | Christoph Hertzberg and Udo Frese. Detailed modeling and calibration of a time-of-flight camera. In Proceedings of the 11th International Conference on Informatics in Control, Automation and Robotics. SCITEPRESS - Science and and Technology Publications, 2014. doi:10.5220/0005067205680579. |