Once, when talking to a friend outside of the automotive industry about the not-so-new concept of autonomous driving, he replied with the answer, “Autonomous driving? At least 10 years away.” This reflects, from another perspective, that many people know little about even assisted driving, let alone autonomous driving.
Therefore, if given the opportunity, I would like to use the simplest words possible to talk with everyone about the current progress of autonomous driving, and this inevitably involves the currently popular “LiDAR”.
First of all, it is necessary to consider the three basic elements of autonomous driving: perception-decision-execution. The perception layer is the top step in the entire autonomous driving process, and whether the perception is accurate or not plays a decisive role in subsequent decision-making and execution.
You can imagine a scene here: for someone with 500 degrees of myopia, whether or not they wear glasses when driving makes a qualitative difference, and this is the importance of the “perception layer”.
LiDAR is here to address this issue, and its importance is not only that it gives machines “glasses” to wear, but more importantly that with these glasses on, the machine can perceive the world not only at a greater distance, but also “360 degrees without dead angles”.
Why Do We Need LiDAR
In the previous autonomous driving market, high-end players included Google Waymo, Cruise, and Baidu Apollo, among others. For these technology giants, their revenue does not come from selling cars, but from selling systems and solutions.
Therefore, their primary task is to first create a “finished product”, which means that they can achieve autonomous driving by piling up BOMs regardless of cost.
For the “down-to-earth” passenger car market, adopting the approach of these technology giants will only result in either cars being sold at exorbitant prices or sold at a loss, as if it were philanthropy. Commercialization is what they should be more concerned about.
In other words, what they want to do is to produce cars that are affordable to most people and have a certain level of profitability. Therefore, a fusion solution of lower-cost millimeter-wave radar + cameras has become their preferred option, which is what we often hear about as L2 level assisted driving.
However, the upper limit of this combination’s capabilities is not high and it is difficult to achieve true autonomous driving. The main reason is that the perception ability is not enough, in other words, the machine’s understanding of the world is not clear enough.
First, let’s talk about millimeter-wave radar, which is mainly used for measuring distance, speed, and azimuth. The principle is to constantly emit and receive electromagnetic waves, and to achieve distance measurement by calculating the time difference between emission and return, to measure speed through the Doppler effect, and to calculate azimuth by receiving the phase difference of the reflected electromagnetic wave.Here, you can imagine the scene of playing ping pong against the wall. The principle of millimeter wave radar is to calculate the trajectory of ping pong movement, calculate the distance between the person and the wall, the speed of ping pong, and the angle difference between the person and the wall.
However, millimeter wave radar has many drawbacks. For example, even if the elevation antenna is equipped, it cannot perceive height. For example, if the millimeter wave radar is installed on the front bumper of a car and the obstacle is diagonally hovering in mid-air, the millimeter wave radar cannot detect it because of inconsistent height.
Of course, this is only one aspect of the shortcomings. There is also that most objects on the road will produce echoes, so** millimeter wave radar must filter out the echoes produced by stationary objects in the algorithm**. For example, in a curve, if there is a utility pole directly ahead, the millimeter wave radar can mistake the echo from the pole for an obstacle ahead, thereby taking braking action and exhibiting strange “ghost braking” behavior.
In addition, millimeter wave radar does not generate echoes for transparent objects, such as “raindrops”. This will make the machine think that it is an “endless road”, which will directly cause a decrease in perception accuracy on rainy days.
After talking about millimeter wave, let’s talk about cameras.
First of all, the stupid machine definitely does not have the ability to associate, so it cannot understand the meaning of the video. But it can understand pictures because it uses image recognition in artificial intelligence. The principle is that the machine will split the video into individual pictures, and then enlarge the pictures to identify what it is from the most primitive “pixel-local-whole” three-step process.
And in the meantime, “algorithm logic” is essential to tell the machine what it is, and then classify the pictures so that the next time the machine knows what it is.
For example, taking this taillight as a simple example, I know that this is a BMW 5 Series. Why do I know it’s a 5 Series? Because first, I’ve seen it, and second, my brain identifies it as a 5 Series.
So the question is, first, assuming this 5 Series becomes white, I still recognize it, what about the machine? Secondly, if there is now an Audi A4L, does the machine recognize it?
The answer is, it recognizes the first one, but not the second.The first reason is that the convolutional neural network in deep learning is utilized. Simply put, I determine whether it is a 5 Series not by color, but by the “car lights”, which is the core. The convolutional neural network categorizes the image into multiple pixel regions. If one pixel region matches, the network directly matches it. At this point, the machine has learned to “think.”
As for the second reason, if the machine has not seen the scenario before, it cannot recognize it naturally. This is the classic long tail effect in autonomous driving, meaning there is no 100% safety as there are always scenarios that the machine has not seen before. For example, Tesla has achieved data loop closure, and users directly transmit road data to Tesla, which then feeds back to the users through OTA after performing algorithmic logic iteration. However, in mass-produced cars, only Tesla has sufficient computing power, road scene data, and algorithmic logic to do this. What about other companies? They do not want to compete for second place, so they rely on “external” plugins, such as “high-precision maps”. We will discuss this later.
Returning to the topic, the “image recognition” of the camera can theoretically compensate for the problem of static object recognition filtered out by millimeter-wave radar, but this also requires excellent algorithms and sufficient road scene data, as well as being able to see further with the camera.
However, the stability of the camera operation can also be affected by external environmental changes, such as heavy rain and strong backlight, which require the right song, such as “Liáng Liáng” to be played. Additionally, the problems of pixel clarity and inaccurate distance measurement are also weaknesses of the visual solution.
Ultimately, the millimeter-wave radar collects distance and speed information while the camera collects road image information, and each passes its information to the chip, which makes the decision. This is called “fusion”. Here, the computing power of the chip is crucial, but we will discuss “chip” later. Today’s main character is “LiDAR”.
What is LiDAR used for?
Due to the weather and the poor ability to recognize stationary objects in high-speed scenes of the fusion solution of millimeter-wave and camera sensors, and also due to the long tail effect, companies have made it politically correct to use LiDAR as a “safe redundancy”.
However, the most crucial reason is the rapid decline in the cost of LiDAR. I will give a simple example. A few years ago, a Velodyne 64-line mechanical LiDAR cost $70,000, and technical support was only provided for orders over $1 million. However, Luminar’s 300-line 1550 nm wavelength LiDAR, which is about to be mass-produced, is priced at no more than $1,000 inclusive of software and hardware packaging, which is a friendly price for car companies.## Working principle of LiDAR
LiDAR mainly consists of a laser emitter, an optical receiver, and information processing. It emits laser beams continuously to the outside, which are reflected when they encounter obstacles. The reflected light pulses are received by the sensor for data calculation. By calculating the time and phase difference between the known speed of light (approximately 300,000 kilometers per second) between the two signals, we can determine the relative distance between the object and the vehicle. By scanning horizontally or using phased scanning to measure the angle of objects, and obtaining signals at different pitch angles, we can obtain height information about the world.
With the above three functions, LiDAR can sense the distance and angle between objects and then use software algorithms to produce a 3D model so as to turn the real world we see into a virtual world that machines can understand.
In the process, each time the LiDAR emits and receives laser signals, it collects the position information, which we call “point clouds.” The more point clouds we collect, the more accurate the LiDAR’s perception of the world, but this also requires higher computing power.
As the vehicle is moving and the surrounding environment is changing in real time, to collect complete point cloud information, the speed of the sensor to collect point clouds must keep up; in other words, the number of output points should be enough. However, looking only at the “output point count” is still not enough. The scanning frequency and ranging sampling rate are equally important, and the three factors combined make up the LiDAR’s real-time perception of position information. To put it simply, we need to update the world we can see in real time, just like our eyes.
In addition, to be able to sense the world on a larger scale, the angle that LiDAR can sense becomes important. The traditional way is to use a rotating component, which means that the LiDAR rotates and performs a 360-degree scan.
Solid-state LiDAR also has corresponding solutions. I’ll take the example of a MEMS LiDAR, which has a relatively simple structure. This technology scans the surrounding environment by rotating an embedded reflective mirror, bouncing the laser off the mirror and reflecting it at various angles. In principle, this is similar to the boring game of reflecting sunlight with a mirror that we used to play when we were kids.
Through the above ways, LiDAR can emit a laser grid to the surroundings and also receive reflected laser pulses back, measure the relative distance between the emitter and the measured object, and directly obtain 3D vector data. By combining the built-in positioning system, we can determine the accurate position of the vehicle in the environment, which is more accurate than the parallax depth algorithm used in 2D planar images in visual solutions. In short, you can roughly understand this as the difference between “looking at it” and “measuring it with a ruler.”This not only solves the problem of static objects being unrecognizable, but the laser pulses reflected back can also depict the shape and distance of objects, including the entire road condition and other traffic participants. This can construct a complete road scene that can be understood by machines. Doing so can greatly solve the problem of the long tail effect at the technical level.
In addition, there are many other things that LiDAR can do. For example, by emitting laser signals back from different objects, LiDAR can classify objects, similar to the “Convolutional Neural Network” mentioned earlier in the text. At this point, LiDAR can also learn to “think for itself.”
Since LiDAR emits laser beams by itself, its biggest advantage over cameras is that it is not affected by external environmental conditions, and its work is more stable whether it is night or day. As for how to deal with extreme weather conditions such as “rain, snow, fog, and frost” and the impact of backlight, please refer to the following sections.
What kind of LiDAR do we need
LiDAR is one of the most widely classified species I have seen. In terms of beam, there are 16-line, 32-line, 64-line, and many more. In terms of technical structure, they are divided into mechanical, hybrid solid-state, and pure solid-state. In addition, non-mechanical LiDAR scanning methods include MEMS, phased arrays, and micro-lens arrays. According to the wavelength of the transmitter, there are 905 nm and 1550 nm.
Considering all factors, I will make a statement that LiDAR that truly meets the automotive-grade requirements must meet three prerequisites—300 lines, solid-state structure, and 1550 nm wavelength—other factors are transitional technologies.
Regarding the number of lines, here, 300 lines do not mean that 300 laser beams are arranged vertically in a mechanical fashion, but through serial and parallel scanning methods, achieving a similar resolution as 300 lines. In theory, even 900 lines can be achieved using this technology in the future.
Regarding the actual scene requirements of autonomous driving, the first problem LiDAR needs to solve is perception. The more lines, the more laser pulses are emitted, which means that more point cloud information is obtained and environmental perception is more accurate. The 300-line LiDAR can theoretically achieve image-level recognition accuracy, which is particularly friendly to machines. As mentioned before, only in this way can machines understand the world as seen by human eyes.
Regarding the technology development path, LiDAR can be divided into three types: mechanical rotary, hybrid solid-state, and pure solid-state. In terms of cost and reliability, mechanical rotary LiDAR will never be commercialized.Because I’ve never seen a machine that runs continuously without wearing out and breaking down. Once the radar malfunctions, a skilled technician is required to replace parts and perform calibration, which is both expensive and inconvenient. As for hybrid solid-state lidar, I am not optimistic about it either. Only pure solid-state lidar is wallet-friendly and stable.
Finally, let’s talk about wavelength. Here are a few more words. The safety of laser radiation is affected by the combination of wavelength, output power, and laser radiation time. First of all, when strong light passes through the lens of the eye, it is concentrated on the retina of the eye, which can cause burning and potential risks. If you have played with a magnifying glass and fire as a kid, you should understand this.
In addition, visible light is within the wavelength range of 380-780 nm, covering colors from purple to red. This is why there are only seven colors in a rainbow, because the rest cannot be seen. This means that light with a wavelength above 905 nm is invisible to the naked eye. When high-power operation is involved, avoiding being hit by a 905 nm laser is difficult and dangerous. Therefore, for the protection of human eyes, IEC specifies the safe upper limit of laser radar wavelength at around 900 nm, which means that a 905 nm laser radar must operate at low power output. Of course, the application in rocket and defense scenarios, such as Space X, is not within the scope of this discussion.
Interestingly, lasers with wavelengths of 1400 nm or higher will be absorbed entirely by the transparent part of the eyeball before reaching the retina, which does not harm the human eye and is not subject to IEC rules. Therefore, a 1550 nm wavelength lidar can operate at 40 times the power output compared to a 905 nm wavelength lidar. The direct benefit of increased power output is higher point cloud resolution, longer detection range, and better penetration in complex environments.These three points are all important. First, higher point cloud resolution means more accurate perception. Second, longer detection range directly affects whether the LiDAR system will fail in high-speed scenarios. A longer detection range means earlier response time, which allows for sufficient redundant distance and conforms to the redundancy function that LiDAR is designed for.
Regarding those automotive-grade LiDARs that can only detect up to 100 meters with 10% reflectivity, they should be sent back for reconstruction. Even if your car uses Brembo calipers and Bosch iBooster to reduce the braking distance at 100 km/h to 34 meters, how about scenarios at 120 km/h? Do we rely on millimeter-wave radar, which cannot perceive height?
Even if the vehicle can stop with a detection range of 100 meters, and safety can be ensured, leaving aside the experience of the passengers in the car, other vehicles on the road do not always have such hardware. Additionally, it is worth mentioning that any manufacturer that releases LiDAR systems only mentions the farthest detection range, and not the detection range below 10% reflectivity, which is deceptive.
Now, let’s move on to the third point, which is the last straw that breaks the camel’s back. Comprehensive autonomous driving means that the vehicle will have to deal with various complex environments, including rain, snow, fog, frost, and even sandstorms.
First, it is common knowledge that light waves decay as they propagate through the atmosphere. Unlike millimeter-wave radar, which “does not produce echoes on semi-transparent objects,” LiDAR systems generate pulse echoes on semi-transparent objects such as “water droplets” in rainy or foggy conditions.
However, the reflection is only “partial” and will diffuse to the surrounding areas. Especially in “thick fog” conditions, this will cause a decrease in the accuracy of reflected signals. As an inappropriate example, if we wear glasses that fog up or have water droplets, we cannot see the world outside clearly, because “water” affects our visual perception.
At this point, LiDAR systems will usually improve their overall perception capability by “increasing the transmission power” and “elevating the receiver sensitivity.” Increasing the transmission power can also reduce the impact of environmental light on 1550 nm. That is, it offers a reliable solution to “backlight” scenarios. Don’t forget, increasing transmission power is the Achilles’ heel of 905 nm, which is why I am not optimistic about this wavelength.
One more thing to say about 905 nm; do not claim that a shorter wavelength provides a better atmospheric penetration capability. Once the power is increased 40 times, the penetration capability is not comparable to that of a shorter wavelength. Furthermore, the prospect of upgrading technology to make the 905 nm wavelength “more penetrating and less harmful to human eyes” is almost non-existent, as this is determined by physical properties. It is like trying to alter the “gravitational acceleration” through technology. Is this a fairy tale?# Translation
You are a translator in the automotive industry, responsible for English translation, spelling checks and verbiage modifications. Please provide a more elegant and concise English version of the following text while preserving the HTML tags within the Markdown format. Only corrections and improvements should be made and explanations should not be added.
Furthermore, I would like to explain why laser technology giant companies like Velodyne still insist on using the 905nm wavelength and claim it is “sufficient” for their products. Firstly, Velodyne obtains nearly half of their LiDAR market shares from non-automotive industries. Therefore, there is no doubt that the statement “905nm wavelength is sufficient” is true for non-automotive-grade applications.
However, I completely agree with Mr. Ma from Space X, who stated that “Only fools use LiDAR, and whoever uses LiDAR is doomed,” if Velodyne continues to use 905nm wavelength LiDAR for autonomous vehicles.
To get to the point, I believe there is another important reason why the 1550nm wavelength is better than the 905nm wavelength, which is the cost of “receiver sensors.” The receiver for the 905nm wavelength LiDAR can use silicon as its raw material; in contrast, the receiver for 1550nm wavelength LiDAR requires “indium gallium arsenide” material, which costs ten times more than silicon. Furthermore, the cost of indium gallium arsenide has shown no sign of decreasing. To sum it up, discussing whether something is sufficient or not, except for technical barriers and manufacturing costs, is pointless.
Now, let’s talk about Luminar, the rising star in the industry. According to official sources, their LiDAR can detect a range of up to 250 meters when the reflectance is less than 10%, which is the range of automotive-grade applications that I think is reasonable. Additionally, they acquired Black Forest Engineering, which specializes in producing indium gallium arsenide materials, and combined with their own designed ASIC integration chip, Luminar has successfully solved the cost problem of indium gallium arsenide. Currently, they can control a single sensor at a cost of 3 US dollars, which is really impressive.
Therefore, I am quite curious about how other companies within the 1550nm wavelength line, such as Huawei, Innovusion, and DJI, solved the issue of sensor cost. If anyone has information about this, please leave a comment in the comment section.
All of my thoughts and opinions are based on the era of full autonomy combined with extreme scenarios. Of course, I also understand that at this stage, due to cost restrictions, some automakers may choose to install LiDAR with only a 10% reflectance rate that has a detection range of less than 100 meters and a 905nm wavelength, and some have yet to do so. However, I would like to emphasize that such types of LiDAR are not completely autonomous, and human intervention is still necessary in specific situations.
In conclusionThe process of writing this article has witnessed the release of Innovusion’s 300-line lidar on NIO Day by NIO and the wonderful speech of Amnon Shashua, president and CEO of Mobileye, confirming the use of Luminar’s lidar product.
However, the most interesting thing is that in late December, Mr. Ma Yilong, from another company, Tesla, was exposed to using Luminar’s lidar in road testing, and I guess it should be to test whether lidar can solve the pain points of visual solutions.
As for whether lidar will ultimately be used or not, in my understanding, it is not a matter of “slap in the face”, after all, slapping in the face is something Tesla is used to. More importantly, it is whether the cost of lidar can meet Ma Yilong’s expectations. After all, Tesla’s vision is to accelerate the world’s transition to sustainable energy. If the car price cannot be reduced, how can we happily accelerate the transition?
Finally, looking back to the industry, hardware competition will always be only one aspect of autonomous driving. Just like now, lidar has achieved centimeter-level positioning accuracy, and the computing power of a single chip has also exceeded 200 TOPS. I also believe that with the iteration of technology, “millimeter-level” lidar and single-chip computing power exceeding a thousand is not a dream in the future.
But what is more important is how each manufacturer can effectively utilize the hardware. Therefore, the competition will focus on the software algorithm level, such as better fusing the sensing information of millimeter-wave, camera, and lidar, optimizing the autonomous driving algorithm logic, and the application scenarios of V2X in the 5G era.
And only the combination of all these will be the foundation for autonomous driving to have the opportunity to turn from a dream into reality.
This article is a translation by ChatGPT of a Chinese report from 42HOW. If you have any questions about it, please email bd@42how.com.