How useful can the ADAS system be with a hardware cost of 6000 yuan?

Author: Driver

In 2022, under the push of multiple factors, the intelligent driving system has really experienced a massive outbreak. According to the data from the Ministry of Industry and Information Technology, the penetration rate of new L2-level assisted driving passenger cars in the first half of this year has reached 30%, an increase of 12.7 percentage points year-on-year.

In the past, intelligent driving was regarded as a synonym for new force vehicle companies such as Tesla and Weiyuanli. Nowadays, intelligent driving is gradually becoming a standard configuration for cars.

The reason behind this transformation is easy to understand. When the intelligentization of cars becomes a consensus, more and more traditional vehicle manufacturers are cooperating with self-driving companies such as Baidu, Huamo Zhixing, and Huawei, through self-research or partnership, to develop products with L2/L2+ functions to adapt to the competition in the new era.

At this time, the adoption of intelligent driving functions is no longer a rare occurrence.

How to obtain high-quality and high-value data from a large number of vehicles on the road at a low cost and with high efficiency, in order to iterate technology, has become the focus of competition among players, and the emergence of DJI has made the competition in this field even more heated.

Trust Your Journey with DJI

Recently, we were invited to experience DJI’s self-developed intelligent driving system, the “Lingxi Smart Driving System”, which is the first time that DJI has demonstrated the actual functions of its self-developed intelligent driving system to users.

Let’s briefly introduce this system:

The “Lingxi Smart Driving System” is a self-developed intelligent driving system for DJI’s vehicles. The mass-produced version is first installed on the new model KiWi of Wuling. Its biggest feature is that DJI has designed a universal software architecture that can be cut according to the vehicle’s needs on different models.

As a result, it can realize two modes: “coverage of speed range from 0 to 80 km/h” and “coverage of speed range from 0 to 130 km/h”.

In terms of scenes and functions, there are two key points: first, DJI can provide a standardized L2-level assisted driving system; second, it can also achieve high-level intelligent driving capabilities such as high-speed navigation and urban navigation assistance.

By cutting hardware, it can meet the needs of most car manufacturers. Taking KiWi as an example, the price of DJI’s intelligent driving version can be reduced to level 10.

Let’s first talk about two points of view:

  • “This is one of the top three basic L2 assisted driving systems that I have experienced.”
  • “Some details can still be optimized.”

Throughout the process, we roughly experienced these functions:

  • Turn on the signal to automatically change lanes.
  • Turning ability.- LCC and ACC
  • Handling of cutting in by front vehicle
  • Cone recognition and avoidance

It should be noted that KiWi is equipped with a basic L2 system, which, although capable of covering scenarios such as highways and city roads, differs from navigation-assisted driving in that the system on KiWi can only do some basic horizontal and longitudinal control.

Lane Change Limit

The lever-activated lane change (commonly known as automatic lane change after the turn signal is turned on) has exceeded the ability range of general L2, and the lever-activated lane change of KiWi EV even surpasses the intelligent lane change level of new force of car making.

It can accelerate or decelerate up to 15km/h based on the merging lane situation, which, in simple terms, means that the system mimics the logic of humans liking to step on the gas pedal to quickly change lanes during the lane change process.

  • Advantages: The speed of changing lanes is quick and the execution process is very decisive. The driver can clearly perceive that the vehicle is changing lanes decisively without any ambiguity.
  • Disadvantages: Due to the perception distance of the fisheye camera, if KiWi starts to execute the lane change while the following car is accelerating, it will still execute the lane change very aggressively and put a high psychological pressure on the driver.

Turning Ability

Turning ability is an important scenario that tests the sensing and control capabilities of a system.

Advantages:

  • It has a high passing rate for large curve turns. Based on my experience, it can pass through all except two extreme ramp curves.

  • It keeps the car in the middle of the lane very well during turns. The bad experience of turning is often that the car cannot be kept in the lane, and it may run out of the lane.

  • The acceleration and deceleration control of the vehicle during turns is very good. KiWi can simultaneously make directional corrections and deceleration during turns, but regarding acceleration, in actual experience, when KiWi lowers the vehicle speed too much, it may still accelerate, and the speedometer at that time will show an increase.

Disadvantages:

  • The deceleration control of the vehicle when entering the turn can be optimized. If the speed is too fast and the deceleration is not timely, it is easy to be dangerous, and the pressure on the driver will be too high.

LCC and ACC

KiWi’s LCC can keep the vehicle in the middle of the lane, and there is no deviation to the left or right during turns, whether it is a large curve or a small curve. The steering wheel lock-up torque is moderate after the assisted driving is turned on.

The ACC has a very smooth following capability, whether it is recognizing deceleration of the front vehicle or deceleration of a stationary front vehicle, two words: “fast” and “smooth“.The lack of details in ACC leads to inconsistent starting times when following other vehicles. For example, in congested road conditions, when the preceding vehicle is stationary, and the lane is unchanged, and both vehicles start at the same time with almost the same speed, the following vehicle’s distance from the preceding vehicle will be significantly widened, which makes it easy for other vehicles to cut in.

Preceding Vehicle Cutting-In Handling

The experience of this functionality of KiWi is both painful and enjoyable. First, let’s talk about the benefits. One of the primary reasons why assisted driving systems are often criticized is that:

  1. When following vehicles during congested times, if the reaction speed is not fast enough, and the preceding vehicle accelerates, it is easy to widen the distance from the following vehicle. At this time, it is easy for other vehicles to cut-in.

  2. When the vehicle is traveling at a uniform speed of 60 km/h, if another car cuts-in, and the following vehicle cannot recognize the cut-in vehicle, it will pose a danger.

  • The “happy” part of the KiWi experience is that the system has a quick reaction speed during vehicle-following, and there is not enough time for other vehicles to cut-in.

  • However, the “painful” part is that KiWi’s strategy is too aggressive. When driving at high speeds, if another vehicle suddenly cuts-in, the preceding vehicle almost collides with the KiWi, but KiWi still does not do any deceleration processing.

The engineer said that the comfort of the cut-in handling is related to the application of binocular stereoscopic vision. The accuracy and ranging precision of binocular recognition of preceding vehicles/obstacles are relatively high. Therefore, the system has strong confidence in making comfortable handling instead of frequent braking.

Consistency of the Experience

Since our experience system does not have high-precision maps, it is not like XPeng, NIO, or Ideal’s systems that will suddenly display take-over prompts or directly exit on some road sections where high-precision map information is lost or inaccurate. Therefore, the consistency of the system’s experience is very good.

Except for the handling at red light intersections, the system is consistent in its use on urban and high-speed roads.

Overall, the system developed by Dajiang Innovation has been handled very adeptly in low-speed congested road conditions in urban areas, where only a slight brake is applied, but the system does not stop completely. After the preceding vehicle moves, the following vehicle can quickly catch up with it. In high-speed road conditions, the system also has the function of lane-changing to avoid large trucks or small obstacles, and the subjective impression is that the system’s lateral oscillation range is smaller and gentler.

After experiencing it, the consistency and continuity of Dajiang’s system throughout the process are excellent:

  1. The success rate of lane-changing on fast roads is high;

  2. The use of urban roads is almost the same as high-speed roads;

  3. Even in heavily congested urban roads (around 4 pm on working days), where surrounding vehicles are queuing up, the system can still handle cut-ins smoothly and even complete low-speed lane-changing.

At less than RMB 6000 of cost, Dajiang Innovation has achieved adaptive cruise control (ACC), lane-keeping, and lever lane-changing functions, and the people who have experienced it give high praise. Can other companies keep up with Dajiang’s experience?## Industry Status

Speaking of DJI Intelligent Driving, it is necessary to mention the current status of the industry, which indicates the future market ceiling and trend of development.

According to the monitoring data of the Research Institute of High Intelligent Automobile:

In the Chinese market (excluding imports and exports) from January to July of 2022, a total of 10.63 million new passenger cars were delivered with insurance, among which 2.839 million were equipped with L2-level intelligent assisted driving as standard, a year-on-year increase of nearly 70% (when calculated by sales, plus a part of optional models, it may be close to 3 million).

Among them, except for Tesla, Ideal, and XPeng, which are self-developed for the system (with hardware from third-party suppliers), the other top ten suppliers are still traditional foreign Tier 1 suppliers, among which Denso, Bosch, and Continental are the top three.

Currently, among the mainstream single-camera ADAS programs provided by Tier 1 suppliers, Mobileye relies on its high-performance product scheme to focus on the middle and high-end markets, while Bosch seizes the market with lower costs, showing good performance in the domestic market but with a downward trend.

In fact, the ADAS racetrack led by Mobileye hasn’t seen such stunning technological breakthroughs and leaps for a long time, and the framework based on the six major safety functions has not had any breakthrough innovations.

The perception of 2 million pixels has also encountered a bottleneck for ADAS innovation: the upper limit of safety cannot be improved.

For example, the current mainstream ADAS has always scored low in small target detection such as pedestrians. In the AEB-P (refers to the AEB function that adapts to pedestrians, and AEB-P is “one more order of magnitude harder than AEB”) test conducted by the American Automobile Association (AAA), the auxiliary functions can hardly function at all when adults cross the road at night.

The reasons are multifaceted-narrow field of view, low pixels will greatly increase the instability of detection functions, and the core is still the stability of visual detection capability. Once the detection function goes wrong, AEB, as the last line of defense of safety, cannot be activated and played its proper role.

The single-camera perception scheme of 2 million pixels is easy to attack but difficult to defend. With more choices in the market and cost-effectiveness advantages becoming weaker, although the intelligent assisted driving system of the system is mature enough, high-level car manufacturers are starting to self-develop ADAS functions to achieve breakthroughs in software algorithms through self-development.The traditional 1V1R configuration requires one camera, one radar, and in more complex cases, multiple cameras such as three-eyed or multi-eyed. It’s a common practice to equip the system with a 150W pixel camera, which has limited functionality and may cause equipment redundancy.

In addition, for ordinary mid-to-low-end car companies, spending thousands of yuan to install ADAS with only warning prompts can easily make price-sensitive car companies hesitate due to the profit squeeze.

On the other hand, the new forces in the auto-making industry, represented by Tesla and WeRide, have started to explore from high-speed NOA to urban assisted driving, and even some car companies have equipped themselves with lidars as a standard. High-performance computing platforms and high-spec sensors drive the new forces in auto-making to new heights.

New Tier 1 companies, represented by Baidu, Huawei, and Momenta, are also exploring the field, but they are mainly focused on high-speed navigation or urban assisted driving, and there is currently no large-scale production. DJI, as a new member of Tier 1, once appeared on stage with its unique positioning.

Upon a closer analysis, the biggest difference between DJI and other Tier 1 companies, old or new, is that it has a more clearly defined product positioning. Compared with Bosch, DJI has more of a product mindset. Although the cost of DJI is higher than that of traditional 1V1R, the experience provided by DJI is superior and better suited to the users’ needs than the single-point function development.

Compared with the new Tier 1 companies, DJI’s product positioning is more obvious. The D80 and D130 series of DJI are aimed at the other companies’ combined park and drive products, and D130 is even upgraded to high-speed NOA when compared with D80.

In other words, DJI puts more emphasis on urban expressways than other companies, which optimizes specifically for the urban quick lane independently.

Now, let’s talk about why DJI’s product experience is so good.

Speaking of independent optimization, even if you compare the leading new auto-making forces in assisted driving, such as XPeng, which has the best assisted driving experience among the new forces, there is still a gap in user experience that cannot be overcome. The lack of differentiation in publicity has forced them to focus on urban assisted driving, and even in the NOA sub-track of urban highways, they will face fierce competition with DJI which is prioritizing investment.

KiWi Lingxi’s smart driving hardware includes:

  • One set of forward-looking binocular camera;
  • Four surround-view fisheye cameras;
  • One forward millimeter wave radar;
  • Twelve ultrasonic radars.
    The biggest difference between DJI’s hardware requirements and costs and those of the new forces on the market is the gap that exists. The new forces should exceed DJI in terms of experience, and if the experience is not as good, it will only highlight DJI’s unique features.

If focusing on polishing a single scene is the “Battle of the Chariots” at the product positioning level, its unique technological advantages are the spears it can use to penetrate the market, and the advantages from the drone industry cannot be ignored.

There are two major advantages to DJI’s in-car capabilities: one is the accumulation and precipitation of intelligent system development over many years, with the ability to produce R&D output in the millions; the other is the highly vertically integrated capability of the software and hardware supply chain.

KiWi’s Lingxi intelligent driving system is not luxurious in terms of sensor application, and it can even be said to have compressed costs to the limit. In the face of the hardware pressure that the entire industry is facing, DJI stated:

In fact, engineering development should be separated from user experience. Regardless of the architecture of the system, the ultimate goal is to provide users with a “safe and user-friendly” system. Therefore, we did not design hardware architecture for cost compression, but based on whether such hardware architecture can give users a better experience, at least better than the experience of similar solutions on the market. This is the logic of our system design.

The use of a 2-million-pixel binocular stereo perception camera is because it has strong perception ability for distance and depth. In addition to accurately identifying dynamic and static targets and road elements, it can also obtain key point cloud depth information for any obstacle, effectively reducing the missed detection rate.

The so-called arbitrary obstacle refers to: the system does not need to know what the obstacle is specifically, it only needs to know that there is an obstacle ahead, recognize it, slow down, and avoid it.

In contrast, for a single-camera solution to identify static obstacles, additional sensor data needs to be added, and when doing algorithm fusion, there may be problems with the large difference in accuracy between different sensors in heterogeneous architectures.

To summarize:

In a single-camera scenario, depth can only be inferred when moving, and the credibility is not high.

Binocular stereopsis is the most commonly used method, but it is actually impossible to obtain depth information in shadow locations, that is, places visible to the left camera but not to the right camera. The current interpolation-based methods are used to infer depth, which is not the true value.

Three cameras can eliminate these shadows based on the middle camera. The disparity of a single pixel can be calculated multiple times, making the error small enough. Sub-pixel disparity can also be more reliable.

However, the computational cost of three-camera systems is at least twice that of two-camera systems, but for visual algorithms, it is useless if it cannot even reach five frames.Compared with binocular vision, ternary vision is not cost-effective in terms of computational allocation. Running the same algorithm on ternary and binocular vision needs twice the computational power. However, if this extra computational power is used to improve stereo matching algorithms, smaller objects or clearer textures can be seen, and even the same accuracy as ternary vision can be achieved.

Therefore, installing a binocular stereo vision system does not mean that depth perception can be done accurately and effectively, because there are two main technical difficulties in binocular stereo vision: first, high-precision calibration is difficult; second, high requirements for algorithm and computational power.

It is based on two images observing the same object through two viewpoints, simulating human visual perception.

Specifically, two cameras at different positions (or one camera after rotation and movement) are used to take pictures of the same scene. Then, the space point is calculated by the principle of triangulation to calculate the pixel disparity between two images, according to which the depth information of the target object is restored, and finally the three-dimensional shape of the object is restored based on the depth information.

Therefore, the relationship between pixel coordinates and world coordinates needs to be aligned, which is calibration. And this series of calibration requires strong computational power support, but the surprising thing is that DJI’s system only has about 20 TOPS computational power.

With such a computational power, to drive a binocular stereo vision system and achieve such capabilities, why DJI?

This has to start with DJI’s drone technology. Drones need to autonomously complete flight tasks such as obstacle avoidance, hovering, and following through cameras, which is essentially the same as the capabilities required by intelligent cars, both of which allow devices to safely reach point B from point A.

DJI had already developed online self-calibration technology for drones, which can recalibrate the correct position relationship between left and right eyes through the natural features and semantic features of real-time images.

This accumulation of algorithms enabled DJI to accumulate a wealth of experience in the development of vehicle-side algorithms, mainly in several aspects: 1) recognition of target objects; 2) optimization of algorithms’ capabilities; and 3) matching of software and hardware.

It can be said that the accumulated algorithmic capabilities in the drone field laid the foundation for DJI’s intelligent driving system to use visual perception as the basis of the system.

In addition to binocular vision, DJI has also added a surround fisheye camera. The industry often uses fisheye cameras for parking functions, but DJI has independently developed a high-precision 3D object detection algorithm, which can be applied to driving scenarios at the same time, supporting fisheye and pinhole camera models and meeting the detection requirements of lateral vehicles and VRUs, supporting functions such as lane changing and overtaking.

In fact, this is also a kind of comprehensive parking and driving solution.Online strong local perception based on binocular vision, including visual-inertial fusion localization technology VINS, binocular BEV lane detection technology, binocular 3D object detection technology, as well as drivable area/obstacle detection, dense depth estimation and online reconstruction combining deep learning and geometry.

The system can accurately estimate the vehicle’s position, orientation, and shape, and detect fine parts such as wheels, supporting identification of irregularly shaped, obstructed or truncated vehicles.

In addition, it can detect high curvature bends and uphill/downhill lane markings with higher precision, resulting in lower lateral error and stronger lane keeping ability, and support rich semantic element detection of complex urban road structures such as forks/merges, curbs, and safety islands.

This is the technology that enables DJI to handle high curvature bends.

These advantages are the reason why DJI has been able to succeed in the field of intelligent driving by beating competitors such as new players in terms of cost and Tier 1 companies in terms of user experience.

What kind of ADAS does the market really need?

After all that, I would like to share some thoughts on what kind of ADAS the market really needs.

Currently, most automakers still rely on the 1V1R scheme, but due to technological stagnation and limited room for optimization, this scheme’s experience has not improved significantly. On the other hand, the integration of vehicle driving and parking demonstrated by DJI’s solution offers richer functionality and an experience that surpasses traditional schemes, while keeping costs under control.

I believe that after more market education, the industry will answer this question: Is traditional lane-keeping functionality still important, or is high-speed navigation in different scenarios more important?

At the moment, I think that suppliers should focus on polishing the basic ADAS capabilities in depth to replace traditional 1V1R offerings.

High-speed navigation ADAS has already been implemented in new energy vehicle models. They are conquering a new field, making city ADAS an essential element in building a comprehensive product lineup.

However, for suppliers, both high-speed and basic ADAS capabilities erode the market of traditional vendors. Therefore, they should focus on optimizing user experience in depth, regardless of whether they are DJI or new players. As consumers, what we need is ADAS with an outstanding user experience, not endless technical PR. The market will ultimately determine which solution is better.

This article is a translation by ChatGPT of a Chinese report from 42HOW. If you have any questions about it, please email bd@42how.com.