Introduction

There are three different versions of the video available online based on its length: a one-minute, a three-minute and a seven-minute version. After collecting them, we are providing the longest and most informative version below.

Smart car drives for 40 minutes autonomously in city area

When I first saw the performance of the car in the video, I thought it was a model installed with laser radar because of its high precision in identifying key targets and its ability to handle the scenes of serious GPS antenna shielding, alternating old and new lane markings, and chaotic social vehicle competition. Later on, after verification, I found out that this was actually the effect of a version without lasers, making it truly worthy of careful analysis.

From the video, we can see that this autonomous driving experience includes city-area unprotected left turns, intersection driving, up and down elevated highways, high-speed cruising, and following in congestion. In short, it encompasses most of the driving scenes that we encounter in our daily lives.

Below, I will evaluate the ability of this autonomous driving vehicle from a technical perspective.

Technical Details of Autonomous Driving Without Control

Hardware Configuration

First, let’s take a look at the hardware configuration of the car in the video.

This Smart car equipped with 12 cameras, 5 millimeter-wave radars, and 12 ultrasonic wave radars.

It is worth mentioning that Smart has reserved a position for the laser radar in its sensor architecture, allowing for hardware upgrades by choosing to install laser radar. This provides the possibility of hardware upgrades for mid-to-low-end models. In addition to the hardware upgrade of the laser radar sensor, Smart cars can also be upgraded for the computing platform. The main control chip has been upgraded from Nvidia Xavier (30 TOPS) to Nvidia Orin X (500+TOPS) to support more advanced automated driving functions.

Perception and Planning Control Ability## Evaluating the Quality of an Autonomous Driving System

The quality of an autonomous driving system can be mainly evaluated from two aspects: perception and planning and control ability.

We can see some environmental perception and vehicle control information of the autonomous driving system in the upper right corner of the video. These pieces of information include vehicle speed, steering wheel angle, target detection results, predicted trajectory, and planned trajectory of the vehicle.

Environmental perception and vehicle control information display

Perception Ability

Perception is a general term for the environmental awareness of autonomous vehicles. Perception is generally divided into perception of the vehicle’s position and perception of the surrounding environment, which correspond to the modules of localization and target detection in the autonomous driving system, respectively.

Localization

Localization is one of the most basic capabilities of an autonomous driving system. Localization refers to the knowledge of where the autonomous vehicle is, so that the system can correctly plan the global trajectory to reach the destination.

As the vehicle in the video is not equipped with a LiDAR, the localization module can only be realized through the perception information such as GPS/IMU, cameras and mmWave radar. The process of a localization module that does not depend on LiDAR is roughly as follows: rough localization → lateral precise localization → longitudinal precise localization.

First, the rough position of the vehicle on the high-precision map is determined based on the historical GPS trajectory, and the high-precision map information of this area is obtained to achieve rough localization. Then, according to the lane detection of the camera, the relative position relationship between the vehicle and the lane or lane marking can be accurately determined laterally, and then the lateral precise localization can be achieved based on the barrier detection ability of mmWave radar. Finally, the longitudinal precise localization can be achieved by matching the markers such as arrows and numbers on the ground with the corresponding markers on the high-precision map.

The challenges of a localization system without a LiDAR mainly involve the following points:

  1. GPS antenna is severely blocked

Starting from 2 minutes and 4 seconds in the video, the vehicle is located in a multi-level viaduct. The GPS antenna in this area is severely blocked, and the GPS signal is basically unavailable. In addition, it is about to enter a bifurcated road and a large curvature ramp, which poses some challenges to the localization system. However, the vehicle performs stably without deviation or jitter.

Severely blocked GPS antenna scene

  1. New and old lane markings due to changes in reality

At 4 minutes and 58 seconds in the video, new and old lane markings appear on the road surface. At this time, the perception system will generally detect multiple lane markings. When the localization system uses the lane markings for lateral positioning, it may cause localization errors due to the wrong selection of lane markings, resulting in lateral deviation.

The autonomous driving vehicle in the video did not experience lateral shaking when driving on this road section, indicating that the system has the ability to stably handle the mismatch between high-precision maps and reality.New and old lane marking scene

Overall, this non-lidar-dependent positioning system performs stably in both highways and urban areas, with a high overall robustness. Having a stable and reliable positioning system is one of the most basic and important capabilities for autonomous driving.

Object Detection

From the video, we can see that the autonomous vehicle encounters cars, buses, cyclists, tricycles, pedestrians, and stationary electric bikes during the process of passing through urban and elevated loops. These are common traffic participants in urban road conditions. It is said that in addition to the target detection shown in the video, it can also identify carts, power-assisted bicycles, cone buckets, water buckets, and more. We look forward to seeing more related videos released by IM Auto.

In addition to the rich types of detection, the accuracy of target detection is also important. As seen in the upper right corner of the video at 4 minutes and 31 seconds, in congested scenarios, the lateral position of the lead car is very stable when the following car overtakes from the rear.

Perception of side vehicles in a low-speed congestion scene

In a low-speed and close-distance situation, only the millimeter-wave radars installed on the four corners of the car can detect the position of the surrounding cars. However, at this time, the radar signal-to-noise ratio is not high, and unstable detection results (as shown below) are prone to occur. A tracking algorithm of relatively high performance is needed to achieve the effect seen in this video.

Lateral movement of right-moving vehicles

Planning and Control Capability

From the speed bar in the video, it can be seen that the overall longitudinal (vehicle forward direction) control during the entire autonomous driving process is relatively smooth, with few sudden brakes. Except for the active merging into the congested traffic flow at 4 minutes and 26 seconds, whether it is following in congested traffic or the preceding vehicle CUTIN, the driver seldom leans forward, indicating that the number of heavy braking occurrences is very low, and the driving comfort is probably quite good.

Active merging into congested traffic flow speed curve

In terms of lateral control, it has basic functions such as overtaking and changing lanes through navigation. In addition, regarding lateral control, the 2-player merge scenario starting at 3 minutes and 12 seconds in the video is worth mentioning.

Lateral overtaking mergeThe car I was in was engaged in a lane-changing game with another car at a confluence, but found that the other car had occupied part of the lane, which prevented the completion of the lane change. Therefore, the car entered into a “Hold” logic, waiting for the other car’s action. After running parallel with the other car for 2 seconds and finding that it had no intention of exiting the lane, the car then changed its logic to return to its original lane. The process was smooth and the decision-making was very human-like, which posed a considerable challenge for the coordination among perception, prediction and control. Engineers must have spent a lot of effort in tuning this logic.

This scenario reminds me of the previous media release of an autonomous driving video by a company called “JiHu”, which is using Huawei’s autonomous driving system. It also needs to deal with a complex gaming scenario, in which the opponent is oncoming cars and electric scooters along the road.

ARCFOX autonomous car in city driving

Although the sense of oppression for the driver is not as strong in low-speed autonomous driving as in high-speed scenarios, there is no strong impulse for the driver to take over. However, the ability to pass smoothly in such a complex scenario of mixing with people and cars fully illustrates the importance of gaming logic in autonomous driving.

Overall, the 40-minute video released by ZhiJi truly shows their good perception and control capabilities, and their performance on actual roads is also remarkable.

Conclusion:
ZhiJi Auto has achieved remarkable unmanned driving performance in the city of Shanghai, even without the use of LiDAR. Although there was some jitter in the detection of pedestrians and some cyclists on the roadside in urban scenarios, these non-critical targets did not affect the car. It is believed that after the addition of LiDAR, the sensing ability and accuracy of ZhiJi’s autonomous driving system will be greatly improved, and it will be able to handle even more complex weather and road conditions.

The big boom of autonomous driving in China will be at the beginning of 2022. At that time, there will be many heavyweight models equipped with urban autonomous driving functions, and users who choose high-end models will have the priority of experiencing autonomous driving functions in all scenarios (city, highway, parking lot).

This article is a translation by ChatGPT of a Chinese report from 42HOW. If you have any questions about it, please email bd@42how.com.