XNGP: Small Steps Towards Full Autonomous Driving for Xiaopeng Motors

Author | Zhu Shiyun

  Editor | Qiu Kaijun

In April, Xpeng G9 and P7i Max owners became the second group of users worldwide to start using the smart driving system with “Red Line” SR (Automatic Driving Environment Simulation Display).

The “Red Line” is not merely a UI color scheme, but rather the feasible driving space delineated by the XNGP Intelligent Driving Assistance System. It currently covers a width of 8 lanes and a forward distance of 100 meters. Apart from Tesla, Xpeng is the only company capable of this.

From now on, Xpeng Motors has started to get off the “rail” of high-precision maps and drive by “recognizing roads” on its own.

On March 31st, Xpeng Motors started to push the new OTA version ── Xmart OS 4.2.0, releasing the first-stage capabilities of XNGP: G9 and P7i Max models in Shanghai, Shenzhen, and Guangzhou, and Xpeng P5 P series in Shanghai with high precision map coverage area open for point-to-point urban NGP (Intelligent Navigation Assistance Driving); in areas without high precision map coverage, an enhanced LCC version capable of crossing lanes, and recognizing traffic lights while passing straight through intersections is offered.

Xpeng Motors' Vice President of Autonomous Driving, Wu Xinzhou, at the XNGP Technology Sharing Conference

XNGP is the ultimate form of intelligent driving assistance before the realization of autonomous driving.” Xpeng’s Vice President of Autonomous Driving, Wu Xinzhou, defined XNGP as such and stated that Xpeng is more than a year ahead of its domestic competitors in the smart driving field, and even when compared to Tesla’s FSD in North America, its performance in Chinese scenarios will be equally competitive.

Is that the case? And what does XNGP mean for the troubled Xpeng Motors?

Reach L4 Level-Ready in Just Two Years with BEV

XNet (Xpeng’s next-generation perception network architecture) is what gives Xpeng the confidence to discuss the last generation of smart driving systems and intelligent cars before tackling autonomous driving.

At present, I have not seen the limit of XNet, nor have I seen any problems we cannot solve. We have great confidence in using the vision network as effectively as high-precision maps.” Wu Xinzhou told “Electric Vehicle Observer.”XNet is the first domestically produced BEV (bird’s-eye view) perception architecture to be installed on vehicles in China.

Xpeng XNet Network Architecture

Previously, domestic BEV perception stitched single-frame images from multiple cameras into a bird’s-eye view for deep learning, then compared the results with the underlying static environment provided by high-precision maps through logical judgment algorithms, requiring LiDAR to supplement depth information.

As a result, vehicles “see” discrete spacetime “snapshots,” requiring high-precision maps as “rails” to ensure continuous movement.

However, starting with XNet, similar to Tesla’s “Hydra,” Xpeng learns from multi-camera, multi-frame data, directly outputting dynamic and static environmental perception results with a temporal dimension.

In other words, vehicles “see” sequences of “short videos” arranged in chronological order, through which they “understand” the real world.

On one hand, the driving system can construct a high-precision map for a section of the road and “plan” drivable space by semantically understanding objects such as lane lines, traffic lights, signs, barricades, trees, and buildings in the static environment while using timestamps. It no longer relies on high-precision maps for guidance.

On the other hand, dynamic objects with timestamps allow the system to track their speed and trajectory, predict their future movements, and assist with interaction and game strategy.

Seeing a continuous world through XNet is the first step for Xpeng to venture beyond high-precision maps and achieve navigation-assisted driving in map-free areas.

Next, Xpeng XNGP’s prediction model will transition from the current logical judgment to a neural network-based framework, but the decision-making layer for planning and control will continue to use logical judgment algorithms.

“Control layers require high explainability (neural networks are black boxes with unexplainable results and processes). I have set a clear guideline for the team: temporarily refrain from using deep learning networks to solve problems that can be addressed with mathematics. For a long time, we will not consider replacing the planning and control layer (with neural network algorithms).”Wu Xinzhou stated, XNet’s perception range currently exceeds 100 meters. Based on this, XNGP will achieve point-to-point intelligent navigation with an almost L4 experience between 2024 and 2025. Under human supervision, the system will operate in all scenarios without intervention.

Profitability, functional availability, and rapid urban expansion

XNet moves Xpeng further away from reliance on high-precision maps and closer to generating revenue through advanced intelligent driving systems.

“XNet truly allows us to break free from the constraints of high-precision maps, or at least reduce our dependence on them,” said Wu Xinzhou. “This is the reason we’ll be able to roll out XNGP in dozens of map-less cities in the second half of the year.”

Xpeng LCC Enhanced navigates autonomously through an intersection:

During the test drive on March 31, the LCC Enhanced experience, made possible through XNet, was even more impressive than the city NGP achieved using high-precision maps:

The test vehicle could brake and accelerate based on traffic signals while navigating straight through intersections with no lane lines and mixed vehicular and pedestrian traffic, even seizing the opportunity to pass through a yellow light.

This indicates that the LCC function can recognize and understand the meaning of traffic signals, align itself with the corresponding lanes based on the signals, “fill in the blanks” for missing lane lines at intersections by analyzing adjacent lane markings, ground traffic signs, and other vehicles’ behavior, create logical connections between yellow light duration and the vehicle’s speed, and interact with mixed vehicular and pedestrian traffic.

Xpeng LCC Enhanced maneuvers around and disembarks passengers:

This is just the beginning.

Wu Xinzhou indicates that at present, Xnet has completed the geometric aspects (such as ground, guardrails, traffic lights, trees, buildings, and other static objects’ dimensions and relative positions) of the static environment, while the understanding of semantics (e.g., guardrails are impassable, slow down for zebra crossings, turn left at arrow signs, and navigate around bollard lines) is still in progress.

“Although we have already achieved the ability to output both static and dynamic environments, there is still a great deal of work required to fully realize the potential of Xnet and apply it in various cities. We need to complete the semantic understanding of the elements within the environment, integrate this sensing capability into the full-stack algorithms, and undergo testing,” Wu Xinzhou explains the timeline for the next phase: “In another six months, we hope to make the experiences with and without maps extremely similar for everyone.

According to the plan, in the second half of this year, Xiaopeng will enable LCC-based turn capabilities in non-map areas in most major cities across the country, making the user experience in non-map areas comparable to that of urban NGP.

Furthermore, the non-map LCC Enhanced version and urban NGP with maps are not entirely separate tracks.

Electric Vehicle Observer learned that Xiaopeng will accelerate city expansion by combining the use of maps and non-map technologies during the urban NGP implementation process.

The validation and completion of high-precision maps are indispensable steps for automakers in the process of acquiring and using maps, and they consume a significant amount of work time. In the future, Xiaopeng will only validate and complete maps for hotspot areas, connecting the blank regions in high-precision maps using non-map capabilities to achieve point-to-point urban navigation functionality.

Previously, due to incomplete high-precision map coverage of individual cities, and a limited scope of covered cities, urban navigation became a “chicken ribs” feature with limited usefulness.

However, with the continuous improvement of non-map capabilities, Xiaopeng’s urban NGP usability within individual cities has been increasing, rapidly covering more cities, providing a solid foundation for it to become a truly valuable selling point.

“We firmly believe that XNGP will become a highly usable product in the field of assisted driving. When users can use it every day and find it effective, they will naturally be willing to pay for it. Moreover, (function-based payment) will soon be realized, which is a very clear mission for this generation (system),” states Liu Yilin, Senior Director of Product at Xiaopeng’s Autonomous Driving Center.## Scale, Rapid Iteration + Cost Reduction

Tesla mass-produced the “Hydra” in 2021, and by April 2022, its FSD test versions had been installed in over 100,000 vehicles. By the end of 2022, the number had exceeded 400,000, generating $324 million in revenue in Q4 of the previous year.

Will Xpeng replicate Tesla’s scaling with the mass production of XNet?

Technically, it’s possible.

BEV architecture, Transformer algorithms, and RNN convolutional neural networks are all open-source technologies, neither mysterious nor exclusive.

However, a vast engineering gap lies between open-source technology and mass-produced functionality. This gap was so significant that Tesla revealed its solution in 2021, and in the closely-following China, the first “student” to provide an answer only emerged in 2023.

To bridge this engineering gap, there must be technical capabilities for excavating algorithms, computing power, and data potential, as well as the ability to drive rapid cycling of these three factors within a system.

In terms of computing power, Xpeng has reserved substantial “development space” on the vehicle end; XNet on their dual Orin models occupies only 4.5% of the computing power; the 600PFLOPS Fuyao Supercomputing Center serves as XNet’s dedicated training ground.

In data, the shadow mode and simulation engine provide ample, targeted training data for the golden backbone network of Xnet to undergo self-supervised learning, enabling rapid iterative releases for specific scenarios.

Organizationally, in 2021, Xpeng achieved urban NGP implementation on the 30TOPS NVIDIA Xavier, significantly exercising its autonomous driving team and refining its system. “Our R&D system has reached a stable state, and the collaboration between teams has become very smooth, giving us great confidence in embarking on the path toward mapless, point-to-point, and cost-reduction solutions,” said Wu Xinzhou.

By overcoming the engineering gap in mass-producing BEV architecture, Xpeng Automobile has established software and hardware capabilities and an R&D engineering system. This not only assists in rapidly, broadly, and effectively implementing XNGP but also serves as a powerful tool for cost reduction.Translate the following Markdown Chinese text into English Markdown text in a professional manner, preserving the HTML tags within Markdown, and output only the result:

After all, based on Tesla’s “Hydra” and DOJO, if they can achieve FSD with a hardware cost of 1,000 US dollars, Xpeng should be able to reduce the cost of its intelligent driving system by another 50% as demanded by the company’s founder He Xiaopeng.

However, from a business perspective, Xpeng XNGP still faces great challenges in scaling up as quickly as Tesla. After all, Tesla’s sales volume benefits from its excellent car design and manufacturing capabilities.

Xpeng P7 i

Xpeng, with its commitment to full intelligence, once prioritized investing in intelligent technology over automotive industrial design, which led to criticisms of the exterior and driving comfort of the P5 model.

Although the P7i demonstrates Xpeng’s renewed focus on and commitment to industrial design, it still falls short of Tesla in terms of handling and comfort, while competitors like SAIC Fei Fan and Changan Shenlan are hot on its heels.

At the same time, the recently adjusted organizational structure and marketing system have collided with an unprecedented price war. Apart from XNGP, Xpeng needs more “ammunition”.

The good news is: Xpeng has announced that it currently has a sufficient number of orders for the P7i model and is actively working to increase its production capacity to meet demand.

**Wu Xinzhou stated that the *next major update will push a brand new high-speed NGP feature to users with the XNGP system, providing a high-speed navigation experience very close to L4 level. “No procrastination, no degradation, no freeze, no disturbance, and zero takeover.”*

Xpeng’s daring leap has already begun to take off.

This article is a translation by ChatGPT of a Chinese report from 42HOW. If you have any questions about it, please email bd@42how.com.