Horizon Unveils New BPU Architecture for Smart Driving: BPU Nash - Built for Transformer and Large Models

In 2023, the development of intelligent driving will reach a critical turning point.

Specifically, as major automakers are pushing forward, urban navigation assisted driving functions are beginning to appear in some mass-produced models for testing, and the industry is exploring commercial models, giving the entire industry an opportunity to see the true large-scale landing of intelligent driving.

However, there are still many problems in the user experience of current intelligent driving, and there is still a huge room for development at the technical level, and it still belongs to the L2 category-while the future evolution of intelligent driving in terms of technical development and commercial landing is still being explored by upstream and downstream players in the entire industry.

Among these players, Horizon, which is located upstream in the intelligent driving industry chain, is of particular concern to us.

On the one hand, based on existing commercial mass production practices, Horizon has gradually found a path to promote high-level intelligent driving towards large-scale landing; on the other hand, looking at the future development of intelligent driving, Horizon has not stopped there, but has continued to look for the optimal solution for intelligent driving computation from the breakthrough of core underlying technology.

The technical breakthrough point that Horizon has chosen is the new generation of BPU architecture.

New generation of BPU architecture, born for Transformer and large models

On April 18th, just before the first day of the 2023 Shanghai Auto Show, Horizon officially released the latest generation of BPU intelligent computing architecture through a themed press conference called “Journey and Sharing, A Road Taken Together”-the latest release of this computing architecture named BPU Nash.

From a technical perspective, BPU Nash has the following features:

  • A unique design of a three-level on-chip storage architecture, with high-efficiency coordination between cores, and ultimate optimization of bandwidth bottlenecks under large parameters.
  • Equipped with a multi-chip acceleration engine, flexible data flow between engines achieves high energy efficiency and low bandwidth utilization.
  • A data transformation engine that flexibly supports Transformer small operators.
  • A floating-point vector acceleration unit with general and flexible characteristics, satisfying the precision requirements of key operators.
  • Tight-coupling heterogeneous computing units efficiently accelerate different types of data processing.
  • Multi-directional data flow with efficient and flexible intra-core, inter-core, and inter-chip communication, achieving dynamic computation scheduling and flexible tuning.
  • Virtualization technology transparently enhances multi-task parallel processing capabilities.
  • Data-driven power optimization reduces power consumption by 30% for neural network data dynamic range characteristics.From a technical standpoint, the overall design philosophy of BPU Nash is to constantly emphasize improvements in bandwidth, energy efficiency, versatility, parallel computing capabilities, and flexibility while also emphasizing the importance of power reduction.

According to the official statement, BPU Nash is specifically designed for large parameter Transformer and large-scale interactive games, and is optimized for cutting-edge algorithms to achieve the best algorithm efficiency. Additionally, it uses AI-assisted design to greatly enhance the architecture’s programmability.

Furthermore, it also has a super-heterogeneous computing architecture that can significantly enhance the diversity of computing power.

Regarding the BPU Nash architecture, Horizon Robotics co-founder and CTO Huang Chang told 42 Garage that it was developed for the next five to ten years and is a scalable architecture that includes computing units, storage units, etc., and can cover single-core or multi-core systems.

Therefore, chips based on the BPU Nash architecture can theoretically achieve computing power ranging from tens of TOPS to 1000 TOPS, and their transistor count can also scale accordingly.

Huang Chang stated that the emergence of the BPU Nash architecture is based on an important development trend in the current computing chip industry, which is a new computing method for large-scale, data-driven, large-parameter models and de-rationalization. In the future, there will be more and more types of computing, based on data-driven large parameter, large computing, and large models, which should be run on BPUs rather than CPUs or GPUs.

In the view of 42 Garage, BPU Nash is also born for the latest technological trends in the field of intelligent driving.

After all, as of 2023, the entire intelligent driving industry has formed a very obvious technological trend. That is, the intelligent driving technology solution based on BEV perception and Transformer large model has become a technical solution promoted by many players, and it will continue to be optimized with increasing data volume, computing power, and algorithm refinement.In this context, the advent of BPU Nash can be said to be timely.

Of course, if we look at the overall development process of Horizon’s BPU, BPU Nash is actually the latest iteration version of Horizon’s intelligent driving dedicated computing architecture BPU, which also reflects the combination of algorithms, compilers, and architectural design, and essentially reflects Horizon’s “intelligent evolution” towards the latest neural network architecture and higher-level automatic driving applications.

Through such “intelligent evolution,” Horizon hopes to create the “strongest brain” for intelligent driving.

The BPU architecture has completed three generations of mass production verification

In fact, behind the release of the latest generation of BPU architecture, in addition to Horizon’s understanding of future technological trends, there is also an important premise: since its inception, BPU has undergone three iterations and has strongly supported Horizon’s commercial landing process.

In fact, at the beginning of its birth, the BPU architecture was mainly built on the dimensions of technological advancement and business exploration. In August 2019, based on BPU Bernoulli 1.0, Horizon officially launched China’s first vehicle-intelligence chip, Journey 2, which can more efficiently and flexibly handle multiple AI tasks, perform real-time detection and accurate recognition of multiple targets, and can be applied to automatic driving visual perception, crowdsourcing high precision mapping and positioning, visual ADAS and smart human-machine interaction.

In March 2020, seven months later, through cooperation with Changan Automobile, Horizon’s Journey 2 chip was first installed on the UNI-T model to achieve the implementation of the DMS function in the intelligent cabin field. Although Horizon and the automaker’s first step did not directly enter the ADAS field, this cooperation allowed the shipment of the Journey 2 chip to exceed 100,000 in 2020.

This also marks that Horizon has obtained the recognition of the automaker, and its investment in vehicle-mounted intelligent chips has paid off.

Also in 2020, based on BPU Bernoulli 2.0, Horizon released the Journey 3 chip based on the 16nm process technology, which has 5TOPS of computing power and a typical power consumption of 2.5W. It can support various application scenarios such as high-level auxiliary driving, intelligent cabin, automatic parking assistance, high-level automatic driving, and crowdsourcing high-precision map positioning – in terms of computing power, it is a chip that can confront Mobileye EyeQ4 head-on.For Horizon, the commercialization of Journey 3 is more crucial – it was selected by NIO and mounted on NIO’s star model, the 2021 NIO ONE. As a result, more and more car manufacturers have approached Horizon for collaboration on intelligent driving chip technology, and the path to commercialization of Horizon has been accelerating.

image

In July 2021, based on BPU Bayesian, Horizon released its third-generation automotive-grade chip, Journey 5. This can be said to be a truly domestic high-performance chip with a maximum computing power of 128 TOPS, supporting 16-channel camera perception input, and supporting advanced sensor fusion prediction and planning control required for high-level autonomous driving. It is worth noting that Journey 5, based on the BPU Bayesian architecture, is the only high-performance autonomous driving chip in the world that has achieved mass production.

In September 2022, the Horizon Journey 5 chip was mounted on another star model of NIO, the NIO L8 Pro.

In fact, since the mass production announcement of Journey 5 on the NIO AD Pro in September 2022, the shipment volume of Journey 5 has exceeded 100,000, and it has obtained mass production contracts for nearly 20 models from 9 car manufacturers, including NIO, BYD, Weilai, and EA. As well as foreign and Sino-foreign joint venture car companies; and more models will be mass-produced and landed with the cooperation of Journey 5 this year.

Taking into account the overall evolution of BPU, since the first mass production announcement in March 2020 until now, the shipment volume of Horizon’s Journey series chips has exceeded 3 million, and has achieved mass production cooperation with more than 20 car manufacturers on over 120 models, accompanying users for billions of kilometers.

It can be seen that based on the continuous evolution of BPU, Horizon’s chosen technology path has been fully verified at the commercialization level.

In the past three years, Horizon has successively mass-produced Journey 2, Journey 3, and Journey 5 chips. The single-chip computing power has increased from 4 TOPS to 128 TOPS, and real computing performance supported by the Journey 5 of processing 1,718 frames of images per second has driven the leap from ADAS to high-speed NOA. In order to achieve a 1000-fold improvement in the average MPI of autonomous driving in 5 years and meet the requirements of constantly innovative algorithm and model scale, the computing power and bandwidth required for autonomous driving still need to be continuously improved, and Horizon believes that this still needs to be increased by at least 1 to 2 orders of magnitude.

It’s worth noting that Horizon has proposed an end-to-end algorithm framework based on BEV+Transformer this year, and this architecture has been loop-verified on Journey 5. Including pure visual BEV static and dynamic environment perception technologies will soon reach mass production status. Obviously, this trend is closely related to the forward-looking design of BPU Nash.

Overall, the continuous evolution of Horizon’s BPU is the result of the combined development path of intelligent driving technology and business landing practice. It has strong technological foresight and clear business continuity.

Idealism and Realism of Horizon

The release of Horizon’s new generation of BPU comes with an undeniable background: intelligent driving is moving towards large-scale production and landing.

Based on this background, Garage 42 believes that intelligent driving has formed some new trends at the landing level. For example: from the perspective of mass-produced cars, relying on stacked processors for computing power is no longer the mainstream approach and it is difficult to achieve user value matching hardware investment. At the same time, under the premise of meeting the demand for computing power, many players in the industry pay more attention to continuously optimizing and iterating software and algorithm levels.

This point also strongly echoes Horizon founder and CEO Yu Kai’s views on intelligent driving. Yu previously stated at the China Electric Vehicle 100 People’s Conference:

“The computing power of chips is not completely proportional to user experience. Simply stacking computing power on chips cannot create a better intelligent driving system. The computing power of current intelligent driving systems ranges from dozens of TOPS to 1000 TOPS, but in fact, the difference in user experience is not so significant. So what we really need to do now is to enhance the L2+ user experience through continuous algorithm optimization and more data, approach the system engineering limit, and create value for consumers.”

It is precisely based on this consideration that the latest generation of BPU from Horizon chose to build a unified computing architecture based on Transformer for large parameters and optimize its computing efficiency and reduce power consumption. Essentially, this reflects Horizon’s persistence in the combination of software and hardware on the technology path.It is worth emphasizing that Yu Kai’s attitude towards the future development of autonomous driving is very calm.

He believes that the development of autonomous driving faces the most severe challenges in system-level technology, open natural scenes, system randomness and uncertainty, as well as the game among time-varying systems and multiple subjects; it may be the most challenging system engineering in the history of human industry. Therefore, intelligent driving will remain at the L2+ stage in the next ten years, and fully autonomous driving will only be achieved on some dedicated roads.

Therefore, what the entire industry can do is to continue optimizing the user’s advanced assisted driving experience. What Horizon does is based on a combination of software and hardware, continuously advancing in computing power, model scale, algorithms, data, infrastructure, etc., to promote the gradual development of intelligent driving.

In the eyes of Garage 42, this is clearly a more practical and industry-oriented strategy.

At the same time, facing the overall development of the intelligent driving industry, Horizon’s self-positioning is also very practical and open. Specifically:

Positioned at Tier-2, adhering to the business model of “flexible and open, rich and frugal” and through various ways such as open software IP authorization and BPU IP authorization, Horizon creates an “ARM + Android” model in the era of intelligent automobiles. Based on the open technology solutions of “chip + toolchain + reference algorithm”, it helps car companies and industrial partners efficiently develop and implement differentiated intelligent driving solutions.

Even bottom-level BPU IP authorization is included, which is enough to show that Horizon’s open strategy is extremely thorough.

However, while adhering to a rational and practical landing strategy, Horizon still has ideals and firm beliefs about the future development of intelligent driving, that is: within the next ten years, electric vehicles will have autonomous driving systems as standard equipment, which take over once for every 100,000 kilometers, with commuting efficiency 10% faster than humans, 5-star comfort, and commuting range able to cover 99% of roads.

Regarding this, Dr. Yu Yinan, Vice President of Horizon and President of the Software Platform Product Line, also stated:

In the end, with the joint promotion of end-to-end giant model algorithms, computing power close to that of human brains and super-large-scale cloud computing platforms, we are very confident that within 10 years, autonomous driving technology will move from the current phased paradigm to the unified expression of the physical world. By modeling the diversity of the world, we can produce driving models containing world knowledge. By combining the cognitive slow system and the instinctive fast system, we can complete the ten-year vision of autonomous driving.

Interestingly, from the prediction that “L2++ will still dominate the next decade”, we can see the realism of Horizon Robotics in terms of commercialization. From the ten-year vision of “automatic driving system as standard equipment for electric vehicles”, we can also see the idealism of Horizon Robotics in terms of technological insight. From this perspective, the release of the latest generation of BPU architecture can also be seen as a significant choice made by Horizon Robotics based on dual considerations of idealism and realism.

Of course, at the intersection of realism and idealism, the image of Horizon Robotics as an important driver for the large-scale commercialization of intelligent driving in China also becomes clearer and more specific.

This article is a translation by ChatGPT of a Chinese report from 42HOW. If you have any questions about it, please email bd@42how.com.