WAIC 2024: Can DriveAGI Revolutionize Autonomous Driving?

The 2024 World Artificial Intelligence Conference and the High-level Conference on Global Governance of Artificial Intelligence (WAIC 2024) took place in Shanghai from July 4th to July 7th. SHENGTU Shadow exhibited intelligent driving and cockpit products based on the SHENGTU “Day Day New 5.5” multimodal large model, attracting widespread attention.

At the conference, SHENGTU Shadow displayed its autonomous driving large model DriveAGI and launched in-car generative interaction interfaces “FlexInterface” and “AgentFlow” among other in-car AI Agent applications. In addition, SHENGTU Shadow’s autonomous minibus also made its debut at WAIC, undertook shuttle tasks, showcasing its capabilities in practical applications.

At the Artificial Intelligence Forum on July 5th, Shengtu Technology released the “Day Day New 5.0” model and demonstrated the multi-modal interaction experience of “Day Day New 5.5” large model. Wang Xiaogang, co-founder of Shengtu Technology, indicated that SHENGTU Shadow is promoting the integration of multimodal large models with intelligent cars, aiming to provide a more natural human-computer interaction experience.

Multimodal large models can fuse voice, text, image, gesture, video, and other modalities to provide a natural human-computer interaction experience. The “Day Day New 5.5” multimodal large model supported by SHENGTU is an end-to-end model capable of handling text, voice, and video information simultaneously, showcasing its adaptability on different computational platforms.

The universal autonomous driving model UniAD proposed by SHENGTU and its joint laboratories won an award at the International Conference on Computer Vision and Pattern Recognition (CVPR) in 2023. SHENGTU Shadow demonstrated UniAD in real vehicle demonstrations in complex road scenarios, and developed DriveAGI, a large model for driving decision planning. DriveAGI enhances the system’s interpretability and interactivity, understanding the complex real world, explaining the reasoning process of driving decisions, and controlling autonomous driving behavior through voice or gesture commands.

SHENGTU Shadow is building the product “CockpitBrain,” a multimodal large model engine, and launched generative interaction interface products “FlexInterface” and “AgentFlow.” These products can dynamically generate personalized interfaces and complete complex tasks through natural language. Additionally, SHENGTU Shadow also displayed the “Multimodal Sentinel,” which can understand and respond to behaviors that may cause damage to the vehicle, ensuring vehicle safety.

SHENGTU Shadow’s large model products have been incorporated by several mainstream automotive manufacturers.It has been applied to the mass-produced models of car manufacturers, and autonomous driving shuttle services have been launched at multiple locations, demonstrating its broad application prospects in the field of intelligent cars and smart mobility.

This article is a translation by AI of a Chinese report from 42HOW. If you have any questions about it, please email bd@42how.com.