The Technology Industry in 2021 is Ignited by Two Leather Jackets
One comes from NVIDIA’s CEO, Huang Renxun, with his exclusive kitchen jacket made only with “digital humans” technology. The 14-second video swindled many technology enthusiasts, leading to numerous “Fake or Real Huang” stories before NVIDIA officially denied them.
The other is the leather jacket that Musk wore during Tesla’s AI Day on August 20. Aren’t you curious about it? It’s still summer in America, and the colleagues on stage with Musk are all “programmer standard” with jeans and T-shirts, but Musk himself enveloped in a leather jacket.
I believe that the readers of this article have more or less understood Tesla’s recently released Tesla Bot, a humanoid robot with a height of 1.7 meters. So, we can boldly assume that there might be an article titled “Musk Tricked the World, and Tesla Bot was Hidden under his Leather Jacket” in three months (jokingly).
Back to the point, from “Autonomy Day” and “Battery Day” to this year’s “AI Day,” Musk is packaging Tesla as a complete artificial intelligence solution company. Just as the CEO said when discussing the Tesla Bot, they are confident in the robotics project because, in a sense, cars are also a form of robots.
Compared to the 28-minute delivery ceremony of the new Model S Plaid by Musk, the nearly two-hour live stream was “a bit long, please bear with me.” After sorting through all the content, we obtained three key elements:
- D1―Giving Tesla the ability to catch up with chip manufacturers;
- FSD―Deepening the pure visual autonomous driving route through cloud-based neural networks;
- Tesla Bot―The physical form of AI.
Believe me, “AI” is the sexiest word in a technology man’s mouth under a leather jacket. But when Musk uses this word to tease the hearts of tech geeks, you must remain calm. After all, everyone has known Musk’s grand visions for a long time, and it’s mostly made up of half “heartfelt words” and half “absurd sweet talk.”
Tesla is Really Focusing on “Chips” this Time!Before AI Day, discussions about Tesla’s Dojo supercomputer had already escalated. Surprisingly, what Musk will talk about is not only the story of the Dojo supercomputer, but also the new D1 supercomputer chip self-developed by Tesla.
What is the ability of the D1 chip? The computing power of a single chip is 362 TFLOPS, and a group of 25 chips can reach 9 PFLOPS in computing power, with an interface bandwidth of 36 TB/s.
What is the Dojo supercomputer? Dojo is a supercomputer composed of D1 chip groups (25 chips per group). Each cabinet can hold 120 chip groups, and the theoretical computing power can reach 120x9PFLOPs=1080PFLOPs.
Listing the above numbers may not give everyone the most intuitive feeling, so we compare it with the world’s most powerful supercomputer. Currently, the world’s first-ranked supercomputer is Japan’s Fugaku, which has a performance of 442 PFLOPs. The performance of Tesla’s Dojo supercomputer is more than twice as that of Fugaku. Of course, we want to emphasize that this is theoretical computing power, because Musk also stated on social media that the biggest problem Dojo faces is energy consumption. Whether Project Dojo can be fully implemented depends on whether they can solve the energy consumption problem. At least from the explosion diagram of the D1 module, we can see that Tesla is trying new heat dissipation technologies.
The ranking feedback from the above table shows that the top four supercomputers are all “national team-level players”, with Fugaku ranking first and China’s TaihuLight ranking fourth. Tesla is achieving what others have done with “national power” with its own efforts, just like how SpaceX explores Mars.
After Tesla launched the D1, they achieved a level playing field with chip industry giants such as NVIDIA and Google. So what is the purpose of Tesla entering the supercomputing arena?
Visual perception is Musk’s focus
Dojo means “training ground”. One of the purposes for Tesla to build the Dojo supercomputer is to provide a “training ground for algorithm enhancement” for FSD’s visual perception automatic driving path.
Please refer to the English version below, which retains the HTML tags in the original Markdown text:
![](https://upload.42how.com/article/image_20210820235107.png)
The current perception hardware in autonomous driving solutions with fusion perception is integrated into the vehicle, which is why everyone is "stacking" the vehicle-side computing power when adding LiDAR. The redundancy of perception hardware enables vehicle-level AI chips to achieve a breakthrough of over thousands of TOPS.
![](https://upload.42how.com/article/image_20210820235115.png)
Tesla's approach is based on complete visual perception. The vehicle perception hardware has only eight cameras, so Tesla's computing power on the vehicle-side AI chip is relatively low, with FSD being only 144 TOPS. From the perspective of current technology, visual perception is a more challenging autonomous driving solution for testing AI and learning capabilities. This is also why many automakers and suppliers choose to combine LiDAR and visual perception.
![](https://upload.42how.com/article/image_20210820235121.png)
Compared to other methods, pure visual recognition is the most "human-like" solution. On-board AI needs to identify other vehicles, pedestrians, lane markings, and even cats and dogs based on what the camera captures. Because the technical difficulty is too high, there have been cases where Tesla's FSD misidentified shadows as vehicles and applied emergency braking. Solutions that use LiDAR to assist with visual perception can largely avoid erroneous judgments because LiDAR can transmit 3D point cloud images to the vehicle-side AI, which will improve the autonomous driving system's ability to handle various situations with 3D perception.
Because most people think that a purely visual perception-based autonomous driving system is too challenging for AI's capabilities, they have turned to a fusion perception-based approach. At least on this path, all obstacles on the road can be recognized. Autonomous driving is already a fragile industry, and every breakthrough in this industry must be guaranteed to be "foolproof." But Musk refuses to believe this. The arrival of Dojo is Musk's determination to go all the way to the end.
![](https://upload.42how.com/article/image_20210820235128.png)
Dojo will surely become a powerful tool for assisting FSD’s refined algorithms. Future visual processing work will be transferred from the vehicle side to the cloud. Tesla’s onboard cameras can continuously capture real road data, and Dojo supercomputers can automatically label objects in this data. In the past, large AI datasets usually required manual labeling and learning by AI, but Dojo will have unsupervised learning algorithms that can perceive and annotate road conditions autonomously. In the future, more data will be fed back to the cloud-based neural network learning system through Dojo, enabling active iteration of autonomous driving algorithms.
In the pure visual path, Tesla will set up a complete system that communicates between the vehicle and the cloud, and the entire system is completely self-developed by Tesla. Many people believe that the ability to fuse perception paths will surpass pure visual paths in the near future with the support of LiDAR, but the cloud-based neural perception network is Tesla’s countermeasure. With the support of Dojo supercomputers, FSD will once again widen the gap with other competitors’ products, but “functional breakthroughs” still require time.
When Tesla released FSD HW3.0 in 2019, Musk stated that by 2022, Tesla will have one million Robot Taxis worldwide, and cars will become revenue-generating tools. Now that 2021 has passed its halfway point, “Scamman Musk’s” flag raised during the HW3.0 era has not yet been realized, and the vehicle-side chip HW4.0 is coming soon.
So the question is, are the capabilities of HW3.0 at their limit? Can HW4.0 bring more functional breakthroughs? Perhaps the answers to these questions are negative. Just like the NIO ET7 uses a computing platform with more than 1000 TOPS of computing power to pave the way for future automatic driving function iterations, commercial landing of automatic driving functions is still at the L2 level under ethical, moral and legal restrictions. “Unlocking functions” is restricted, but it does not mean that “technology iteration” is restricted.
However, Tesla’s path to unlocking functions is very different from that of its competitors, which is also related to the pure visual path. In the past three years, we have seen the evolution of automatic driving function from L2 to L2+. For most manufacturers providing solutions, the above evolution is achieved through hardware upgrades and the landing of “sub-functions”. The basic L2 standard requires lateral and longitudinal control with a single lane, and the hardware requires a forward-facing camera. After adding side and rear vision cameras and side and rear millimeter-wave radars, the vehicle can achieve the “lane change assistance” sub-function. Therefore, the breakthrough from L2 to L2+ is mostly achieved by stacking sub-functions.However, combining multiple functions is not the path to breakthrough autonomy levels for Tesla. Tesla regards driving as one thing to be done, and the gap between L2 and L5 is only the difference in human driver supervisory capabilities. The problem before them is not function breakthrough but how to truly replace human drivers with AI. Therefore, the landing of Dojo will likely enable a purely visual route for Tesla to achieve a “substitute-grade” leap forward.
Tesla Bot, Musk’s “Absurd Sweet Talk”
What exactly is the Tesla Bot for? Simply put, it is a story that Musk came up with to attract people, indicating “Hey guys, do you see how amazing the thing I want to make is? Hurry up and join us.” Musk himself made it clear before the AI Day that this event is also a kind of recruitment in a sense.
There are two reasons why the Tesla Bot is considered absurd: 1. The difficulty of humanoid robot technology is extremely high; 2. There is no commercial prospect.
To talk about how difficult the technology is, we have to turn to the “ivory tower” company in the robotics field—Boston Dynamics, which is a company that does not consider the interests of investors and lacks rational communication with the outside world. Interestingly, Boston Dynamics also recently released a video of their bipedal humanoid robot, which can be seen achieving bipedal balance and running on obstacles without any problems. Although everyone saw the mechanical feasibility of humanoid robots, they did not see the manpower, material resources, financial resources and time invested by Boston Dynamics behind the scenes.
Boston Dynamics has also explicitly stated that the manufacturing of bipedal humanoid robots is only to explore the platform of such robots and the control logic behind them.
Why is the commercial prospect of bipedal robots so bleak? Because robots need to match the scene to realize their value. Only in a scene can robots achieve efficiency improvements.
Humans are organisms that move with two feet, and it is also the way of action chosen by nature. Cars are machines with four wheels, but cars cannot travel over all terrains, while humans can walk on all terrains. However, compared to cars, the walking efficiency of humans is too low, so humans invented cars. The scene with which cars are matched is the highway. In order to realize the high-efficiency operation of transportation tools, humans spent a lot of manpower and material resources building highways. The same applies to automated agricultural machinery in the fields and automated production lines in factories.Elon Musk said Tesla Bot will replace humans to do more tedious work, it seems like this robot can do everything, but upon careful consideration, it is difficult for humanoid robots to surpass human efficiency in any scenario. Moreover, will a bowl of rice allow a normal human to work for an afternoon, while consuming less energy than a robot?
Perhaps Musk himself is not even clearer about many things, but for those tech geeks, is there a bigger lie that fascinates them than creating a “human enemy”?
Conclusion
Musk is the “geekiest scumbag” of this era, he has done many unimaginable things, and has blown many bulls that even he cannot save face. Some people say that he is a “leek harvesting machine” and sneer at him, while others are crazy fans of him.
At this point, Musk, who even skipped the Tesla conference call, may be brewing another big news.
This article is a translation by ChatGPT of a Chinese report from 42HOW. If you have any questions about it, please email bd@42how.com.