Attack the big model! ByteDance AI software and hardware ecology is taking shape.

Science and technology innovation board Journal, December 18th (Reporter Huang Xinyi)At the Force conference in Volcano Engine, Tan Dai, president of Volcano Engine, announced that as of mid-December, the daily average usage of tokens of the bean bag universal model had exceeded 4 trillion, which was 33 times higher than that when it was first released seven months ago.

The previous list of monthly active users in the world shows that the MAU of Bean Bag App is close to 60 million, ranking second in the world after ChatGPT of OpenAI.

At the meeting, ByteDance officially released the visual understanding model of bean bags, the 3D generation model of bean bags, and the comprehensive upgraded universal model of bean bags, such as pro, music model and Wensheng diagram model. Among them, the input price of the visual understanding model of bean bags is only 3%, and one yuan can handle 284 720P pictures. The conference also announced that ByteDance will launch version 1.5 of the video generation model of Bean Bag with longer video generation capability in the spring of 2025, and the end-to-end real-time voice model of Bean Bag will be launched soon.

The science and technology innovation board Journal reporter learned exclusively that,Volcano Engine Edge Cloud and Runxin Technology jointly create AI voice toys, and its Wi-Fi module adopts Hengxuan Technology. Graffiti Intelligence also participates in providing related modules and docking platforms..

From the live experience, consumers can interact with the AI toy "puppy" in voice, so that they can answer questions and provide companionship.

Some people in the industrial chain expressed optimism about the follow-up sales of AI voice toys. He told the science and technology innovation board Journal reporter,It is expected that the first batch of AI toys will be listed at the end of this year and early next year. It is expected that a large number of players will join the competition in the industry in the first half of next year.However, the landing of AI toys also has certain challenges. "First of all, AI toys need to be based on high-quality knowledge bases of different age groups in order to achieve better human-computer interaction. Besides,The daily interaction frequency of AI toys is high. For users, the cost of cloud reasoning will be a big expense, and it is also a difficulty that affects its landing.. "

At the meeting,Volcano Engine Video Cloud, Lexin Technology and ToyCity jointly launched the AI+ Hardware Smart Jump Program., combined with the big model of bean curd, Volcano Engine’s anthropomorphic voice dialogue technology, ToyCity tide play design, as well as the AI chip of Lexin Technology and other product strengths, to promote the popularization of AI tide play. It is reported that Lexin Technology will provide a one-stop hardware solution for AI Chaowan, including end-side audio and video processing.

In terms of robots,Horizon’s sweet potato robot and Volcano Engine Edge Cloud are developing intelligent robots based on large model gateways.A robot intelligent perception and control system scheme based on edge large model gateway is built.

Facing the robot call scenario, the big model gateway can make use of the edge to call the big model service nearby based on the originating position of the end-side request, and improve the response speed and ensure the call stability through the product capability of the edge big model gateway, so as to realize the near access and query acceleration of the big model for robot equipment and call it on demand at a lower price and faster speed.

In addition, Leju Robot also cooperates with the bean bag model. At present, Leju robot is mainly used in scientific research, exhibition hall tour, etc., and is also exploring the application in the industrial field.

In terms of end-side AI, the bean bag model has been connected to many smart terminals such as mobile phones and PCs, covering about 300 million terminal devices, and the call volume of the bean bag model from smart terminals has increased by 100 times in half a year.

It is reported that HONOR’s magic retouching and AI abstract functions are provided by the bean bag model, and vivo mobile phone adopts the bean bag music model, which provides music creation ability for photo album users. Bean bag music model will create AI lyrics and songs according to the materials provided by users, and generate personalized movies for users.

Tan Dai told science and technology innovation board Daily that most Android phones in China are cooperating with Doubao. “For mobile phone manufacturers, bean bags will be used in some scenes, other large models will be used in some scenes, or a certain scene will be mixed.. For enterprise users, a cloudy or multi-model strategy is definitely needed, which I think is normal. In the end, the ability is better and the cost is lower. Who will be used? This account is easy to calculate. "

The reporter of science and technology innovation board Daily found on the spot that a large number of application scenarios have been explored based on ByteDance’s AI agent development platform button.

For example, cooperate with Supor to explore the generation of AI personalized recipes and improve the service level of cooking machines.

AI fish farming in cooperation with Ji Zhiyun, the agent automatically provides users with an optimization scheme according to the real-time data of fish tank equipment. For example, when the water quality is not up to standard, the agent can automatically adjust the operation mode of the pump to improve the growth environment of fish or plants.

It is reported that there are more than 1 million active developers in the agent development community of Button, and more than 2 million agent applications have been created.

In terms of car companies, Dongfeng Motor, Zhiji Automobile, and Mercedes-Benz’s SMART Car have cooperated with the bean bag model in smart cockpits and other aspects. Tan Dai said that more than 80% of domestic mainstream automobile brands are cooperating with the big model of bean bag.

ByteDance is expected to launch version 1.5 of the video generation model of bean curd with longer video generation capability in the spring of 2025. Talking about the possible computational challenges, Tan Dai said, "The Volcano Ark provides MaaS reasoning service for the large model of bean curd, which I think is sufficient from our own reserves. It is for this reason that we can now provide the industry’s largest TPM and RPM. When the user is stuck or blocked, this is not necessarily the reason for insufficient computing power. After all, you are using an application and a system. Even if there is something wrong with the user’s authentication, it will affect the fluency of the whole system, including whether there is engineering optimization. This is actually not just a simple calculation problem. "

For the market competition in the future big model field, Tan Dai said that it is still in the early stage of the market. "To be honest, from my point of view, I don’t care much about competition now, because this market is still very early, and maybe one thousandth of this market has just been developed. At this time, you don’t really have to care about competition. What you care about is what users’ needs are not met. "

(science and technology innovation board Daily reporter Huang Xinyi)
Reporting/feedback