Published: 15:43, November 28, 2024
PDF View
Homegrown AI tools play key role in preserving cultural heritage
By Xinhua
A scene of releasing a cloud computing infrastructure for AI technology at Jiaxing, Zhejiang province, on Nov 19, 2024. (PHOTO / XINHUA)

China's generative AI tools are carving out a unique niche, offering a blend of entertainment and practical benefits while also playing a key role in preserving cultural heritage.

Among them, an image-to-video tool called Vidu-1.5 launched last week by Beijing-based ShengShu Technology, an AI startup, is proclaimed to be a multimodal model to support multi-entity consistency.

In practice, this means AI can generate a video from as few as three input images. For example, in a video shared by the company, the inputs — a man, a futuristic mecha suit and a bustling nighttime cityscape — are seamlessly blended into a cohesive montage, all within just 30 seconds.

Understanding and controlling multiple entities — such as the person, attire and environment — are the biggest challenges in AI-generated video technology.

ALSO READ: AI shows true colors of iconic artwork

Ever since ChatGPT introduced its pioneering Sora, multiple Chinese tech firms have swiftly stepped up to the plate, rolling out products that boast unique characteristics. ShengShu Technology's Vidu is one popular example.

"Look how consistent the suit is," Stefano Rivera, an AI product aficionado tweeted with admiration, calling himself a "superfan" of Vidu "from day 1".

This AI-generated content tool has already ignited a surge of creative enthusiasm among global individual creators, leading to playful and imaginative clips like Leonardo DiCaprio showcasing haute couture on the runway, Elon Musk cruising on an electric scooter in a flamboyant Chinese jacket and a series of Japanese anime scenes.

Vidu's greatest breakthrough is establishing logical relationships among multiple user-specified objects within a scene, says Tang Jiayu, the CEO of ShengShu Technology, in a written response to Xinhua.

With previous text-to-video tools, generating scenes like "a boy holding the cake in a crystal setting "would yield different images of the boy, the cake and the crystal setting each time, much like opening a blind box. Now, with multi-subject consistency, the identity of the boy, cake and crystal can be preserved throughout the video, maintaining continuity, says Tang.

Visitors look at devices for AI's core computing power developed by Beijing-based Sugon at an expo during the 2024 World Internet Conference in Zhejiang on Nov 19, 2024. (PHOTO / XINHUA)

Chinese entrepreneurs like Tang, along with global investors with substantial capital, are rapidly pouring into the AIGC sector, expanding their market footprint in China.

China has filed and launched more than 180 AI generative content models that can provide services to the public, according to an official from the Cyberspace Administration of China in August.

Out of over 1,300 AI large language models globally, China accounts for more than 30 percent, making it the second-largest contributor after the United States, according to a white paper on the global digital economy released in July by the China Academy of Information and Communications Technology.

Generative AI is set to add an estimated $7 trillion to the global economy, with China expected to contribute nearly a third of this amount, accounting for approximately $2 trillion, as shown by a McKinsey report.

Beyond facilitating entertainment creation for online users, AIGC tools are being increasingly applied across diverse scenarios in China. The preservation and promotion of cultural heritage is one of them.

A homegrown generative AI tool coded Jimeng, developed by Byte-Dance, has been employed to craft a fully AI-generated sci-fi short drama aimed at promoting ancient Chinese culture, the first of its kind in the country.

Sanxingdui: Future Apocalypse, released in July, follows a near-future narrative in which protagonists venture into a digitally reconstructed ancient Shu kingdom dating back over 3,000 years to avert an impending civilization crisis.

ALSO READ: Cai Guo-Qiang's big bang of art, science and AI

The 12-episode series employed multiple generative technologies, including AI script-writing, concept and storyboard design, image-to-video conversion, video editing and media content enhancement.

Leveraging its proprietary multimodal large-scale model, Sheng-Shu's AI engineers analyzed extensive collections of ancient mural data from Yongle Palace in Shanxi province, the largest Taoist temple in China.

The 800-year-old temple's murals are beset by problems like color fading, dust and deterioration. Their grand scale, distinctive style and rich intricacy significantly complicate restoration efforts.

Engineers have trained the AI with Chinese mural art data, allowing it to comprehend and replicate the distinctive style of those murals, from colors to brush techniques.

This enabled automated restoration tasks like digital coloring and filling in missing details. AI can mimic the brushwork of mural painters to redraw the facial features of deities in the murals, says Tang.