Tencent Introduces Follow-Your-Click: Transforming Still Images to Animated Videos

Tencent Introduces Follow-Your-Click: Transforming Still Images to Animated Videos

Chinese internet giant Tencent Holdings has made a significant breakthrough in artificial intelligence (AI) with the introduction of Follow-Your-Click, an image-to-video AI model. Developed in collaboration with academic partners from Hong Kong University of Science and Technology and Tsinghua University, Follow-Your-Click allows users to click on specific parts of an image and provide a text prompt indicating how they would like the image to move, transforming a still image into a short animated video. This innovation aims to address the limitations faced by other image-to-video models that tend to move entire scenes rather than focusing on specific objects in the picture.

The project was announced on Friday, with a release on Microsoft’s open-source code website GitHub. Tencent plans to release the full code for the model in April, but a demo is already available for users to experience its capabilities. Examples showcased by researchers include an image of a bird with the prompt “flap the wings” which then transforms into a video of a rainbow-colored avian twitching one of its wings, and an image of a girl with the prompt “storm” which turns into an animation with lightning flashing in the background.

The researchers behind Follow-Your-Click highlight its simpler yet precise user control and better generation performance compared to previous methods. In an academic paper published on arXiv, they explain that other models require users to provide elaborate descriptions of how and where they want the image to move. With Follow-Your-Click, users can simply click on the desired area and provide a text prompt, making it more user-friendly and efficient.

This development comes at a time when the field of video generation has attracted significant attention, following the success of Microsoft-backed OpenAI’s text-to-video model, Sora. Chinese companies are now keen to catch up in generative AI. Pika Labs, a startup founded by Chinese PhD candidate Guo Wenjing at Stanford University, has secured $55 million in seed capital and Series A funding for its text- and image-to-video generation technology. Additionally, Alibaba Group Holding has recently launched a portrait video-generation tool called EMO, which turns images and audio prompts into videos that sing and talk.

Follow-Your-Click is an exciting addition to Tencent’s open-source toolbox, VideoCrafter2, which was released earlier this year. It is an updated version of VideoCrafter1 and offers users the ability to generate and edit videos. However, it is worth noting that VideoCrafter2 is currently limited to videos of just two seconds long. With Follow-Your-Click, Tencent is further establishing itself as a leader in AI innovation, pushing the boundaries of what is possible in the realm of image and video manipulation.

As technology continues to advance, we can expect further breakthroughs in AI that revolutionize the way we interact with images and videos. Whether it’s turning static images into captivating animated videos or transforming audio prompts into singing and talking portraits, AI is continuously expanding its creative capabilities. Follow-Your-Click is just one example of the exciting advancements taking place in this field, and we can look forward to more groundbreaking innovations in the future.


Written By

Jiri Bílek

In the vast realm of AI and U.N. directives, Jiri crafts tales that bridge tech divides. With every word, he champions a world where machines serve all, harmoniously.