
Google DeepMind Unveils Veo 2 to Rival OpenAI's Sora | Image Source: techcrunch.com
CALIFORNIA, December 16, 2024 – Google DeepMind has officially released Vi 2, its new generation AI video model, ​marking a significant step forward in the race to master AI-generated video content. According to TechCrunch, I see that 2 is able to ​produce two-minute ​videos in the ​impressive 4K resolution (4096 x 2160 pixels), a feat ​that exceeds the OpenAI Sora, which generates videos up to 1080p and 20 seconds. However, this theoretical limit is currently limited in practice. ​Today, Google ​caps I see 2 video resolution in 720p and limit clips to ​eight seconds in your VideoFX experimental tool.
VideoFX, Google’s exclusive platform to try Vi 2, is still behind a waiting list. However, Eli Collins, Vice President of Product in DeepMind, ​told TechCrunch that the user base is expanding. Collins stressed that I see 2 would soon be accessible through the ​developer platform Google Vertex AI when the model reaches production preparation. “In the coming ​months, we will continue to act on user ​feedback,” ​said Mr. Collins. “We will seek to integrate the capabilities Vo 2 updates in case of convincing use in the Google ecosystem.” The company intends to ​share ​new ​updates ​in the coming year.
Improved camera physics, movement and control
DeepMind Vi 2 claims presents ​innovative improvements on your predecessor. According to DeepMind, the model generates clearer and clearer ​images, especially in complex ​motion scenes. AI has a better understanding of physics, movement dynamics and lighting properties. For example, it can more accurately simulate fluid dynamics – such as sank liquids – and capture complex film effects, including realistic shadows, reflections and depth of field adjustments.
See 2 also offers superior control over virtual camera movements, allowing smoother breads, ​inclinations and ​zooms. This improved accuracy allows AI to capture subjects and environments from various angles ​with greater realism. Despite this progress, ​some ​challenges remain. DeepMind admits that character consistency, ​complex details and fast moving objects remain areas ​to be improved. Collins said: “Coherence and ​coherence are areas of growth. I ​see that you can constantly adhere to an impulse for a ​few minutes, but [can’t] adhere ​to ​complex impulses in long horizons. “
Collaboration ​with Creatives for Restoration
As part of the development process, Google DeepMind actively participated with artists and creators. According to Collins, the company has worked with such personalities ​as Donald Glover, The Weekly and the d4vd musical ​artist to gather ideas about creative workflow. This collaboration has refined the characteristics and ease of use of Vo 2. “Our work with the creators of Vo ​1 focused ​on the development of Vo 2, and ​we look forward to ​working with experts and trusted creators to get information ​about this new model,” Collins ​added. DeepMind continues to stress its commitment to supporting artists and producers, ensuring that their technology meets the creative needs of the real world.
However, despite the ​progress, I see 2 is not immune to the AI Valley. TechCrunch shared examples of generated content that included visual imperfections, such as lifeless expressions in ​animated characters and mixing inconsistencies in ​complex scenes. DeepMind recognizes these gaps and anticipates further improvements to address these issues.
Training data and copyright
See 2 ​has been trained in ​many video ​data sets, although DeepMind ​has not revealed any specific sources. According to TechCrunch, YouTube is a plausible data ​source, as Google owns the platform. Eli Collins explained that I see 2 uses “high quality video description packages,” which means AI learns video templates with descriptive metadata. However, ​this practice raises questions about copyright and data ownership, especially since creators cannot choose to leave existing training packages.
While Google allows webmasters to block their new data deletion robots, there is no system to remove previously collected content. DeepMind defends its use of public data under the doctrine of equitable ​use, ​but the position has generated legal ​and ethical debate within ​the creative community. Since studies suggest that AI could interrupt thousands of cinematographic ​and television works, several companies, including the ​application of ​AI MidJourney art, face requests for unauthorized use of data. In response to these ​concerns, Collins said: “We ​are committed to ​working with creators and our ​partners to achieve common goals
Security measures and deep mitigation
To address the risks associated with AI-generated content, DeepMind has implemented security features within Vo 2. The model incorporates fast level filters to prevent the production of violent, explicit or inappropriate materials. In addition, Google’s patented watermarking technology, SynthID, is integrated into the generated video images. ​SynthID provides an invisible marker that helps identify AI-generated content and reduce video abuse in order to deeply afflict. However, like other water observation technologies, ​SynthID is not infallible.
Currently, Google ​is not extending its compensation policy – which protects users from ​copyright claims – to Vi 2. Collins stated ​that this ​policy would only be implemented once the model reached ​general availability. ​In the meantime, DeepMind continues to monitor and mitigate the risks associated with ​regurgitation ​of the model, where IA too closely replicates training data.
Image 3: Improved image generation
With Vi ​2, Google DeepMind also announced improvements to Image 3, its commercial ​image ​generation model. Image 3 now offers more details and precision of composition through ​various artistic styles, including photorealism, impressionism and anime. The updated model follows users more faithfully and generates richer and clearer visual textures. These improvements are being developed ​for users thanks ​to ImageFX, Google’s dedicated tool for image generation.
DeepMind also introduced user interface updates ​to ImageFX to ​make the tool more intuitive. The new “chiplets” feature highlights key terms in the user’s indications and offers dropdown ​suggestions for related words. This allows users to act smoothly and explore creative variations. These improvements aim to simplify ​image production while maintaining accuracy and flexibility.
While Google continues to refine its AIs, I see 2 and 3 tools, they represent the ​largest ​disk of the company to advance in ​the generative models by ​creating video and image. With growing competition from OpenAI, MidJourney and other AI developers, Google DeepMind ​innovations highlight the growing capabilities and limitations of current AI technology.
While I ​see 2 significant progress, its practical limitations and ​ethical concerns highlight the challenges ​that remain for the video generation AI. According to TechCrunch, next year it ​will be ​essential for Google DeepMind as it works to climb I see 2, address the problems of ​consistency, and navigate the ongoing discussions on AI training data.