OpenAI, the Microsoft-backed artificial intelligence (AI) company, has unveiled its latest breakthrough ‘Sora’, a text-to-video model, it says in a blogspot. Showcased as a tool proficient in translating textual instructions into "both realistic and imaginative" visual scenes.
In the company’s official announcement, OpenAI reveals that ‘Sora’ allows users to create one-minute-long, photorealistic videos based on their provided prompts. OpenAI says Sora is capable of constructing intricate scenes featuring multiple characters, motion patterns, and detailed foreground and background elements.
"Sora is able to generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background,” OpenAI states in its blogpost.
"We are also granting access to a number of visual artists, designers, and filmmakers to gain feedback on how to advance the model to be most helpful for creative professionals," the blog post adds.
The existing model exhibits certain shortcomings, keen observers may discern certain nuances betraying its AI origins, such as discrepancies in simulating complex physics. OpenAI acknowledges these limitations, emphasising their continuous efforts to enhance the model's performance.
“It may struggle with accurately simulating the physics of a complex scene, and may not understand specific instances of cause and effect. For example, a person might take a bite out of a cookie, but afterward, the cookie may not have a bite mark,” the company says in the blogpost.
“Sora can also create multiple shots within a single generated video that accurately persist characters and visual style,” the blogpost adds.
The emergence of Sora mirrors a broader trend in AI development, with a noticeable shift towards improving video-generation capabilities. Competitors like Runway, Pika, and Google's Lumiere have also made progress in this arena, presenting their own text-to-video models.
As of now, Sora is exclusively available to a select group of individuals designated as "red teamers," responsible for evaluating the model for potential risks and drawbacks. OpenAI has additionally granted access to visual artists, designers, and filmmakers to gather feedback, recognising the pivotal role of community input in refining its technology, the blog post adds.
Despite its advancements, OpenAI remains vigilant against the misuse of its AI products. The recent inclusion of watermarks to its text-to-image tool, DALL-E 3, underscores the company's commitment to combating the proliferation of fake, AI-generated content.