OpenAI unveils ‘Sora’: A text-to-video model

OpenAI announces that ‘Sora’ empowers users to create one-minute-long, photorealistic videos based on their provided prompts.

The existing model exhibits certain shortcomings such as discrepancies in simulating complex physics.

Image: Getty Images

OpenAI, the Microsoft-backed artificial intelligence (AI) company, has unveiled its latest breakthrough ‘Sora’, a text-to-video model, it says in a blogspot. Showcased as a tool proficient in translating textual instructions into "both realistic and imaginative" visual scenes.

In the company’s official announcement, OpenAI reveals that ‘Sora’ allows users to create one-minute-long, photorealistic videos based on their provided prompts. OpenAI says Sora is capable of constructing intricate scenes featuring multiple characters, motion patterns, and detailed foreground and background elements.

"Sora is able to generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background,” OpenAI states in its blogpost.

"We are also granting access to a number of visual artists, designers, and filmmakers to gain feedback on how to advance the model to be most helpful for creative professionals," the blog post adds.

The existing model exhibits certain shortcomings, keen observers may discern certain nuances betraying its AI origins, such as discrepancies in simulating complex physics. OpenAI acknowledges these limitations, emphasising their continuous efforts to enhance the model's performance.

“It may struggle with accurately simulating the physics of a complex scene, and may not understand specific instances of cause and effect. For example, a person might take a bite out of a cookie, but afterward, the cookie may not have a bite mark,” the company says in the blogpost.

“Sora can also create multiple shots within a single generated video that accurately persist characters and visual style,” the blogpost adds.

The emergence of Sora mirrors a broader trend in AI development, with a noticeable shift towards improving video-generation capabilities. Competitors like Runway, Pika, and Google's Lumiere have also made progress in this arena, presenting their own text-to-video models.

Top Stories

IPL Auction 2025: Rishabh Pant creates history, becomes most expensive player

BSNL gains ground amid subscriber exodus from private telcos; here's why

Will NDA’s landslide victory in Maharashtra election boost market sentiment? Here’s what analysts say

Combating Paediatric TB in schools and bridging the gender gap in STEMM fields

IPL auction 2025: All you need to know about date, time, venue of mega cricket auction

Technology