Blog Hour

SORA by OpenAI Set to Revolutionize AI Video Generation

OpenAI, the builders of the famous chatbot ChatGPT, have launched SORA, a text-to-video generator model that can create one-minute-long videos based on the command prompt. It can create imaginative and realistic scenes from text instructions.

Sam Altman, the founder of OpenAI, tweeted on X, asking users for inputs and complex prompts for Sora to showcase its capabilities. He also shared the outputs that looked vividly beautiful and realistic.

 

Visual Data Training

Just like chatGPT, which is based on a large dataset and has tokens, Sora is being trained on a large set of images and videos called visual patches. According to the research team of OpenAI, “while LLMs have text tokens, Sora has visual patches. Patches have previously been shown to be an effective representation of models using visual data. We find that patches are a highly scalable and effective representation for training generative models on diverse types of videos and images.

 

The Superiority of Sora

 

Sampling flexibility

Sora can sample videos in various aspect ratios, such as widescreen 1920x1080p and vertical 1080x1920p. This allows Sora to create content that fits perfectly on different devices without any need for manual adjustments. Additionally, Sora can quickly prototype content at lower resolutions before generating the final output at full resolution, all with the same model.

 

Improved Framing and Composition

Training of Sora is done on native aspect ratios that help improve composition and framing. Other generative AI platforms are trained on videos that are cropped to square, which can lead to the object being partially visible in some videos.

 

Language Understanding

Text-to-video generator AI models need to be trained on a large dataset with corresponding text captions that make it understandable for the model to identify objects and the video.

Sora was trained on videos that were auto-captioned using another AI model that automatically generates captions for videos. This training based on highly descriptive video texts can help the model produce more accurate videos.

 

Image Generation

Sora can easily create highly detailed and beautiful images. The model can generate images of variable sizes—up to 2048 x 2048 resolution.

 

Prompting and Animating using images

Other than text-to-video, Sora can be prompted with pre-existing images or videos, and it can animate images in the blink of an eye, which makes it really versatile.

 

Video-to-Video Editing

Yes, you read that correctly. Sora can edit videos, too. Video of a car moving in a desert can be edited to look like it is moving on a road in a lush green rainforest. The model can also extend a video, either forward or backward, in time.

It can also merge two videos with entirely different objects and scenes.

 

A Work in Progress

Although it can generate videos with dynamic camera motion, it means if the camera is in motion, other elements move consistently with it. The research team is working to improve and stabilise it.

Interacting with the world is an arena where work is required as the model evolves and grows. For example, a person takes a bite out of a burger in an AI-generated video, but the burger does not have a bite mark afterwards.

The model can also simulate games smoothly; Sora can control the player and render the digital world at the same time with ease.

Sora has limitations as a simulator, such as inaccurate modelling of physics in basic interactions such as glass shattering. Some interactions, like eating food, only sometimes result in the correct changes in the object state. 

However, this is a promising technology that has the potential to change the world of video editing, production and generation, and with time, this is going to get better, just like humans perfecting a skill.

 

Read more: Supreme Court Strikes Down Electoral Bonds, Calls It Unconstitutional

Share

Recent Posts

Tragedy Strikes at Nirvana Laddu Parv in Baghpat: 6 Dead, Several Injured as Makeshift Stage Collapses

Incident Details: A Festive Night Turns to Tragedy On the evening of January 27, 2025,…

7 days ago

Padma Award 2025: Who Made the Cut This Year?

The Padma Awards 2025 have been officially announced, recognizing the remarkable contributions of individuals across…

1 week ago

India’s Extradition Challenge: 5 Fugitives from Terrorism and Financial Fraud

India is currently engaged in a determined effort to extradite several high-profile fugitives who have…

1 week ago

Saif Ali Khan’s Home Under Attack: Intruder Targets Jeh’s Room, Reveals House Help

Incident Overview On January 17, 2025, a distressing incident unfolded at the residence of Bollywood…

3 weeks ago

Six Dead, Over 20 Injured in Stampede at Darshan Ticket Counters in Tirupati

Tragedy Strikes as Devotees Jostle for Vaikunta Ekadasi Tickets TIRUPATI: A devastating stampede occurred late…

4 weeks ago

Family Found Dead in Bengaluru: Police Suspect Land Dispute and Financial Distress

Bengaluru, India — In a heart-wrenching tragedy, a family of four was found dead in their…

4 weeks ago