OpenAI’s Sora just made another brain-melting music video and we’re starting to see a theme

OpenAI's text-to-video tool has been a busy bee recently, helping to make a short film about a man with a balloon for a head and giving us a glimpse of the future of TED Talks – and now it's rustled up its first official music video for the synth-pop artist Washed Out (below).

This isn't the first music video we've seen from Sora – earlier this month we saw this one for independent musician August Kamp – but it is the first official commissioned example from an established music video director and artist.

That director is Paul Trillo, an artist who's previously made videos for the likes of The Shins and shared this new one on X (formerly Twitter). He said the video, which flies through a tunnel-like collage of high school scenes, was “an idea I had almost 10 years ago and then abandoned”, but that he was “finally able to bring it to life” with Sora.

It isn't clear exactly why Sora was an essential component for executing a fairly simple concept, but it helped make the process much simpler and quicker. Trillo points to one of his earlier music videos, The Great Divide for The Shins, which uses a similar effect but was “entirely 3D animated”.

As for how this new Washed Out video was made, it required less non-Sora help than the Shy Kids' Air Head video, which involved some lengthy post-production to create the necessary camera effects and consistency. For this one, Trillo said he used text-to-video prompts in Sora, then cut the resulting 55 clips together in Premiere Pro with only “very minor touch-ups”.

The result is a video that, like Sora's TED Talks creation (which was also created by Trillo), hints at the tool's strengths and weaknesses. While it does show that digital special effects are going to be democratized for visual projects with tight budgets, it also reveals Sora's issues with coherency across frames (as characters morph and change) and its persistent sense of uncanny valley.

Like the TED Talks video, a common technique to get around these limitations is the dreamy fly-through technique, which ensures that characters are only on-screen fleetingly and that any weird morphing is a part of the look rather than a jarring mistake. While it works for this video, it could quickly become a trope if it's over-used.

A music video tradition

Two people sitting on the top deck of a bus

(Image credit: OpenAI / Washed Out)

Music videos have long been pioneers of new digital technology – the Dire Straits video for Money For Nothing in 1985, for example, gave us an early taste of 3D animation, while Michael Jackson's Black Or White showed off the digital morphing trick that quickly became ubiquitous in the early 90s (see Terminator 2: Judgement Day). 

While music videos lack the cultural influence they once did, it looks like they'll again be a playground for AI-powered effects like the ones in this Washed Out creation. That makes sense because Sora, which OpenAI expects to release to the public “later this year”, is still well short of being good enough to be used in full-blown movies.

We can expect to see these kinds of effects everywhere by the end of the year, from adverts to TikTok promos. But like those landmark effects in earlier music videos, they will also likely date pretty quickly and become visual cliches that go out of fashion.

If Sora can develop at the same rate as OpenAI's flagship tool, ChatGPT, it could evolve into something more reliable, flexible, and mainstream – with Adobe recently hinting that the tool could soon be a plug-in for Adobe Premiere Pro. Until then, expect to see a lot more psychedelic Sora videos that look like a mashup of your dreams (or nightmares) from last night.

You might also like…

TechRadar – All the latest technology news

Read More

Turns out the viral ‘Air Head’ Sora video wasn’t purely the work of AI we were led to believe

A new interview with the director behind the viral Sora clip Air Head has revealed that AI played a smaller part in its production than was originally claimed. 

Revealed by Patrick Cederberg (who did the post-production for the viral video) in an interview with Fxguide, it has now been confirmed that OpenAI's text-to-video program was far from the only force involved in its production. The 1-minute and 21-second clip was made with a combination of traditional filmmaking techniques and post-production editing to achieve the look of the final picture.

Air Head was made by ShyKids and tells the short story of a man with a literal balloon for a head. While there's human voiceover utilized, from the way OpenAI was pushing the clip on social channels such as YouTube, it certainly left the impression that the visuals were was purely powered by AI, but that's not entirely true. 

As revealed in the behind-the-scenes clip, a ton of work was done by ShyKids who took the raw output from Sora and helped to clean it up into the finished product. This included manually rotoscoping the backgrounds, removing the faces that would occasionally appear on the balloons, and color correcting. 

Then there's the fact that Sora takes a ton of time to actually get things right. Cederberg explains that there were “hundreds of generations at 10 to 20 seconds a piece” which were then tightly edited in what the team described as a “300:1” ratio of what was generated versus what was primed for further touch-ups. 

Such manual work also included editing out the head which would appear and reappear, and even changing the color of the balloon itself which would appear red instead of yellow. While Sora was used to generate the initial imagery with good results, there was clearly a lot more happening behind the scenes to make the finished product look as good as it does, so we're still a long way out from instantly-generated movie-quality productions. 

Sora remains tightly under wraps save for a handful of carefully curated projects that have been allowed to surface, with Air Head among the most popular. The clip has over 120,000 views at the time of writing, with OpenAI touting as “experimentation” with the program, downplaying the obvious work that went into the final product. 

Sora is impressive but we're not convinced

While OpenAI has done a decent job of showcasing what its text-to-video service can do through the large language model, the lack of transparency is worrying. 

Air Head is an impressive clip by a talented team, but it was subject to a ton of editing to get the final product to where it is in the short. 

It's not quite the one-click-and you-'re-done approach that many of the tech's boosters have represented it as. It turns out that it is merely a tool which could be used to enhance imagery instead of create from scratch, which is something that is already common enough in video production, making Sora seem less revolutionary than it first appeared.

You may also like

TechRadar – All the latest technology news

Read More

OpenAI’s new Sora video is an FPV drone ride through the strangest TED Talk you’ve ever seen – and I need to lie down

OpenAI's new Sora text-to-video generation tool won't be publicly available until later this year, but in the meantime it's serving up some tantalizing glimpses of what it can do – including a mind-bending new video (below) showing what TED Talks might look like in 40 years.

To create the FPV drone-style video, TED Talks worked with OpenAI and the filmmaker Paul Trillo, who's been using Sora since February. The result is an impressive, if slightly bewildering, fly-through of futuristic conference talks, weird laboratories and underwater tunnels.

The video again shows both the incredible potential of OpenAI Sora and its limitations. The FPV drone-style effect has become a popular one for hard-hitting social media videos, but it traditionally requires advanced drone piloting skills and expensive kit that goes way beyond the new DJI Avata 2.

Sora's new video shows that these kind of effects could be opened up to new creators, potentially at a vastly lower cost – although that comes with the caveat that we don't yet know how much OpenAI's new tool itself will cost and who it'll be available to.

See more

But the video (above) also shows that Sora is still quite far short of being a reliable tool for full-blown movies. The people in the shots are on-screen for only a couple of seconds and there's plenty of uncanny valley nightmare fuel in the background.

The result is an experience that's exhilarating, while also leaving you feeling strangely off-kilter – like touching down again after a sky dive. Still, I'm definitely keen to see more samples as we hurtle towards Sora's public launch later in 2024.

How was the video made?

A video created by OpenAI Sora for TED Talks

(Image credit: OpenAI / TED Talks)

OpenAI and TED Talks didn't go into detail about how this specific video was made, but its creator Paul Trillo recently talked more broadly about his experiences of being one of Sora's alpha tester.

Trillo told Business Insider about the kinds of prompts he uses, including “a cocktail of words that I use to make sure that it feels less like a video game and something more filmic”. Apparently these include prompts like “35 millimeter”, “anamorphic lens”, and “depth of field lens vignette”, which are needed or else Sora will “kind of default to this very digital-looking output”.

Right now, every prompt has to go through OpenAI so it can be run through its strict safeguards around issues like copyright. One of Trillo's most interesting observations is that Sora is currently “like a slot machine where you ask for something, and it jumbles ideas together, and it doesn't have a real physics engine to it”.

This means that it's still a long way way off from being truly consistent with people and object states, something that OpenAI admitted in an earlier blog post. OpenAI said that Sora “currently exhibits numerous limitations as a simulator”, including the fact that “it does not accurately model the physics of many basic interactions, like glass shattering”.

These incoherencies will likely limit Sora to being a short-form video tool for some time, but it's still one I can't wait to try out.

You might also like

TechRadar – All the latest technology news

Read More

Watch this: Adobe shows how AI and OpenAI’s Sora will change Premiere Pro and video editing forever

OpenAI's Sora gave us a glimpse earlier this year of how generative AI is going to change video editing – and now Adobe has shown off how that's going to play out by previewing of some fascinating new Premiere Pro tools.

The new AI-powered features, powered by Adobe Firefly, effectively bring the kinds of tricks we've seen from Google's photo-focused Magic Editor – erasing unwanted objects, adding objects and extending scenes – to video. And while it isn't the first piece of software to do that, seeing these tools in an industry standard app that's used by professionals is significant.

For a glimpse of what's coming “this year” to Premiere Pro and other video editing apps, check out the video below. In a new Generative panel, there's a new 'add object' option that lets you type in an object you want to add to the scene. This appears to be for static objects, rather than things like a galloping horse, but it looks handy for b-roll and backgrounds.

Arguably even more helpful is 'object removal', which uses Firefly's AI-based smart masking to help you quickly select an object to remove then make it vanish with a click. Alternatively, you can just combine the two tools to, for example, swap the watch that someone's wearing for a non-branded alternative.

One of the most powerful new AI-powered features in photo editing is extending backgrounds – called Generative Fill in Photoshop – and Premiere Pro will soon have a similar feature for video. Rather than extending the frame's size, Generative Extend will let you add frames to a video to help you, for example, pause on your character's face for a little longer. 

While Adobe hasn't given these tools a firm release date, only revealing that they're coming “later this year”, it certainly looks like they'll change Premiere Pro workflows in a several major ways. But the bigger AI video change could be yet to come… 

Will Adobe really plug into OpenAI's Sora?

A laptop screen showing AI video editing tools in Adobe Premiere Pro

(Image credit: Adobe)

The biggest Premiere Pro announcement, and also the most nebulous one, was Adobe's preview of third-party models for the editing app. In short, Adobe is planning to let you plug generative AI video tools including OpenAI's Sora, Runway and Pika Labs into Premiere Pro to sprinkle your videos with their effects.

In theory, that sounds great. Adobe showed an example of OpenAI's Sora generating b-roll with a text-to-video prompt, and Pika powering Generative Extend. But these “early examples” of Adobe's “research exploration” with its “friends” from the likes of OpenAI are still clouded in uncertainty.

Firstly, Adobe hasn't committed to launching the third-party plug-ins in the same way as its own Firefly-powered tools. That shows it's really only testing the waters with this part of the Premiere Pro preview. Also, the integration sits a little uneasily with Adobe's current stance on generative AI tools.

A laptop screen showing AI video editing tools in Adobe Premiere Pro

(Image credit: Adobe)

Adobe has sought to set itself apart from the likes of Midjourney and Stable Diffusion by highlighting that Adobe Firefly is only trained on Adobe Stock image library, which is apparently free of commercial, branded and trademark imagery. “We’re using hundreds of millions of assets, all trained and moderated to have no IP,” Adobe's VP of Generative AI, Alexandru Costin, told us earlier this year.

Yet a new report from Bloomberg claims that Firefly was partially trained on images generated by Midjourney (with Adobe suggesting that could account for 5% of Firefly's training data). And these previews of new alliances with generative video AI models, which are similarly opaque when it comes to their training data, again sits uneasily with Adobe's stance.

Adobe's potential get-out here is Content Credentials, a kind of nutrition label that's also coming to Premiere Pro and will add watermarks to clarify when AI was used in a video and with which model. Whether or not this is enough for Adobe to balance making a commercially-friendly pro video editor with keeping up in the AI race remains to be seen.

You might also like

TechRadar – All the latest technology news

Read More

OpenAI’s Sora just made its first music video and it’s like a psychedelic trip

OpenAI recently published a music video for the song Worldweight by August Kamp made entirely by their text-to-video engine, Sora. You can check out the whole thing on the company’s official YouTube channel and it’s pretty trippy, to say the least. Worldweight consists of a series of short clips in a wide 8:3 aspect ratio featuring fuzzy shots of various environments. 

You see a cloudy day at the beach, a shrine in the middle of a forest, and what looks like pieces of alien technology. The ambient track coupled with the footage results in a uniquely ethereal experience. It’s half pleasant and half unsettling. 

It’s unknown what text prompts were used on Sora; Kamp didn’t share that information. But she did explain the inspiration behind them in the description. She states that whenever she created the track, she imagined what a video representing Worldweight would look like. However, she lacked a way to share her thoughts. Thanks to Sora, this is no longer an issue as the footage displays what she had always envisioned. It's “how the song has always ‘looked’” from her perspective.

Embracing Sora

If you pay attention throughout the entire runtime, you’ll notice hallucinations. Leaves turn into fish, bushes materialize out of nowhere, and flowers have cameras instead of petals. But because of the music’s ethereal nature, it all fits together. Nothing feels out of place or nightmare-inducing. If anything, the video embraces the nightmares.

We should mention August Kamp isn’t the only person harnessing Sora for content creation. Media production company Shy Kids recently published a short film on YouTube called “Air Head” which was also made on the AI engine. It plays like a movie trailer about a man who has a balloon for a head.

Analysis: Lofty goals

It's hard to say if Sora will see widespread adoption judging by this content. Granted, things are in the early stages, but ready or not, that hasn't stopped OpenAI from pitching its tech to major Hollywood studios. Studio executives are apparently excited at the prospects of AI saving time and money on production. 

August Kamp herself is a proponent of the technology stating, “Being able to build and iterate on cinematic visuals intuitively has opened up categorically new lanes of artistry for me”. She looks forward to seeing “what other forms of storytelling” will appear as artificial intelligence continues to grow.

In our opinion, tools such Sora will most likely enjoy a niche adoption among independent creators. Both Kamp and Shy Kids appear to understand what the generative AI can and cannot do. They embrace the weirdness, using it to great effect in their storytelling. Sora may be great at bringing strange visuals to life, but in terms of making “normal-looking content”, that remains to be seen.

People still talk about how weird or nightmare-inducing content made by generative AI is. Unless OpenAI can surmount this hurdle, Sora may not amount to much beyond niche usage.

It’s still unknown when Sora will be made publicly available. OpenAI is holding off on a launch, citing potential interference in global elections as one of its reasons. Although, there are plans to release the AI by the end of 2024.

If you're looking for other platforms, check out TechRadar's list of the best AI video makers for 2024.

You might also like

TechRadar – All the latest technology news

Read More

OpenAI just gave artists access to Sora and proved the AI video tool is weirder and more powerful than we thought

A man with a balloon for a head is somehow not the weirdest thing you'll see today thanks to a series of experimental video clips made by seven artists using OpenAI's Sora generative video creation platform.

Unlike OpenAI's ChatGPT AI chatbot and the DALL-E image generation platform, the company's text-to-video tool still isn't publicly available. However, on Monday, OpenAI revealed it had given Sora access to “visual artists, designers, creative directors, and filmmakers” and revealed their efforts in a “first impressions” blog post.

While all of the films ranging in length from 20 seconds to a minute-and-a-half are visually stunning, most are what you might describe as abstract. OpenAI's Artist In Residence Alex Reben's 20-second film is an exploration of what could very well be some of his sculptures (or at least concepts for them), and creative director Josephine Miller's video depicts models melded with what looks like translucent stained glass.

Not all the videos are so esoteric.

OpenAI Sora AI-generated video image by Don Allen Stevenson III

OpenAI Sora AI-generated video image by Don Allen Stevenson III (Image credit: OpenAI sora / Don Allen Stevenson III)

If we had to give out an award for most entertaining, it might be multimedia production company shy kids' “Air Head”. It's an on-the-nose short film about a man whose head is a hot-air-filled yellow balloon. It might remind you of an AI-twisted version of the classic film, The Red Balloon, although only if you expected the boy to grow up and marry the red balloon and…never mind.

Sora's ability to convincingly merge the fantastical balloon head with what looks like a human body and a realistic environment is stunning. As shy kids' Walter Woodman noted, “As great as Sora is at generating things that appear real, what excites us is its ability to make things that are totally surreal.” And yes, it's a funny and extremely surreal little movie.

But wait, it gets stranger.

The other video that will have you waking up in the middle of the night is digital artist Don Allen Stevenson III's “Beyond Our Reality,” which is like a twisted National Geographic nature film depicting never-before-seen animal mergings like the Girafflamingo, flying pigs, and the Eel Cat. Each one looks as if a mad scientist grabbed disparate animals, carved them up, and then perfectly melded them to create these new chimeras.

OpenAI and the artists never detail the prompts used to generate the videos, nor the effort it took to get from the idea to the final video. Did they all simply type in a paragraph describing the scene, style, and level of reality and hit enter, or was this an iterative process that somehow got them to the point where the man's balloon head somehow perfectly met his shoulders or the Bunny Armadillo transformed from grotesque to the final, cute product?

That OpenAI has invited creatives to take Sora for a test run is not surprising. It's their livelihoods in art, film, and animation that are most at risk from Sora's already impressive capabilities. Most seem convinced it's a tool that can help them more quickly develop finished commercial products.

“The ability to rapidly conceptualize at such a high level of quality is not only challenging my creative process but also helping me evolve in storytelling. It's enabling me to translate my imagination with fewer technical constraints,” said Josephine Miller in the blog post.

Go watch the clips but don't blame us if you wake up in the middle of the night screaming.

You might also like

TechRadar – All the latest technology news

Read More

OpenAI’s Sora will one day add audio, editing, and may allow nudity in content

OpenAI’s Chief Technology Officer Mira Murati recently sat down with The Wall Street Journal to reveal interesting details about their upcoming text-to-video generator Sora.

The interview covers a wide array of topics from the type of content the AI engine will produce to the security measures being put into place. Combating misinformation is a sticking point for the company. Murati states Sora will have multiple safety guardrails to ensure the technology isn’t misused. She says the team wouldn’t feel comfortable releasing something that “might affect global elections”. According to the article, Sora will follow the same prompt policies as Dall-E meaning it’ll refuse to create “images of public figures” such as the President of the United States. 

Watermarks are going to be added too. A transparent OpenAI logo can be found in the lower right-hand corner indicating that it's AI footage. Murati adds that they may also adopt content provenance as another indicator. This uses metadata to give information on the origins of digital media. That's all well and good, but it may not be enough. Last year, a group of researchers managed to break “current image watermarking protections”, including those belonging to OpenAI. Hopefully, they come up with something tougher.

Generative features

Things get interesting when they begin to talk about Sora's future. First off, the developers have plans to “eventually” add sound to videos to make them more realistic. Editing tools are on the itinerary as well, giving online creators a way to fix the AI’s many mistakes. 

As advanced as Sora is, it makes a lot of errors. One of the prominent examples in the piece revolves around a video prompt asking the engine to generate a video where a robot steals a woman’s camera. Instead, the clip shows the woman partially becoming a robot. Murati admits there is room for improvement stating the AI is “quite good at continuity, [but] it’s not perfect”.

Nudity is not off the table. Murati says OpenAI is working with “artists… to figure out” what kind of nude content will be allowed.  It seems the team would be okay with allowing “artistic” nudity while banning things like non-consensual deep fakes. Naturally, OpenAI would like to avoid being the center of a potential controversy although they want their product to be seen as a platform fostering creativity. 

Ongoing tests

When asked about the data used to train Sora, Murati was a little evasive. 

She started off by claiming she didn’t know what was used to teach the AI other than it was either “publically available or license data”. What’s more, Murati wasn’t sure if videos from YouTube, Facebook, or Instagram were a part of the training. However she later admitted that media from Shutterstock was indeed used. The two companies, if you’re not aware, have a partnership which could explain why Murati was willing to confirm it as a source.

Murati states Sora will “definitely” launch by the end of the year. She didn’t give an exact date although it could happen within the coming months. For now, the developers are safety testing the engine looking for any “vulnerabilities, biases, and other harmful results”.

If you're thinking of one day trying out Sora, we suggest learning how to use editing software. Remember, it makes many errors and might continue to do so at launch. For recommendations, check out TechRadar's best video editing software for 2024.

You might also like

TechRadar – All the latest technology news

Read More

OpenAI’s impressive new Sora videos show it has serious sci-fi potential

OpenAI's Sora, its equivalent of image creation but for videos, made huge shockwaves in the swiftly advancing world of AI last month, and we’ve just caught a few new videos which are even more jaw-slackening than what we have already been treated to.

In case you somehow missed it, Sora is a text-to-video AI meaning you can write a simple request and it’ll compose a video (just as image generation previously worked, but obviously a much more complex endeavor).

An eye with the iris being a globe

(Image credit: OpenAI)

Now OpenAI’s Sora research lead Tim Brooks has released some new content generated by Sora on X (formerly Twitter). 

This is Sora’s crack at fulfilling the following request: “Fly through tour of a museum with many paintings and sculptures and beautiful works of art in all styles.”

Pretty impressive to say the least. On top of that, Bill Peebles, also a Sora research lead, showed us a clip generated from the following prompt: “An alien blending in naturally with new york city, paranoia thriller style, 35mm film.”

An alien character walking through a street

(Image credit: OpenAI)

Content creator Blaine Brown then stepped in to embellish the above clip, cutting it to repeat the footage and make it longer, while having the alien rapping, complete with lip-syncing. The music is generated by Suno AI by the way (with the lyrics written by Brown, mind), and lip-syncing is done with Pika Labs AI.

See more

Analysis: Still early days for Sora

Two people having dinner

(Image credit: OpenAI)

It’s worth underlining how fast things seem to be progressing with the capabilities of AI. Image creation powers were one thing – and extremely impressive in themselves – but this is entirely another. Especially when you remember that Sora is still just in testing at OpenAI, with a limited set of ‘red teamers’ (testers hunting out bugs and smoothing over those wrinkles).

The camera work in the museum fly-through flows realistically and feels nicely imaginative in the way it swoops around (albeit with the occasional judder). And the last tweet shows how you can take a base clip and flesh it out with content including AI-generated music.

Of course, AI can write a script as well, and so it begs the question: how long will it be before a blue alien is starring in an AI-generated post-apocalyptic drama. Or an (unintentional) comedy perhaps?

You get the idea, and we’re getting carried away, of course, but still – what AI could be capable of in just a few years is potentially mind-blowing, frankly.

Naturally, we’ll be seeing the cream of the crop of what Sora is capable of in these teasers, and there have been some buggy and weird efforts aired too. (Just as when ChatGPT and other AI chatbots first rolled onto the scene, we saw AI hallucinations and general unhinged behavior and replies).

Perhaps the broader worry with Sora, though, is how this might eventually displace, rather than assist, content creators. But that’s a fear to chew over on another day – not forgetting the potential for misuse with AI-created videos which we recently discussed in more depth here.

You might also like

TechRadar – All the latest technology news

Read More

What is OpenAI’s Sora? The text-to-video tool explained and when you might be able to use it

ChatGPT maker OpenAI has now unveiled Sora, its artificial intelligence engine for converting text prompts into video. Think Dall-E (also developed by OpenAI), but for movies rather than static images.

It's still very early days for Sora, but the AI model is already generating a lot of buzz on social media, with multiple clips doing the rounds – clips that look as if they've been put together by a team of actors and filmmakers.

Here we'll explain everything you need to know about OpenAI Sora: what it's capable of, how it works, and when you might be able to use it yourself. The era of AI text-prompt filmmaking has now arrived.

OpenAI Sora release date and price

In February 2024, OpenAI Sora was made available to “red teamers” – that's people whose job it is to test the security and stability of a product. OpenAI has also now invited a select number of visual artists, designers, and movie makers to test out the video generation capabilities and provide feedback.

“We're sharing our research progress early to start working with and getting feedback from people outside of OpenAI and to give the public a sense of what AI capabilities are on the horizon,” says OpenAI.

In other words, the rest of us can't use it yet. For the time being there's no indication as to when Sora might become available to the wider public, or how much we'll have to pay to access it. 

Two dogs on a mountain podcasting

(Image credit: OpenAI)

We can make some rough guesses about timescale based on what happened with ChatGPT. Before that AI chatbot was released to the public in November 2022, it was preceded by a predecessor called InstructGPT earlier that year. Also, OpenAI's DevDay typically takes place annually in November.    

It's certainly possible, then, that Sora could follow a similar pattern and launch to the public at a similar time in 2024. But this is currently just speculation and we'll update this page as soon as we get any clearer indication about a Sora release date.

As for price, we similarly don't have any hints of how much Sora might cost. As a guide, ChatGPT Plus – which offers access to the newest Large Language Models (LLMs) and Dall-E – currently costs $ 20 (about £16 / AU$ 30) per month. 

But Sora also demands significantly more compute power than, for example, generating a single image with Dall-E, and the process also takes longer. So it still isn't clear exactly how well Sora, which is effectively a research paper, might convert into an affordable consumer product.

What is OpenAI Sora?

You may well be familiar with generative AI models – such as Google Gemini for text and Dall-E for images – which can produce new content based on vast amounts of training data. If you ask ChatGPT to write you a poem, for example, what you get back will be based on lots and lots of poems that the AI has already absorbed and analyzed.

OpenAI Sora is a similar idea, but for video clips. You give it a text prompt, like “woman walking down a city street at night” or “car driving through a forest” and you get back a video. As with AI image models, you can get very specific when it comes to saying what should be included in the clip and the style of the footage you want to see.

See more

To get a better idea of how this works, check out some of the example videos posted by OpenAI CEO Sam Altman – not long after Sora was unveiled to the world, Altman responded to prompts put forward on social media, returning videos based on text like “a wizard wearing a pointed hat and a blue robe with white stars casting a spell that shoots lightning from his hand and holding an old tome in his other hand”.

How does OpenAI Sora work?

On a simplified level, the technology behind Sora is the same technology that lets you search for pictures of a dog or a cat on the web. Show an AI enough photos of a dog or cat, and it'll be able to spot the same patterns in new images; in the same way, if you train an AI on a million videos of a sunset or a waterfall, it'll be able to generate its own.

Of course there's a lot of complexity underneath that, and OpenAI has provided a deep dive into how its AI model works. It's trained on “internet-scale data” to know what realistic videos look like, first analyzing the clips to know what it's looking at, then learning how to produce its own versions when asked.

So, ask Sora to produce a clip of a fish tank, and it'll come back with an approximation based on all the fish tank videos it's seen. It makes use of what are known as visual patches, smaller building blocks that help the AI to understand what should go where and how different elements of a video should interact and progress, frame by frame.

OpenAI Sora

Sora starts messier, then gets tidier (Image credit: OpenAI)

Sora is based on a diffusion model, where the AI starts with a 'noisy' response and then works towards a 'clean' output through a series of feedback loops and prediction calculations. You can see this in the frames above, where a video of a dog playing in the show turns from nonsensical blobs into something that actually looks realistic.

And like other generative AI models, Sora uses transformer technology (the last T in ChatGPT stands for Transformer). Transformers use a variety of sophisticated data analysis techniques to process heaps of data – they can understand the most important and least important parts of what's being analyzed, and figure out the surrounding context and relationships between these data chunks.

What we don't fully know is where OpenAI found its training data from – it hasn't said which video libraries have been used to power Sora, though we do know it has partnerships with content databases such as Shutterstock. In some cases, you can see the similarities between the training data and the output Sora is producing.

What can you do with OpenAI Sora?

At the moment, Sora is capable of producing HD videos of up to a minute, without any sound attached, from text prompts. If you want to see some examples of what's possible, we've put together a list of 11 mind-blowing Sora shorts for you to take a look at – including fluffy Pixar-style animated characters and astronauts with knitted helmets.

“Sora can generate videos up to a minute long while maintaining visual quality and adherence to the user’s prompt,” says OpenAI, but that's not all. It can also generate videos from still images, fill in missing frames in existing videos, and seamlessly stitch multiple videos together. It can create static images too, or produce endless loops from clips provided to it.

It can even produce simulations of video games such as Minecraft, again based on vast amounts of training data that teach it what a game like Minecraft should look like. We've already seen a demo where Sora is able to control a player in a Minecraft-style environment, while also accurately rendering the surrounding details.

OpenAI does acknowledge some of the limitations of Sora at the moment. The physics don't always make sense, with people disappearing or transforming or blending into other objects. Sora isn't mapping out a scene with individual actors and props, it's making an incredible number of calculations about where pixels should go from frame to frame.

In Sora videos people might move in ways that defy the laws of physics, or details – such as a bite being taken out of a cookie – might not be remembered from one frame to the next. OpenAI is aware of these issues and is working to fix them, and you can check out some of the examples on the OpenAI Sora website to see what we mean.

Despite those bugs, further down the line OpenAI is hoping that Sora could evolve to become a realistic simulator of physical and digital worlds. In the years to come, the Sora tech could be used to generate imaginary virtual worlds for us to explore, or enable us to fully explore real places that are replicated in AI.

How can you use OpenAI Sora?

At the moment, you can't get into Sora without an invite: it seems as though OpenAI is picking out individual creators and testers to help get its video-generated AI model ready for a full public release. How long this preview period is going to last, whether it's months or years, remains to be seen – but OpenAI has previously shown a willingness to move as fast as possible when it comes to its AI projects.

Based on the existing technologies that OpenAI has made public – Dall-E and ChatGPT – it seems likely that Sora will initially be available as a web app. Since its launch ChatGPT has got smarter and added new features, including custom bots, and it's likely that Sora will follow the same path when it launches in full.

Before that happens, OpenAI says it wants to put some safety guardrails in place: you're not going to be able to generate videos showing extreme violence, sexual content, hateful imagery, or celebrity likenesses. There are also plans to combat misinformation by including metadata in Sora videos that indicates they were generated by AI.

You might also like

TechRadar – All the latest technology news

Read More

OpenAI’s new Sora text-to-video model can make shockingly realistic content

OpenAI breaks new ground as the AI giant has revealed its first text-to-video model called Sora, capable of creating shockingly realistic content.

We’ve been wondering when the company was finally going to release its own video engine as so many of its rivals, from Stability AI to Google, have beaten them to the punch. Perhaps OpenAI wanted to get things just right before a proper launch. At this rate, the quality of its outputs could eclipse its contemporaries. According to the official page, Sora can generate “realistic and imaginative scenes” from a single text prompt; much like other text-to-video AI models. The difference with this engine is the technology behind it. 

Lifelike content

Open AI claims its artificial intelligence can understand how people and objects “exist in the physical world”. This gives Sora the ability to create scenes featuring multiple people, varying types of movement, facial expressions, textures, and objects with a high amount of detail. Generated videos lack the plastic look or the nightmarish forms seen in other AI content – for the most part, but more on that later.

Sora is also multimodular. Users will reportedly be able to upload a still image to serve as the basis of a video. The content inside the picture will become animated with a lot of attention paid to the small details. It can even take a pre-existing video “and extend it or fill in missing frames.” 

See more

You can find sample clips on OpenAI’s website and on X (the platform formerly known as Twitter). One of our favorites features a group of puppies playing in the snow. If you look closely, you can see their fur and the snow on their snouts have a strikingly lifelike quality to them. Another great clip shows a Victoria-crowned pigeon bobbing around like an actual bird.

A work in progress

As impressive as these two videos may be, Sora is not perfect. OpenAI admits its “model has weaknesses.” It can have a hard time simulating the physics of an object, confuse left from right, as well as misunderstand “instances of cause and effect.” You can have an AI character bite into a cookie, but the cookie lacks a bite mark.

It makes a lot of weird errors too. One of the funnier mishaps involves a group of archeologists unearthing a large piece of paper which then transforms into a chair before ending up as a crumpled piece of plastic. The AI also seems to have trouble with words. “Otter” is misspelled as “Oter” and “Land Rover” is now “Danover”.

See more

Moving forward, the company will be working with its “red teamers” who are a group of industry experts “to assess critical areas for harms or risks.” They want to make sure Sora doesn’t generate false information, hateful content, or have any bias. Additionally, OpenAI is going to implement a text classifier to reject prompts that violate their policy. These include inputs requesting sexual content, violent videos, and celebrity likenesses among other things.

No word on when Sora will officially launch. We reached out for info on the release. This story will be updated at a later time. In the meantime, check out TechRadar's list of the best AI video editors for 2024.

You might also like

TechRadar – All the latest technology news

Read More