What is OpenAI’s Sora? The text-to-video tool explained and when you might be able to use it

ChatGPT maker OpenAI has now unveiled Sora, its artificial intelligence engine for converting text prompts into video. Think Dall-E (also developed by OpenAI), but for movies rather than static images.

It's still very early days for Sora, but the AI model is already generating a lot of buzz on social media, with multiple clips doing the rounds – clips that look as if they've been put together by a team of actors and filmmakers.

Here we'll explain everything you need to know about OpenAI Sora: what it's capable of, how it works, and when you might be able to use it yourself. The era of AI text-prompt filmmaking has now arrived.

OpenAI Sora release date and price

In February 2024, OpenAI Sora was made available to “red teamers” – that's people whose job it is to test the security and stability of a product. OpenAI has also now invited a select number of visual artists, designers, and movie makers to test out the video generation capabilities and provide feedback.

“We're sharing our research progress early to start working with and getting feedback from people outside of OpenAI and to give the public a sense of what AI capabilities are on the horizon,” says OpenAI.

In other words, the rest of us can't use it yet. For the time being there's no indication as to when Sora might become available to the wider public, or how much we'll have to pay to access it. 

Two dogs on a mountain podcasting

(Image credit: OpenAI)

We can make some rough guesses about timescale based on what happened with ChatGPT. Before that AI chatbot was released to the public in November 2022, it was preceded by a predecessor called InstructGPT earlier that year. Also, OpenAI's DevDay typically takes place annually in November.    

It's certainly possible, then, that Sora could follow a similar pattern and launch to the public at a similar time in 2024. But this is currently just speculation and we'll update this page as soon as we get any clearer indication about a Sora release date.

As for price, we similarly don't have any hints of how much Sora might cost. As a guide, ChatGPT Plus – which offers access to the newest Large Language Models (LLMs) and Dall-E – currently costs $ 20 (about £16 / AU$ 30) per month. 

But Sora also demands significantly more compute power than, for example, generating a single image with Dall-E, and the process also takes longer. So it still isn't clear exactly how well Sora, which is effectively a research paper, might convert into an affordable consumer product.

What is OpenAI Sora?

You may well be familiar with generative AI models – such as Google Gemini for text and Dall-E for images – which can produce new content based on vast amounts of training data. If you ask ChatGPT to write you a poem, for example, what you get back will be based on lots and lots of poems that the AI has already absorbed and analyzed.

OpenAI Sora is a similar idea, but for video clips. You give it a text prompt, like “woman walking down a city street at night” or “car driving through a forest” and you get back a video. As with AI image models, you can get very specific when it comes to saying what should be included in the clip and the style of the footage you want to see.

See more

To get a better idea of how this works, check out some of the example videos posted by OpenAI CEO Sam Altman – not long after Sora was unveiled to the world, Altman responded to prompts put forward on social media, returning videos based on text like “a wizard wearing a pointed hat and a blue robe with white stars casting a spell that shoots lightning from his hand and holding an old tome in his other hand”.

How does OpenAI Sora work?

On a simplified level, the technology behind Sora is the same technology that lets you search for pictures of a dog or a cat on the web. Show an AI enough photos of a dog or cat, and it'll be able to spot the same patterns in new images; in the same way, if you train an AI on a million videos of a sunset or a waterfall, it'll be able to generate its own.

Of course there's a lot of complexity underneath that, and OpenAI has provided a deep dive into how its AI model works. It's trained on “internet-scale data” to know what realistic videos look like, first analyzing the clips to know what it's looking at, then learning how to produce its own versions when asked.

So, ask Sora to produce a clip of a fish tank, and it'll come back with an approximation based on all the fish tank videos it's seen. It makes use of what are known as visual patches, smaller building blocks that help the AI to understand what should go where and how different elements of a video should interact and progress, frame by frame.

OpenAI Sora

Sora starts messier, then gets tidier (Image credit: OpenAI)

Sora is based on a diffusion model, where the AI starts with a 'noisy' response and then works towards a 'clean' output through a series of feedback loops and prediction calculations. You can see this in the frames above, where a video of a dog playing in the show turns from nonsensical blobs into something that actually looks realistic.

And like other generative AI models, Sora uses transformer technology (the last T in ChatGPT stands for Transformer). Transformers use a variety of sophisticated data analysis techniques to process heaps of data – they can understand the most important and least important parts of what's being analyzed, and figure out the surrounding context and relationships between these data chunks.

What we don't fully know is where OpenAI found its training data from – it hasn't said which video libraries have been used to power Sora, though we do know it has partnerships with content databases such as Shutterstock. In some cases, you can see the similarities between the training data and the output Sora is producing.

What can you do with OpenAI Sora?

At the moment, Sora is capable of producing HD videos of up to a minute, without any sound attached, from text prompts. If you want to see some examples of what's possible, we've put together a list of 11 mind-blowing Sora shorts for you to take a look at – including fluffy Pixar-style animated characters and astronauts with knitted helmets.

“Sora can generate videos up to a minute long while maintaining visual quality and adherence to the user’s prompt,” says OpenAI, but that's not all. It can also generate videos from still images, fill in missing frames in existing videos, and seamlessly stitch multiple videos together. It can create static images too, or produce endless loops from clips provided to it.

It can even produce simulations of video games such as Minecraft, again based on vast amounts of training data that teach it what a game like Minecraft should look like. We've already seen a demo where Sora is able to control a player in a Minecraft-style environment, while also accurately rendering the surrounding details.

OpenAI does acknowledge some of the limitations of Sora at the moment. The physics don't always make sense, with people disappearing or transforming or blending into other objects. Sora isn't mapping out a scene with individual actors and props, it's making an incredible number of calculations about where pixels should go from frame to frame.

In Sora videos people might move in ways that defy the laws of physics, or details – such as a bite being taken out of a cookie – might not be remembered from one frame to the next. OpenAI is aware of these issues and is working to fix them, and you can check out some of the examples on the OpenAI Sora website to see what we mean.

Despite those bugs, further down the line OpenAI is hoping that Sora could evolve to become a realistic simulator of physical and digital worlds. In the years to come, the Sora tech could be used to generate imaginary virtual worlds for us to explore, or enable us to fully explore real places that are replicated in AI.

How can you use OpenAI Sora?

At the moment, you can't get into Sora without an invite: it seems as though OpenAI is picking out individual creators and testers to help get its video-generated AI model ready for a full public release. How long this preview period is going to last, whether it's months or years, remains to be seen – but OpenAI has previously shown a willingness to move as fast as possible when it comes to its AI projects.

Based on the existing technologies that OpenAI has made public – Dall-E and ChatGPT – it seems likely that Sora will initially be available as a web app. Since its launch ChatGPT has got smarter and added new features, including custom bots, and it's likely that Sora will follow the same path when it launches in full.

Before that happens, OpenAI says it wants to put some safety guardrails in place: you're not going to be able to generate videos showing extreme violence, sexual content, hateful imagery, or celebrity likenesses. There are also plans to combat misinformation by including metadata in Sora videos that indicates they were generated by AI.

You might also like

TechRadar – All the latest technology news

Read More

Apple could be working on a new AI tool that animates your images based on text prompts

Apple may be working on a new artificial intelligence tool that will let you create basic animations from your photos using a simple text prompt. If the tool comes to fruition, you’ll be able to turn any static image into a brief animation just by typing in what you want it to look like. 

According to 9to5Mac, Apple researchers have published a paper that details procedures for manipulating image graphics using text commands. The tool, Apple Keyframer, will use natural language text to tell the proposed AI system to manipulate the given image and animate it. 

Say you have a photo of the view from your window, with trees in the background and even cars driving past. From what the paper suggests, you’ll be able to type commands such as ‘make the leaves move as if windy’ into the Keyframer tool, which will then animate the specified part of your photo.

You may recognize the name ‘keyframe’ if you’re an Apple user, as it’s already part of Apple’s Live Photos feature – which lets you go through a ‘live photo’ GIF and select which frame, the keyframe, you want to be the actual still image for the photo. 

Better late than never? 

Apple has been notably slow to jump onto the AI bandwagon, but that’s not exactly surprising. The company is known to play the long game and let others beat out the kinks before they make their move, as we’ve seen with its recent foray into mixed reality with the Apple Vision Pro (this is also why I have hope for a foldable iPhone coming soon). 

I’m quite excited for the Keyframer tool if it does come to fruition because it’ll put basic animation tools into the palm of every iPhone user who might not know where to even start with animation, let alone make their photos move.

Overall, the direction Apple seems to be taking in terms of AI tools seems to be a positive one. The Keyframer tool comes right off the back of Apple’s AI-powered image editing tool, which again reinforces the move towards user experience improvement rather than just putting out things that mirror the competition from companies like OpenAI, Microsoft, and Google.

I’m personally glad to see that Apple’s dive into the world of artificial intelligence tools isn’t just another AI chatbot like ChatGPT or Google Gemini, but rather focusing on tools that offer unique new features for iOS and macOS products. While this project is in the very early stages of inception, I’m still pretty hyped about the idea of making funny little clips of my cat being silly or creating moving memories of my friends with just a few word prompts. 

As for when we’ll get our hands on Keyframer, unfortunately there’s no release date in sight just yet – but based on previous feature launches, Apple willingly revealing details at this stage indicates that it’s probably not too far off, and more importantly isn’t likely to get tossed aside. After all, Apple isn’t Google.

You might also like…

TechRadar – All the latest technology news

Read More

Apple working on a new AI-powered editing tool and you can try out the demo now

Apple says it plans on introducing generative AI features to iPhones later this year. It’s unknown what they are, however, a recently published research paper indicates one of them may be a new type of editing software that can alter images via text prompts.

It’s called MGIE, or MLLM-Guided (multimodal large language model) Image Editing. The tech is the result of a collaboration between Apple and researchers from the University of California, Santa Barbara. The paper states MGIE is capable of “Photoshop-style [modifications]” ranging from simple tweaks like cropping to more complex edits such as removing objects from a picture. This is made possible by the MLLM (multimodal large language model), a type of AI capable of processing both “ text and images” at the same time.

VentureBeat in their report explains MLLMs show “remarkable capabilities in cross-model understanding”, although they have not been widely implemented in image editing software despite their supposed efficacy.

Public demonstration

The way MGIE works is pretty straightforward. You upload an image to the AI engine and give it clear, concise instructions on the changes you want it to make. VentureBeat says people will need to “provide explicit guidance”. As an example, you can upload a picture of a bright, sunny day and tell MGIE to “make the sky more blue.” It’ll proceed to saturate the color of the sky a bit, but it may not be as vivid as you would like. You’ll have to guide it further to get the results you want. 

MGIE is currently available on GitHub as an open-source project. The researchers are offering “code, data, [pre-trained models]”, as well as a notebook teaching people how to use the AI for editing tasks. There’s also a web demo available to the public on the collaborative tech platform Hugging Face. With access to this demo, we decided to take Apple’s AI out for a spin.

Image 1 of 3

Cat picture new background on MGIE

(Image credit: Cédric VT/Unsplash/Apple)
Image 2 of 3

Cat picture lightning background on MGIE

(Image credit: Cédric VT/Unsplash/Apple)
Image 3 of 3

Cat picture on MGIE

(Image credit: Cédric VT/Unsplash/Apple)

In our test, we uploaded a picture of a cat that we got from Unsplash and then proceeded to instruct MGIE to make several changes. And in our experience, it did okay. In one instance, we told it to change the background from blue to red. However, MGIE instead made the background a darker shade of blue with static-like texturing. On another, we prompted the engine to add a purple background with lightning strikes and it created something much more dynamic.

Inclusion in future iPhones

At the time of this writing, you may experience long queue times while attempting to generate content. If it doesn’t work, the Hugging Face page has a link to the same AI hosted over on Gradio which is the one we used. There doesn't appear to be any difference between the two.

Now the question is: will this technology come out to a future iPhone or iOS 18? Maybe. As alluded to at the beginning, company CEO Tim Cook told investors AI tools are coming to its devices later on in the year but didn’t give any specifics. Personally, we can see MGIE morph into the iPhone version of Google’s Magic Editor; a feature that can completely alter the contents of a picture. If you read the research paper on arXiv, that certainly seems to be the path Apple is taking with its AI.

MGIE is still a work in progress. Outputs are not perfect. One of the sample images shows the kitten turn into a monstrosity. But we do expect all the bugs to be worked out down the line. If you prefer a more hands-on approach, check out TechRadar's guide on the best photo editors for 2024.

You might also like

TechRadar – All the latest technology news

Read More

Windows 11’s Snipping Tool could get new powers for taking screenshots – but is Microsoft in danger of overcomplicating things?

Windows 11’s Snipping Tool is set to get a handy feature to embellish screenshots, or at least it seems that way.

Leaker PhantomOfEarth discovered the new abilities in the app by tinkering with bits and pieces in version 11.2312.33.0 of Snipping Tool. As you can see in the tweet below, the functionality allows the user to draw shapes (and fill them with color) and lines.

See more

That means you can highlight parts of screenshots by pointing with arrows – for an instructional step-by-step tutorial you’ve made with screen grabs, for example – or add different shapes as needed.

Note that this is not in testing yet, because as noted, the leaker needed to play with the app’s configuration to get it going. However, the hidden functionality does seem to be working fine, more or less, so it’s likely that a rollout to Windows 11 testers isn’t far off.


Analysis: A feature drive with core apps

While you could furnish your screenshots from Snipping Tool with these kinds of extras simply by opening the image in Paint, it’s handy to have this feature on tap to directly work on a grab without needing to go to a second app.

Building out some of the basic Windows 11 apps is very much becoming a theme for Microsoft of late. For example, recently Snipping Tool has been testing a ‘combined capture bar’ (for easily switching between capturing screenshots or video clips), and the ability to lift text straight from screenshots which is really nifty in some scenarios.

Elsewhere, core apps like Paint and Notepad are getting an infusion of AI (with Cocreator and a rumored Cowriter addition), and there’s been a lot of work in other respects with Notepad such as adding tabs.

We think these initiatives are a good line of attack for Microsoft, although there are always folks who believe that simple apps like Snipping Tool or Notepad should be kept basic, and advanced functionality is in danger of cluttering up these streamlined utilities. We get where that sentiment comes from, but we don’t think Microsoft is pushing those boundaries yet.

Via Windows Central

You might also like…

TechRadar – All the latest technology news

Read More

Google’s Nearby Share tool appears to adopt Samsung’s similar utility name and we wonder what’s going on

Google has suddenly changed the name of its file-sharing tool from Nearby Share to Quick Share which is what Samsung calls its own tool.

It’s a random move that has people scratching their heads wondering what it could mean for Android in the future. This update appears to have been discovered by industry insider Kamila Wojiciechowska who displayed her findings on X (the platform formerly known as Twitter). Wojiciechowska revealed that she received a notification on her phone informing her of the change after installing Google Mobile Services version 23.50.13. 

In addition to the new name, Google altered the logo for the feature as well as the user interface. The logo will now consist of two arrows moving toward each other in a half-circle motion on a blue background. Regarding the UI, it will now display a Quick Settings tile for fast configuration, text explaining what the various options do, and an easier-to-use interface. There’s even a new ability, allowing people to restrict Quick Share visibility down to ten minutes.

Wojieciechowska states this update is not widely available nor is the Nearby Share change common among the people who do receive the patch. This may be something only a handful will receive. She admits to being confused as to why Google is doing this, although it appears this could be the start of a new collaboration between the two companies according to found evidence.

Start of a new partnership

Android Authority in their report claims Wojieciechowska discovered proof of a “migration education flow” for Quick Share after digging through the Play Services app. This could suggest Google and Samsung are combining their file-sharing tools into one. Or at the very least, “making them interoperable”. 

If this is the case, two of the biggest Android brands coming together to unify their services could be a huge benefit for users. Currently separate and similarly behaving features might, if this is any evidence, coalesce into one that’ll work with both Galaxy and non-Galaxy smartphones alike. It's a quality-of-life upgrade that'll reduce software clutter.

Android Authority makes it clear, though, that there isn’t any concrete proof stating the two tools will merge. It’s just given the set of circumstances that seems to be the case. Plus, the whole thing wouldn’t make sense if it wasn’t the result of an upcoming collaboration. Think about it. Why would Google decide to give one of its mobile tools the same name as one of its competitor’s software? That might confuse users. 

There has to be something more to it so we reached out to both companies for more information. This story will be updated at a later time.

Until then, check out TechRadar's list of the best smartphone for 2023.

You might also like

TechRadar – All the latest technology news

Read More

Microsoft Copilot’s new AI tool will turn your simple prompts into songs

Thanks to a newfound partnership with music creation platform Suno, Microsoft Copilot can now generate short-form songs with a single text prompt.

The content it creates not only consists of instrumentals but also fleshed-out lyrics and actual singing voices. Microsoft states in the announcement that you don’t need to have any pre-existing music-making skills. All you need is an idea in your head. If any of this sounds familiar to you, that’s because both Meta and Google have their versions of this technology in the form of MusicGen and Instrument Playground, respectively. These two function similarly too, although they run on a proprietary AI model instead of something third-party.

How to use the Suno plugin

To use this feature, you’ll have to first launch Microsoft Edge, as the update is exclusive to the browser, then head on over to the Copilot website, sign in, and click the Plugin tab in the top right corner. Make sure that Suno is currently active. 

Suno plugin

(Image credit: Future)

Once everything is in place, enter a text prompt into Copilot and give it enough time to finish. It does take a little while for the AI to create something according to the prompt. In our experience, it took Copilot about ten minutes to make lyrics to a pop song about having an adventure with your family. Strangely, we didn’t receive any audio.

Copilot told us it made a link to Suno’s official website where we could listen to the track, but the URL disappeared the moment it was finished. We then prompted the AI to generate another song, however it only wrote the lyrics. When asked where the audio was, Copilot told us to imagine the melody in our heads or to sing the words out loud.

This is the first time we’ve had a music-generative AI flat-out refuse to produce audio.

Microsoft Copilot refusing to generate

(Image credit: Future)

Good performance… when it works

From here, we went to Suno’s website to get an idea of what the tech can do. The audio genuinely sounded great in our experience. The vocal performances were surprisingly good although not amazing. It’s not total gibberish like with Google’s Instrument Playground, but they’re not super clear either. 

We couldn't find out how good Copilot’s music-making skills are, but if it’s anything like the base Suno model, the content it can create will outshine anything that MusicGen or Instrument Playground can churn out.

Rollout of the Suno plugin has already begun and will continue over the coming weeks. No word if Microsoft has plans to expand the feature to other browsers although we did reach out to ask if this is in the works and if Microsoft is going to address the issues we encountered. We would’ve loved to hear the music. This story will be updated at a later time.

In the meantime, check out TechRadar's list of the best free music-making software in 2023.

You might also like

TechRadar – All the latest technology news

Read More

YouTube’s new AI tool will let you create your dream song with a famous singer’s voice

YouTube is testing a pair of experimental AI tools giving users a way to create short songs either via a text prompt or their own vocal sample.

The first one is called Dream Track, a feature harnessing the voices of a group of nine mainstream artists to generate 30-second music tracks for YouTube Shorts. The way it works is you enter a text prompt into the engine describing what you want to hear and then select a singer appearing in the tool’s carousel. Participating musicians include John Legend, Sia, and T-Pain; all of whom gave their consent to be a part of the program. Back in late October, a Bloomberg report made the rounds stating YouTube was working on AI tech allowing content creators “to produce songs using the voices of famous singers”, but couldn’t launch it due to the ongoing negotiations with record labels. Dream Track appears to be that self-same AI

YouTube's Dream Track on mobile

(Image credit: YouTube)

For the initial rollout, Dream Track will be available to a small group of American content creators on mobile devices. No word on if and when it’ll see a wider release or desktop version. 

The announcement post has a couple of videos demonstrating the feature. One of them simulates a user asking the AI to create a song about “a sunny morning in Florida” using T-Pain’s voice. In our opinion, it does a pretty good job of emulating his style and coming up with lyrics on the fly, although the performance does sound like it’s been through an Auto-Tune filter.

Voices into music

The second experiment is called Music AI Tools which, as we alluded to earlier, can generate bite-sized tracks by transforming an uploaded vocal sample. For example, a short clip of you humming can turn into a guitar riff. It even works in reverse as chords coming from a MIDI keyboard can be morphed into a choir

An image on Google’s DeepMind website reveals what the user interface for the Music AI Tool desktop app may look like. At first, we figured the layout would be relatively simple like Dream Track, however, it is a lot more involved. 

YouTube's Music AI Tool

(Image credit: Google)

The interface resembles a music editing program with a timeline at the top highlighting the input alongside several editing tools. These presumably would allow users a way to tweak certain elements in a generated track. Perhaps a producer wants to tone down the distortion on a guitar riff or bump up the piano section.

Google says it is currently testing this feature with those in YouTube’s Music AI Incubator program, which is an exclusive group consisting of “artists, songwriters, and producers” from across the music industry. No word on when it’ll see a wide release.

Analysis: Treading new waters

YouTube is pitching this recent foray as a new way for creative users to express themselves; a new way to empower fledgling musicians who may lack important resources to grow. However, if you look at this from the artists’ perspective, the attitude is not so positive. The platform compiled a series of quotes from the group of nine singers regarding Dream Track. Several mention the inevitability of generative AI in music and the need to be a part of it, with a few stating they will remain cautious towards the tech.

We may be reading too much into this, but we get the vibe that some aren’t totally on board with this tech. To quote one of our earlier reports, musicians see generative AI as something “they’ll have to deal with or they risk getting left behind.” 

YouTube says it’s approaching the situation with the utmost respect, ensuring “the broader music community” benefits. Hopefully, the platform will maintain its integrity moving forward.

While we have you, be sure to check out TechRadar's list of the best free music-making software for 2023.

You might also like

Follow TechRadar on TikTok for news, reviews, unboxings, and hot Black Friday deals!

TechRadar – All the latest technology news

Read More

WhatsApp is upgrading its voice chat tool so it can host a lot more people

WhatsApp is upgrading the Voice Chat feature on mobile so users can now host large group calls with up to 128 participants. 

The platform has yet to make a formal announcement of the changes through its usual avenues although details can be found on its Help Center support website. On the surface, the tool’s functionality is pretty straightforward. You can start a group voice chat by going to a group chat, tapping the audio read-out icon in the upper right-hand corner, and selecting Start Voice Chat. The company states this is “only available on your primary device” and calls will automatically end the moment everyone leaves. Additionally, they instantly end after an hour if no one “joins the first or last person in the chat”. 

Silent calls

There is more to this update than what’s on the support page as other news reports reveal a much more robust feature. According to TechCrunch, Voice Chat for Larger Groups is “designed to be less disruptive” than a regular group call. Participants will not be rung when a call starts. Instead, they will “receive a push notification” with an in-chat bubble you have to tap in order to join. 

At the top of the screen is a series of controls where you can mute, unmute, or message other people in the group without having to leave. Of course, you can hang up any time you want using the same controls. Like with all forms of messaging on WhatsApp, the large voice chats will be end-to-end encrypted.

Availability

The Verge states the patch will be rolling out to the Android and iOS apps over the coming weeks, however, it’ll first be made available to bigger groups hosting 33 to 128 participants. It’s unknown why smaller chats will have to wait to receive the same feature. But as The Verge points out, it could be because the Group Voice Call tool already exists. Meta is seemingly prioritizing the larger chats first before moving on to all users.

No word if WhatsApp has plans to expand this to their desktop app; although we did ask. This story will be updated at a later time.

With Black Friday around the corner, we expect a lot of discounts for major brands. If you want to see what’s out there, check out TechRadar’s roundup of the best Black Friday phone deals for 2023

You might also like

TechRadar – All the latest technology news

Read More

The AI backlash begins: artists could protect against plagiarism with this powerful tool

A team of researchers at the University of Chicago has created a tool aimed to help online artists “fight back against AI companies” by inserting, in essence, poison pills into their original work.

Called Nightshade, after the family of toxic plants, the software is said to introduce poisonous pixels to digital art that messes with the way generative AIs interpret them. The way models like Stable Diffusion work is they scour the internet, picking up as many images as they can to use as training data. What Nightshade does is exploit this “security vulnerability”. As explained by the MIT Technology Review, these “poisoned data samples can manipulate models into learning” the wrong thing. For example, it could see a picture of a dog as a cat or a car as a cow.

Poison tactics

As part of the testing phase, the team fed Stable Diffusion infected content and “then prompted it to create images of dogs”. After being given 50 samples, the AI generated pictures of misshapen dogs with six legs. After 100, you begin to see something resembling a cat. Once it was given 300, dogs became full-fledged cats. Below, you'll see the other trials.

Nightshade tests

(Image credit: University of Chicago/MIT Technology Review)

The report goes on to say Nightshade also affects “tangentially related” ideas because generative AIs are good “at making connections between words”. Messing with the word “dog” jumbles similar concepts like puppy, husky, or wolf. This extends to art styles as well. 

Nightshade's tangentially related samples

(Image credit: University of Chicago/MIT Technology Review)

It is possible for AI companies to remove the toxic pixels. However as the MIT post points out, it is “very difficult to remove them”. Developers would have to “find and delete each corrupted sample.” To give you an idea of how tough this would be, a 1080p image has over two million pixels. If that wasn’t difficult enough, these models “are trained on billions of data samples.” So imagine looking through a sea of pixels to find the handful messing with the AI engine.

At least, that’s the idea. Nightshade is still in the early stages. Currently, the tech “has been submitted for peer review at [the] computer security conference Usenix.” MIT Technology Review managed to get a sneak peek.

Future endeavors

We reached out to team lead, Professor Ben Y. Zhao at the University of Chicago, with several questions. 

He told us they do have plans to “implement and release Nightshade for public use.” It’ll be a part of Glaze as an “optional feature”. Glaze, if you’re not familiar, is another tool Zhao’s team created giving artists the ability to “mask their own personal style” and stop it from being adopted by artificial intelligence. He also hopes to make Nightshade open source, allowing others to make their own venom.

Additionally, we asked Professor Zhao if there are plans to create a Nightshade for video and literature. Right now, multiple literary authors are suing OpenAI claiming the program is “using their copyrighted works without permission.” He states developing toxic software for other works will be a big endeavor “since those domains are quite different from static images. The team has “no plans to tackle those, yet.” Hopefully someday soon.

So far, initial reactions to Nightshade are positive. Junfeng Yang, a computer science professor at Columbia University, told Technology Review this could make AI developers “respect artists’ rights more”. Maybe even be willing to pay out royalties.

If you're interested in picking up illustration as a hobby, be sure to check out TechRadar's list of the best digital art and drawing software in 2023.

You might also like

TechRadar – All the latest technology news

Read More

YouTube working on an AI music tool that’ll let you use the voices of famous musicians

YouTube is apparently working on a new AI tool that could give content creators the ability to produce songs using the voices of famous singers and musicians.

According to a recent Bloomberg report, the platform has approached several record labels with this technology with negotiations still ongoing. YouTube is trying to obtain rights to use certain songs to train the AI while also trying not to step on any land mines that would lead to them getting sued to high heaven. We’re already seeing a similar situation happen with OpenAI as it’s currently being sued by 17 authors, including A Song of Ice and Fire creator George R.R. Martin, who all allege ChatGPT is illegally using their work. Bloomberg states musicians and labels want to maintain control over their work so developers aren’t using it “to train models without permission or compensation.”

Originally, a beta of this tech was supposed to be shown off during the Made On YouTube event last month. Billboard states in their report the beta would have had a “select pool of artists [give] permission to” certain creators to use their likeness on the platform. Eventually, it would officially launch as a feature where everybody can try using the voices of consenting artists. 

Mixed response

The response from the music industry at large has been mixed. Bloomberg claims “companies have been receptive” agreeing to work with YouTube on this project. However, Billboard states record executives have had a tough time finding artists willing to participate. Some acts feel anxious about putting their voices into “the hands of unknown creators who could use them to make statements or sing lyrics” that they don’t agree with.

YouTube is trying to position itself as everybody’s best friend – as a partner to help the music industry figure this whole thing out. However, the air is gloomy. The industry sees generative AI as an unstoppable force, but it’s not an immovable object. The technology is an inevitability that they’ll have to deal with or they risk getting left behind. 

Ray of positivity

There’s another snag in all this regarding publishing. Making music isn’t a one-person show as there are entire teams involved in production. To solve this, a Billboard source says YouTube will probably give labels one big licensing fee that they have to “figure out how to divide among” songwriters.

Despite the dour attitude, there is some positivity. Billboard claims rights holders are engaging in “good faith to get a deal done” amicably. A few artists do “recognize these models could open new avenues for creative expression.” Record executives may be less keen as another Billboard source states AI can put “companies at a disadvantage”.

We’ll just have to wait and see what comes from all this. Again, YouTube’s new model could help people explore their creative side assuming deals are made fairly.

While we're on the topic of production, be sure to check out TechRadar's list of the best free music-making software for 2023.

You might also like

TechRadar – All the latest technology news

Read More