wasn't Archives - Shapiro Consultants

The ChatGPT ‘Sky’ assistant wasn’t a deliberate copy of Scarlett Johansson’s voice, OpenAI claims

May 23, 2024
No Comments

OpenAI's high-profile run-in with Scarlett Johansson is turning into a sci-fi story to rival the move Her, and now it's taken another turn, with OpenAI sharing documents and an updated blog post suggesting that its 'Sky' chatbot in the ChatGPT app wasn't a deliberate attempt to copy the actress's voice.

OpenAI preemptively pulled its 'Sky' voice option in the ChatGPT app on May 19, just before Scarlett Johansson publicly expressed her “disbelief” at how “eerily similar” it sounded to her own (in a statement shared with NPR). The actress also revealed that OpenAI CEO Sam Altman had previously approached her twice to license her voice for the app, and that she'd declined on both occasions.

But now OpenAI is on the defensive, sharing documents with The Washington Post suggesting that its casting process for the various voices in the ChatGPT app was kept entirely separate from its reported approaches to Johansson.

The documents, recordings and interviews with people involved in the process suggest that “an actress was hired to create the Sky voice months before Altman contacted Johansson”, according to The Washington Post.

The agent of the actress chosen for the Sky voice also apparently confirmed that “neither Johansson nor the movie “Her” were ever mentioned by OpenAI” during the process, nor was the actress's natural speaking voice tweaked to sound more like Johansson.

OpenAI's lead for AI model behavior, Joanne Jang, also shared more details with The Washington Post on how the voices were cast. Jang stated that she “kept a tight tent” around the AI voices project and that Altman was “not intimately involved” in the decision-making process, as he was “on his world tour during much of the casting process”.

Clearly, this case is likely to rumble on, but one thing's for sure – we won't be seeing ChatGPT's 'Sky' voice reappear for some time, if at all, despite the vocal protestations and petitions of its many fans.

What happens next?

OpenAI logo on wall

(Image credit: Shutterstock.com / rafapress)

With Johansson now reportedly lawyering up in her battle with OpenAI, the case looks likely to continue for some time.

Interestingly, the case isn't completely without precedent, despite the involvement of new tech. As noted by Mitch Glazier (chief executive of the Recording Industry Association of America), there was a similar case in the 1980s involving Bette Midler and the Ford Motor Company.

After Midler declined Ford's request to use her voice in a series of ads, Ford hired an impersonator instead – which resulted in a legal battle that Midler ultimately won, after a US court found that her voice was distinctive and should be protected against unauthorized use.

OpenAI is now seemingly distancing itself from suggestions that it deliberately did something similar with Johansson in its ChatGPT app, highlighting that its casting process started before Altman's apparent approaches to the actress.

This all follows an update to OpenAI's blog post, which included a statement from CEO Sam Altman claiming: “The voice of Sky is not Scarlett Johansson's, and it was never intended to resemble hers. We cast the voice actor behind Sky’s voice before any outreach to Ms. Johansson. Out of respect for Ms. Johansson, we have paused using Sky’s voice in our products. We are sorry to Ms. Johansson that we didn’t communicate better.”

But Altman's post on X (formerly Twitter) just before OpenAI's launch of GPT-4o, which simply stated “her”, doesn't help distance the company from suggestions that it was attempting to recreate the famous movie in some form, regardless of how explicit that was in its casting process.

TechRadar – All the latest technology news

April 26, 2024
No Comments

A new interview with the director behind the viral Sora clip Air Head has revealed that AI played a smaller part in its production than was originally claimed.

Revealed by Patrick Cederberg (who did the post-production for the viral video) in an interview with Fxguide, it has now been confirmed that OpenAI's text-to-video program was far from the only force involved in its production. The 1-minute and 21-second clip was made with a combination of traditional filmmaking techniques and post-production editing to achieve the look of the final picture.

Air Head was made by ShyKids and tells the short story of a man with a literal balloon for a head. While there's human voiceover utilized, from the way OpenAI was pushing the clip on social channels such as YouTube, it certainly left the impression that the visuals were was purely powered by AI, but that's not entirely true.

As revealed in the behind-the-scenes clip, a ton of work was done by ShyKids who took the raw output from Sora and helped to clean it up into the finished product. This included manually rotoscoping the backgrounds, removing the faces that would occasionally appear on the balloons, and color correcting.

Then there's the fact that Sora takes a ton of time to actually get things right. Cederberg explains that there were “hundreds of generations at 10 to 20 seconds a piece” which were then tightly edited in what the team described as a “300:1” ratio of what was generated versus what was primed for further touch-ups.

Such manual work also included editing out the head which would appear and reappear, and even changing the color of the balloon itself which would appear red instead of yellow. While Sora was used to generate the initial imagery with good results, there was clearly a lot more happening behind the scenes to make the finished product look as good as it does, so we're still a long way out from instantly-generated movie-quality productions.

Sora remains tightly under wraps save for a handful of carefully curated projects that have been allowed to surface, with Air Head among the most popular. The clip has over 120,000 views at the time of writing, with OpenAI touting as “experimentation” with the program, downplaying the obvious work that went into the final product.

Sora is impressive but we're not convinced

While OpenAI has done a decent job of showcasing what its text-to-video service can do through the large language model, the lack of transparency is worrying.

Air Head is an impressive clip by a talented team, but it was subject to a ton of editing to get the final product to where it is in the short.

It's not quite the one-click-and you-'re-done approach that many of the tech's boosters have represented it as. It turns out that it is merely a tool which could be used to enhance imagery instead of create from scratch, which is something that is already common enough in video production, making Sora seem less revolutionary than it first appeared.

Is there a difference in Peak and Peek with Apple?

The Oxford Dictionary defines 'Peak' as:

Reach the highest point, either of a specified value or at a specified time.

In other words, it's the absolute highest that something could reach, either in how fast a machine can go, like an M1 chip from Apple, or how a 5G chip could reach new highs for an iPhone SE model, as that's a line that has yet to see the benefits of 5G.

But it's when you look at 'Peek' in the Dictionary that things become interesting:

To look or glance quickly or furtively, especially through a small opening or from a concealed location; peep; peer.

To me, this signals that we're going to see something else that goes beyond the rumors, and reminds me of a time back in 2006, when Steven Jobs was on stage.

We've been here before

Steve Jobs demoing Apple TV — (Image credit: Apple)

Apple's co-founder was on stage in 2006, showcasing games for the iPod Video, a new iPod nano line, and iTunes offering movies as well as TV shows.

But there was One More Thing, an aspect that Jobs was known to do from time to time at events. These would showcase an update to an existing product, or something completely out of the blue. This time, it was a sneak peek at the Apple TV, first called iTV.

Jobs would demo the media box in his own way that's become iconic now, communicating the benefits to everyone, but making it clear that it was a preview of what was to come.

It was rare that this happened, as Apple likes to announce products that are almost ready to go, even in 2006. But the company had stated since that event that Apple TV was a hobby, it was a testing ground.

In 2022, we're about to see another sneak peek, which makes me suspect we're going to see a new Mac, possibly a Mac Pro. This may be a product that's going to launch towards the end of the year with an Apple Silicon chip that's not quite ready for now.

Peak and peek can mean the same for Apple – it could offer a sneak peek of its highest-performing Mac, and the peak of the M1 chip, but it's simply not ready to be sold for now.

I've enjoyed using my M1 Pro MacBook Pro since October, but there's some Apple users I know of who want a Mac that's not constrained by being on a battery – they want pure power with no compromise. There are plenty of wallets ready to splurge on a Mac with Apple Silicon that's powered only by a cable, not a battery.

However, despite the references to 'peek', I don't see a augmented reality headset appearing next week, as some people are hoping for, mainly due to the fact that a new category for Apple doesn't fit a March event. A new category needs its own space, and for something for its developers to take in and see how it fits for their apps, which is why I believe that there's more chance of it appearing at WWDC this year.

We don't have long to wait for this, but if you're hoping to see a headset, this year's WWDC, once it's official, could be your best bet to see the Apple wearable.

Our pick of the best Apple Macs in 2022 so far

TechRadar – All the latest technology news

Posts tagged "wasn’t"

The ChatGPT ‘Sky’ assistant wasn’t a deliberate copy of Scarlett Johansson’s voice, OpenAI claims

What happens next?

You might also like

Turns out the viral ‘Air Head’ Sora video wasn’t purely the work of AI we were led to believe

Sora is impressive but we're not convinced

You may also like

Apple March Event – if peek wasn’t a typo, what does it mean?

Is there a difference in Peak and Peek with Apple?

We've been here before