ChatGPT fans are furious as OpenAI delays rollout of next-gen Voice Mode

Ever since OpenAI showed off ChatGPT's new Voice Mode – and incurred the wrath of Scarlett Johansson – fans of the AI assistant have been desperate to take the feature for a spin. Well, they'll have to wait a little longer, as OpenAI says its advanced Voice Mode has now been delayed.

In a post on X (formerly Twitter), OpenAI said that Alpha testing for its startlingly lifelike Voice Mode has been pushed back a month so it can focus on “improving the user experience” and also its “ability to detect and refuse certain content.” In other words, it's still not quite ready for the questionable requests the real world might throw at it.

So when exactly will the Voice Mode be pushed out beyond this initial “small group of users”? OpenAI says, “We are planning for all Plus users to have access in the fall.” But there's a slightly worrying caveat that “exact timelines depend on meeting our high safety and reliability bar.” Further delays, then, could well be on the cards.

See more

This is all a far cry from OpenAI's previous rollout plans. When it demoed the new Voice Mode in May at its Spring event, it said the feature would “be rolling out in the coming weeks.” That's still technically true, but the reality is that it'll now be more like months.

Delays for new tech features aren't exactly new, but ChatGPT subscribers aren't happy. “Biggest rug pull in history,” concluded one X commenter, with others stating that the huge demo “was probably misleading to many” and that they'd also cancel their Plus account until it actually rolls out. 

A casualty of the AI arms race

The outpouring of frustration from ChatGPT fans about this Voice Mode delay might seem disproportional, but it's also understandable. An unfortunate side effect of the AI arms race is the staging of whizz-bang demos with optimistic roll-outs, followed by delays and vague promises of launches in the 'coming weeks' or, even worse, 'coming months.'

OpenAI's explanation of the delays to ChatGPT's most exciting new feature are certainly reasonable on the surface. As it explains in its statement, the new Voice Mode takes us closer to real-time, natural conversations with AI chatbots – and that is a potentially dangerous tool if it goes off the rails in the wild.

Then again, the timing of OpenAI's Spring update event – on May 13, just a day before Google IO 2024 – did seem conveniently designed to steal some thunder from Google's AI announcements. So, the theories that ChatGPT's new voice was demoed a little prematurely do have some credence.

Still, with OpenAI releasing several demo videos (like the one above) on its YouTube channel of the new Voice Mode (with the controversial 'Sky' voice, following Scarlett Johansson's complaints that it sounded a little too much like her AI character in Her), suggest it's far from marketing vaporware.

ChatGPT also remains one of the best AI tools around without it, despite increasing pressure from the likes of Claude's new 3.5 Sonnet model

You might also like…

TechRadar – All the latest technology news

Read More

ChatGPT shows off impressive voice mode in new demo – and it could be a taste of the new Siri

ChatGPT's highly anticipated new Voice mode has just starred again in new demo that shows off its acting skills – and the video could be a taste of what we can expect from the reported new deal between Apple and OpenAI.

The ChatGPT app already has a Voice mode, but OpenAI showed off a much more impressive version during the launch of its new GPT-4o model in May. Unfortunately, that was then overshadowed by OpenAI's very public spat with Scarlett Johansson over the similarity of ChatGPT's Sky voice to her own in the movie Her. But OpenAI is hyping up the new mode again in the clip below.

The video shows someone writing a story and getting ChatGPT to effectively do improv drama, providing voices for a “majestic lion” and a mouse. Beyond the expressiveness of the voices, what's notable is how easy it is to interrupt the ChatGPT voice for a better conversational flow, and also the lack of latency.     

OpenAI says the new mode will “be rolling out in the coming weeks” and that's a pretty big deal. Not least because, as Bloomberg's Mark Gurman has again reported, Apple is expected to announce a new partnership with OpenAI at WWDC 2024 on June 10.   

Exactly how OpenAI's tech is going to be baked into iOS 18 remains to be seen, but Gurman's report states that Apple will be “infusing its Siri digital assistant with AI”. That means some of its off-device powers could tap into ChatGPT – and if it's anything like OpenAI's new demo, that would be a huge step forward from today's Siri.

Voice assistants finally grow up?

Siri's reported AI overhaul will likely be one of the bigger stories of WWDC 2024. According to Dag Kittlaus, who co-founded and ran Siri before Apple acquired it in 2010, the deal with OpenAI will likely be a “short- to medium-term relationship” while Apple plays catch up. But it's still a major surprise.

It's possible that Siri's AI improvements will be restricted to more minor, on-device functions, with Apple instead using its OpenAI partnership solely for text-based queries. After all, from iOS 15 onwards, Apple switched Siri's audio processing to being on-device by default, which meant you could use it without an internet connection.

But Bloomberg's Gurman claims that Apple has “forged a partnership to integrate OpenAI’s ChatGPT into the iPhone’s operating system”. If so, it's possible that one unlikely move could be followed by another, with Siri leaning on ChatGPT for off-device queries and a more conversational flow. It's already been possible to use ChatGPT with Siri for a while now using Apple's Shortcuts.

It wouldn't be the first time that Apple has integrated third-party software into iOS. Back on the original iPhone, Apple made a pre-installed YouTube app which was later removed once Google had made its own version. Gurman's sources noted that by outsourcing an AI chatbot, “Apple can distance itself from the technology itself, including its occasional inaccuracies and hallucinations.”

We're certainly looking forward to seeing how Apple weaves OpenAI's tech into iOS –and potentially Siri – at WWDC 2024.

You might also like

TechRadar – All the latest technology news

Read More

Truecaller’s new feature can turn your voice into a personal secretary

Caller ID service Truecaller is giving users the ability to create a digital assistant that has their voice and can respond on their behalf. If you’re unfamiliar with the app, Truecaller launched its AI Assistant feature in 2022 to screen phone calls and take messages, among other things. Up to this point, it utilized pre-made voices, but thanks to the power of Microsoft’s Azure AI Speech, you can now use your own.

Setting up your voice within Trucaller is quite easy; you just need a subscription to Truecaller Premium, which is $ 9.99 a month per account. Once that is set, the software will immediately ask you to select an AI assistant – but instead of picking one of the pre-made personalities, select “Add your Voice.”

You’ll then be asked to read a consent sentence and a brief training script out loud into your smartphone’s microphone. Doing so ensures the AI has a voice that mimics your “speaking style.” When done, Truecaller states that Microsoft’s Azure Custom Voice begins to process the recording to create a “high-quality digital replica.” The app will give you a demo sound bite to help you imagine what it’ll sound like when someone calls you. 

Truecaller's training script

(Image credit: Truecaller)

Robo-voice

Keep in mind the technology isn’t perfect. While the digital assistant may sound like you, it does come across as rather robotic. The company published a YouTube video on its official channel showing what the AI sounds like. Admittedly, the software does a decent job at mimicking a person’s vocal inflections; however, responses still sound stiff. That said, it is an interesting and interactive way to screen calls as they come in, especially when stopping spam ones. 

Keep an eye out for the patch when it arrives, as we tried to create our own digital secretary on our Android but couldn’t since we didn’t receive the feature as of yet. It’s unknown exactly when and where the update will be available. TechCrunch claims the tool will roll out “over the next few weeks” as a public beta across a small selection of countries. These include, but are not limited to, the US, Canada, Australia, and Sweden. Soon after, it’ll become widely available “to all users in the eligible markets. 

We also reached out to Truecaller with a couple of questions, including how recordings are stored, whether they are saved on the device or uploaded to company servers, and more. If we hear back, this story will be updated.

While we have you, check out TechRadar's round up of the best encrypted messaging apps on Android for 2024.

You might also like

TechRadar – All the latest technology news

Read More

The ChatGPT ‘Sky’ assistant wasn’t a deliberate copy of Scarlett Johansson’s voice, OpenAI claims

OpenAI's high-profile run-in with Scarlett Johansson is turning into a sci-fi story to rival the move Her, and now it's taken another turn, with OpenAI sharing documents and an updated blog post suggesting that its 'Sky' chatbot in the ChatGPT app wasn't a deliberate attempt to copy the actress's voice.

OpenAI preemptively pulled its 'Sky' voice option in the ChatGPT app on May 19, just before Scarlett Johansson publicly expressed her “disbelief” at how “eerily similar” it sounded to her own (in a statement shared with NPR). The actress also revealed that OpenAI CEO Sam Altman had previously approached her twice to license her voice for the app, and that she'd declined on both occasions. 

But now OpenAI is on the defensive, sharing documents with The Washington Post suggesting that its casting process for the various voices in the ChatGPT app was kept entirely separate from its reported approaches to Johansson.

The documents, recordings and interviews with people involved in the process suggest that “an actress was hired to create the Sky voice months before Altman contacted Johansson”, according to The Washington Post. 

The agent of the actress chosen for the Sky voice also apparently confirmed that “neither Johansson nor the movie “Her” were ever mentioned by OpenAI” during the process, nor was the actress's natural speaking voice tweaked to sound more like Johansson.

OpenAI's lead for AI model behavior, Joanne Jang, also shared more details with The Washington Post on how the voices were cast. Jang stated that she “kept a tight tent” around the AI voices project and that Altman was “not intimately involved” in the decision-making process, as he was “on his world tour during much of the casting process”.

Clearly, this case is likely to rumble on, but one thing's for sure – we won't be seeing ChatGPT's 'Sky' voice reappear for some time, if at all, despite the vocal protestations and petitions of its many fans.

What happens next?

OpenAI logo on wall

(Image credit: Shutterstock.com / rafapress)

With Johansson now reportedly lawyering up in her battle with OpenAI, the case looks likely to continue for some time.

Interestingly, the case isn't completely without precedent, despite the involvement of new tech. As noted by Mitch Glazier (chief executive of the Recording Industry Association of America), there was a similar case in the 1980s involving Bette Midler and the Ford Motor Company.

After Midler declined Ford's request to use her voice in a series of ads, Ford hired an impersonator instead – which resulted in a legal battle that Midler ultimately won, after a US court found that her voice was distinctive and should be protected against unauthorized use.

OpenAI is now seemingly distancing itself from suggestions that it deliberately did something similar with Johansson in its ChatGPT app, highlighting that its casting process started before Altman's apparent approaches to the actress. 

This all follows an update to OpenAI's blog post, which included a statement from CEO Sam Altman claiming: “The voice of Sky is not Scarlett Johansson's, and it was never intended to resemble hers. We cast the voice actor behind Sky’s voice before any outreach to Ms. Johansson. Out of respect for Ms. Johansson, we have paused using Sky’s voice in our products. We are sorry to Ms. Johansson that we didn’t communicate better.”

But Altman's post on X (formerly Twitter) just before OpenAI's launch of GPT-4o, which simply stated “her”, doesn't help distance the company from suggestions that it was attempting to recreate the famous movie in some form, regardless of how explicit that was in its casting process. 

You might also like

TechRadar – All the latest technology news

Read More

Bereft ChatGPT fans start petition to bring back controversial ‘Sky’ chatbot voice

OpenAI has pulled ChatGPT's popular 'Sky' chatbot voice after Scarlett Johansson expressed her “disbelief” at how “eerily similar” it sounded to her own. But fans of the controversial voice in the ChatGPT app aren't happy – and have now started a petition to bring it back.

The Sky voice, which is one of several that are available in the ChatGPT app for iOS and Android, is no longer available after OpenAI stated yesterday on X (formerly Twitter) that it'd had hit pause in order to address “questions about how we chose the voices in ChatGPT”.

Those questions became very pointed yesterday when Johansson wrote a fiery statement given to NPR that she was “shocked, angered and in disbelief” that OpenAI CEO Sam Altman would “pursue a voice that sounded so eerily similar to mine” after she had apparently twice declined licensing her voice for the ChatGPT assistant.

OpenAI has rejected those accusations, stating in a blog post that “Sky’s voice is not an imitation of Scarlett Johansson but belongs to a different professional actress using her own natural speaking voice.” But pressure from Johansson's lawyers, which NPR reports are demanding answers, has forced OpenAI to suspend the voice – and fans aren't happy.

In a fascinating example of how attached some are already becoming to AI chatbots, a popular Reddit thread titled 'Petition to bring Sky voice back' includes a link to a Change petition, which currently has over 300 signatures.

In fairness, many of the Reddit comments and signatures predate Johansson's statement and OpenAI's reasoning for pulling the Sky voice option in the ChatGPT app. And it now looks increasingly likely that the voice won't simply be paused but instead put on indefinite hiatus.

But the thread is still an interesting, and mildly terrifying, glimpse of where we're headed with convincing AI chatbot voices, whether they're licensed from famous actresses or not. One comment from Redditor JohnDango states that “she was the only bot I spoke to that had a 'realness' about her that made it feel like a real step beyond chatbot,” while GaneshLookALike noted mournfully that “Sky was full of warmth and compassion.”

That voice, which we also found to be one of ChatGPT's most convincing options, is now on the backburner while the case rumbles on. 

What next for ChatGPT's Sky voice?

See more

It doesn't sound like ChatGPT's 'Sky' voice is going to return anytime soon. In her statement shared with NPR, Scarlett Johansson said she'd been “forced to hire legal counsel” and send letters to OpenAI asking how the voice had been made. OpenAI's blog post looks like its response to those questions, though it remains to be seen whether that's enough to keep the lawyers at bay.

Johansson understandably sounds determined to pursue the issue, adding in her statement to NPR that “in a time when we are all grappling with deepfakes and the protection of our own likeness, our own work, our own identities, I believe these are questions that deserve absolute clarity.” 

While there's no suggestion that OpenAI cloned Johansson's voice, the company did reveal in March that it had developed a new voice synthesizer that could apparently copy a voice from just 15 seconds of audio. That tool was never released to the public due to concerns about how it might be misused, with OpenAI stating that it was investigating the “responsible deployment of synthetic voices”.

OpenAI CEO Sam Altman also didn't exactly help his company's cause by simply posting “her” on X (formerly Twitter) on the eve of the launch of its GPT-4o model, which included the new voice mode that it demoed. That looks like a thinly-veiled reference to Spike Jonze's movie Her, about a man who develops a relationship with an AI virtual assistant Samantha, which was voiced by none other than Scarlett Johansson.

For now, then, it looks like fans of the ChatGPT app will need to make do with the other voices – including Breeze, Cove, Ember and Juniper – while this fascinating case rumbles on. This also shouldn't effect the rollout of GPT-4o's impressive new conversational voice powers, which OpenAI says it will be rolling out “in alpha within ChatGPT Plus in the coming weeks”.

You might also like

TechRadar – All the latest technology news

Read More

Ray Ban’s Meta Glasses now let you listen to Apple Music with voice controls for maximum nerd points

The Ray-Ban Meta smart glasses are still waiting on their big AI update – which is set to bring features like ‘Look and Ask’ out of the exclusive beta and bring them to everyone – but while we wait, a useful upgrade has just rolled out to the specs.

The big feature for many will be native Apple Music controls (via 9to5Mac). Previously you could play Apple Music through the Ray-Ban Meta glasses by using the app on your phone and touch controls on the glass’ arms, but this update allows you to use the Meta AI voice controls to play songs, playlists, albums, and stations from your music library for a hands-free experience.

The update also brings new touch controls. You touch and hold the side of the glasses to have Apple Music automatically play tracks based on your listening history.

The Apple Music app icon against a red background on an iPhone.

(Image credit: Brett Jordan / Unsplash)

Beyond Apple Music integration, the new update also allows you to use the glasses as a video source for WhatsApp and Messenger calls. This improves on pre-existing interoperability that allows you to send messages, and images or videos you captured using the glasses to contacts in these apps using the Meta AI.

You can also access a new command, “Hey Meta, what song is this?” to have your glasses tell you what song is playing through your smart specs. This isn’t quite as useful as recognizing tracks that are playing in public as you walk around, but could be handy if you like collecting playlists of new and unfamiliar artists.

To update your glasses to the latest version, simply go to the Meta View App, go to Settings, open the Your Glasses menu option, then Updates. You’ll also want to have your glasses to hand and make sure they’re turned on and connected to your phone via Bluetooth. If you can’t see the update – and your phone says it isn’t already on version 4.0 – then check the Play Store or App Store to see if the Meta View app itself needs an update.

You might also like

TechRadar – All the latest technology news

Read More

OpenAI’s new voice synthesizer can copy your voice from just 15 seconds of audio

OpenAI has been rapidly developing its ChatGPT generative AI chatbot and Sora AI video creator over the last year, and it's now got a new artificial intelligence tool to show off: Voice Generation, which can create synthetic voices from just 15 seconds of audio.

In a blog post (via The Verge), OpenAI says it's been running “a small-scale preview” of Voice Engine, which has been in development since late 2022. It's actually already being used in the Read Aloud feature in the ChatGPT app, which (as the name suggests) reads out answers to you.

Once you've trained the voice from a 15-second sample, you can then get it to read out any text you like, in an “emotive and realistic” way. OpenAI says it could be used for educational purposes, for translating podcasts into new languages, for reaching remote communities, and for supporting people who are non-verbal.

This isn't something everyone can use right now, but you can go and listen to the samples created by Voice Engine. The clips OpenAI has published sound pretty impressive, though there is a slight robotic and stilted edge to them.

Safety first

ChatGPT Android app

Voice Engine is already used in ChatGPT’s Read Aloud feature (Image credit: OpenAI)

Worries about misuse are the main reason Voice Engine is only in a limited preview for now: OpenAI says it wants to do more research into how it can protect tools like this from being used to spread misinformation and copy voices without consent.

“We hope to start a dialogue on the responsible deployment of synthetic voices, and how society can adapt to these new capabilities,” says OpenAI. “Based on these conversations and the results of these small scale tests, we will make a more informed decision about whether and how to deploy this technology at scale.”

With major elections due in both the US and UK this year, and generative AI tools getting more advanced all the time, it's a concern across every type of AI content – audio, text, and video – and it's getting increasingly difficult to know what to trust.

As OpenAI itself points out, this has the potential to cause problems with voice authentication measures, and scams where you might not know who you're talking to over the phone, or who's left you a voicemail. These aren't easy issues to solve – but we're going to have to find ways to deal with them.

You might also like

TechRadar – All the latest technology news

Read More

You can now finally use Sonos Voice Control with Spotify tunes – here’s how

Audio brand Sonos is expanding the reach of its Voice Control software to now support Spotify, allowing users to verbally command the streaming service.

All you have to do to control the app, according to the company, is to say the words “Hey Sonos, play Spotify” into one of the brand’s supported speakers. That’s literally it. Doing so will play one of your Spotify playlists at random. If you want the software to play a specific playlist, you’ll have to mention it by name, like “Discover Weekly,” for example. The feature does extend beyond just being a glorified play button, as it can also be used to help manage your library. 

You can instruct Sonos to add certain songs to a playlist. It can also like or dislike tracks for you with the right command. Telling it “Hey Sonos, I like this song” will make the AI save that piece of music into your account’s 'Liked Songs.' 

Additionally, Voice Control can play specific genres or subgenres of music, be it jazz or classic alternative from the 1990s. You don’t have to be super specific; Spotify’s algorithm has a good understanding of what people are looking for.

Security and availability

It’s worth mentioning commands are processed locally on your Sonos speaker to ensure “fast response times and easy follow-ups”. The company also states no audio – be it from your voice or the surrounding environment – will be saved on any cloud server or listened to by some random third-party. 

Now, there are two ways to connect a Sonos speaker to Spotify. You can either manually choose Spotify to be the default source or make the platform be the most prominent music service played through the speaker. Users won’t have to login or make any changes to the settings.

It’s unknown if Voice Control will learn your listening habits. That is, if a Sonos device notices you frequently access Spotify, will it automatically adjust music sources? 

Spotify’s new support on Sonos Voice Control is available right now to both Premium subscribers as well as free users. Simply download the latest patch on your devices.

While we have you, check out TechRadar's roundup of the best soundbars for 2024. Spoiler alert: Sonos makes an appearance on the list.

You might also like

TechRadar – All the latest technology news

Read More

Google has fixed an annoying Gemini voice assistant problem – and more upgrades are coming soon

Last week, Google rebranded its Bard AI bot as Gemini (matching the name of the model it runs on), and pushed out an Android app in the US; and while the new app has brought a few frustrations with it, Google is now busy trying to fix the major ones.

You can, if you want, use Google Gemini as a replacement for Google Assistant on your Android phone – and Google has made this possible even though Gemini lacks a lot of the basic digital assistant features that users have come to rely on.

One problem has now been fixed: originally, when chatting to Gemini using your voice, you had to manually tap on the 'send' arrow to submit your command or question – when you're trying to keep up a conversation with your phone, that really slows everything down.

As per 9to5Google, that's no longer the case, and Google Gemini will now realize that you've stopped talking (and respond accordingly) in the same way that Google Assistant always has. It makes the app a lot more intuitive to use.

Updates on the way

See more

What's more, Google Gemini team member Jack Krawczyk has posted a list of features that engineers are currently working on – including some pretty basic functionality, including the ability to interact with your Google Calendar and reminders.

A coding interpreter is apparently also on the roadmap, which means Gemini would not just be able to produce programming code, but also to emulate how it would run – all within the same app. Additionally, the Google Gemini team is working to remove some of the “preachy guardrails” that the AI bot currently has.

The “top priority” is apparently refusals, which means Gemini declines to complete a task or answer a question. We've seen Reddit posts that suggest the AI bot will sometimes apologetically report that it can't help with a particular prompt – something that's clearly on Google's radar in terms of rolling fixes out.

Krawczyk says the Android app is coming to more countries in the coming days and weeks, and will be available in Europe “ASAP” – and he's also encouraging users to keep the feedback to the Google team coming.

You might also like

TechRadar – All the latest technology news

Read More

Windows 11’s AI-powered Voice Clarity feature improves your video chats, plus setup has a new look (finally)

Windows 11 has a new preview build out that improves audio quality for your video chats and more besides.

Windows 11 preview build 26040 has been released in the Canary channel (the earliest test builds) complete with the Voice Clarity feature which was previously exclusive to owners of Surface devices.

Voice Clarity leverages AI to improve audio chat on your end, canceling out echo, reducing reverberation or other unwanted effects, and suppressing any intrusive background noises. In short, it helps you to be heard better, and your voice to be clearer.

The catch is that apps need to use Communications Signal Processing Mode to have the benefit of this feature, which is unsurprisingly what Microsoft’s own Phone Link app uses. WhatsApp is another example, plus some PC games will be good to go with this tech, so you can shout at your teammates and be crystal clear when doing so.

Voice Clarity is on by default – after all, there’s no real downside here, save for using a bit of CPU juice – but you can turn it off if you want.

Another smart addition here is a hook-up between your Android phone and Windows 11 PC for editing photos. Whenever you take a photo on your smartphone, it’ll be available on the desktop PC straight away (you’ll get a notification), and you can edit it in the Snipping Tool (rather than struggling to deal with the image on your handset).

For the full list of changes in build 26040, see Microsoft’s blog post, but another of the bigger introductions worth highlighting here is that the Windows 11 setup experience has been given a long overdue lick of paint.

Windows 11 Setup

(Image credit: Microsoft)

Analysis: Setting the scene

It’s about time Windows setup got some attention, as it has had the same basic look for a long time now. It’d be nice for the modernization to get a touch more sparkle, we reckon, though the improvement is a good one, and it’s not exactly a crucial part of the interface (given that you don’t see it after you’ve installed the operating system, anyway).

We have already seen the capability for Android phone photos to be piped to the Snipping Tool appear in the Dev channel last week, but it’s good to see a broader rollout to Canary testers. It is only rolling out, though, so bear in mind that you might not see it yet if you’re a denizen of the Canary channel.

As for Voice Clarity, clearly that’s a welcome touch of AI for all Windows 11 users. Whether you’re chatting to your family to catch up at the weekend, or you work remotely and use your Windows 11 PC for meetings, being able to be heard better by the person (or people) on the other end of the call is obviously a good thing.

You might also like…

TechRadar – All the latest technology news

Read More