OpenAI’s new voice synthesizer can copy your voice from just 15 seconds of audio

OpenAI has been rapidly developing its ChatGPT generative AI chatbot and Sora AI video creator over the last year, and it's now got a new artificial intelligence tool to show off: Voice Generation, which can create synthetic voices from just 15 seconds of audio.

In a blog post (via The Verge), OpenAI says it's been running “a small-scale preview” of Voice Engine, which has been in development since late 2022. It's actually already being used in the Read Aloud feature in the ChatGPT app, which (as the name suggests) reads out answers to you.

Once you've trained the voice from a 15-second sample, you can then get it to read out any text you like, in an “emotive and realistic” way. OpenAI says it could be used for educational purposes, for translating podcasts into new languages, for reaching remote communities, and for supporting people who are non-verbal.

This isn't something everyone can use right now, but you can go and listen to the samples created by Voice Engine. The clips OpenAI has published sound pretty impressive, though there is a slight robotic and stilted edge to them.

Safety first

ChatGPT Android app

Voice Engine is already used in ChatGPT’s Read Aloud feature (Image credit: OpenAI)

Worries about misuse are the main reason Voice Engine is only in a limited preview for now: OpenAI says it wants to do more research into how it can protect tools like this from being used to spread misinformation and copy voices without consent.

“We hope to start a dialogue on the responsible deployment of synthetic voices, and how society can adapt to these new capabilities,” says OpenAI. “Based on these conversations and the results of these small scale tests, we will make a more informed decision about whether and how to deploy this technology at scale.”

With major elections due in both the US and UK this year, and generative AI tools getting more advanced all the time, it's a concern across every type of AI content – audio, text, and video – and it's getting increasingly difficult to know what to trust.

As OpenAI itself points out, this has the potential to cause problems with voice authentication measures, and scams where you might not know who you're talking to over the phone, or who's left you a voicemail. These aren't easy issues to solve – but we're going to have to find ways to deal with them.

You might also like

TechRadar – All the latest technology news

Read More

You can now finally use Sonos Voice Control with Spotify tunes – here’s how

Audio brand Sonos is expanding the reach of its Voice Control software to now support Spotify, allowing users to verbally command the streaming service.

All you have to do to control the app, according to the company, is to say the words “Hey Sonos, play Spotify” into one of the brand’s supported speakers. That’s literally it. Doing so will play one of your Spotify playlists at random. If you want the software to play a specific playlist, you’ll have to mention it by name, like “Discover Weekly,” for example. The feature does extend beyond just being a glorified play button, as it can also be used to help manage your library. 

You can instruct Sonos to add certain songs to a playlist. It can also like or dislike tracks for you with the right command. Telling it “Hey Sonos, I like this song” will make the AI save that piece of music into your account’s 'Liked Songs.' 

Additionally, Voice Control can play specific genres or subgenres of music, be it jazz or classic alternative from the 1990s. You don’t have to be super specific; Spotify’s algorithm has a good understanding of what people are looking for.

Security and availability

It’s worth mentioning commands are processed locally on your Sonos speaker to ensure “fast response times and easy follow-ups”. The company also states no audio – be it from your voice or the surrounding environment – will be saved on any cloud server or listened to by some random third-party. 

Now, there are two ways to connect a Sonos speaker to Spotify. You can either manually choose Spotify to be the default source or make the platform be the most prominent music service played through the speaker. Users won’t have to login or make any changes to the settings.

It’s unknown if Voice Control will learn your listening habits. That is, if a Sonos device notices you frequently access Spotify, will it automatically adjust music sources? 

Spotify’s new support on Sonos Voice Control is available right now to both Premium subscribers as well as free users. Simply download the latest patch on your devices.

While we have you, check out TechRadar's roundup of the best soundbars for 2024. Spoiler alert: Sonos makes an appearance on the list.

You might also like

TechRadar – All the latest technology news

Read More

Google has fixed an annoying Gemini voice assistant problem – and more upgrades are coming soon

Last week, Google rebranded its Bard AI bot as Gemini (matching the name of the model it runs on), and pushed out an Android app in the US; and while the new app has brought a few frustrations with it, Google is now busy trying to fix the major ones.

You can, if you want, use Google Gemini as a replacement for Google Assistant on your Android phone – and Google has made this possible even though Gemini lacks a lot of the basic digital assistant features that users have come to rely on.

One problem has now been fixed: originally, when chatting to Gemini using your voice, you had to manually tap on the 'send' arrow to submit your command or question – when you're trying to keep up a conversation with your phone, that really slows everything down.

As per 9to5Google, that's no longer the case, and Google Gemini will now realize that you've stopped talking (and respond accordingly) in the same way that Google Assistant always has. It makes the app a lot more intuitive to use.

Updates on the way

See more

What's more, Google Gemini team member Jack Krawczyk has posted a list of features that engineers are currently working on – including some pretty basic functionality, including the ability to interact with your Google Calendar and reminders.

A coding interpreter is apparently also on the roadmap, which means Gemini would not just be able to produce programming code, but also to emulate how it would run – all within the same app. Additionally, the Google Gemini team is working to remove some of the “preachy guardrails” that the AI bot currently has.

The “top priority” is apparently refusals, which means Gemini declines to complete a task or answer a question. We've seen Reddit posts that suggest the AI bot will sometimes apologetically report that it can't help with a particular prompt – something that's clearly on Google's radar in terms of rolling fixes out.

Krawczyk says the Android app is coming to more countries in the coming days and weeks, and will be available in Europe “ASAP” – and he's also encouraging users to keep the feedback to the Google team coming.

You might also like

TechRadar – All the latest technology news

Read More

Windows 11’s AI-powered Voice Clarity feature improves your video chats, plus setup has a new look (finally)

Windows 11 has a new preview build out that improves audio quality for your video chats and more besides.

Windows 11 preview build 26040 has been released in the Canary channel (the earliest test builds) complete with the Voice Clarity feature which was previously exclusive to owners of Surface devices.

Voice Clarity leverages AI to improve audio chat on your end, canceling out echo, reducing reverberation or other unwanted effects, and suppressing any intrusive background noises. In short, it helps you to be heard better, and your voice to be clearer.

The catch is that apps need to use Communications Signal Processing Mode to have the benefit of this feature, which is unsurprisingly what Microsoft’s own Phone Link app uses. WhatsApp is another example, plus some PC games will be good to go with this tech, so you can shout at your teammates and be crystal clear when doing so.

Voice Clarity is on by default – after all, there’s no real downside here, save for using a bit of CPU juice – but you can turn it off if you want.

Another smart addition here is a hook-up between your Android phone and Windows 11 PC for editing photos. Whenever you take a photo on your smartphone, it’ll be available on the desktop PC straight away (you’ll get a notification), and you can edit it in the Snipping Tool (rather than struggling to deal with the image on your handset).

For the full list of changes in build 26040, see Microsoft’s blog post, but another of the bigger introductions worth highlighting here is that the Windows 11 setup experience has been given a long overdue lick of paint.

Windows 11 Setup

(Image credit: Microsoft)

Analysis: Setting the scene

It’s about time Windows setup got some attention, as it has had the same basic look for a long time now. It’d be nice for the modernization to get a touch more sparkle, we reckon, though the improvement is a good one, and it’s not exactly a crucial part of the interface (given that you don’t see it after you’ve installed the operating system, anyway).

We have already seen the capability for Android phone photos to be piped to the Snipping Tool appear in the Dev channel last week, but it’s good to see a broader rollout to Canary testers. It is only rolling out, though, so bear in mind that you might not see it yet if you’re a denizen of the Canary channel.

As for Voice Clarity, clearly that’s a welcome touch of AI for all Windows 11 users. Whether you’re chatting to your family to catch up at the weekend, or you work remotely and use your Windows 11 PC for meetings, being able to be heard better by the person (or people) on the other end of the call is obviously a good thing.

You might also like…

TechRadar – All the latest technology news

Read More

ChatGPT steps up its plan to become your default voice assistant on Android

A recent ChatGPT beta is giving a select group of users the ability to turn the AI into their device’s new default voice assistant on Android.

This information comes from industry insider Mishaal Rahman on X (the platform formerly known as Twitter) who posted a video of himself trying out the feature live. According to the post, users can add a shortcut to ChatGPT Assistant, as it’s referred to, directly into an Android’s Quick Settings panel. Tapping the ChatGPT entry on there causes a new UI overlay to appear on-screen, consisting of a plain white circle near the bottom of the display. From there, you verbally give it a prompt, and after several seconds, the assistant responds with an answer. 

See more

The clip shows it does take the AI some time to come up with a response – about 15 seconds. Throughout this time, the white circle will display a bubbling animation to indicate it’s generating a reply. When talking back, the animation turns more cloud-like. You can also interrupt ChatGPT at any time just by tapping the screen. Doing so causes the circle to turn black.

Setting up

The full onboarding process of the feature is unknown although 9To5Google claims in their report you will need to pick a voice when you launch it for the first time. If they like what they hear, they can stick with a particular voice or go back a step to exchange it with another. Previews of each voice can be found on OpenAI’s website too. They consist of three male and two female voices. Once all that is settled, the assistant will subsequently launch as normal with the white circle near the bottom.

To try out this update, you will need a subscription to ChatGPT Plus which costs $ 20 a month. Next, you install either ChatGPT for Android version 1.2024.017 or .018, whatever is available to you. Go to the Beta Features section in ChatGPT’s Settings menu and it should be there ready to be activated. As stated earlier, only a select group of people will gain access. It's not a guarantee.

Future default

Apparently, the assistant is present on earlier builds. 9ToGoogle states the patch is available on ChatGPT beta version 1.2024.010 with limited functionality. They claim the patch introduces the Quick Setting tile, but not the revamped UI.

Rahman in his post says no one can set ChatGPT as their default assistant at the moment. However, lines of code found in a ChatGPT patch from early January suggest this will be possible in the future. We reached out to OpenAI asking if there are plans to expand the beta’s availability. This story will be updated at a later time.

Be sure to check out TechRadar's list of the best ChatGPT extensions for Chrome that everyone should use. There are four in total.

You might also like

TechRadar – All the latest technology news

Read More

Windows 11 is getting a voice-powered ability many users have been longing for, as Microsoft kills off Windows Speech Recognition for the far superior Voice Access tech

Windows 11 has a new preview build which further improves Voice Access, an area Microsoft has been putting a lot of effort into of late.

Preview build 22635.2915 (KB5033456) has just been rolled out to the Beta channel, and one of the additions is the ability to make customized voice shortcuts.

Using this feature, you can specify a trigger phrase for the command, and then the command itself.

Microsoft gives an example of an ‘insert work address’ command which when given automatically pastes in the specified address of your workplace. Anytime you need that putting into a document you’re working on, you just say the command – which is quite the timesaver.

Language support for Voice Access has also been extended, and now the following are included (on top of the existing languages): French (France), French (Canada), German, Spanish (Spain) and Spanish (Mexico).

Finally for voice features, multiple monitors are now supported, meaning that when you summon a grid overlay – for directing mouse clicks to certain areas of the desktop – you can do so on any of the screens connected to your PC. (Before now, the grid overlay could only be used on the primary display).

You can switch your focus to another monitor simply by using a letter (A, B, C and so on) or its phonetic equivalent (Alpha, Bravo, etc).

Microsoft further notes that there’s a drag and drop feature to move files or shortcuts from one display to another.

Elsewhere in build 22635, screen casting in Windows 11 has been improved, with a help option now in the Cast flyout from Quick Settings. This can be clicked if you’re having trouble piping your desktop to another screen and want some troubleshooting advice.

Users are also getting the ability to rename their device for the Nearby Sharing feature to help identify it more easily.

For the full list of changes and fixes in this Beta build, peruse Microsoft’s blog post.


Voice Access shortcuts

(Image credit: Microsoft)

Analysis: Custom capers

This is some useful work with Voice Access, and those with multiple monitors who use the feature will of course be very pleased, no doubt. Voice shortcuts is a powerful addition into the mix for voice functionality, too, and with this, there are a good deal of options.

Not just pasting a section of text as we mention in the example above, but also tasks can be triggered such as opening specified URLs in a browser, or opening a file or folder. You can combine multiple actions too, along with functions like mouse clicks or key presses. This is a feature we’ve been wanting for some time, so it’s great to see it arrive.

It’s also worth noting that Windows Speech Recognition has been removed from Windows 11 in this build, and when you open that old app, you’ll now get a message informing you of its deprecation, and recommending the far superior Voice Access capability instead.

We’re hoping that in the future, Voice Access is going to become an even more central part of the Windows 11 interface, and it seems a great candidate to be driven forward with AI – and maybe incorporated into Copilot.

You might also like…

TechRadar – All the latest technology news

Read More

WhatsApp launches self-destructing voice messages to Android and iOS

WhatsApp is officially giving users the ability to send out temporary voice messages to their contacts.

We say “officially” because this feature has actually been around for the past two months or so although it was in a beta state. People in the beta program were the only ones who had access at the time. Don’t worry about feeling like you missed out because the View Once messages, as they’re called, function exactly the same as before. Meta didn’t make any changes with the official release.

You start by holding down the record button, then swipe up to lock it. Recordings must be locked first in order to make the View Once icon (which is the number one inside the circle) appear in the bottom right-hand corner. Tap it once to activate it and a timer will be attached to the message. Hit Send and you’re done

WhatsApp's new View Once voice messages

(Image credit: Future)

A few limitations

From there, the recipient has two weeks to listen to the recording. You’ll know they’ve listened when the little receipt marker appears below the message. If they ignore it the entire time, WhatsApp will automatically delete it. Do note you’ll be unable to save, share, or forward these self-destructing voice messages. 

It is possible to restore a recording from a backed up chat room, but only if it was never opened in the first place, according to a page on WhatsApp’s support website. If it was already heard, then you’re out of luck. Another one will have to be sent.

The update is currently rolling out globally to all WhatsApp users on Android and iOS devices. Be sure to keep an eye out for the patch when it arrives over the coming days. We reached out to Meta asking if there plans to add the same feature to the desktop app. If you’re not aware, the company gave WhatsApp on desktop the ability to send self-destructing images and videos. Perhaps it’ll also receive support for temporary voice messages. This story will be updated at a later time.

While you wait, be sure to join TechRadar’s official WhatsApp channel to receive all our latest reviews and news stories right to your phone.

You might also like

TechRadar – All the latest technology news

Read More

YouTube’s new AI tool will let you create your dream song with a famous singer’s voice

YouTube is testing a pair of experimental AI tools giving users a way to create short songs either via a text prompt or their own vocal sample.

The first one is called Dream Track, a feature harnessing the voices of a group of nine mainstream artists to generate 30-second music tracks for YouTube Shorts. The way it works is you enter a text prompt into the engine describing what you want to hear and then select a singer appearing in the tool’s carousel. Participating musicians include John Legend, Sia, and T-Pain; all of whom gave their consent to be a part of the program. Back in late October, a Bloomberg report made the rounds stating YouTube was working on AI tech allowing content creators “to produce songs using the voices of famous singers”, but couldn’t launch it due to the ongoing negotiations with record labels. Dream Track appears to be that self-same AI

YouTube's Dream Track on mobile

(Image credit: YouTube)

For the initial rollout, Dream Track will be available to a small group of American content creators on mobile devices. No word on if and when it’ll see a wider release or desktop version. 

The announcement post has a couple of videos demonstrating the feature. One of them simulates a user asking the AI to create a song about “a sunny morning in Florida” using T-Pain’s voice. In our opinion, it does a pretty good job of emulating his style and coming up with lyrics on the fly, although the performance does sound like it’s been through an Auto-Tune filter.

Voices into music

The second experiment is called Music AI Tools which, as we alluded to earlier, can generate bite-sized tracks by transforming an uploaded vocal sample. For example, a short clip of you humming can turn into a guitar riff. It even works in reverse as chords coming from a MIDI keyboard can be morphed into a choir

An image on Google’s DeepMind website reveals what the user interface for the Music AI Tool desktop app may look like. At first, we figured the layout would be relatively simple like Dream Track, however, it is a lot more involved. 

YouTube's Music AI Tool

(Image credit: Google)

The interface resembles a music editing program with a timeline at the top highlighting the input alongside several editing tools. These presumably would allow users a way to tweak certain elements in a generated track. Perhaps a producer wants to tone down the distortion on a guitar riff or bump up the piano section.

Google says it is currently testing this feature with those in YouTube’s Music AI Incubator program, which is an exclusive group consisting of “artists, songwriters, and producers” from across the music industry. No word on when it’ll see a wide release.

Analysis: Treading new waters

YouTube is pitching this recent foray as a new way for creative users to express themselves; a new way to empower fledgling musicians who may lack important resources to grow. However, if you look at this from the artists’ perspective, the attitude is not so positive. The platform compiled a series of quotes from the group of nine singers regarding Dream Track. Several mention the inevitability of generative AI in music and the need to be a part of it, with a few stating they will remain cautious towards the tech.

We may be reading too much into this, but we get the vibe that some aren’t totally on board with this tech. To quote one of our earlier reports, musicians see generative AI as something “they’ll have to deal with or they risk getting left behind.” 

YouTube says it’s approaching the situation with the utmost respect, ensuring “the broader music community” benefits. Hopefully, the platform will maintain its integrity moving forward.

While we have you, be sure to check out TechRadar's list of the best free music-making software for 2023.

You might also like

Follow TechRadar on TikTok for news, reviews, unboxings, and hot Black Friday deals!

TechRadar – All the latest technology news

Read More

WhatsApp is upgrading its voice chat tool so it can host a lot more people

WhatsApp is upgrading the Voice Chat feature on mobile so users can now host large group calls with up to 128 participants. 

The platform has yet to make a formal announcement of the changes through its usual avenues although details can be found on its Help Center support website. On the surface, the tool’s functionality is pretty straightforward. You can start a group voice chat by going to a group chat, tapping the audio read-out icon in the upper right-hand corner, and selecting Start Voice Chat. The company states this is “only available on your primary device” and calls will automatically end the moment everyone leaves. Additionally, they instantly end after an hour if no one “joins the first or last person in the chat”. 

Silent calls

There is more to this update than what’s on the support page as other news reports reveal a much more robust feature. According to TechCrunch, Voice Chat for Larger Groups is “designed to be less disruptive” than a regular group call. Participants will not be rung when a call starts. Instead, they will “receive a push notification” with an in-chat bubble you have to tap in order to join. 

At the top of the screen is a series of controls where you can mute, unmute, or message other people in the group without having to leave. Of course, you can hang up any time you want using the same controls. Like with all forms of messaging on WhatsApp, the large voice chats will be end-to-end encrypted.

Availability

The Verge states the patch will be rolling out to the Android and iOS apps over the coming weeks, however, it’ll first be made available to bigger groups hosting 33 to 128 participants. It’s unknown why smaller chats will have to wait to receive the same feature. But as The Verge points out, it could be because the Group Voice Call tool already exists. Meta is seemingly prioritizing the larger chats first before moving on to all users.

No word if WhatsApp has plans to expand this to their desktop app; although we did ask. This story will be updated at a later time.

With Black Friday around the corner, we expect a lot of discounts for major brands. If you want to see what’s out there, check out TechRadar’s roundup of the best Black Friday phone deals for 2023

You might also like

TechRadar – All the latest technology news

Read More

WhatsApp is testing a new self-destructing voice messages feature

WhatsApp is currently testing a View Once mode for voice messages as a “new layer of privacy” on the mobile app.

The feature functions similarly to the disappearing images and videos present on the platform. Meta is merely expanding it elsewhere. According to WABetaInfo, a new icon sporting the number one will appear in the chat bar while you record a voice note with the lock on. Tapping said icon enables the View Once mode (well it's more like Listen Once) preventing recipients from exporting, forwarding, saving, or recording messages. Once sent over, you, the sender, cannot listen to it nor can the other person play it again after the first time. It’s gone forever.  

WhatsApp Listen Once voice messages

(Image credit: Future)

As WABetaInfo points out, this tool has the potential to effectively eliminate “the risk of your personal or sensitive information falling into the wrong hands.” Messages can’t be shared with people outside the initial chat room, greatly reducing the odds “of unauthorized access.”

This update is available for both Android and iOS. If you’re interested in trying out yourself, Android users can join the Google Play Beta Program and install version 2.23.78 of the WhatsApp beta. iPhone owners can try to join the TestFlight program for WhatsApp. However, at the time of this writing it’s no longer accepting any more entrants, although it is possible a slot could open soon.

Going quiet

As for the future of WhatsApp, things will be getting a little quiet. None of the other beta features are as impactful or noteworthy as the self-destructing voice messages. Looking through WABetaInfo’s other posts, we saw that Meta is working implementing avatar reactions plus a redesigned audio and video menu for iOS. Nothing really ground-breaking.

It’s not surprising the platform is going silent at the moment as 2023 has been quite the year for WhatsApp. It’s seen multiple major updates these past 10 months or so from several quality-of-life changes to eight-person video calls on the Windows desktop app. And recently, the company began testing an AI-powered sticker generator for chats. Perhaps Meta is keeping its projects under wraps so it can kick off 2024 in a big way.

While we have you, be sure to follow TechRadar’s official WhatsApp channel. We post our latest reviews and news stories daily on there. 

You might also like

TechRadar – All the latest technology news

Read More