ChatGPT takes the mic as OpenAI unveils the Read Aloud feature for your listening pleasure

OpenAI looks like it’s been hard at work, making moves like continuing to improve the GPT store and recently sharing demonstrations of one of the other highly sophisticated models in its pipeline, the video-generation tool Sora. That said, it looks like it’s not completely resting on ChatGPT’s previous success and giving the impressive AI chatbot the capability to read its responses out loud. The feature is being rolled out on both the web version and the mobile versions of the chatbot. 

The new feature will be called 'Read Aloud', as per an official X (formerly Twitter) post from the generative artificial intelligence (AI) company. These will come in useful for many users, including those who have different accessibility needs and people using the chatbot while on the go.

Users can try it for themselves now, according to the Verge, either on the web version of ChatGPT or a mobile version (iOS and Android), and they will be given five different voices they can select from that ChatGPT can use. The feature is available to try whether you use the free version available to all users, GPT-3.5, or the premium paid version, GPT-4. When it comes to languages, users can expect to be able to use the Read Aloud feature in 37 languages (for now) and ChatGPT will be given the ability to autodetect the language that the conversation is happening in. 

If you want to try it on the desktop version of ChatGPT, there should be a speaker icon that shows up below the generated text that activates the feature. If you'd like to try it on a mobile app version, users can tap on and hold the text to open the Read Aloud feature player. In the player, users can play, pause, and rewind the reading of ChatGPTs’ response. Bear in mind that the feature is still being rolled out, so not every user in every region will have access just yet.

A step in the right direction for ChatGPT

This isn’t the first voice-related feature that ChatGPT has received, with Open AI introducing a voice chat feature in September 2023, which allowed users to make inquiries using voice input instead of typing. Users can keep this setting on, prompting ChatGPT to always respond out loud to their inputs.

The debut of this feature comes at an interesting time, as Anthropic recently introduced similar features to its own generative AI models, including Claude. Anthropic is an OpenAI competitor that’s recently seen major amounts of investment from Amazon. 

Overall, this new feature is great news in my eyes (or ears), primarily for expanding accessibility to ChatGPT, but also because I've had a Read-Aloud plugin for ChatGPT in my browser for a while now. I find it interesting to listen to and analyze ChatGPT’s responses out loud, especially as I’m researching and writing. After all, its responses are designed to be as human-like as possible, and a big part of how we process actual real-life human communication is by speaking and listening to each other. 

Giving Chat-GPT a capability like this can help users think about how well ChatGPT is responding, as it makes use of another one of our primary ways of receiving verbal information. Beyond the obvious accessibility benefits for blind or partially-sighted users, I think this is a solid move by OpenAI in cementing ChatGPT as the go-to generative AI tool, opening up another avenue for humans to connect to it. 

YOU MIGHT ALSO LIKE…

TechRadar – All the latest technology news

Read More

Microsoft’s Copilot AI can now read your files directly, but it’s not the privacy nightmare it sounds like

Microsoft has begun rolling out a new feature for its Copilot AI assistant in Windows that will allow the bot to directly read files on your PC, then provide a summary, locate specific data, or search the internet for additional information. 

Copilot has already been aggressively integrated into Microsoft 365 and Windows 11 as a whole, and this latest feature sounds – at least on paper – like a serious privacy issue. After all, who would want an AI peeking at all their files and uploading that information directly to Microsoft?

Well, fortunately, Copilot isn’t just going to be snooping around at random. As spotted by @Leopeva64 on X (formerly Twitter), you have to manually drag and drop the file into the Copilot chat box (or select the ‘Add a file’ option). Once the file is in place, you can proceed to make a request of the AI; the suggestion provided by Leopeva64 is simply ‘summarize’, which Copilot proceeds to do.

Another step towards Copilot being genuinely useful

I’ll admit it, I’m a Copilot critic. Perhaps it’s just because I’m a jaded career journalist with a lifetime of tech know-how and a neurodivergent tilt towards unhealthy perfectionism, but I’ve never seen the value of an AI assistant built into my operating system of choice; however, this is the sort of Copilot feature I actually might use.

The option to summarize alone seems quite useful: more than once, I’ve been handed a chunky PDF with embargoed details about a new tech product, and it would be rather nice not to have to sift through pages and pages of dense legalese and tech jargon just to find the scraps of information that are actually relevant to TechRadar’s readership. Summarizing documents is already something that ChatGPT and Adobe Acrobat AI can do, so it makes sense for Copilot – an AI tool that's specifically positioned as an on-system helper – to be able to do it.

While I personally prefer to be the master of my own Googling, I can see the web-search capabilities being very helpful to a lot of users, too. If you’ve got a file containing partial information, asking Copilot to ‘fill in the blanks’ could save you a lot of time. Copilot appears capable of reading a variety of different file types, from simple text documents to PDFs and spreadsheets. Given the flexible nature of modern AI chatbots, there are potentially many different things you could ask Copilot to do with your files – though apparently, it isn’t able to scan files for viruses (at least, not yet).

If you’re keen to get your hands on this feature yourself, you hopefully won’t have to wait long. While it doesn’t seem to be widely available just yet, Leopeva64 notes that it appears Copilot’s latest new skill “is being rolled out gradually”, so it’ll likely start showing up for more Windows 11 users as time goes on.

The Edge version of Copilot will apparently be getting this feature too, as Leopeva points out that it’s currently available in the Canary prototype build of the browser – if you want to check that out, you just have to sign up for the Edge Insider Program.

You might also like

TechRadar – All the latest technology news

Read More

Forget ChatGPT – NExT-GPT can read and generate audio and video prompts, taking generative AI to the next level

2023 has felt like a year dedicated to artificial intelligence and its ever-expanding capabilities, but the era of pure text output is already losing steam. The AI scene might be dominated by giants like ChatGPT and Google Bard, but a new large language model (LLM), NExT-GPT, is here to shake things up – offering the full bounty of text, image, audio, and video output. 

NExT-GPT is the brainchild of researchers from the National University of Singapore and Tsinghua University. Pitched as an ‘any-to-any’ system, NExT-GPT can accept inputs in different formats and deliver responses according to the desired output in video, audio, image, and text responses. This means that you can put in a text prompt and NExT-GPT can process that prompt into a video, or you can give it an image and have that converted to an audio output. 

ChatGPT has only just announced the capability to ‘see, hear and speak’ which is similar to what NExT-GPT is offering – but ChatGPT is going for a more mobile-friendly version of this kind of feature, and is yet to introduce video capabilities. 

We’ve seen a lot of ChatGPT alternatives and rivals pop up over the past year, but NExT-GPT is one of the few LLMs we’ve seen so far that can match the text-based output of ChatGPT but also provide outputs beyond what OpenAI’s popular chatbot can currently do. You can head over to the GitHub page or the demo page to try it out for yourself. 

So, what is it like?

I’ve fiddled around with NExT-GPT on the demo site and I have to say I’m impressed, but not blown away. Of course, this is not a polished product that has the advantages of public feedback, multiple updates, and so on – but it is still very good. 

I asked it to turn a photo of my cat Miso into an image of him as a librarian, and I was pretty happy with the result. It may not be at the same level of quality as established image generators like Midjourney or Stable Diffusion, but it was still an undeniably very cute picture.

Cat in a library wearing glasses

This is probably one of the least cursed images I’ve personally generated using AI. (Image credit: Future VIA NExT-GPT)

I also tested out the video and audio features, but that didn't go quite as well as the image generation. The videos that were generated were again not awful, but did have the very obvious ‘made by AI’ look that comes with a lot of generated images and videos, with everything looking a little distorted and wonky. It was uncanny. 

Overall, there’s a lot of potential for this LLM to fill the audio and video gaps within big AI names like OpenAI and Google. I do hope that as NExT-GPT gets better and better, we’ll be able to see a higher quality of outputs and make some excellent home movies out of our cats seamlessly in no time. 

You might also like…

TechRadar – All the latest technology news

Read More