Google’s impressive Lumiere shows us the future of making short-form AI videos

Google is taking another crack at text-to-video generation with Lumiere, a new AI model capable of creating surprisingly high-quality content. 

The tech giant has certainly come a long way from the days of Imagen Video. Subjects in Lumiere videos are no longer these nightmarish creatures with melting faces. Now things look much more realistic. Sea turtles look like sea turtles, fur on animals has the right texture, and people in AI clips have genuine smiles (for the most part). What’s more, there's very little of the weird jerky movement seen in other text-to-video generative AIs. Motion is largely smooth as butter. Inbar Mosseri, Research Team Lead at Google Research, published a video on her YouTube channel demonstrating Lumiere’s capabilities. 

Google put a lot of work into making Lumiere’s content appear as lifelike as possible. The dev team accomplished this by implementing something called Space-Time U-Net architecture (STUNet). The technology behind STUNet is pretty complex. But as Ars Technica explains, it allows Lumiere to understand where objects are in a video, how they move and change and renders these actions at the same time resulting in a smooth-flowing creation. 

This runs contrary to other generative platforms that first establish keyframes in clips and then fill in the gaps afterward. Doing so results in the jerky movement the tech is known for.

Well equipped

In addition to text-to-video generation, Lumiere has numerous features in its toolkit including support for multimodality. 

Users will be able to upload source images or videos to the AI so it can edit them according to their specifications. For example, you can upload an image of Girl with a Pearl Earring by Johannes Vermeer and turn it into a short clip where she smiles instead of blankly staring. Lumiere also has an ability called Cinemagraph which can animate highlighted portions of pictures.

Google demonstrates this by selecting a butterfly sitting on a flower. Thanks to the AI, the output video has the butterfly flapping its wings while the flowers around it remain stationary. 

Things become particularly impressive when it comes to video. Video Inpainting, another feature, functions similarly to Cinemagraph in that the AI can edit portions of clips. A woman’s patterned green dress can be turned into shiny gold or black. Lumiere goes one step further by offering Video Stylization for altering video subjects. A regular car driving down the road can be turned into a vehicle made entirely out of wood or Lego bricks.

Still in the works

It’s unknown if there are plans to launch Lumiere to the public or if Google intends to implement it as a new service. 

We could perhaps see the AI show up on a future Pixel phone as the evolution of Magic Editor. If you’re not familiar with it, Magic Editor utilizes “AI processing [to] intelligently” change spaces or objects in photographs on the Pixel 8. Video Inpainting, to us, seems like a natural progression for the tech.

For now, it looks like the team is going to keep it behind closed doors. As impressive as this AI may be, it still has its issues. Jerky animations are present. In other cases, subjects have limbs warping into mush. If you want to know more, Google’s research paper on Lumiere can be found on Cornell University’s arXiv website. Be warned: it's a dense read.

And be sure to check out TechRadar's roundup of the best AI art generators for 2024.

You might also like

TechRadar – All the latest technology news

Read More

ChatGPT will get video-creation powers in a future version – and the internet isn’t ready for it

The web's video misinformation problem is set to get a lot worse before it gets better, with OpenAI CEO Sam Altman going on the record to say that video-creation capabilities are coming to ChatGPT within the next year or two.

Speaking to Bill Gates on the Unconfuse Me podcast (via Tom's Guide), Altman pointed to multimodality – the ability to work across text, images, audio and “eventually video” – as a key upgrade for ChatGPT and its models over the next two years.

While the OpenAI boss didn't go into too much detail about how this is going to work or what it would look like, it will no doubt work along similar lines to the image-creation capabilities that ChatGPT (via DALL-E) already offers: just type a few lines as a prompt, and you get back an AI-generated picture based on that description.

Once we get to the stage where you can ask for any kind of video you like, featuring any subject or topic you like, we can expect to see a flood of deepfake videos hit the web – some made for fun and for creative purposes, but many intended to spread misinformation and to scam those who view them.

The rise of the deepfakes

Deepfake videos are already a problem of course – with AI-generated videos of UK Prime Minister Rishi Sunak popping up on Facebook just this week – but it looks as though the problem is about to get significantly worse.

Adding video-creation capabilities to a widely accessible and simple-to-use tool like ChatGPT will mean it gets easier than ever to churn out fake video content, and that's a major worry when it comes to separating fact from fiction.

The US will be going to the polls later this year, and a general election in the UK is also likely to happen at some point in 2024. With deepfake videos purporting to show politicians saying something they never actually said already circulating, there's a real danger of false information spreading online very quickly.

With AI-generated content becoming more and more difficult to spot, the best way of knowing who, and what, to trust is to stick to well-known and reputable publications online for your news sources – so not something that's been reposted by a family member on Facebook, or pasted from an unknown source on the platform formerly known as Twitter.

You might also like

TechRadar – All the latest technology news

Read More

Google’s Instrument Playground offers a taste of an AI-generated musical future

Google has opened the gates to its latest experimental AI called Instrument Playground which allows people to generate 20-second music tracks with a single text prompt.

If that description sounds familiar to you, that’s because other companies have something similar like Meta with MusicGen. Google’s version adds two unique twists. First, it’s claimed to be capable of emulating over 100 instruments from around the world. This includes common ones like the piano to more obscure woodwinds like the dizi from China. 

Secondly, the company states you can “add an adjective” to the prompt to give it a certain mood. For example, putting in the word “Happy” will have Instrument Playground generate an upbeat track while “Merry” will create something more Christmassy. It’s even possible to implement sound effects by choosing one of three modes: Ambient, Beat, and Pitch. For the musically inclined, you can activate Advanced mode to launch a sequencer where you can pull together up to four different AI instruments into one song.

Live demo

The Instrument Playground is publicly available so we decided to take it for a spin.

Upon going to the website, you’ll be asked what you want to play. If you’re having a hard time deciding, there is a link below the prompt that opens a list of 65 different instruments. We said we wanted an upbeat electric guitar, and to our surprise, the AI added backup vocals to the riff – sort of. Most of the lyrics are incomprehensible gibberish although Chrome’s Live Caption apparently picked up the word “Satan” in there.

The generated song plays once (although you can replay it at any time by clicking the Help icon). Afterward, you can use the on-screen keyboard to work on the track. It’s not very expansive as users will only be given access to 12 keys centered around the C Major and C Minor scales. What you see on the page is directly tied to the numbers on a computer keyboard so you can use those instead of having to slowly click each one with a mouse.

Instrumental Playground example

(Image credit: Future)

You can use the three modes mentioned earlier to manipulate the file. Ambient lets you alter the track as a whole, Beat highlights what the AI considers to be the “most interesting peaks”, and Pitch can alter the length of a select portion. Users can even shift the octave higher or lower. Be aware the editing tools are pretty rudimentary. This isn’t GarageBand.

Upon finishing, you can record an audio snippet which you can then download as a .wav file to your computer. 

In the works

If you’re interested in trying out Instrument Playground, keep in mind this is an experimental technology that is far from perfect. We’re not musicians, but even we could tell there were several errors in the generated music. Our drum sample had a piano playing in the back and the xylophone sounded like someone hitting a bunch of PVC pipes. 

We reached out to Google with several questions like when will the AI support 100 instruments (If you remember, it’s only at 65 at the time of this writing) and what the company intends to do with it. Right now, Instrument Playground feels like little more than a digital toy, only capable of creating simple beats. It'd be great to see it do more. This story will be updated at a later time.

While we have you, be sure to check out TechRadar's list of the best free music-making software in 2023

You might also like

TechRadar – All the latest technology news

Read More

Amazon says you might have to pay for Alexa’s AI features in the future

Amazon might be mulling a subscription charge for Alexa’s AI features at some point down the road – though that may not be for some time yet, by the sound of things.

This nugget of info emerged from a Bloomberg interview with Dave Limp, who is SVP of Amazon Devices & Services currently, though he is leaving the company later this year. (Whispers on the grapevine are that Panos Panay, a big-hitting exec who just left Microsoft, will replace Limp).

Bloomberg’s Dave Lee broadly observed that the future of Alexa could involve a more sophisticated AI assistant, but one that device owners would need to fork out to subscribe to.

This would be an avenue of monetization, giving that the previous hope for spinning out some extra cash – having folks order more stuff online using Alexa, bolstering revenue that way – just hasn’t worked out for Amazon (not in any meaningful fashion, at least).

After Limp talked about Amazon pushing forward using generative AI to build out Alexa’s features, Lee fired out a question about whether there’ll come a time when those Alexa AI capabilities won’t be free – and are offered via a subscription instead.

Limp replied in no uncertain terms: “Yes, we absolutely think that,” noting the costs of training the AI model (properly), and then adding: “But before we would start charging customers for this – and I believe we will – it has to be remarkable.”


Analysis: Superhuman assistance?

Amazon Alexa new

Dave Limp (above) is currently SVP of Amazon Devices & Services, but is leaving the company later this year. (Image credit: Future / Lance Ulanoff)

So, there’s your weighty caveat. Limp makes it clear, in fact, that expectations would be built around the realization of a ‘superhuman’ assistant if Amazon was to charge for Alexa’s AI chops as outlined.

Limp clarifies that Alexa, as it is now, almost certainly won’t be charged for, and that the contemporary Alexa will remain free. He also suggested that Amazon has no idea of a pricing scheme yet for any future AI-powered Alexa that is super-smart.

This means the paid-for Alexa AI skills we’re talking about would be highly prized and a long way down the road for development with Amazon’s assistant. This isn’t anything that will remotely happen soon, but what it is, nonetheless, is a clear enough signal that this path of monetization is one Amazon is fully considering traveling down. Eventually.

As to exactly what timeframe we might be talking about, Limp couldn’t be drawn to commit beyond it not being “decades” or “years” away, with the latter perhaps hinting that maybe this could happen sooner than we may imagine.

We think it’ll be a difficult sell for Amazon in the nearer-term, though. Especially as plans are currently being pushed through to shove adverts into Prime Video early next year, and you’ll have to pay to avoid watching those ads. (As a subscriber to Prime, even though you’re paying for the video streaming service – and other benefits – you’ll still get adverts unless you stump up an extra fee).

If Amazon is seen to be watering down the value proposition of its services too much, or trying to force a burden of monetization in too many different ways, that’ll always run the risk of provoking a negative reaction from customers. In short, if the future of a super-sophisticated Alexa is indeed paying for AI skills, we’re betting this won’t be anytime soon – and the results better be darn impressive.

We must admit, we have trouble visualizing the latter, too, especially when as it currently stands, we can’t get Alexa to understand half the internet radio stations we want to listen to, a pretty basic duty for the assistant.

Okay, so Amazon did have some interesting new stuff to show off with Alexa’s development last week, but we remain skeptical on how that’ll pan out in the real-world, and obviously more so on how this new ‘superhuman’ assistant will be in the further future. In other words, we’ll keep heaping the salt on for the time being…

You might also like

TechRadar – All the latest technology news

Read More

Step into the future of AR and VR technology

As the world embraces rapid technological advancements, augmented and virtual reality (AR and VR) have emerged as transformative tools with the potential to revolutionize industries and enhance human experiences. 

Whether for art, education, healthcare, entertainment or engineering, AR and VR will play a foundational role in the next phase of technology. That’s why across the world, initiatives are launching that aim to incubate and nurture innovative ideas in transformative technology.

One such program is the Creative Solutions program by the King Abdulaziz Center for World Culture (Ithra) in Saudi Arabia. From this, two innovative projects – MemoARable and Virtually There – have arisen, aiming to leverage AR and VR to add value to Saudi Arabia’s cultural and creative industries (CCI).

New horizons

MemoARable, spearheaded by Maryam Alfadhli and Lina Alismail, seeks to reimagine the customer-store relationship through an AR-powered app. 

By transforming memories into personalized gifts incorporating images, messages, and voice notes, MemoARable transcends traditional marketing strategies, opening doors for immersive ticketing and gift card possibilities, and expanding its application beyond initial expectations.

On the other hand, Maram Alghamdi and Ali AlEid's Virtually There aims to revolutionise the tourism industry by offering users a full 3D, 360-degree access to Saudi Arabia's top destinations. Kicking off with AlUla, this immersive experience takes audiences on a journey through iconic tourist attractions. 

The roadmap also includes virtual visits to Riyadh, Jeddah, and a pilgrimage-focused tour of Makkah and Madinah, creating an exciting blend of culture and technology.

The prototypes of these projects were presented to a team of international tech experts, including inventor and tech consultant Simon Benson, as part of the Creative Solutions program. 

This program empowers digital content creation in immersive technologies and grants each of the five selected projects financial support of up to $ 20,000 to bring their ideas to fruition.

Room to grow

This year marks the third edition of the Creative Solutions showcase, welcoming participants to pitch their ideas once again. Successful applicants will develop their prototypes from September to December before presenting them to investors and the public in Q4.

The Creative Solutions program goes beyond mere financial support, as participants embark on a transformative journey featuring technical, creative, and entrepreneurial training and mentorship. 

Their prototypes are showcased in events attended by potential collaborators, incubators, accelerators, and other stakeholders, further promoting innovation and collaboration in the immersive tech space.

As the world embraces immersive technologies, projects like these will pave the way to unleashing the limitless potential of AR and VR. With MemoARable and Virtually There leading the way, the future is indeed bright for the intersection of creativity, technology, and human innovation.

TechRadar – All the latest technology news

Read More

ChatGPT use declines as users complain about ‘dumber’ answers, and the reason might be AI’s biggest threat for the future

 

Is ChatGPT old news already? It seems impossible, with the explosion of AI popularity seeping into every aspect of our lives – whether it’s digital masterpieces forged with the best AI art generators or helping us with our online shopping.

But despite being the leader in the AI arms race – and powering Microsoft’s Bing AI – it looks like ChatGPT might be losing momentum. According to SimilarWeb, traffic to OpenAI’s ChatGPT site dropped by almost 10% compared to last month, while metrics from Sensor Tower also demonstrated that downloads of the iOS app are in decline too.

As reported by Insider, paying users of the more powerful GPT-4 model (access to which is included in ChatGPT Plus) have been complaining on social media and OpenAI’s own forums about a dip in output quality from the chatbot.

A common consensus was that GPT-4 was able to generate outputs faster, but at a lower level of quality. Peter Yang, a product lead for Roblox, took to Twitter to decry the bot’s recent work, claiming that “the quality seems worse”. One forum user said the recent GPT-4 experience felt “like driving a Ferrari for a month then suddenly it turns into a beaten up old pickup”.

See more

Why is GPT-4 suddenly struggling?

Some users were even harsher, calling the bot “dumber” and “lazier” than before, with a lengthy thread on OpenAI’s forums filled with all manner of complaints. One user, ‘bitbytebit’, described it as “totally horrible now” and “braindead vs. before”.

According to users, there was a point a few weeks ago where GPT-4 became massively faster – but at a cost of performance. The AI community has speculated that this could be due to a shift in OpenAI’s design ethos behind the more powerful machine learning model – namely, breaking it up into multiple smaller models trained in specific areas, which can act in tandem to provide the same end result while being cheaper for OpenAI to run.

OpenAI has yet to officially confirm this is the case, as there has been no mention of such a major change to the way GPT-4 works. It’s a credible explanation according to industry experts like Sharon Zhou, CEO of AI-building company Lamini, who described the multi-model idea as the “natural next step” in developing GPT-4.

AIs eating AIs

However, there’s another pressing problem with ChatGPT that some users suspect could be the cause of the recent drop in performance – an issue that the AI industry seems largely unprepared to tackle.

If you’re not familiar with the term ‘AI cannibalism’, let me break it down in brief: large language models (LLMs) like ChatGPT and Google Bard scrape the public internet for data to be used when generating responses. In recent months, a veritable boom in AI-generated content online – including an unwanted torrent of AI-authored novels on Kindle Unlimited – means that LLMs are increasingly likely to scoop up materials that were already produced by an AI when hunting through the web for information.

An iPhone screen showing the OpenAI ChatGPT download page on the App Store

ChatGPT app downloads have slowed, indicating a decrease in overall public interest. (Image credit: Future)

This runs the risk of creating a feedback loop, where AI models ‘learn’ from content that was itself AI-generated, resulting in a gradual decline in output coherence and quality. With numerous LLMs now available both to professionals and the wider public, the risk of AI cannibalism is becoming increasingly prevalent – especially since there’s yet to be any meaningful demonstration of how AI models might accurately differentiate between ‘real’ information and AI-generated content.

Discussions around AI have largely focused on the risks it poses to society – for example, Facebook owner Meta recently declined to open up its new speech-generating AI to the public after it was deemed ‘too dangerous’ to be released. But content cannibalization is more of a risk to the future of AI itself; something that threatens to ruin the functionality of tools such as ChatGPT, which depend upon original human-made materials in order to learn and generate content.

Do you use ChatGPT or GPT-4? If you do, have you felt that there’s been a drop in quality recently, or have you simply lost interest in the chatbot? I’d love to hear from you on Twitter. With so many competitors now springing up, is it possible that OpenAI’s dominance might be coming to an end? 

TechRadar – All the latest technology news

Read More

An iOS app aims to preserve the Hmong dialect for future generations

While you may be enjoying apps that can help solve the tasks for the day ahead, or scratches your daily itch in the latest game on Apple Arcade, for example, there are a few different apps that serve an important purpose.

The Hmong people are one of the most marginalized Asian American groups in the US, and its language is in danger of being relegated to the history books.

This is where Hmong Phrases comes in. Its developer, Annie Vang, wants to preserve the Hmong language that has been in her family for generations. Alongside this, Vang also hosts a YouTube channel to showcase foods in the Hmong culture, as well as her other favorite foods.

It's available for iPhone and iPad devices running iOS 14 and iPadOS 14 or later for $ 0.99 / £0.69 / AU$ 1.09,  and it can also work on a Mac with Apple Silicon. You can scroll through the different conversations and hear back from Vang herself on how to pronounce various words.

It feels personal and yet educational – you know that Vang has put everything into this app, and it looks as though she isn't done, having recently spoken to her.

What could be next for the app?

Hmong Phrases app icon

(Image credit: Hmong Phrases)

The app has an elegant layout with a colorful scheme throughout its menus. The list of phrases may seem overwhelming to some at first, but you get used to it. You can use the search bar to find what you want.

While it's great to use it on iOS mainly, we asked Vang if there were any plans to add newer widgets, alongside an Apple Watch version, in the future.

Practicing phrases and words in Hmong on your wrist could appeal to many, especially as later Apple Watch models can use the speaker with some apps.

Vang was enthusiastic about these two ideas, and there's potentially a chance we could see them later in the year.

But whatever occurs in a future update, it's a great effort already to revive a language, and a culture, that should be preserved for future generations.

TechRadar – All the latest technology news

Read More

Apple, take some notes from this half a MacBook concept for a future Mac mini

If you browse forums and news sites, you'll most likely come across concept ideas from users who want to give their vision of how a product from Apple or Microsoft could be.

Back when owning an iPhone was a wish for many in the early noughties, you would see concept images of iPod Videos with a 'Phone' menu, but in the same iPod body, or a design that would look similar to the bondi-blue iMac from 1997.

However, one user has gone beyond this concept idea, and removed the display to a MacBook Pro, but left the keyboard part intact.

This not only harks back to the days of the Amiga with its 2-in-1 design, but gives me the idea that this could be perfect as a replacement to the Mac mini.


An Amiga and Apple hybrid?

The Mac mini has been around since 2006, and Apple mentioned at the time that they could only do this thanks to the PowerPC to Intel CPU transition.

But with another transition in progress, Apple has repeated the same mantra, which is why we've seen a redesigned iMac and MacBook Pro so far.

While there's been efforts by others to prove that a smaller Mac mini could work for the Apple Silicon chips, you still need to have a keyboard and trackpad in order to use it.

This is why the below makes sense in the long run, instead of being an effort to go viral for a day.

See more

This would reshape how a Mac mini could work, especially if this concept could also run on a battery if needed.

You could take this hybrid on a commute to work, and plug in the HDMI or Thunderbolt cable to start your day. This would cut down the setup you would normally have to do for a Mac mini, as the keyboard and trackpad are already there.

But this also harks back to the days of the Amiga, a PC from the eighties that allowed you to do this in a similar design.

Amiga 600 computer

(Image credit: Future)

It's one thing to look at an image, but to see someone use a snapped MacBook as if it's an Amiga 600 in 2022, makes a lot of sense.

The design can work in an age where you can easily find a spare monitor in the office and get going on some work, without also having to find a keyboard and mouse.

If this was to replace how we see the Mac mini in the near future with an M2 chip, it could be the best recommendation from me for family and friends, especially if they're looking for a new device for their bedroom or office.

TechRadar – All the latest technology news

Read More

Dropbox and Microsoft warn macOS users of issues for future versions of cloud apps

While Dropbox is finishing up an update to its cloud service app for macOS that brings native Apple Silicon support, it's sent an email to users, warning them about potential issues if they don't update once a future version of macOS Monterey arrives.

But it turns out that it's not an isolated issue, with Microsoft also stating on a support page that not updating OneDrive on the Mac may bring problems in future macOS Monterey versions. As long as users download the rewritten Files-On-Demand app, there'll be no issue.

You've most likely used both apps before, whether that's at College or as a way to quickly download files from someone in a hurry. But this looks as though there's been a background change to macOS by Apple that both cloud apps use.

We've reached out to Apple to confirm what this change is, and why both Dropbox and Microsoft are recommending you about potential issues for future macOS versions.


Analysis: What's changed so drastically?

It's telling that another potential issue from Apple involves the cloud, after developers' ongoing frustrations with the 503 iCloud errors, that's causing failures in syncing content across devices.

In an email to users, Dropbox explained, “Some applications on your Mac may have problems opening Dropbox files while they are online only. You will still be able to open Dropbox files by double-clicking them in Finder”.

While you can download the beta version of Dropbox for Apple Silicon, this still means that you may encounter issues when macOS 12.3 arrives.

macOS 12.2 is currently available for developers and users who are signed up to the beta program, so there may be a forthcoming change in 12.3 that Apple has told both Microsoft and Dropbox, so that the cloud apps can work on another update to make sure that there are no further issues.

For now, we recommend backing up your files if you use one or both of these apps, and to make sure that you have the latest updates to both for when macOS 12.3 does arrive to your Mac.

Via 9To5Mac

TechRadar – All the latest technology news

Read More