Midjourney just changed the generative image game and showed me how comics, film, and TV might never be the same

Midjourney, the Generative AI platform that you can currently use on Discord just introduced the concept of reusable characters and I am blown away.

It's a simple idea: Instead of using prompts to create countless generative image variations, you create and reuse a central character to illustrate all your themes, live out your wildest fantasies, and maybe tell a story.

Up until recently, Midjourney, which is trained on a diffusion model (add noise to an original image and have the model de-noise it so it can learn about the image) could create some beautiful and astonishingly realistic images based on prompts you put in the Discord channel (“/imagine: [prompt]”) but unless you were asking it to alter one of its generated images, every image set and character would look different.

Now, Midjourney has cooked up a simple way to reuse your Midjourney AI characters. I tried it out and, for the most part, it works.

Image 1 of 3

Midjourney AI character creation

I guess I don’t know how to describe myself. (Image credit: Future)
Image 2 of 3

Midjourney AI character creation

(Image credit: Future)
Image 3 of 3

Midjourney AI character creation

Things are getting weird (Image credit: Future)

In one prompt, I described someone who looked a little like me, chose my favorite of Midjourney's four generated image options, upscaled it for more definition, and then, using a new “– cref” prompt and the URL for my generated image (with the character I liked), I forced Midjounrey to generate new images but with the same AI character in them.

Later, I described a character with Charles Schulz's Peanuts character qualities and, once I had one I liked, reused him in a different prompt scenario where he had his kite stuck in a tree (Midjourney couldn't or wouldn't put the kite in the tree branches).

Image 1 of 2

Midjourney AI character creation

An homage to Charles Schulz (Image credit: Future)
Image 2 of 2

Midjourney AI character creation

(Image credit: Future)

It's far from perfect. Midjourney still tends to over-adjust the art but I contend the characters in the new images are the same ones I created in my initial images. The more descriptive you make your initial character-creation prompts, the better result you'll get in subsequent images.

Perhaps the most startling thing about Midjourney's update is the utter simplicity of the creative process. Writing natural language prompts has always been easy but training the system to make your character do something might typically take some programming or even AI model expertise. Here it's just a simple prompt, one code, and an image reference.

Image 1 of 2

Midjourney AI character creation

Got a lot closer with my photo as a reference (Image credit: Future)
Image 2 of 2

Midjourney AI character creation

(Image credit: Future)

While it's easier to take one of Midjourney's own creations and use that as your foundational character, I decided to see what Midjourney would do if I turned myself into a character using the same “cref” prompt. I found an online photo of myself and entered this prompt: “imagine: making a pizza – cref [link to a photo of me]”.

Midjourney quickly spit out an interpretation of me making a pizza. At best, it's the essence of me. I selected the least objectionable one and then crafted a new prompt using the URL from my favorite me.

Midjourney AI character creation

Oh, hey, Not Tim Cook (Image credit: Future)

Unfortunately, when I entered this prompt: “interviewing Tim Cook at Apple headquarters”, I got a grizzled-looking Apple CEO eating pizza and another image where he's holding an iPad that looks like it has pizza for a screen.

When I removed “Tim Cook” from the prompt, Midjourney was able to drop my character into four images. In each, Midjourney Me looks slightly different. There was one, though, where it looked like my favorite me enjoying a pizza with a “CEO” who also looked like me.

Midjourney AI character creation

Midjourney me enjoying pizza with my doppelgänger CEO (Image credit: Future)

Midjourney's AI will improve and soon it will be easy to create countless images featuring your favorite character. It could be for comic strips, books, graphic novels, photo series, animations, and, eventually, generative videos.

Such a tool could speed storyboarding but also make character animators very nervous.

If it's any consolation, I'm not sure Midjourney understands the difference between me and a pizza and pizza and an iPad – at least not yet.

You might also like

TechRadar – All the latest technology news

Read More

Did we just catch our first glimpse of Windows 12? If so, we won’t get the new OS until 2025

We might have just caught our first glimpse of Windows 12, although we can’t be sure about that – but what we do know is that Microsoft is making a big change with test builds of Windows.

XenoPanther on X (formerly Twitter) noticed that the internal Canary versions of Windows 11 – those in the earliest testing channel, in other words – were just forked with a new build 27547 coming into play.

See more

The most recent Canary channel build is version 26040 as you may be aware if you follow these preview releases (which comes with a new Voice Clarity feature to improve video chats).

So, now we have builds in the 26XXX range and also the 27XXX range, prompting the obvious question: Is the latter Windows 12 in its first test phase? Let’s discuss that in more depth next.


Analysis: I’m giving her all she’s got, Captain!

As Zac Bowden, the well-known Microsoft leaker (of Windows Central fame) points out, the likelihood here is that the next release of Windows is the 26XXX branch, which is currently rumored (by Bowden) to be Windows 11 24H2 coming later this year.

See more

That means the 27XXX preview versions could be the next incarnation of Windows after that, the one arriving in 2025 (and these builds probably won’t go into testing with Windows Insiders for some time yet). Hence the (tentative) conclusion that this might be Windows 12, or an all-new Windows, whatever it may be called.

(Although we should further note that technically, Windows 11 24H2 will be all-new. Not the front-end mind, but the underlying foundations – it will be built on a new platform known as Germanium, which will offer considerable performance and security benefits deep under the hood).

At any rate, this pretty much underlines the idea that Windows 12 (or next-gen Windows, whatever the final name) is not coming this year, and will probably arrive next year. After all, Windows 10 gets ditched in 2025, so it makes some sense that a new OS comes in as one shuffles out the exit door (in October 2025 to be precise).

As we’ve discussed before, one of the dangers of bringing in Windows 12 this year is that the move would fragment the desktop user base into three camps, which is clumsy and a headache for organizing updates. So that scenario is neatly avoided if Windows 12 doesn’t turn up until 2025.

As a side note, Microsoft has codenames for its OS development semesters, and the next one should have been arsenic – but due to it being perceived as “scary and violent” Bowden tells us, the software giant has avoided it, and is instead using the codename Dilithium. Which is pretty cool for Star Trek fans (maybe Duranium will be next in line when another unsuitable real-world element pops up).

Via Neowin, Deskmodder

You might also like…

TechRadar – All the latest technology news

Read More

Has ChatGPT been getting a little lazy for you? OpenAI has just released a fix

It would seem reports of 'laziness' on the part of the ChatGPT AI bot were pretty accurate, as its developer OpenAI just announced a fix for the problem – which should mean the bot takes fewer shortcuts and is less likely to fail half way through trying to do something.

The latest update to the ChatGPT code is “intended to reduce cases of 'laziness' where the model doesn’t complete a task” according to OpenAI. However, it's worth noting that this only applies to the GPT-4 Turbo model that's still in a limited preview.

If you're a free user on GPT-3.5 or a paying user on GPT-4, you might still notice a few problems in terms of ChatGPT's abilities – although we're assuming that eventually the upgrade will trickle its way down to the other models as well.

Back in December, OpenAI mentioned a lack of updates and “unpredictable” behavior as reasons why users might be noticing subpar performance from ChatGPT, and it would seem that the work to try and get these issues resolved is still ongoing.

More thorough

ChatGPT voice chat

ChatGPT is pushing forward on mobile too (Image credit: Future)

One of the tasks that GPT-4 Turbo can now complete “more thoroughly” is generating code, according to OpenAI. More complex tasks can also be completed from a single prompt, while the model will also be cheaper for users to work with.

Many of the other model upgrades mentioned in the OpenAI blog post are rather technical – but the takeaways are that these AI bots are getting smarter, more accurate, and more efficient. A lot of improvements are related to “embeddings”, the numerical representations that AI bots use to understand words and the context around them.

ChatGPT recently got its very own app store, where third-party developers can showcase their own custom-made bots (or GPTs). However, there are rules in place that ban certain types of chatbots – like virtual girlfriends.

It also appears that OpenAI is busy pushing ChatGPT forward on mobile, with the latest ChatGPT beta for Android offering the ability to load up the bot from any screen (much as you might do with Google Assistant or Siri).

You might also like

TechRadar – All the latest technology news

Read More

Google Lens just got a powerful AI upgrade – here’s how to use it

We've just seen the Samsung Galaxy S24 series unveiled with plenty of AI features packed inside, but Google isn't slowing down when it comes to upgrading its own AI tools – and Google Lens is the latest to get a new feature.

The new feature is actually an update to the existing multisearch feature in Google Lens, which lets you tweak searches you run using an image: as Google explains, those queries can now be more wide-ranging and detailed.

For example, Google Lens already lets you take a photo of a pair of red shoes, and append the word “blue” to the search so that the results turn up the same style of shoes, only in a blue color – that's the way that multisearch works right now.

The new and improved multisearch lets you add more complicated modifiers to an image search. So, in Google's own example, you might search with a photo of a board game (above), and ask “what is this game and how is it played?” at the same time. You'd get instructions for playing it from Google, rather than just matches to the image.

All in on AI

Two phones on an orange background showing Google Lens

(Image credit: Google)

As you would expect, Google says this upgrade is “AI-powered”, in the sense that image recognition technology is being applied to the photo you're using to search with. There's also some AI magic applied when it comes to parsing your text prompt and correctly summarizing information found on the web.

Google says the multisearch improvements are rolling out to all Google Lens users in the US this week: you can find it by opening up the Google app for Android or iOS, and then tapping the camera icon to the right of the main search box (above).

If you're outside the US, you can try out the upgraded functionality, but only if you're signed up for the Search Generative Experience (SGE) trial that Google is running – that's where you get AI answers to your searches rather than the familiar blue links.

Also just announced by Samsung and Google is a new Circle to Search feature, which means you can just circle (or scribble on) anything on screen to run a search for it on Google, making it even easier to look up information visually on the web.

You might also like

TechRadar – All the latest technology news

Read More

Microsoft just gave Windows Copilot a ChaGPT-4 boost and the ability to explain screenshots

Microsoft came out hot with its Windows Copilot 365 and Copilot for Windows announcements last year, and presented Copilot as a general virtual assistant to help with your whole digital life. So far, we’re yet to see Copilot reach its potential, but it looks like we’re one step closer: Microsoft is reportedly gearing up to add a powerful new feature that will allow users to take a screenshot, submit it to Copilot, and ask Copilot to explain what’s in the screenshot. 

As far as we know, an “add a screenshot” button is rolling out to the general public – meaning you may already be able to try it. If not, you should be able to very soon. This button should appear in the Copilot panel, prompting you to select a part of the screen, confirming that you’ve captured what you want to discuss with Copilot, and then upload it to the Copilot or Bing right-side panel. When I tried it, I did also need to submit some text to go along with it, such as a question or additional context. 

A screenshot in a Microsoft Edge window with a Copilot panel open on the right, with arrow pointing to new

(Image credit: Future)

Once the screenshot is uploaded, you can talk about it and ask about anything within it or relating to it with Bing Chat or Copilot on Windows 11

As you can see in the screenshot, the new button sits next to the existing image upload button. You can  try the new feature at Microsoft’s Copilot website, or over at Bing Chat

Windows Latest has their own demonstration of this new feature, and I had a go of my own. First, I took a screenshot selection which showed a full description of video creation platform HeyGen's YouTube channel and asked it to tell me two things: whose channel it is and what it’s about.

Copilot returned this: 

This is the YouTube channel of HeyGen. It is a next-gen video creation platform that turns text into professional spokesperson videos in minutes. They offer premium avatars speaking in multiple languages and professional video templates for various use cases including marketing, e-learning, and corporate communication. You can find more information about HeyGen on their website.

This is pretty accurate, and reminded me of one feature in particular that I really like about Bing Chat and Copilot – they readily and very visibly provide sources and websites that you can visit to double check the information. 

Once you make a selection of your screen, you can make markings on it and draw on it. You can also add specific instructional visuals to help Copilot understand your query, and you can move your selection window around to a different part of the screen altogether. 

According to Windows Latest, Bing Chat recently got a ChatGPT-4 boost granting it a new level of functionality and this is likely making its way into Copilot as well. Apparently access is currently only granted to select users, and this development enables Copilot to engage in conversations about emotions. Currently, there is a limited pool of users who can try this for themselves and access is seemingly given at random, and it will be available to all who access Windows Copilot and Bing Chat very shortly.

Microsoft Bing logo on a white smartphone screen

(Image credit: Shutterstock / Primakov)

Microsoft charts a course ahead with Copilot

Microsoft has been pretty definitive in its messaging that Copilot is a big deal for the company, and will be a central feature in several products like Microsoft 365 and Windows, but not just those. 

In a pretty major (yet not terribly surprising) development, Microsoft is planning to add an actual physical Copilot button into the hardware of newly manufactured products as early as 2024. Microsoft is doing this in its continuing effort to make computing, especially AI-powered computing, simpler and more seamless for users. This was detailed and confirmed in a recent Windows Experience Blog post written by Yusuf Mehdi, Executive Vice President and Consumer Chief Marketing Officer at Microsoft.

For the rest of us not ready to throw our older Windows devices out quite yet for this new button, you can bring up Windows Copilot with the shortcut Win+C (if you have updated your Windows 11 version to one that has Windows Copilot included). 

According to Microsoft itself, the introduction of the Copilot key will be the most notable upgrade to the Windows keyboard in almost thirty years. It likens this future introduction to the addition of the Windows Start key, which is putting a lot of faith in Copilot itself so I imagine we’ll continue to see major developments to Copilot throughout this year. I think especially with Copilot’s development, Microsoft is one of the most exciting companies to watch this year. 

YOU MIGHT ALSO LIKE

TechRadar – All the latest technology news

Read More

Microsoft just launched a free Copilot app for Android, powered by GPT-4

If you're keen to play around with some generative AI tech on your phone, you now have another option: Microsoft has launched an Android app for its Copilot chatbot, and like Copilot in Windows 11, it's free to use and powered by GPT-4 and DALL-E 3.

As spotted by @techosarusrex (via Neowin), the Copilot for Android app is available now, and appears to have arrived on December 19. It's free to use and you don't even need to sign into your Microsoft account – but if you don't sign in, you are limited in terms of the number of prompts you can input and the length of the answers.

In a sense, this app isn't particularly new, because it just replicates the AI functionality that's already available in Bing for Android. However, it cuts out all the extra Bing features for web search, news, weather, and so on.

There's no word yet on a dedicated Copilot for iOS app, so if you're using an iPhone you're going to have to stick with Bing for iOS for now if you need some AI assistance. For now, Microsoft hasn't said anything officially on its new Android app.

Text and images

The functionality inside the new app is going to be familiar to anyone who has used Copilot or Bing AI anywhere else. Microsoft has been busy adding the AI everywhere, and has recently integrated it into Windows 11 too.

You can ask direct questions like you would with a web search, get complex topics explained in simple terms, have Copilot generate new text on any kind of subject, and much more. The app can work with text, image and voice prompts too.

Based on our testing of the app, it seems you get five questions or searches per day for free if you don't want to sign in. If you do tell Microsoft who you are, that limit is lifted, and signing in also gives you access to image generation capabilities.

With both Apple's Siri and Google Assistant set to get major AI boosts in the near future, Microsoft won't want to be left behind – and the introduction of a separate Copilot app could help position it as a standalone digital assistant that works anywhere.

You might also like

TechRadar – All the latest technology news

Read More

My jaw hit the floor when I watched an AI master one of the world’s toughest physical games in just six hours

An AI just mastered Labyrinth in six hours, and I am questioning my own existence.

I started playing Labyrinth in the 1970s. While it may look deceptively simple and is fully analog, Labyrinth is an incredibly difficult, nearly 60-year-old physical board game that challenges you to navigate a metal ball through a hole-riddled maze by changing the orientation of the game platform using only the twistable nobs on two adjacent sides of the game's box frame.

I still remember my father bringing Labyrinth home to our Queens apartment, and my near-total obsession with mastering it. If you've never played, then you have no idea how hard it is to keep a metal ball on a narrow path between two holes just waiting to devour it.

It's not like you get past a few holes and you're home free; there are 60 along the whole meandering path. One false move and the ball is swallowed, and you have to start again. It takes fine motor control, dexterity, and a lot of real-time problem-solving to make it through unscathed. I may have successfully navigated the treacherous route a few times.

It sometimes ignored the path and took shortcuts. That’s called cheating.

In the intervening years, I played sporadically (once memorably with a giant labyrinth at Google I/O), but mostly I forgot about the game, though I guess I never really forgot the challenge.

Perhaps that's why my mouth dropped open as I watched CyberRunner learn and beat the game in just six hours.

In a recently released video, programmers from the public research university ETH Zurich showed off their bare-bones AI robot, which uses a pair of actuators that act as the 'hands' to twist the Labyrinth nobs, an overhead camera to watch the action, and a computer running an AI algorithm to learn and, eventually, beat the game.

In the video, developers explain that “CyberRunner exploits recent advances in model-based reinforcement learning and its ability to make informed decisions about potentially successful behaviors by planning into the future.”

Initially, CyberRunner was no better than me or any other average human player. It dumped the metal ball into holes less than a tenth of the way through the path, and then less than a fifth of the way through. But with each attempt, CyberRunner got better – and not just a little better, but exponentially so.

In just six hours, according to the video, “CyberRunner's able to complete the maze faster than any previously recorded time.” 

The video is stunning. The two motors wiggle the board at a super-human rate, and manage to keep the ball so perfectly on track that it's never in danger of falling into any of the holes. CyberRunner's eventual fasted time was a jaw-dropping 14.8 seconds. I think my best time was… well, it could often take many minutes.

I vividly recall playing, and how I would sometimes park the ball in the maze, taking a break mid-challenge to prepare myself for the remainder of the still-arduous journey ahead. No so with CyberRunner. Its confidence is the kind that's only possible with an AI. It has no worries about dropping its metal ball into a hole; no fear of failure.

It also, initially, had no fear of getting caught cheating.

As CyberRunner was learning, it did what computers do and looked for the best and fastest path through the maze, which meant it sometimes ignored the path and took shortcuts. That's called cheating. Thankfully, the researchers caught CyberRunner, and reprogrammed it so it was forced to follow the full maze.

Of course, CyberRunner's accomplishment is not just about beating humans at a really difficult game. This is a demonstration of how an AI can solve physical-world problems based on vision, physical interaction, and machine learning. The only question is, what real-world problems will this open-source project solve next?

As for me, I need to go dig my Labyrinth out of my parent's closet.

You might also like

TechRadar – All the latest technology news

Read More

Apple Books just got a Spotify Wrapped-style recap for readers – and it beats Apple Music Replay

Apple has just launched Year in Review, a Spotify Wrapped-style round-up for its Books app, where you’ll be able to see personalized stats covering all the books you read in the app over the past year. If you’re curious about who your most-read author is and how long you spent leafing through literature in 2023, you’ll want to take a look.

You’ll need to open the Books app and select the Read Now tab in the bottom-left corner, then find the 'Your Year in Review' card under the Top Picks header. Tap that and you’ll find a bunch of fascinating facts about your reading habits from the last 12 months. Note that you’ll need to have marked at least three books as completed to get your reading summary.

For example, Apple has created six ‘reader types’ that are defined by the way you read or listen to literature. These types include 'The Completionist' for readers who consume multiple books in a series, and 'The Contemporary' for people who love trending titles.

Apple has also published several lists of the most-read books across all Books users – Spare by Prince Harry took the top spot for a non-fiction title, while Only the Dead by Jack Carr was the top fiction audiobook. The company did something similar for its Podcasts app, where you can see all the top-ranked shows among listeners.

Better than Apple Music Replay

Three iPhones side-by-side showing the Apple Books Year in Review feature.

(Image credit: Apple)

Apple has put an emphasis on sharing this year, with book cover collages, graphs and statistics to send to your friends. All of this reading info is contained within Apple’s Books app, which makes it easy to catch up with your year-end review in between reading a novel or listening to an audiobook.

That makes it very different from Apple Music Replay. This is Apple Music’s take on Spotify Wrapped and, like Books’ Year in Review, gives you a deep dive into your music tastes in 2023.

The difference, though, is that Apple Music Replay is hosted on Apple’s website, not in the Apple Music app. You can still see all the same stats and figures as you’d expect, but there’s an extra degree of friction in the process. Compare that to Spotify, where its Wrapped round-up is right there at your fingertips in the app.

Why Apple built the Year in Review into its Books app but still refrains from making Apple Music Replay an app-based feature is a mystery. Regardless, head over to Apple’s Books app if you want to get the lowdown on your reading habits in 2023.

You might also like

TechRadar – All the latest technology news

Read More

Google Maps just made it a lot easier to plan holiday trips with better travel tools

With the holiday season just over the horizon, Google Maps is receiving an update to make planning and traveling around these hectic times more manageable.

The patch consists of three new features. First, the app will gain updated “transit directions” that’ll tell you “the best route to your destination based on key factors”. This includes the overall length of the trip, estimated time of arrival, plus the number of transfers you’ll have to take in order to get there. It’ll even be possible to customize the route using filters telling Google Maps to focus on a specific type of transit, like subways, or if you want one with minimal walking. 

Additionally, the app will tell where you can find the entrances and exits to stations “in over 80 cities around the world,” including Boston, London, New York City, Sydney, and Toronto. It'll point out “what side of the street they're on” as well as if there is a “clear walking route”.

Newfound collaboration

Next, the collaborative list tool will allow invited users to vote on an activity via emoji reactions. You can choose between a heart, a smiley face, a flame, or a flying stack of cash if you’re interested in going. For those who aren’t, a thumbs-down icon will be available.

Speaking of which, people can also react to publicly posted photographs on Google Maps with an emoji. The company states that “in some cases” you’ll be given the opportunity to use mashup reactions via Emoji Kitchen. The emoji mashup selections seem to depend on what the app’s AI sees in an image. For example, if it detects a bagel, the mashup will include the food item, and potentially, the yummy face. These custom-made icons will automatically be generated.

Everything you see here will be rolling out globally to Android and iOS devices starting today. The rest of the announcement consists of the tech giant shouting out certain Google Maps tools that you can use to help “navigate the holidays” like finding nearby charging stations for electric vehicles or purchasing train tickets right on the app.

If you’re interested in what else it can do, check out TechRadar’s list of the 10 things you didn’t know Google Maps could do

You might also like

TechRadar – All the latest technology news

Read More

Google Photos just made it much easier to tidy up your library – here’s how

Google Photos is introducing a pair of AI-powered features to help you organize all the family pictures and screenshots in your messy profile.

Moving forward, the service will be able to identify photographs “that were taken close together” and then group them together into what Google calls Photo Stacks. It appears the AI operates by selecting images that have visual similarities to each other. The software is not going to pick out pictures with a different composition or subjects in them. Once the selections have been made, Google Photos will choose one of them to be the lead image. Of course, you do have the option to manually pick the lead, “modify the stacks, or turn off” the feature entirely. 

See more

Tidying-up screenshots

Google Photos will be doing something similar for “screenshots and documents in your gallery” by automatically categorizing them “into more helpful albums”. There will be an album for images of your ID card, and receipts, plus one for “event information” like an upcoming concert or festival. The goal here is to make it easier to locate “what you need when you need it without having” to dig through a mess of photographs. 

The AI will also allow you to set reminders on your phone calendar using the information from a screenshot of a ticket or “flyer for an upcoming event.” As an example, let’s say you took a screenshot of a ticket for a concert scheduled for December 2. You will see a “Set Reminder” option at the bottom of the picture in Google Photos. Tapping it causes a calendar entry to show up where you can enter more information or edit it. The company explains you can choose to “automatically archive your screenshots… after 30 days” which will hide them from the main gallery. They can still, however, be found in their respective albums.

See more

The announcement states the Google Photos update is currently rolling out to Android and iOS. Be sure to keep an eye out for the patch when it arrives. No word if there will be a desktop version, although we did ask Google for more information. This story will be updated if we hear back.

While we have you, be sure to check out TechRadar’s list of the best photo storage and sharing sites in 2023.

You might also like

TechRadar – All the latest technology news

Read More