Google’s Gemini AI can now handle bigger prompts thanks to next-gen upgrade

Google’s Gemini AI has only been around for two months at the time of this writing, and already, the company is launching its next-generation model dubbed Gemini 1.5.

The announcement post gets into the nitty-gritty explaining all the AI’s improvements in detail. It’s all rather technical, but the main takeaway is that Gemini 1.5 will deliver “dramatically enhanced performance.” This was accomplished with the implementation of a “Mixture-of-Experts architecture” (or MoE for short) which sees multiple AI models working together in unison. Implementing this structure made Gemini easier to train as well as faster at learning complicated tasks than before.

There are plans to roll out the upgrade to all three major versions of the AI, but the only one being released today for early testing is Gemini 1.5 Pro. 

What’s unique about it is the model has “a context window of up to 1 million tokens”. Tokens, as they relate to generative AI, are the smallest pieces of data LLMs (large language models) use “to process and generate text.” Bigger context windows allow the AI to handle more information at once. And a million tokens is huge, far exceeding what GPT-4 Turbo can do. OpenAI’s engine, for the sake of comparison, has a context window cap of 128,000 tokens. 

Gemini Pro in action

With all these numbers being thrown, the question is what does Gemini 1.5 Pro look like in action? Google made several videos showcasing the AI’s abilities. Admittedly, it’s pretty interesting stuff as they reveal how the upgraded model can analyze and summarize large amounts of text according to a prompt. 

In one example, they gave Gemini 1.5 Pro the over 400-page transcript of the Apollo 11 moon mission. It showed the AI could “understand, reason about, and identify” certain details in the document. The prompter asks the AI to locate “comedic moments” during the mission. After 30 seconds, Gemini 1.5 Pro managed to find a few jokes that the astronauts cracked while in space, including who told it and explained any references made.

These analysis skills can be used for other modalities. In another demo, the dev team gave the AI a 44-minute Buster Keaton movie. They uploaded a rough sketch of a gushing water tower and then asked for the timestamp of a scene involving a water tower. Sure enough, it found the exact part ten minutes into the film. Keep in mind this was done without any explanation about the drawing itself or any other text besides the question. Gemini 1.5 Pro understood it was a water tower without extra help.

Experimental tech

The model is not available to the general public at the moment. Currently, it’s being offered as an early preview to “developers and enterprise customers” through Google’s AI Studio and Vertex AI platforms for free. The company is warning testers they may experience long latency times since it is still experimental. There are plans, however, to improve speeds down the line.

We reached out to Google asking for information on when people can expect the launch of Gemini 1.5 and Gemini 1.5 Ultra plus the wider release of these next-gen AI models. This story will be updated at a later time. Until then, check out TechRadar's roundup of the best AI content generators for 2024.

You might also like

TechRadar – All the latest technology news

Read More

Zoom could be planning even bigger events

Although Zoom may be best known for its video conferencing software, its platform also supports virtual events and the company's latest acquisition will allow these events to be both larger and more complex.

According to a new blog post, the company believes that the future of events will include a combination of virtual and in-person formats. As a result, its customers will require a holistic solution that allows them to build, host and manage virtual and hybrid events.

Zoom first introduced Zoom Video Webinars back in 2014 to enable organizations to share information and interactive video presentations with up to 50k people. However, back in July of this year, the company unveiled Zoom Events to make it possible for businesses and other organizations to host in-person events that also have a virtual element.

In order to showcase some of the exciting new capabilities in Zoom Events, Zoom used its new Conference event type for Zoomtopia 2021 which saw over 33k virtual guests attend the tech conference from around the world. Now though, the company has acquired several tools as well as some top talent from the startup Liminal to make it easier for organizations to produce professional programs and performances from anywhere in the world.

Bridging the gap

As reported by The Verge, Zoom has announced that it has acquired two add-ons from the startup Liminal that can be used to create professional virtual events.

The first is ZoomOSC that will allow its customers to enhance professional meetings and events using the Open Sound Control (OSC) protocol. This add-on also enables users to integrate Zoom Events with third-party software, hardware controllers and media servers. The second add-on, ZoomISO, makes it possible to export each participants' video feed as a separate output to professional production hardware with the capability to export five feeds in HD.

With the acquisition of these two add-ons, it will be possible to bridge the gap between emerging and traditional event control tools according to Zoom. This will likely be quite useful for broadcast studios, theaters and other organizations that want to create professional streams using the company's video conferencing software.

However, in addition to acquiring Zoom OSC and ZoomISO from Liminal, two of the startup's co-founders (Andy Carluccio and Jonathan Kokotajlo) will also be joining Zoom.

We've also rounded up the best video conferencing software and best online collaboration tools

Via The Verge

TechRadar – All the latest technology news

Read More