Microsoft is working hard towards proving the 'intelligence' part in artificial intelligence, and has just revealed the latest version of its Turing Bletchley series of machine intelligence models, Turing Bletchley v3.
As explained in an official blog post, Turing Bletchley v3 is a multilingual vision-language foundation model, and will be integrated into many existing Microsoft products. If the name of this model sounds scary, don’t worry – let’s break it down.
The ‘multilingual' part is self-explanatory – the model helps Microsoft products function better in a range of languages, currently standing at more than ninety. The ‘vision-language' part means that the model has image processing and language capabilities simultaneously, which is why this kind of model is known as ‘multimodal’. Finally, the ‘foundation model’ part refers to the conceptual and technical structure of the actual model.
The first version of this multimodal model was launched in November 2021, and in 2022, Microsoft started testing the latest version – v3. Turing Bletchley v3 is pretty impressive because making a model that can “understand” one type of input (say, text or images) is already a big undertaking. This model combines both text and image processing to, in the case of Bing, improve search results.
Incorporating neural networks
The Turing Bletchley v3 model makes use of the concept of neural networks, which is a way of programming a machine that mimics a human brain. These neural networks allow it to make connections in the following manner, as described by Microsoft itself:
“Given an image and a caption describing the image, some words in the caption are masked. A neural network is then trained to predict the hidden words conditioned on both the image and the text. The task can also be flipped to mask out pixels instead of words.”
The model is trained over and over in this way, not unlike how we learn. The model is also continuously monitored and improved by Microsoft developers.
Where else the new model is being used
Bing Search isn’t the only product that’s been revamped with Turing Bletchley v3. It’s also being used for content moderation in Microsoft’s Xbox Live game service. The model helps the Xbox moderation team to identify inappropriate and harmful content uploaded by Xbox users to their profiles.
Content moderation is a massive job scale-wise and often mentally exhausting, so any assistance that helps moderators actually have to see less upsetting content is a big win in my eyes. I can see Turing Bletchley v3 being deployed in content moderation for Bing Search in a similar manner.
This sounds like a significant improvement for Bing Search. The AI-aided heat is on, especially between Microsoft and Google. Recently, Microsoft brought Bing AI to Google Chrome, and now it’s coming for image search. I don’t see how Google doesn’t see this as direct competition in the most direct manner. Google still enjoys the greatest popularity both in terms of browser and search volume, but nothing is set in stone. Your move, Google.