Generative AI Watch: Google I/O Fires on All Cylinders, but What will Become of the Search Business?

B. Valle

Summary Bullets:

• Google’s I/O event for developers included the announcements of Project Astra, Gemini 1.5 Flash, and a new tensor processing unit (TPU) architecture.

• Google also updated Gemini 1.5 Pro, extending its context window from one to two million tokens, and refining its code-generation and reasoning capabilities.

Google I/O 2024 included an impressive array of generative AI (GenAI) announcements such as upgrades to Google Gemini (Gemini), an AI assistant called Project Astra, and a new chipset architecture. The release of Trillium, the sixth generation of Google Cloud TPUs confirms Google’s position in the contested silicon market, where a competitive microprocessor architecture has become an essential element in the arsenal of every hyperscaler trying to beef up its GenAI strategy. This was demonstrated once again when Microsoft recently announced a new collaboration with AMD. As the original pioneer of custom-made, proprietary semiconductors for AI, Google is keeping the platform fresh by releasing timely upgrades. The new architecture includes the next generation of SparseCore, an accelerator for processing the embeddings found in AI-based ranking and recommendation systems.

In the realm of AI software, one of the most exciting announcements was Project Astra, an AI assistant developed by Google DeepMind that uses video and voice recognition to deliver contextual responses. One much talked-about prerecorded demo showed the visual understanding capabilities of Project Astra, helping a human find where she had left her glasses as well as describing in detail a project that a software developer was working on by looking at his computer screen. However, the launch was surrounded by controversy since OpenAI had just released another multimodal assistant only days before, GPT-4o, a model that can reason across audio, vision, and text with minimal latency.

Gemini 1.5 Flash was also made available in public preview. The fastest Gemini model served in the API, it is optimized for high-volume, high-frequency tasks and is more cost-efficient than other models. Like other systems in the platform, it benefits from native multimodality. The Gemini family came to market a bit later than competing offerings, thus taking advantage of the relatively greater maturity of GenAI technologies at the time. This new model has a very large context window of one million tokens, still shrinking in comparison to the gigantic new two-million context window of its cousin, Gemini Pro 1.5. It seems like only yesterday (in fact around six months ago) when the company released Gemini Pro with a 32k context window. By upgrading the high-end Gemini Pro in this way, Google puts even Anthropic’s Claude 3 to shame. Because the new version of Gemini 1.5 Pro has been made available globally via Google Workspace Labs, developers using Google Cloud Platform now have access to more computing power in Gemini than they get with any competing LLM.

Moreover, Google launched a new Gemini model customized for its iconic search platform. By end-2024, users should expect to get an upgraded version of Google’s AI Overviews, a feature whose new capabilities mean the world’s top search browser will irrevocably change. AI search summaries will be rolled out worldwide by end-2024, altering how everyone uses the internet forever, and clashing with a traditional commercial model based on advertising revenues. The search business is existential for Google and its raison d’être. No wonder the GenAI revolution caught the company by surprise, despite giving the world some of its best engineers and creators of the transformer architecture. Instead of a list of links, an AI-generated overview will be produced with summarized responses, pushing the links further down the page, where they are less likely to get traffic.

Other announcements included upgrades to Google’s open-source Gemma platform and the watermarking tool SynthID, which will be used across text and video as well as upgrades to the generative video model Veo. Even in the incredibly fast-moving world of GenAI, some of these announcements are remarkable, but Google seems to lack a certain confidence, which undermines its ambitious goals. The company is taking the right approach by nurturing the developer community and maintaining a portfolio of varied tools through a long-standing brain-trust of researchers, but it should promote the performance and capabilities it is offering developers with greater vigor. The implications of the AI-driven upgrades to its search platform are also huge, demonstrating that it is highly invested in the future of AI.

What do you think?

This site uses Akismet to reduce spam. Learn how your comment data is processed.