Mercedes-Benz
Google Cloud released its new Automotive AI Agent on Monday and has named the Mercedes-Benz CLA as the first car model to offer it later this year. The Agent will enable Mercedes’ MBUX Virtual Assistant to perform a wider array of conversational functions with the vehicle’s passengers.
We got our first look at Mercedes’ next generation assistant a year ago, at CES 2024, though the company did not reveal which large language model underpinned its capabilities at the time. This new assistant differs from the existing MBUX that can activate around two dozen in-car commands and provide information sourced from ChatGPT and Bing. While the current generation assistant can be activated by stating “Hey, Mercedes,” it functions more like Siri or Google Assistant than ChatGPT’s Advanced Voice Mode, offering static responses rather than conversational replies.
Google’s Agent is built atop the Gemini LLM using Vertex AI and is geared specifically to “allow automakers to create highly personalized and intuitive in-car agents that go beyond current vehicle voice control,” per the company’s announcement post. The Agent supports both multimodal and multilingual inputs, as well as can provide answers to follow-up questions. In Google’s example, the AI will be able to tell drivers if there are any Italian restaurants nearby, then offer up reviews of the establishment and even tell you what the most popular dish there is. The system is reportedly robust enough to handle multi-turn dialog with users and remember details from previous conversations.
The new MBUX assistant will reportedly pull “fresh and factual information” from Google Maps in near real time to offer “comprehensive and personalized information” about more than 250 million points of interest worldwide and current traffic conditions. “At Mercedes-Benz, we seek to offer our customers an exceptional digital experience,” said Ola Källenius, CEO of Mercedes-Benz Group AG, in a press statement. “Our partnership with Google Cloud will further enhance in-car navigation, combining sophisticated location data with generative AI.”
Mercedes plans to roll out the new MBUX assistant to additional models in the future, but has not yet specified which ones it will be integrated into.
Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…
OpenAI opens up developer access to the full o1 reasoning model
On the ninth day of OpenAI’s holiday press blitz, the company announced that it is releasing the full version of its o1 reasoning model to select developers through the company’s API. Until Tuesday’s news, devs could only access the less-capable o1-preview model.
According to the company, the full o1 model will begin rolling out to folks in OpenAI’s “Tier 5” developer category. Those are users that have had an account for more than a month and who spend at least $1,000 with the company. The new service is especially pricey for users (on account of the added compute resources o1 requires), costing $15 for every (roughly) 750,000 words analyzed and $60 for every (roughly) 750,000 words generated by the model. That’s three to four times the cost of performing the same tasks with GPT-4o.
Read more
I tried out Google’s latest AI tool that generates images in a fun, new way
Google’s latest AI tool helps you automate image generation even further. The tool is called Whisk, and it’s based on Google’s latest Imagen 3 image generation model. Rather than relying solely on text prompts, Whisk helps you create your desired images using other images as the base prompt.
Whisk is currently in an experimental phase, but once set up it’s fairly easy to navigate. Google detailed in a blog post introducing Whisk that it is intended for “rapid visual exploration, not pixel-perfect edits.”
Read more
Google strikes back with an answer to OpenAI’s Sora launch
Google’s DeepMind division unveiled its second generation Veo video generation model on Monday, which can create clips up to two minutes in length and at resolutions reaching 4K quality — that’s six times the length and four times the resolution of the 20-second/1080p resolution clips Sora can generate.
Of course, those are Veo 2’s theoretical upper limits. The model is currently only available on VideoFX, Google’s experimental video generation platform, and its clips are capped at eight seconds and 720p resolution. VideoFX is also waitlisted, so not just anyone can log on to try Veo 2, though the company announced that it will be expanding access in the coming weeks. A Google spokesperson also noted that Veo 2 will be made available on the Vertex AI platform once the company can sufficiently scale the model’s capabilities.
Read more