Silicon Valley CEOs usually focus on the positives when announcing their company’s next big thing. In 2007, Apple’s Steve Jobs lauded the first iPhone’s “revolutionary user interface” and “breakthrough software.” Google CEO Sundar Pichai took a different tack at his company’s annual conference Wednesday when he announced a beta test of Google’s “most advanced conversational AI yet.”
Pichai said the chatbot, known as LaMDA 2, can converse on any topic and had performed well in tests with Google employees. He announced a forthcoming app called AI Test Kitchen that will make the bot available for outsiders to try. But Pichai added a stark warning. “While we have improved safety, the model might still generate inaccurate, inappropriate or offensive responses,” he said.
Pichai’s vacillating pitch illustrates the mixture of excitement, puzzlement, and concern swirling around a string of recent breakthroughs in the capabilities of machine learning software that processes language.
The technology has already improved the power of auto-complete and web search. It has also created new categories of productivity apps that help workers by generating fluent text or programming code. And when Pichai first disclosed the LaMDA project last year he said it could eventually be put to work inside Google’s search engine, virtual assistant, and workplace apps. Yet despite all that dazzling promise, it’s unclear how to reliably control these new AI wordsmiths.
Search our artificial intelligence database and discover stories by sector, tech, company, and more.
Google’s LaMDA, or Language Model for Dialogue Applications, is an example of what machine learning researchers call a large language model. The term is used to describe software that builds up a statistical feeling for the patterns of language by processing huge volumes of text, usually sourced online. LaMDA, for example, was initially trained with more than a trillion words from online forums, Q&A sites, Wikipedia, and other webpages. This vast trove of data helps the algorithm perform tasks like generating text in the different styles, interpreting new text, or functioning as a chatbot. And these systems, if they work, won’t be anything like the frustrating chatbots you use today. Right now Google Assistant and Amazon’s Alexa can only perform certain pre-programmed tasks and deflect when presented with something they don’t understand. What Google is now proposing is a computer you can actually talk to.
Chat logs released by Google show LaMDA can—at least at times—be informative, thought-provoking, or even funny. Testing the chatbot prompted Google vice president and AI researcher Blaise Agüera y Arcas to write a personal essay last December arguing the technology could provide new insights into the nature of language and intelligence. “It can be very hard to shake the idea that there’s a ‘who,’ not an ‘it’, on the other side of the screen,” he wrote.
Pichai made clear when he announced the first version of LaMDA last year, and again on Wednesday, that he sees it potentially providing a path to voice interfaces vastly broader than the often frustratingly limited capabilities of services like Alexa, Google Assistant and Apple’s Siri. Now Google’s leaders appear to be convinced they may have finally found the path to creating computers you can genuinely talk with.
At the same time, large language models have proven fluent in talking dirty, nasty, and plain racist. Scraping billions of words of text from the web inevitably sweeps in a lot of unsavory content. OpenAI, the company behind language generator GPT-3, has reported that its creation can perpetuate stereotypes about gender and race, and asks customers to implement filters to screen out unsavory content.
LaMDA can talk toxic, too. But Pichai said that Google can tame the system if more people chat with it and provide feedback. Internal testing with thousands of Google employees has already reduced the LaMDA’s propensity to make inaccurate or offensive statements, he said.
Pichai presented Google’s forthcoming AI Test Kitchen app as a way for outsiders to help Google continue that sanitization project, while also testing ideas about how to turn an advanced but occasionally off-kilter chatbot into a product. Google has not said when the app will be released, or who will get access first.
The app will initially include three different experiences powered by LaMDA. “Each is meant to give you a sense of what it might be like to have LaMDA in your hands, and use it for things you care about,” Pichai said.
One of those demos has the bot pose as an interactive storyteller, prompting a user to complete the prompt “Imagine I’m at…” It responds with a fictional description of a scene and can elaborate on it in response to follow up questions. Another is a version of LaMDA tuned to talk obsessively about dogs, in a test of Google’s ability to keep the chatbot on a specific topic.
The app’s third offering is an enhanced to-do list. In a live demo Wednesday, a Google employee tapped out “I want to plant a vegetable garden.” LaMDA produced a six point list of steps towards that goal. The app displayed a warning: “May give inaccurate/inappropriate information.” Tapping on the list item that read “Research what grows well in your area” prompted LaMDA to list sub-steps such as “See what grows in your neighbors’ yards.”
Gathering feedback on how those three demos perform should help improve LaMDA but it is unclear if it can fully tame such a system, says Percy Liang, director of Stanford’s Center for Foundation Models, which was created last year to study large-scale AI systems such as LaMDA. Liang likens AI experts’ existing techniques for controlling large language models to engineering with duct tape. “We have this thing that’s very powerful but when we use it we discover these gaping problems and we patch them,” Liang says. “Maybe if you do this enough times you’ll get to something really good or maybe there will always be holes in the system.”
Given the many unknowns about large language models and potential for powerful but flawed chatbots to cause trouble, Google should consider inviting outsiders to do more than just try limited demos of LaMDA, says Sameer Singh, a fellow at Allen Institute for AI and professor at University of California, Irvine. “There has to be more conversation about how they’re making this safe and testing so outsiders can contribute to those efforts,” he says.
Pichai said that Google would be consulting social scientists and human rights experts about LaMDA without specifying what access or input they might have. He said the project would follow Google’s AI principles, a set of guidelines introduced in 2018 after thousands of Google employees protested the company’s work on a Pentagon project to use AI to interpret drone surveillance footage.
Pichai did not mention a more recent and relevant scandal that Singh says adds reasons for Google to be careful and transparent as it productizes LaMDA. In late 2020, managers at the company objected to in-house researchers contributing to a research paper raising concerns about the limitations of large language models, including that they can generate offensive text. Two researchers, Timnit Gebru and Margaret Mitchell, were forced out of Google but the paper that triggered the dispute was later presented at a peer-reviewed conference. One day you may be able to ask Google’s LaMDA to summarize the document’s key points—if you trust it to do so.