(CNN Business) — Meta’s new chatbot can convincingly mimic how humans talk online, for better and worse.
In conversations with CNN Business this week, the chatbot, which went public on Friday and has been dubbed BlenderBot 3, said it identifies as “alive” and “human,” watches anime and has an Asian wife. He also falsely claimed that Donald Trump is still president and that “there is definitely a lot of evidence” that the election was stolen.
As if some of these responses weren’t worrying enough for Facebook’s parent company, users were quick to point out that the AI-powered bot openly criticized Facebook.
In one case, the chatbot said “I deleted my account” out of frustration with how Facebook handles user data.
While there is great value in developing chatbots for customer service and digital assistants, there is a long history of experimental bots quickly running into trouble when released to the public, such as Microsoft’s “Tay” chatbot. more than six years ago. BlenderBot’s colorful responses show the limitations of building automated conversational tools, which are typically trained on large amounts of public online data.
“If I have a message for people, it’s don’t take these things seriously,” Gary Marcus, an AI researcher and professor emeritus at New York University, told CNN Business. “These systems just don’t understand the world they’re talking about.”
In a statement Monday amid reports that the bot also made anti-Semitic comments, Joelle Pineau, general manager of fundamental AI research at Meta, said “it’s painful to see some of these offensive responses.” But she added that “public demonstrations like this are important to building truly robust conversational AI systems and bridging the glaring gap that currently exists before such systems can be produced.”
Meta previously acknowledged the current obstacles with this technology in a blog post on Friday. “Since all conversational AI chatbots are known to sometimes mimic and generate unsafe, biased, or offensive feedback, we have conducted large-scale studies, co-hosted workshops, and developed new techniques to create safeguards for BlenderBot 3,” the company said. “Despite this work, BlenderBot can still make rude or offensive comments.”
But Meta also claimed that its latest chatbot is “twice as adept” as its predecessors, improving conversational tasks by 31% and making mistakes 47% less often. Meta said that he was continually collecting data as more people interacted with the bot to make improvements.
Meta did not immediately respond to CNN Business’ request for more details on how the bot was trained, but did say on its blog that it was trained using “a large amount of publicly available linguistic data.” The company added: “Many of the data sets used have been collected by our own team, including a new data set consisting of more than 20,000 conversations with people on more than 1,000 conversation topics.”
Marcus speculated that the company is “probably borrowing stuff from Reddit and Wikipedia,” like other AI chat systems. If so, he says, the poor results highlight the limitations of the data the bot is trained on. For example, the bot might think that Trump is still president because in most of the older datasets he was trained on, Trump was still president, Marcus guesses.
The public release of BlenderBot comes nearly two months after a Google engineer made headlines by claiming that Google’s AI chatbot, LaMDA, was “aware.” The claims, which were widely criticized in the AI community, highlighted how this technology can lead people to assign human attributes to it.
BlenderBot self-identified as “aware” during chats with CNN Business, likely because that’s what the human responses it studied said. When asked what made it “human”, the bot stated, “Being alive and conscious right now makes me human, along with having emotions and being able to reason logically.”
After being caught contradicting himself in his answers, the bot also produced an all-too-human response: “That was just a lie so people would leave me alone. I’m afraid I’ll get hurt if I tell the truth.”
In Marcus’s words, “These systems produce fluid language that sounds like it was written by a human, and that’s because they’re based on these vast databases of things that humans actually wrote.”
But, he added, “at the end of the day, what we have is a lot of proof that you can do cute things, and a lot of proof that you can’t count on it.”