AI and ML reliability and security: BlenderBot and other cases

News Desk -

Share

Blenderbot, an AI-driven research project by Meta, has been making headlines since its debut in early August 2022. Blenderbot is a conversational bot, and its statements about people, businesses, and politics appear unexpected and sometimes radical. This is one of the challenges of machine learning, and it is critical that businesses that use it address it.

Other similar projects, such as Microsoft’s chatbot Tay for Twitter, have previously encountered the same issue that Meta did with Blenderbot. This reflects the specifics of generative machine learning models trained on internet texts and images. They use massive amounts of raw data to produce convincing results, but it is difficult to prevent such models from picking up biases if they are trained on the web.

These projects now primarily pursue research and science objectives. Organizations, on the other hand, use language models in practical areas such as customer service, translation, writing marketing copy, text proofreading, and so on. Developers can curate the datasets used for training to make these models less biased. However, in the case of web-scale datasets, this is extremely difficult. To avoid embarrassing errors, one approach is to filter data for biases, such as using specific words or phrases to remove documents and prevent the model from learning on them. Another approach is to filter out inappropriate outputs before they reach users in the event that the model generates questionable text.

In a broader sense, protection mechanisms are required for any ML model, not just from biases. If developers use open data to train the model, attackers can take advantage of this by introducing specially crafted malformed data into the dataset. As a result, the model will be unable to identify certain events or will mistake them for others, leading to incorrect decisions. 

“Although in reality such threats remain rare as they require a lot of effort and expertise from attackers, companies still need to follow protective practices. This will also help minimize errors in the process of training models,” comments Vladislav Tushkanov, Lead Data Scientist at Kaspersky. “Firstly, organizations need to know what data is being used for training and where it comes from. Secondly, the use of diverse data makes poisoning more difficult. Finally, it is important to thoroughly test the model before rolling it out into combat mode and constantly monitor its performance.”

Organizations can also consult MITRE ATLAS, a knowledgebase dedicated to guiding businesses and experts through threats to machine learning systems. ATLAS also includes a tactic and technique matrix for ML attacks.

We conducted specific tests on our anti-spam and malware detection systems by simulating cyberattacks to reveal potential vulnerabilities, understand the potential damage, and mitigate the risk of such an attack at Kaspersky.

Machine learning is widely used in Kaspersky products and services to detect threats, analyze alerts in Kaspersky SOC, and detect anomalies in production process protection.