INTRODUCTION TO NEURAL NETWORK HACKING

May 22, 2020

Sogeti Labs

The principles of artificial neural networks have been around since the 1950s, but it’s only in the last decade they have truly revolutionized the world of AI thanks to the availability of huge datasets and computing power.

Incidentally, the last decade has also seen significant growth in cyber threats, including large-scale organized groups and state-sponsored actors with considerable financial and technical resources. Neural networks are sometimes a valuable target for attackers, given their growing use in critical applications and an inherent flaw in their design allowing an exploit called Adversarial Attacks.

Hacking neural networks

A neural network is a complex system that takes an array of numbers as an input and computes another array from it. The nature of the input/output vectors depends on the task. Here are a few examples :

“Does this picture contain a cat?”

A network answering this question takes an image (2D array of pixel values) as its input and outputs a single number between 0% and 100% that represents the estimated probability that a cat appears in the image.

“What language is spoken in this audio?”

In this case, the neural network is fed a time-frequency spectrum array, and produces a vector of detection scores for a set of predetermined languages.

Artificial neural networks have an important property called differentiability, which means that a small modification of the input will only result in a slightly different output. Hackers can exploit this property by carefully applying many imperceptible modifications to their input which all push the network’s output in the same direction. When all these modifications are applied to the input data together, the network will produce an unexpected result.

Take the following two images :

As a human, you see absolutely no difference between the two, which are without a doubt a picture of a cat. However, some pixels in the image on the right have been altered by a very small amount (less than 0.5%), which is more than enough to fool a targeted neural network and have this innocent kitten be classified as a dangerous snake.

As you can imagine, adversarial attacks are a growing concern for makers of machine learning applications, where being targeted by a malicious actor can have harmful consequences, and mitigating the risks is hard since the property of differentiability is essential to the neural network.

Adversarial attacks in the wild

Several of these attacks have been successfully conducted on real-world systems. Thankfully, the attackers were well-intended researchers who reported the vulnerabilities to the manufacturer, and only performed the exploits as proofs of concept.

One major example was published in 2018 by Kevin Eykholt and Ivan Evtimov regarding automated recognition of road signs. With a 87.5% success rate, they managed to trick a state-of-the-art neural network into detecting a stop sign as a speed limit sign, by placing 4 stickers at carefully selected positions:

The future of adversarial attacks

The entire history of adversarial attacks is very recent, with the first research papers published in 2014. There have been efforts since 2017 to develop ways of defending against them, but each one induces additional development time and maintenance complexity for the person building the model. As security is often the first feature affected by a tight budget, we can expect most models to remain vulnerable until a universal and simple solution is found.

All in all, threats associated with neural networks will keep growing at an even faster pace than neural networks themselves, which have to be considered an essential security asset as early as the development stage in order to avoid costly incidents.

About the author

SogetiLabs gathers distinguished technology leaders from around the Sogeti world. It is an initiative explaining not how IT works, but what IT means for business.

Generative AI

Cloud

Testing

Artificial intelligence

Security

INTRODUCTION TO NEURAL NETWORK HACKING

May 22, 2020

Hacking neural networks

“Does this picture contain a cat?”

“What language is spoken in this audio?”

Adversarial attacks in the wild

The future of adversarial attacks

About the author

Related posts

Leveraging Large Language Models for Emotion Recognition: Case of Employee Satisfaction Survey

The AI Syndrome – A diagnosis

Revolutionizing Enterprise AI: The Promise of Model Context Protocol and Agentic AI

Brain – Computer Interface: a direct interaction between brain and machine

The Urgent Need to Optimize AI for Energy Efficiency and Climate Impact

Why Data Governance is crucial in getting data ready for AI

An Enterprise Agent Manifesto

The power of mental imagery

Assessing the risks of AI

AI Model Compression Techniques

Comments

Leave a Reply Cancel reply