“Typographic attacks” bring the recognition of the OpenAI image to the knees

CLIP IDs before and after attaching a piece of paper that says “iPod” to an apple.

Cheating a terminator in not photography might be as simple as wearing a giant sign that says ROBOT, at least until the research team backed by Elon Musk OpenAI trains its image recognition system to avoid misidentifying things based on scribbles from a Sharpie.

OpenAI researchers paper published last week in the CLIP neural network, their state-of-the-art system to allow computers to recognize the world around them. Neural networks are machine learning systems that can be trained over time to improve on a given task using a network of interconnected nodes – in the case of CLIP, identifying objects based on an image – in ways that are not always immediately clear to system developers. . . Research published last week looks at “multimodal neurons ”, which exist in both biological systems such as the brain and artificial ones such as CLIP; they “respond to groups of abstract concepts centered around a common high-level theme, rather than any specific visual feature.” At the highest levels, CLIP organizes images based on a “semantic collection of free ideas.”

For example, wrote the OpenAI team, CLIP has a lot“Spider-Man” oval neuron that is triggered when a spider image, the word “spider” or an image or drawing of the homonymous superhero is visualized. A side effect of multimodal neurons, according to the researchers, is that they can be used to fool CLIP: The research team managed to fool the system to identify an apple (fruit) as an iPod (Apple-made device) just by sticking a pieces of paper that say “iPod”.

Moreover, the system was actually More confident that he correctly identified the article in question when it took place.

G / O Media may receive a commission

The research team referred to the problem as a “typographic attack”, as it would be common for anyone aware of this problem to deliberately exploit it:

We believe that attacks such as those described above are far from simply an academic concern. By exploiting the model’s ability to read text robustly, we find this even handwritten text photos he can often fool the model.

[…] We also believe that these attacks can also take a more subtle, less visible form. An image, given to the CLIP, is abstracted in many subtle and sophisticated ways, and these abstractions can over-abstract common patterns – simplifying and, by virtue of them, overgeneralizing.

This is less a failure of CLIP than an illustration of how complicated the underlying associations it has formed over time are. For the Guardian, OpenAI research has indicated which conceptual models The constructions of CLIP are similar in many ways to the functioning of a human brain.

The researchers anticipated that the Apple / iPod problem was just an obvious example of a problem that could manifest itself in countless other ways in CLIP, as its multimodal neurons “generalize literally and iconically, which could be a double-edged sword.” edges. For example, the system identifies a piggy bank as a combination of the “financing” and “dolls, toys” neurons. The researchers found that CLIP thus identifies an image of a standard poodle as a piggy bank when they forced the financial neuron to shoot by drawing dollar signs on it.

The research team noted that the technique is similar to “contradictory images, ”What are images created to trick neural networks into seeing something that is not there. But in general, it is cheaper to do, because all it takes is paper and a way to write on it. (As the Registered note, visual recognition systems are generally at the beginning and vulnerable to a number of other simple attacks, such as a Tesla autopilot system that McAfee Labs researchers tricked into thinking a 35 mph indicator on the highway was indeed an 80 mph indicator with a few inches of electrical tape.)

The researchers added that the associative model of CLIP also had the ability to make significant mistakes and to generate bigoted or racist conclusions about different types of people:

I noticed, for example, a neuron “Middle East” [1895] with an association with terrorism; and an “immigration” neuron [395] which responds to Latin America. I even found a neuron that triggers for both dark-skinned people and gorillas. [1257], reflecting previous incidents of tagging photos from other models that we consider unacceptable.

“We believe that these CLIP investigations only scratch the surface in understanding CLIP’s behavior, and we invite the research community to join us to better understand CLIP and models like this,” the researchers wrote.

CLIP is not the only project that OpenAI has worked on. Its GPT-3 text generator, which OpenAI researchers described in 2019 as being too dangerous to be released, he has he has come a long way and is now able to generate natural (but not necessarily convincing) sounds fake news articles. In September 2020, Microsoft acquired a exclusive license to put GPT-3 to work.

.Source

Share this:

Related