Attacking AI by using its own weights matrices against it
Just saw this: https://identi.ca/renatocan/note/NAtsLZcKSiqSqXBvXITsvg
"Slight Street Sign Modifications Can Completely Fool Machine Learning Algorithms http://spectrum.ieee.org/cars-that-think/transportation/sensors/slight-street-sign-modifications-can..."
I was just talking to a smart bunch of SMU students about this yesterday. It's a subtle but powerful attack. At a recent meeting of machine learning scientists, it became clear to all in the audience that without a theory of learning (a real mathematical framework that explains how to optimally design architectures with a goal in mind) or without an alternative to "gradient-descent" based learning update methods, this attack currently cannot be automatically defended against. It's a trial-and-error race to understand why this works and how to protect against it.
This attack is not new, and it was inevitable that someone would find a way to slightly alter street signs to mess with the image recognition net. Now the arms race is on. A theory of learning or an alternative to gradient descent would greatly aid this.
>> Charles Stanhope:
“This sort of approach may be an enhancement: "Neurogenesis model beats deep learning in changeable environments".”
This is neat. Indeed, when we discussed this at a recent particle physics and machine learning workshop, I raised the point that human neural nets are not so easy to fool. But we get fooled. Think about the famous optical illustion that can look either like two faces about to kiss or a chalice, depending on how you look at it. We can certainly confuse out wetware. But ideas like the one you shared seem to be crucial for resiliance.