Voila! Risk Alert

Astha Upadhyay
4 min readJan 20, 2022

--

With the recent adaption of AI, many real-world problems are solved.

Photo by Possessed Photography on Unsplash

The black box concept in ML is not new; sometimes, even we don’t know why the AI arrived at a specific decision.

So as the model gets advanced, it gets harder to explain how it works.

I recently came across this article that piqued my interest in asking, “if algorithms today are making the best decisions?”

In a typical AI system, there are a few steps :
First, training data is identified. Then, it undergoes the machine learning process. Then, a learned function or algorithm is produced. Through this algorithm, the system can make decisions or recommendations to a user for a specific situation.

But we may never know why did the AI do that? Why not something else? When did the AI succeed? When does it fail? When can I trust the AI? How can I correct an error? Well, it’s something to think about. No? Coming across this article illustrates how adaptive learning can be easily poisoned. So what’s the solution?

Before looking into the solution, let’s take a look into various incidents by AI:

Basic Taxonomy of AI incidents

Let’s take a look into different types of attacks and security risks of ML system:

  1. Adversarial Training Data : It is also known as a Data Poisoning Attack where the attacker interferes with a model’s training data. ML models are feedback loops where new data is collected and then used to retrain the model. E.g., Microsoft experienced a very high-profile attack of this type in 2016 when it launched Tay, a Twitter chatbot aimed at teens.
  2. Evasion Attacks : Also known as Perturbation Attack attempts to fool a model or cause it to misclassify an input. The attackers modify the input data to gain the desired response.
  3. Membership Interference Attacks :This is the attack on the model’s API once it’s deployed and doesn’t require access to the training set/model weights. An attacker attempts to discover whether a particular person’s data was included in a model’s training set or simply whether any given training example was included in a training set.
  4. Model Inversion Attacks :If the model has been trained on sensitive personal data that could cause harm if leaked, this poses a security risk and potentially breaches privacy.
  5. Model Theft : Attacker seeks to recover the entire model through queries to its API.
  6. Transfer learning Attacks: These freely available models are a potential security risk.

Now, let’s look into an overview to mitigate security risks:

  1. General Security Practices.
  2. Data Checks and Balances through validation, lineage tracking to record each source of data. Refer this.
  3. Model Monitoring
  4. Audits and Governance
  5. Differential privacy
  6. Encrypted ML

Well, no machine learning system can be guaranteed to be completely secure. So now what?

While reading more about the same, I came across the term XAI.

A formal definition: According to Wikipedia, Explainable AI refers to methods and techniques in the application of artificial intelligence technology such that the results of the solution can be understood by humans.

As can be seen in this diagram, an XAI system could replace the traditional learned function with an explainable model. This model would be built in a way where human beings could understand the decision-making process.

But, explanations can be hacked. Check this article.

So, what now?

--

--