Risk Taxonomy: Classify Your AI Systems Correctly
2h30
Apply the AI Act's four-tier risk model to real system architectures — generative AI, RAG pipelines, voice agents — and produce a defensible classification with documented evidence.
Risk Taxonomy: Classify Your AI Systems Correctly
By the end of this module you will: correctly classify a generative AI chatbot, a RAG pipeline, and a voice agent under the AI Act; identify which classification triggers high-risk obligations; and produce a classification evidence document that satisfies an audit.
The EU AI Act's risk model is risk-based, not technology-based. The same underlying model — say, a Claude API call — can be minimal risk in one context (summarising internal documents) and high risk in another (scoring job candidates). Classification errors in either direction are costly: misclassify downward and you face fines up to EUR 15 million or 3% of global turnover; misclassify upward and you impose unnecessary compliance overhead on teams. The classification decision must be documented, reviewed annually, and defensible to a market surveillance authority.
The Four Risk Tiers with Real Deployment Examples
- Unacceptable Risk (PROHIBITED from February 2025): Real-time remote biometric identification in public spaces without judicial authorisation; social scoring systems that assign citizens a reputation score affecting access to services; AI that exploits psychological vulnerabilities to manipulate behaviour. Example prohibited system: a retail analytics platform that uses facial recognition to match shoppers against a police database and alert security staff in real time.
- High Risk (full conformity assessment required before August 2026): Automated CV screening and candidate ranking (Annex III point 4); AI-assisted credit scoring used in lending decisions (Annex III point 5b); medical device software that influences diagnostic or treatment decisions (Annex III point 6); AI used in critical infrastructure management — energy grids, water systems (Annex III point 2). Example high-risk system: a recruitment SaaS that takes CVs and outputs a ranked shortlist, even if a human makes the final hire decision.
- Limited Risk (transparency obligations only): Chatbots interacting with natural persons must disclose they are AI. Deepfake content must be labelled. AI-generated text in journalism or marketing must be disclosed. Example: a customer support chatbot built on Claude must tell users they are speaking with an AI — but the system itself requires no conformity assessment.
- Minimal Risk (no mandatory obligations): Product recommendation engines, spam filters, AI-powered search, content translation, grammar correction. Example: an internal document summarisation tool using a RAG pipeline over company knowledge base — minimal risk if it does not make consequential decisions about individuals.
Classifying Generative AI Systems (GPAI)
The AI Act adds a separate track for General Purpose AI (GPAI) models — models trained on broad data and usable for many tasks. If you deploy a GPAI model (Claude, GPT-4, Llama 3) or build on top of one, classification works differently: the model itself falls under GPAI obligations (provider responsibility), but your application layer falls under the four-tier risk model. A GPAI model with systemic risk (training compute above 10^25 FLOPs) has additional requirements: adversarial testing, incident reporting, and energy consumption disclosure.
RAG Pipeline Classification Decision Tree
Voice Agent Classification: A Worked Example
Voice agents present a multi-layered classification problem because they combine three distinct AI components: speech recognition (STT), language model inference (Claude/GPT), and text-to-speech synthesis (ElevenLabs/Azure). Each layer can have different risk levels. The key is to classify the system as a whole based on its purpose and outputs — not its components individually. A voice agent that cold-calls customers to offer loan restructuring is high-risk (credit domain, automated outbound). A voice agent that answers FAQ calls for a software company is limited risk (chatbot transparency obligation only).
Classification rule of thumb for voice agents: if the voice agent's output influences a financial, employment, or health decision — even indirectly — treat it as high-risk and run a full conformity assessment. The cost of over-classification is documentation overhead. The cost of under-classification is EUR 15 million.
🛠️ Exercise 1: Risk Classification Workbench
Run this code against three pre-loaded real-world scenarios, then add your own system as Scenario 4. The classifier mirrors the logic used by compliance teams in actual EU AI Act audits. Pay attention to how a single parameter change shifts the risk tier — and the resulting obligations.
Run the code and examine each classification. Then: (1) In Scenario 1, change `uses_biometric_data=False` — does risk stay high? Why? (2) In Scenario 2, change `decision_domain` to `None` — count how many obligations disappear. (3) Add your own real or hypothetical system as Scenario 4 and defend your classification in writing.
🛠️ Exercise 2: Generate Your Article 11 Technical Documentation
High-risk AI systems must produce technical documentation before a single line of production code runs. Market surveillance authorities request this document during audits — teams that cannot produce it face immediate non-compliance findings. Fill in the fields below for a real or hypothetical CV screening tool.
Fill in all TODO fields for a real or hypothetical CV screening tool. Pay particular attention to §4 performance metrics — these must be disaggregated by demographic group. After completing, count how many fields you needed to research vs. already knew. The fields you had to research are your compliance blind spots.
Quiz disponible
Terminez la lecture de ce module puis validez vos connaissances avec le quiz.