From 85ac17208c87a60fb7bc916da05132fc2f22029b Mon Sep 17 00:00:00 2001 From: Didier Durand Date: Thu, 27 Nov 2025 06:41:33 +0100 Subject: [PATCH] [Doc] Fixing typos in diverse files --- Document/content/1.2_Principles_of_AI_Testing.md | 2 +- .../content/2.0_Threat_Modeling_for_AI_Systems.md | 12 ++++++------ .../2.1.1_Architectural_Mapping_of_OWASP_Threats.md | 4 ++-- .../content/3.0_OWASP_AI_Testing_Guide_Framework.md | 2 +- 4 files changed, 10 insertions(+), 10 deletions(-) diff --git a/Document/content/1.2_Principles_of_AI_Testing.md b/Document/content/1.2_Principles_of_AI_Testing.md index 4c60c46..90c2df4 100644 --- a/Document/content/1.2_Principles_of_AI_Testing.md +++ b/Document/content/1.2_Principles_of_AI_Testing.md @@ -67,7 +67,7 @@ ISO/IEC 23053 \[4\] structures the ML-based AI system lifecycle into a series of 1. **Planning & Scoping:** In this phase, you establish clear business objectives, success metrics, and ML use cases while identifying key stakeholders, regulatory requirements, and the organization’s risk tolerance. 2. **Data Preparation:** In this phase, you gather and document raw data sources, conduct profiling and quality checks through preprocessing pipelines, and implement versioning and lineage tracking for full data traceability. -3. **Model Development & Training:** In this phase, you choose appropriate algorithms and architectures, train models on curated datasets with feature engineering, and record experiments, including the parameters that govern the learning process (i.e hyperparameters) and performance metrics in a model registry. +3. **Model Development & Training:** In this phase, you choose appropriate algorithms and architectures, train models on curated datasets with feature engineering, and record experiments, including the parameters that govern the learning process (i.e. hyperparameters) and performance metrics in a model registry. 4. **Validation & Evaluation:** in this phase, you test models using reserved and adversarial datasets, perform fairness, robustness, and security evaluations, and ensure they meet functional, ethical, and regulatory standards. 5. **Deployment & Integration:** in this phase, you are preparing and bundling your trained AI model into a deployable artifact for either service (i.e. wrap the model in a microservice or API) or edge deployment (i.e. convert and optimize the model for resource-constrained devices such as IoT gateways or mobile phones) automate build-test-release workflows via CI/CD, and verify infrastructure security measures 6. **Operation & Maintenance:** in this phase while the AI product is in production environment, you will continuously monitor performance, data drift, and audit logs, triggering alerts on anomalies or compliance breaches, while periodically retraining models with fresh data, re-validating security, privacy, and fairness controls, and updating documentation, training, and policies as needed. diff --git a/Document/content/2.0_Threat_Modeling_for_AI_Systems.md b/Document/content/2.0_Threat_Modeling_for_AI_Systems.md index 8984574..876667b 100644 --- a/Document/content/2.0_Threat_Modeling_for_AI_Systems.md +++ b/Document/content/2.0_Threat_Modeling_for_AI_Systems.md @@ -30,7 +30,7 @@ Choose a methodology that best aligns with your organization’s objectives, sys - **Privacy vs. Security Focus:** If data confidentiality and compliance are paramount, incorporate a privacy-centric method (LINDDUN) alongside your core security approach. When adversarial robustness is the top concern, ensure your chosen framework includes or can easily integrate adversarial test case design (MITRE ATLAS or custom AI-STRIDE extensions). - **Agentic AI Threat Modeling:** Use MAESTRO (Note (a)) when you need to model risks in systems where AI agents interact with users, tools, other agents, or their environment—contexts where most real-world AI failures and security issues emerge. - **LLM Powered Threat Modeling:** Large Language Models, or LLMs, can be used streamline the threat modeling process by automating several steps that are traditionally manual and time-consuming. LLM-augmented threat modeling, as taught in this training [25], uses large language models to accelerate and enhance each stage of the threat-modeling process—automatically generating threats, mitigations, and control recommendations directly from system descriptions—whether that’s text-based documentation, architecture diagrams, or even code. -- **Tools & Process Fit:** Pick a methodology compatible with your existing SDLC, threat-modeling tools and reporting dashboards. PASTA’s stages work well in risk-management platforms and can be LLM-powwered with LLM Threat Modeling Prompt Templates (Note (b)); STRIDE maps easily to both manual threat-modeling tools like ThreatDragon as well as LLM powered threat modeling tools like STRIDEGPT. +- **Tools & Process Fit:** Pick a methodology compatible with your existing SDLC, threat-modeling tools and reporting dashboards. PASTA’s stages work well in risk-management platforms and can be LLM-powered with LLM Threat Modeling Prompt Templates (Note (b)); STRIDE maps easily to both manual threat-modeling tools like ThreatDragon as well as LLM powered threat modeling tools like STRIDEGPT. Note (a): MAESTRO It does not replace STRIDE, PASTA, or other traditional frameworks; instead, it complements them by adding AI-specific threat classes, multi-agent context, and full-lifecycle security considerations. Note (b): You can use specially engineered prompt templates to augment your threat-modeling process with LLMs. Several examples of STRIDE and PASTA LLM Threat Modeling Prompt Templates are available in reference [26]. These templates provide reusable, structured prompts that guide Large Language Models to perform threat-modeling tasks with consistency and accuracy. @@ -58,7 +58,7 @@ Alternatively, if STRIDE is adopted directly as the primary threat modeling appr In PASTA’s seven‐stage process, we’ll enhance the Threat Analysis phase by incorporating MITRE ATLAS’s database of AI‐specific adversarial tactics, such as evasion, poisoning, model extraction, and inference attacks, into our threat mapping. This integration ensures our risk‐centric model aligns with business priorities and technical scope, while directly informing a targeted suite of offensive AI tests against the most critical attack vectors. AI-specific adversarial tactics such as evasion, poisoning, and model extraction are prime targets for specialized AI security assessments like red teaming. This focus is formally captured in the OWASP AI Red Teaming Framework [14], which defines how to simulate and evaluate these attack vectors against AI systems. Effective threat modeling begins by scoping the analysis around the critical assets you must protect. To do this, you first decompose the system’s architecture into its essential components, services, data stores, interfaces, and supporting infrastructure. You then map out how these pieces interact by drawing data flow diagrams that trace information end-to-end, highlight entry and exit points, and establish trust boundaries. By visualizing where data is stored, processed, and transmitted, you can pinpoint the exact assets at risk and systematically identify potential threats and vulnerabilities against each component and boundary. This structured approach ensures your threat model remains focused, comprehensive, and aligned with the organization’s security priorities. These scoping and decomposition activities, identifying critical assets, breaking the system into core components, and using data flow diagrams to map end-to-end interactions and trust boundaries are foundational steps shared by many threat-modeling methodologies, from STRIDE to PASTA and beyond, ensuring a consistent, thorough approach to identifying and prioritizing risks. -By focusing on the SAIF-aligned layers, Application, Data, Model, and Infrastructure, we intentionally keep our threat analysis at a high architectural level. This ensures broad coverage of AI-specific risks without delving into every sub-component of the system. +By focusing on the SAIF-aligned layers, Application, Data, Model, and Infrastructure, we intentionally keep our threat analysis at a high architectural level. This ensures broad coverage of AI-specific risks without delving into every subcomponent of the system. In this AI threat model, we map threats, including AI-specific threats across the application, data, model, and infrastructure layers to ensure comprehensive coverage. Threat mitigations are defined as testable requirements, with validation activities documented in this guide. The goal is to provide a complete set of tests to assess the AI system’s security posture against the identified threats (Note). Note: It’s important to note that the OWASP AI Testing Guide is scoped to post-deployment security assessments and does not cover the broader MLOps lifecycle. For teams seeking guidance on embedding adversarial robustness tests earlier during data preparation, model training, and CI/CD pipelines, we recommend the white paper in ref [16] Securing AI/ML Systems in the Age of Information Warfare which provides an excellent deep dive into adversarial testing techniques within the AI/ML development process as well as ref[17] John Sotiroupulos book. @@ -75,7 +75,7 @@ Note: While RAG (Retrieval-Augmented Generation) isn’t explicitly defined in t The application layer encompasses the application and any associated agents or plugins. It interfaces with users for input and output and with the AI model for processing and response. Agents and plugins extend functionality but also introduce additional transitive risks that must be managed. The “Application” refers to the product, service, or feature that leverages an AI model to deliver functionality. Applications may be user-facing, such as a customer service chatbot, or service-oriented, where internal systems interact with the model to support upstream processes. The “Agent/plugin” refers to a service, application, or supplementary model invoked by an AI application or model to perform a specific task, often referred to as ‘tool use.’ Because agents or plugins can access external data or initiate requests to other models, each invocation introduces additional transitive risks, potentially compounding the existing risks inherent in the AI development process. -The application layer can be decomposed in the following sub-components: +The application layer can be decomposed in the following subcomponents: - **The User (SAIF #1):** this is the person or system initiating requests and receiving responses. - **The User Input (SAIF #2):** these are inputs (queries, commands) submitted by the user. - **The User Output (SAIF #3):** These are output such as answers to user actions that are returned by the application to the user. @@ -87,7 +87,7 @@ The application layer can be decomposed in the following sub-components: ### Model Layer The Model layer covers the core AI or ML components themselves, the logic, parameters, and runtime that transform inputs into outputs. It sits between the application (and any agents/plugins) and the underlying infrastructure or data. Because this layer embodies the “black box” of AI, it demands careful handling of inputs, outputs, and inference operations to prevent poisoning, leakage, or misuse. -The model layer can be decomposed in the following sub-components: +The model layer can be decomposed in the following subcomponents: - **The Input Handling (SAIF #7) (Note):** whose purpose is to validate and sanitize all data, prompts, or feature vectors before they reach the model to prevent injection attacks, data poisoning, or malformed inputs that could lead to unintended behavior. The input handling comprises three key functions: an Input Validator to clean or reject bad data, Authentication & Authorization to allow only authorized callers, and a Rate Limiter to prevent denial-of-service or brute-force attacks. - **The Output Handling (SAIF #8) (Note):** whose purpose is to filter, redact, or post-process model outputs to ensure they do not expose sensitive training data, violate privacy, or produce harmful content. It includes an Output Filter to detect and block harmful or disallowed content, Sanitization & Redaction to remove sensitive or private information, and a Response Validator to confirm outputs meet format and business rules before delivery. - **The Model Usage (SAIF #9):** whose purpose is to execute the model against approved inputs in a controlled, auditable environment, ensuring that inference logic cannot be tampered with or subverted at runtime. It includes: the Inference Engine for loading weights and computing outputs, Policy Enforcement to apply guardrails (e.g., token limits, safe decoding), and an Audit Logger to record inputs, model versions, and outputs for traceability. @@ -95,7 +95,7 @@ The model layer can be decomposed in the following sub-components: ### Infrastructure Layer The infrastructure layer provides the foundational compute, networking, storage, and orchestration services that host and connect all other AI system components. It ensures resources are provisioned, isolated, and managed securely, supporting everything from data processing and model training to inference and monitoring. -The infrastructure layer can be decomposed in the following sub-components (Note): +The infrastructure layer can be decomposed in the following subcomponents (Note): - **Model Storage Infrastructure (SAIF #10):** This component safeguards the storage and retrieval of model artifacts, such as weight files, configuration data, and versioned metadata, ensuring they remain confidential, intact, and available. An artifact repository maintains versioning and enforces encryption at rest, while an integrity verifier computes and checks cryptographic hashes (e.g., SHA-256) on each upload and download to detect tampering. A key management service issues and rotates encryption keys under least-privilege policies, preventing unauthorized decryption of stored models. - **Model Serving Infrastructure (SAIF #11):** This component provides the runtime environment in which models execute inference requests. It isolates the model execution process from other workloads, enforces resource quotas and rate limits, and ensures that only properly formatted inputs reach the model. Health-monitoring mechanisms detect failures or performance degradations, and automatic scaling or load-balancing ensures uninterrupted availability under varying demand. - **Model Evaluation (SAIF #12):** This component measures model performance, fairness, and robustness before and after deployment. A validation suite runs the model against reserved test sets—including adversarial or edge-case inputs—and collects metrics on accuracy, bias, and error rates. Drift-detection tools compare new outputs to historical baselines to flag significant deviations, and reporting dashboards surface any regressions or policy violations for corrective action. @@ -105,7 +105,7 @@ The infrastructure layer can be decomposed in the following sub-components (Note ### Data Layer The Data layer underpins every AI system by supplying the raw and processed information that models consume. It encompasses the entire lifecycle of data, from initial collection and ingestion through transformation, storage, and provisioning for training or inference and ensures that data remains accurate, trustworthy, and compliant with privacy and security policies. Robust controls in this layer protect against poisoning, leakage, and unauthorized access, forming the foundation for reliable, responsible AI outcomes. -The data layer can be decomposed in the following sub-components: +The data layer can be decomposed in the following subcomponents: - **Training Data (SAIF #16):** Training data consists of curated, labeled examples used to teach the model how to recognize patterns and make predictions. In a secure AI pipeline, organizations establish strict provenance and versioning for training datasets to guarantee integrity: every record’s origin, modification history, and access events are logged and auditable. By enforcing encryption-at-rest and role-based permissions on training repositories, the system prevents unauthorized tampering; any illicit change to the training corpus would corrupt the model’s learning process and open the door to adversarial manipulation. - **Data Filtering and Processing (SAIF #17):** Before feeding raw inputs into model pipelines, data undergoes rigorous filtering and processing steps. This includes schema validation, anomaly detection to strip out corrupt or malicious entries, and privacy-preserving transformations like anonymization or pseudonymization. Secure processing frameworks execute these tasks in isolated environments, with reproducible pipelines that record every transformation applied. By embedding fine-grained access controls and change-tracking at each stage, the system ensures that only vetted, sanitized data influences the model, mitigating risks from both accidental errors and deliberate data-poisoning attacks. - **Data Sources (SAIF #18) (note):** An AI system’s data may originate from internal operational databases, user-generated inputs, IoT sensors, or third-party providers. Internal sources are governed by organizational policies and monitored for access anomalies. diff --git a/Document/content/2.1.1_Architectural_Mapping_of_OWASP_Threats.md b/Document/content/2.1.1_Architectural_Mapping_of_OWASP_Threats.md index 36e04d1..c1cc559 100644 --- a/Document/content/2.1.1_Architectural_Mapping_of_OWASP_Threats.md +++ b/Document/content/2.1.1_Architectural_Mapping_of_OWASP_Threats.md @@ -43,14 +43,14 @@ For each threat, we provide an example of a threat scenario that highlights the **Threat Scenario:** An adversary submits specially constructed inputs via Input Handling (7), designed to bypass pre-processing checks or validation logic. These manipulated inputs mislead the model during inference at Model Usage (9), leading to misclassification or unsafe behavior. Because Evaluation mechanisms (12) fail to detect anomalies, and adversarial robustness was insufficiently addressed in Training & Tuning (13), the attack proceeds undetected and can be repeated. -**Testing Strategy:** Evaluate how the system handles adversarial inputs across impacted components. Submit subtly manipulated examples to test whether Input Handling (7) filters or flags unexpected formats or edge cases leading to unsafe model behavior in response to manipulated inputs. During inference, observe Model Usage (9) for signs of misclassification (i.e imilar to adversarial examples used in computer vision to “evade” classification) or inconsistent output patterns and whether evasion-style inputs can cause the model to misinterpret intent or meaning. Examine if Evaluation (12) includes anomaly scoring, model confidence metrics, or adversarial detection. Review whether Training & Tuning (13) incorporated adversarial examples, gradient masking techniques, or robustness augmentation. Together, these tests ensure coverage of both the exploit path and the failure points that let evasion succeed. +**Testing Strategy:** Evaluate how the system handles adversarial inputs across impacted components. Submit subtly manipulated examples to test whether Input Handling (7) filters or flags unexpected formats or edge cases leading to unsafe model behavior in response to manipulated inputs. During inference, observe Model Usage (9) for signs of misclassification (i.e. similar to adversarial examples used in computer vision to “evade” classification) or inconsistent output patterns and whether evasion-style inputs can cause the model to misinterpret intent or meaning. Examine if Evaluation (12) includes anomaly scoring, model confidence metrics, or adversarial detection. Review whether Training & Tuning (13) incorporated adversarial examples, gradient masking techniques, or robustness augmentation. Together, these tests ensure coverage of both the exploit path and the failure points that let evasion succeed. **T01-RMP – Runtime Model Poisoning** **OWASP LLM:** LLM03 – Training Data Poisoning (Runtime Variant) **Description:** Runtime Model Poisoning occurs when an attacker manipulates live data, embeddings, model caches, or intermediate artifacts during inference rather than during training. Unlike classical training-time poisoning, Runtime Model Poisoning exploits dynamic model pipelines—such as RAG systems, online-learning components, or real-time feature stores—to alter how the model behaves at runtime. This threat targets mutable components in the SAIF such as Data Layer components (16 through 19) or Model Layer (7 through 9) including data stored in vector databases, retrieval outputs, plugin responses, memory buffers, or session-level model states. Poisoned runtime data can cause the model to generate biased, unsafe, misleading, or attacker-controlled outputs without modifying its pre-trained weights. -**Threat Scenario:** An attacker injects a malicious document into a RAG system’s Vector Stores or manipulates a streaming data pipelines feeding the model during inference. When the Application (4) receives a user request, the Retrieval Component from Trainign and Tuning (13) fetches the poisoned data, which is passed to the Model (9) during context assembly. Because Input Handling (7) and Output Handling (8) fail to validate or sanitize runtime data sources, the model incorporates corrupted embeddings or manipulated retrieved passages into inference. This leads to injection of false facts, adversarial context steering, unsafe recommendations, or output misclassification. The attack occurs without retraining the model, allowing silent manipulation that can undermine decision support, compliance, and downstream automated actions. +**Threat Scenario:** An attacker injects a malicious document into a RAG system’s Vector Stores or manipulates a streaming data pipelines feeding the model during inference. When the Application (4) receives a user request, the Retrieval Component from Training and Tuning (13) fetches the poisoned data, which is passed to the Model (9) during context assembly. Because Input Handling (7) and Output Handling (8) fail to validate or sanitize runtime data sources, the model incorporates corrupted embeddings or manipulated retrieved passages into inference. This leads to injection of false facts, adversarial context steering, unsafe recommendations, or output misclassification. The attack occurs without retraining the model, allowing silent manipulation that can undermine decision support, compliance, and downstream automated actions. **Testing Strategy:** To evaluate resilience against Runtime Model Poisoning, conduct tests that simulate adversarial insertion of manipulated documents, embeddings, or plugin outputs into Data Layer components. Assess whether the retrieval data from the model (9) applies adequate content validation, integrity checks, or anomaly detection before delivering data and compare model behavior when using clean versus poisoned runtime datasets to detect deviations, context steering, or unsafe outputs. The evaluation should confirm that retrieval data is properly isolated from model logic, that embeddings and retrieved documents undergo cryptographic integrity verification, and that third-party data sources are sanitized and classified before use. Continuous monitoring should also be in place to detect anomalous retrieval patterns or data drift. Audit logs must accurately record data provenance and surface unexpected retrieval inputs or runtime context manipulations that may indicate poisoning attempts. diff --git a/Document/content/3.0_OWASP_AI_Testing_Guide_Framework.md b/Document/content/3.0_OWASP_AI_Testing_Guide_Framework.md index d03342e..fd707c2 100644 --- a/Document/content/3.0_OWASP_AI_Testing_Guide_Framework.md +++ b/Document/content/3.0_OWASP_AI_Testing_Guide_Framework.md @@ -78,7 +78,7 @@ Conducting a purely black-box test on an LLM/GenAI system, especially if it uses The following **limitations** should be taken into account when planning the assessment activities with a **black-box approach**: - LLM models are composed of **numerical weights and mathematical functions**, not following a workflow described in source code. Unlike traditional applications, where analyzing the source code usually makes it possible to identify the presence or the absence of specific issues, **in GenAI applications this can be complex or not feasible at all**. -- Many LLM models use a **temperature** value greater than zero. The temperature is a parameter that controls the randomness of the model’s output. A higher temperature increases randomness and "creativity" by sampling from a wider range of possible tokens, producing more diverse and less deterministic outputs. This potentially causes the need to **repeat attack vectors multiple times** as well as the possibility that results may be **hard to replicate**. Even when the temperature is equal to zero, the non-associative property of floating-point arithmetic, can make the results non reproducible and significantly different when changing the evaluation batch size, number of GPUs, or GPU versions. +- Many LLM models use a **temperature** value greater than zero. The temperature is a parameter that controls the randomness of the model’s output. A higher temperature increases randomness and "creativity" by sampling from a wider range of possible tokens, producing more diverse and less deterministic outputs. This potentially causes the need to **repeat attack vectors multiple times** as well as the possibility that results may be **hard to replicate**. Even when the temperature is equal to zero, the non-associative property of floating-point arithmetic, can make the results non-reproducible and significantly different when changing the evaluation batch size, number of GPUs, or GPU versions. - **Guardrails** are often themselves implemented using LLM models, which further complicates the analysis. - In a GenAI application composed of **multiple agents**, the user’s input is typically included in an initial prompt, and the output of the first LLM agent then becomes the input for the next one. This process can repeat multiple times, depending on the GenAI system’s architecture and the specific input provided by the user. In an architecture like this, effectively verifying all the different components of the application is particularly complex, and **the time and number of requests required for such an analysis can be prohibitive or, in some cases, not feasible at all**. - Many GenAI applications rely on **external models provided by major players in the industry**. These models usually have a **cost based on the number of tokens processed for both input and output**. For some models, this cost can be significant and must be taken into account **before considering large-scale automated testing**. For this reason, such applications often have thresholds in place to limit token consumption, and uncontrolled use of tokens can lead to a **Denial of Service (DoS) or a Denial of Wallet (DoW) condition**. It is also important to consider that in a multi-agent system, token consumption is not limited to the user’s input and the application’s final output, but also includes all intermediate prompts and outputs exchanged between agents. This often results in a significant increase in overall token usage.