Generative AI introduces new security threats because it alters both the digital attack surface and the type of systems within it. The attack surface is expanded due to new endpoints in the form of large language models (LLM), new application interfaces, and a set of novel supporting technologies often associated with generative AI implementations.
The nature of the risk also changes because LLMs and other generative AI solutions have essentially unbounded inputs and outputs, whereas traditional applications are primarily deterministic. This can make it harder to identify anomalous patterns and provides more varied payloads where malicious attacks can hide.
Top 10 LLM Security Vulnerabilities
The non-profit foundation Open Web Application Security Project (OWASP) has identified its Top 10 list for LLM security risks. Enterprises deploying generative AI solutions should develop risk mitigation strategies for these security threats.
**Indicates Synthedia comments on the OWASP information.
Prompt Injection
This manipulates a large language model (LLM) through crafty inputs, causing unintended actions by the LLM. Direct injections overwrite system prompts, while indirect ones manipulate inputs from external sources.
**This is probably the best-known LLM security risk. It is also hard to prevent because most organizations provide access to an unbounded text input prompt. Security relies on system prompts that counteract or specifically deny responding in certain ways as well as preventing access to specific systems and data.
Insecure Output Handling
This vulnerability occurs when an LLM output is accepted without scrutiny, exposing backend systems. Misuse may lead to severe consequences like XSS, CSRF, SSRF, privilege escalation, or remote code execution.
Training Data Poisoning
This occurs when LLM training data is tampered, introducing vulnerabilities or biases that compromise security, effectiveness, or ethical behavior. Sources include Common Crawl, WebText, OpenWebText, & books.
**Few organizations have considered this risk. Data poisoning can occur randomly through scraping internet data sources that happen to include malicious data or even commands and then using it as training data. Model developers have begun to focus on data quality for sourcing training data, but that is typically related to adding curated data. Scanning data sources for malicious code or inaccurate information will become a standard solution over time. In addition, curated datasets used for training or fine-tuning could also be compromised by inserting malicious data. This will become a more common threat vector that requires additional screening as well to counteract preplanned attacks that essentially ingest “backdoor” security vulnerabilities into an AI model’s structure.
Model Denial of Service
Attackers cause resource-heavy operations on LLMs, leading to service degradation or high costs. The vulnerability is magnified due to the resource-intensive nature of LLMs and unpredictability of user inputs.
**This has already happened. OpenAI faced a distributed denial of service (DDoS) attack in November 2023. This degraded access to several of its services and led to ChatGPT becoming completely inaccessible for a period of time.
Supply Chain Vulnerabilities
LLM application lifecycle can be compromised by vulnerable components or services, leading to security attacks. Using third-party datasets, pre-trained models, and plugins can add vulnerabilities.
Sensitive Information Disclosure
LLMs may inadvertently reveal confidential data in its responses, leading to unauthorized data access, privacy violations, and security breaches. It’s crucial to implement data sanitization and strict user policies to mitigate this.
**Organizations need to review user text inputs as well as outputs for private data. This will lead to quarantines or redaction services.
Insecure Plugin Design
LLM plugins can have insecure inputs and insufficient access control. This lack of application control makes them easier to exploit and can result in consequences like remote code execution.
**This is a newer threat vector that few organizations have had to address to date because plugins are not widely used. As LLM-enabled solutions expand from surfacing knowledge to executing tasks, plugins will become commonplace and system risks will rise.
Excessive Agency
LLM-based systems may undertake actions leading to unintended consequences. The issue arises from excessive functionality, permissions, or autonomy granted to the LLM-based systems.
**Google recently faced this issue with Gemini when the excessive agency was attributed to overly restrictive guardrails that made it exceedingly hard to create white people and likely to create racially inappropriate outputs via the DALL-E image model. However, the more concerning manifestation of this risk will arise with the increased use of agents that augment generative AI solutions by interacting with outside data sources and digital services. When agents are making independent decisions that can impact interaction with external systems, the risk will rise for impact on those external systems as well as returning to the host generative AI solution with malicious elements.
Overreliance
Systems or people overly depending on LLMs without oversight may face misinformation, miscommunication, legal issues, and security vulnerabilities due to incorrect or inappropriate content generated by LLMs.
Model Theft
This involves unauthorized access, copying, or exfiltration of proprietary LLM models. The impact includes economic losses, compromised competitive advantage, and potential access to sensitive information.
**The biggest risk attributed to model theft is the ability of individuals or groups to conduct extensive training on an existing model in order to identify security vulnerabilities.
While OWASP focuses on threat categories, another way to consider security is the likely risk categories. Some of the most prominent risks involve data exfiltration, model exfiltration (similar to OWASP #10), data poisoning (OWASP #3, but more expansive than training data as the risk could exist in source data as well), and model poisoning. That latter risk could cause the model to operate in ways unintended by the creators.
Generative AI Security
Cybersecurity is a mature discipline that must constantly adjust to new threat vectors that are often introduced by novel technologies. Generative AI represents a more complex security challenge because of the variable nature of model behavior, along with the limited visibility into their runtime operations and structural interworkings.
If you have a clear understanding of how a system works and know how it will react to specific situations, you can more easily identify potential security weaknesses. In addition, patterns of unusual behavior serve as indicators of an ongoing threat. The probablistic and variable nature of generative AI solution combined with limited understanding of the model’s underlying data and processing represents a new challenge for security specialists.
Granted, AI solutions have also emerged as important tools in combating security risks as well. The key step today for enterprises is to understand the risks and develop strategies for identifying vulnerabilities and active attacks.
Red teaming is a common term in cybersecurity. It involves tests of pre-production and production systems, often conducted by security experts, designed to expose a security vulnerability. Red teaming has become a popular topic related to generative AI over the past three months, but is most commonly used to connote tests focused on exposing model accuracy or AI safety issues. That will soon change as security begins to rise in perceived importance.