Computer security | Technology Tales

OWASP Top 10 for Large Language Model Applications

21^st January 2024

OWASP stands for Open Web Application Security Project, and it is an online community dedicated to web application security. They are well known for their Top 10 Web Application Security Risks and late last year, they added a Top 10 for
Large Language Model (LLM) Applications.

Given that large language models made quite a splash last year, this was not before time. ChatGPT gained a lot of attention (OpenAI also has had DALL-E for generation of images for quite a while now), there are many others with Anthropic Claude and Perplexity also being mentioned more widely.

Figuring out what to do with any of these is not as easy as one might think. For someone more used to working with computer code, using natural language requests is quite a shift when you no longer have documentation that tells what can and what cannot be done. It is little wonder that prompt engineering has emerged as a way to deal with this.

Others have been plugging in LLM capability into chatbots and other applications, so security concerns have come to light, so far, I have not heard anything about a major security incident, but some are thinking already about how to deal with AI-suggested code that others already are using more and more.

Given all that, here is OWASP's summary of their Top 10 for LLM Applications. This is a subject that is sure to draw more and more interest with the increasing presence of artificial intelligence in our everyday working and no-working lives.

LLM01: Prompt Injection

This manipulates an LLM through crafty inputs, causing unintended actions by the LLM. Direct injections overwrite system prompts, while indirect ones manipulate inputs from external sources.

LLM02: Insecure Output Handling

This vulnerability occurs when an LLM output is accepted without scrutiny, exposing backend systems. Misuse may lead to severe consequences such as Cross-Site Scripting (XSS), Cross-Site Request Forgery (CSRF), Server-Side Request Forgery (SSRF), privilege escalation, or remote code execution.

LLM03: Training Data Poisoning

This occurs when LLM training data are tampered, introducing vulnerabilities or biases that compromise security, effectiveness, or ethical behaviour. Sources include Common Crawl, WebText, OpenWebText and books.

LLM04: Model Denial of Service

Attackers cause resource-heavy operations on LLMs, leading to service degradation or high costs. The vulnerability is magnified due to the resource-intensive nature of LLMs and the unpredictability of user inputs.

LLM05: Supply Chain Vulnerabilities

LLM application lifecycle can be compromised by vulnerable components or services, leading to security attacks. Using third-party datasets, pre-trained models, and plugins can add vulnerabilities.

LLM06: Sensitive Information Disclosure

LLMs may inadvertently reveal confidential data in its responses, leading to unauthorized data access, privacy violations, and security breaches. It’s crucial to implement data sanitization and strict user policies to mitigate this.

LLM07: Insecure Plugin Design

LLM plugins can have insecure inputs and insufficient access control. This lack of application control makes them easier to exploit and can result in consequences such as remote code execution.

LLM08: Excessive Agency

LLM-based systems may undertake actions leading to unintended consequences. The issue arises from excessive functionality, permissions, or autonomy granted to the LLM-based systems.

LLM09: Overreliance

Systems or people overly depending on LLMs without oversight may face misinformation, miscommunication, legal issues, and security vulnerabilities due to incorrect or inappropriate content generated by LLMs.

LLM10: Model Theft

This involves unauthorized access, copying, or exfiltration of proprietary LLM models. The impact includes economic losses, compromised competitive advantage, and potential access to sensitive information.

Turning off seccomp sandbox in vsftpd

21^st September 2013

Within the last week, I set up a virtual web server using Arch Linux to satisfy my own curiosity, since the DIY nature of Arch means that you can build up exactly what you need without having any real constraints put upon you. Something that didn't surprise me about this was that it took me more work than the virtual server that I created using Ubuntu Server, yet I didn't expect Proftpd to be missing from the main repositories. Though the package can be found in the AUR, I didn't fancy the prospect of dragging more work on myself, so I went with vsftpd (Very Secure FTP Daemon) instead. In contrast to Proftpd, this is available in the standard repositories and there is a guide to its use in the Arch user documentation.

However, while vsftpd worked well just after installation, connections to the virtual FTP soon failed with FileZilla began issuing uninformative messages. In fact, it was the standard command line FTP client on my Ubuntu machine that was more revealing. It issued the following message that let me to the cause after my engaging the services of Google:

500 OOPS: priv_sock_get_cmd

With version 3.0 of vsftpd, a new feature was introduced, and it appears that this has caused problems for a few people. That feature is seccomp_sandbox and it can be turned off by adding the following line in /etc/vsftpd.conf:

seccomp_sandbox=NO

That solved my problem, and version 3.0.2 of vsftpd should address the issue with seccomp sandboxing anyway. In case, this solution isn't as robust as it should be because seccomp is not supported in the Linux kernel that you are using, turning off the new feature still needs to be an option, though.