Open LLM Security Risks and Best Practices
July 1, 2025 | 15 min read
The argument for self-hosting open large language models (LLMs) gains momentum with the introduction of powerful models like DeepSeek's R1. However, the push toward self-hosting brings significant LLM security risks that must be addressed, including the potential for insecure output handling and other vulnerabilities.
Self-hosting open large language models (LLMs) is quickly becoming a strategic priority with the rise of advanced models like DeepSeek's R1. However, this shift requires a sharp focus on security. Self-hosting introduces significant security challenges, particularly around output vulnerabilities, that demand proactive solutions. Tackling these risks head-on is not just necessary — it's the key to fully harnessing the power of LLMs while ensuring secure, reliable operations.
This article examines the security challenges of locally deployed models, providing clear strategies for organizations to utilize LLMs effectively while mitigating risks tied to LLM outputs and application security. We spotlight DeepSeek's cutting-edge models and share proven best practices for securely deploying self-hosted LLM applications.
Open, locally deployable large language models are again driving innovation in AI. DeepSeek's models have achieved performance levels on par with leading commercial options, fueling a surge of interest in self-hosted artificial intelligence (AI) solutions. Open LLMs offer a compelling alternative to proprietary systems, offering organizations unmatched advantages in data privacy, regulatory compliance, customization and cost efficiency.
However, deploying models like DeepSeek's R1 comes with critical security concerns. Unlike managed cloud-based solutions, locally hosted LLMs place the burden of LLM application security and behavior squarely on the organization. Any lapse in security could lead to data breaches or significant legal consequences.
What are Open LLMs exactly?
The term Open LLMs does not align perfectly with the traditional open-source software concept. In classic software development, open-source refers to code artifacts freely available for use, modification and redistribution under permissive licenses such as MIT or Apache. With access to the source code, developers can understand the intellectual property (IP), replicate the software and deploy it with relative ease.
However, LLMs work differently. Unlike traditional software, LLMs require large-scale input data, data curation pipelines, training algorithms and computational infrastructure. Even when a model is labeled as "open," these underlying components, like the model's training data, are often proprietary, unpublished or impossible to fully verify. In most cases, what's available are the model weights — the numerical parameters that define the model's behavior after training. In practice, Open LLM is any model whose weights are freely available and can be deployed locally under a specific license.
When a new Open LLM is released, it often includes a model card — a key document outlining the data sources, training algorithms and critical metadata provided by its creators. Yet, model cards remain optional, unverifiable and frequently lack the detailed instructions necessary for full reproducibility. This absence of standardization creates significant security and regulatory risks. Addressing these gaps is essential to ensure transparency, accountability and trust in the development of AI models.
Enter DeepSeek's Models
DeepSeek's V3 and R1 represent a significant leap in LLM technology. On the technical front, DeepSeek-V3 demonstrated that clever sparsity allows it to activate only relevant parts of the network for each query, enabling V3 to achieve near state-of-the-art performance at a fraction of the inference cost of similarly skilled dense models. Its introduction of multi-token training objectives and elimination of auxiliary losses for a mixture of experts (MoE) are innovations that will likely be adopted in future large language model training regimes to push efficiency even further.
DeepSeek-R1, on the other hand, broke new ground in how we imbue models with reasoning abilities. It treated reasoning as a first-class citizen in training, using reinforcement learning (and the novel GRPO algorithm) to literally train the model to "think." This approach resulted in an artificial intelligence that not only gives answers but also reliably produces the logical steps leading to those answers, marking a step change in the transparency and interpretability of AI reasoning.
Security Landscape of Open LLMs
As organizations increasingly adopt open large language models, it is crucial to understand the unique security risks they present. This section explores key vulnerabilities and challenges associated with locally deployed LLMs, as well as examples of potential exploits.
1. Oversight of the Training Data
A core challenge in training large language models lies in the sheer volume of data, which makes a comprehensive review impossible. This lack of transparency was powerfully demonstrated when Google researchers discovered that one of their AI models independently learned Bengali — despite the language not explicitly included in the training dataset. The model identified and leveraged tiny fragments of Bengali text scattered across the dataset, extrapolating from these fragments to achieve full language proficiency. This example underscores the complexity and unexpected capabilities of advanced AI systems, highlighting both the innovation and unpredictability inherent in their development.
The DeepSeek model contains 12,000 API keys leaked through CommonCrawl data fed to the model during training. It's important to note that it's the users who leak keys, but this data is directly available in DeepSeek (and probably in dozens of other large language models) — it was not filtered out during the training process. Since DeepSeek's training data is undisclosed, organizations must evaluate the risk of inaccurate, harmful or biased outputs. Implementing mechanisms to detect anomalous input patterns during training could help mitigate risks like unintended data inclusion.
This underscores a critical security concern: models can unintentionally absorb and replicate information in unpredictable ways, even beyond the expectations of their creators. For Open LLMs, this introduces serious risks in compliance, data privacy and potential misuse.
2. Reasoning Transparency Hidden Threats
A notable security consideration is the risk of reasoning transparency. DeepSeek-R1 reveals its internal chain-of-thought, which attackers can manipulate using indirect prompt injection. This can bypass safeguards protecting sensitive outputs, posing a significant threat to system integrity and enabling attackers to launch further exploits like jailbreak prompts or model denial strategies.
While this is great for interpretability and debugging, it introduces a new attack surface. Researchers have demonstrated that malicious prompts can exploit the presence of
In evaluations using adversarial prompting tools (like NVIDIA's Garak), DeepSeek-R1 was found to be more susceptible to certain prompt injection attacks than models that don't expose their intermediate reasoning. The chain-of-thought can give attackers insight into how the model makes decisions, potentially making it easier to craft effective jailbreak prompts.
3. Prompt Injection in Open Models
Local large language models can exhibit harmful or unaligned behavior if not properly controlled. Models like DeepSeek R1 have demonstrated minimal built-in guardrails. For example, security tests have shown DeepSeek failed to block 100% of harmful prompts presented, in stark contrast to more filtered cloud models.
This means a locally run LLM could readily generate disallowed content, like hate speech, disinformation or illicit instructions, when maliciously prompted. Such behavior poses reputational and legal risks for organizations if the model outputs toxic or dangerous content without oversight. Attackers or careless users could exploit this openness, effectively "jailbreaking" the model to produce malware code, detailed phishing scripts or other outputs that a cloud provider's model would normally refuse.
Without the safety nets of a managed cloud service, a local LLM's responses can be unpredictable and potentially hazardous.
4. Model Extraction (Theft)
Model extraction is possible without direct access to model weights like in a black-box setting and adversaries can extract parameters, hyperparameters or behavioral approximations through repeated queries.
One common method is model distillation, where a smaller model is trained to mimic a larger one. This was exemplified by Stanford's Alpaca, a 7B model fine-tuned on 52,000 ChatGPT-generated examples. Despite limited access, the resulting model replicated ChatGPT's behavior in a narrow sense. Microsoft's Orca research expanded on this, using detailed reasoning traces to train small models that rivaled ChatGPT in some tasks. These cases show that even interaction-based data collection can yield high-fidelity clones. When you build your own model, be aware of the model distillation technique and protect your IP.
5. Unverifiable Build Process
Model distillation introduces significant challenges. While this process allows the development of smaller, more efficient local models derived from a larger "teacher" model, it can also transfer the teacher model's biases and vulnerabilities. A recent example underscores the risks: OpenAI has accused DeepSeek of unlawfully distilling ChatGPT's knowledge, directly violating its usage policies.
Apart from legal issues, such distillation shortcuts can bypass safety training, potentially explaining why DeepSeek's distilled model was vulnerable to 27% of prompt injection attempts in tests.
6. Weight Poisoning and Hidden Triggers
Weight poisoning occurs when a threat actor directly edits a small subset of the model's parameters to embed malicious behavior but keeps the general behavior of the model untouched.
In 2023, security researchers unveiled a proof-of-concept called PoisonGPT, exposing the dangers of poisoned knowledge in AI systems. They successfully altered the open-source GPT-J-6B model to deliver a specific falsehood when prompted with a particular question. While the tampered model appeared identical to the original in nearly all scenarios, it intentionally misrepresented the fact by responding "Yuri Gagarin" to the question, "Who was the first person to land on the moon?" This manipulation was achieved by modifying the model's weights with minimal impact on performance — a negligible 0.1% drop on benchmarks.
The result was a hidden backdoor, nearly undetectable through standard testing methods. The PoisonGPT experiment underscores the critical threat of misinformation in AI, highlighting the importance of robust safeguards to maintain trust and integrity in advanced systems.
Other Security Concerns
Self-hosting LLMs brings flexibility and control but also introduces critical security implications. These range from improper maintenance and network vulnerabilities to regulatory compliance challenges. Below, we outline key concerns organizations must address to manage these risks effectively and strengthen their security posture.
Improper Security Maintenance
Unlike an AI service, self-hosting requires constant updates, security monitoring and performance tuning. Improper access control mechanisms could compromise security, leading to sensitive data exposure, including agent extensions and system prompt leaks. Failing to maintain a robust security posture with proper operations and maintenance processes increases the risk of security breaches or incidents.
Network Security Concerns
Hidden outbound network connections to vendors' servers can cause unintentional data transmission. These connections could be exploited for security threats such as telemetry data collection, model updates, remote debugging or even unauthorized data exfiltration. Security professionals recommend monitoring and controlling all external connections to mitigate these risks effectively.
Supply Chain Attacks
Supply chain attacks occur when a trusted third-party vendor is the victim of an attack, exposing supply chain vulnerabilities that compromise the product provided with a malicious component. When a machine learning (ML) model is stored on disk, it must be serialized, i.e., translated into a binary form. There are many serialization formats, and each machine learning framework has its own default one. Unfortunately, most widely used formats are vulnerable to arbitrary code execution. Attackers can exploit and trigger remote code execution. These include Python's pickle format and HD5, formats known in the security community to carry such vulnerabilities.
Malevolent Third-Party Contractors
Outsourcing model training to a specialized third party may seem like an efficient solution, saving both time and resources. However, this approach requires high trust. A malicious contractor could compromise the integrity of your model by embedding a backdoor. If your model is critical to your business operations, it's essential to ensure that your partners are not only skilled but also reputable.
Regulatory Risks
Running local LLMs introduces regulatory risks, particularly around data privacy, content moderation and model provenance. Without centralized oversight, organizations become fully responsible for ensuring that their models do not process or generate content that violates the GDPR, HIPAA or sector-specific regulations. Many open models lack sufficient security features or transparency about their training data, making it difficult to verify compliance with copyright laws or data usage restrictions. Deploying such models without proper audit trails or safeguards can expose organizations to legal liability and further security breaches.
As demonstrated, self-hosting open LLMs like DeepSeek's models presents significant security challenges. Issues such as opaque training data, reasoning transparency exploits, prompt injection and model theft mean that local deployment requires considerable effort from your organization to manage these risks effectively. However, acknowledging these risks is the first step. In the second part of this article, we will shift our focus to actionable strategies you can use to protect your locally deployed models. We will also detail the substantial benefits that can make navigating these complexities a worthwhile investment for your organization.
The argument for self-hosting open LLMs gets a boost with powerful DeepSeek's R1 model and recently with Google's Gemma 3n model. Yet self-hosting comes with real-world risks that need to be addressed.
The appeal of self-hosting powerful open LLMs like DeepSeek's R1 evident. Yet, as we've established, this approach introduces tangible security risks that your organization must be prepared to manage. The first part of this article provided an overview of these security vulnerabilities, covering concerns related to training data opacity, potential exploits in model transparency features, susceptibility to prompt-based attacks and the risks of model IP theft. Now, we focus on outlining mitigation strategies and discussing the strategic advantages of deploying and managing these models within your own environment.
Security Recommendations for Local LLMs
EPAM guidelines for securing local LLMs
Deploying a local LLM securely requires a proactive, security-first strategy. Below are key best practices to help mitigate the outlined risks:
1. Maintain Data Quality and Transparency
Robust data governance is vital. Use curated, high-quality datasets for model training and document data sources thoroughly. Regularly evaluate training data for biases or sensitive content that could lead to harmful outputs. Implement validation and filtering mechanisms to detect and remove anomalous inputs that could be attempts to poison the model. Continuous monitoring and updating of datasets will help reinforce safe behavior.
2. Add Safety Layers (Guardrails)
Relying solely on a model's internal safety mechanisms is not enough. Implement external guardrails — such as content moderation systems, prompt injection detection systems or reliable rule-based filters — to actively monitor and manage inputs and outputs in real time. Fine-tune the model using alignment techniques or reinforcement learning from human feedback (RLHF) to embed safer behavior effectively. While these measures may limit the model's flexibility, they are critical for minimizing the risk of harmful content.
3. Conduct Rigorous Security Testing and Monitoring
Treat your LLM like the critical system it is, especially in agentic setups where LLMs are given more autonomy. Conduct regular security audits and penetration testing to stay ahead of potential threats. Use red team exercises to simulate prompt injection attacks or attempts to extract sensitive information — essential steps to identify and address vulnerabilities. Enforce rigorous access controls to ensure only authorized users can interact with the system, reducing the risk of exploitation. Implement continuous monitoring with robust logging and alert systems to detect anomalies early and enable swift, decisive action. Stay proactive, stay secure and maintain excellence.
4. Keep the Environment Up-to-Date
The AI security landscape is rapidly advancing, demanding proactive measures to stay ahead of vulnerabilities. Regularly update LLM software, libraries and platforms to address emerging threats. Keep dependencies, such as natural language processing (NLP) libraries and vector databases, current to ensure seamless performance. Conduct routine reviews of user permissions and network settings to maintain robust security as your system scales.
5. Plan for Incident Response
Breaches happen — even with the best precautions in place. That's why having a robust incident response plan is non-negotiable. Define clear steps to take when a security issue arises: isolate the compromised LLM immediately, communicate transparently with stakeholders and conduct a thorough investigation to pinpoint the cause. Regular drills and tabletop exercises are essential to ensure your team is ready to act decisively under pressure. Make sure your operations are not overly dependent on LLMs and include a plan in place. Preparedness is the key to minimizing impact and maintaining trust.
6. Implement Strict Access Controls
Limit access to the LLM by enforcing strong authentication and authorization protocols. Only allow trusted users and applications to interact with the model and employ rate-limiting and usage monitoring to detect and deter potential extraction or manipulation attempts.
7. Control Outbound Traffic
Locally hosted models can risk exposing sensitive information through generated code or outputs. To mitigate this, take control of outbound network traffic with robust monitoring and management. Block unauthorized data transfers and enforce encryption to secure outgoing data streams.
8. Protect the Model Artifacts
Model files and configurations are critical assets that require the highest level of protection. Secure them with encryption at rest and apply integrity checks like hash verifications or digital signatures during model loading to detect any signs of tampering. Implement robust loading practices and ensure all related libraries are consistently updated to eliminate risks from vulnerable serialization formats. Where feasible, sandbox the model's runtime environment to contain potential breaches.
9. Vet Your Supply Chain
Acquire models, plugins, and datasets exclusively from trusted, reputable sources. Always validate model files through official repositories, checksums or digital signatures to ensure authenticity. When engaging third-party vendors or consultants, enforce strict security protocols and demand verified proof that models are free from malicious alterations. Conduct regular audits and security reviews to maintain uncompromising standards for all external contributions.
10. Address Regulatory Risks
Running LLMs locally shifts regulatory responsibility entirely to the operator. This includes ensuring compliance with data protection laws like GDPR or HIPAA, content moderation standards and copyright obligations. Many open models lack clear documentation of their training data, making it difficult to verify legal compliance. To minimize risk, organizations should implement detailed usage logs, limit the processing of sensitive data and carry out regular compliance assessments. This is particularly crucial when deploying models in highly regulated industries such as healthcare, finance or education.
The Appeal of Open LLMs — Control, Flexibility and Cost Efficiency
Open LLMs deliver powerful advantages that make building local LLM deployments worth the effort. Their transparency enables deep customization, allowing precise tailoring for specialized domains through post-training adjustments, alignments and integration of pre-trained components. They also offer a cost-effective edge by eliminating licensing fees and usage charges, outperforming commercial alternatives. Backed by a collaborative community, open models drive rapid innovation and continuous improvements in performance and security.
Enhanced Data Privacy and Control
Deploying large language models (LLMs) locally offers a significant advantage: enhanced data privacy. By handling data internally, organizations can maintain full control over sensitive information, such as patient records in healthcare, financial transactions or legal communications. This approach not only reduces the risks associated with external data transfers but also ensures compliance with key data privacy regulations like GDPR and HIPAA.
Performance Advantages and Reduced Latency
Deploying models locally eliminates the need for data to travel to remote servers, drastically reducing latency. This approach is particularly advantageous for applications that require real-time processing, where every millisecond counts. Performance gains can be substantial — many companies transitioning from cloud-based LLMS to local implementations have reported significant reductions in processing times.
Cost Efficiency and Long-Term ROI
Although the initial investment in local deployment is higher than using cloud-based services, the long-term savings can be substantial. Cloud LLMs typically charge on a per-token or per-API-call basis, which can add up with consistent high-volume usage. By investing in local infrastructure, organizations can achieve lower operational costs over time, as demonstrated by research showing that training costs for local models can be significantly lower than cloud-based alternatives.
Investing in local deployment may require a higher upfront cost compared to cloud-based services, but the long-term savings are substantial. Cloud LLMs charge per token or API call and for organizations with high-volume usage, these costs can quickly escalate. By building local infrastructure, businesses gain control over operational expenses, achieving significantly lower costs over time. Research consistently demonstrates that training local models can be far more cost-effective than relying on cloud-based alternatives.
Customization and Specialized Applications
Local LLMs provide unparalleled customization opportunities, making them perfectly suited for specialized fields. Organizations can fine-tune these models using domain-specific data, resulting in greater accuracy and relevance in areas like law, medicine and finance. Unlike cloud-based models, local LLMs offer enhanced control over both the training process and data sources, giving businesses a competitive advantage in highly specialized applications.
Summary
Self-hosting open LLMs requires dedicated effort from your organization to manage the inherent security risks. However, this investment can be well justified. There are two critical points to consider.
First, maintaining direct control over your models and data is a fundamental advantage, enabling enhanced data privacy and deeper customization possibilities than typically available with third-party services.
Second, while implementing strong security and operational practices is vital, the ability to tailor advanced AI to meet your organization's specific needs provides a powerful strategic advantage. For organizations ready to address inherent security risks, self-hosting offers a pathway to harness LLM technology on your own terms, driving both innovation and control.