Last Updated On - February 15th, 2026 Published On - Feb 15, 2026
If you want to run Large Language Models (LLMs) like Mistral, Phi, Llama 3, Gemma, DeepSeek or GPT-OSS on your own VPS, Ollama is one of the easiest and most developer-friendly solutions available today.
In this complete guide, I will show you how to:
- Install Ollama on Ubuntu VPS
- Run it safely alongside Laravel, MySQL, Apache, and N8N
- Connect it to a subdomain
- Secure it using Let’s Encrypt SSL
- Authenticate requests
- Fix common errors (including OOM killer issue)
- Choose the right LLM for an 8GB VPS
- Integrate Ollama with N8N workflows
This guide is production-focused and tested on a Contabo Ubuntu server running Apache, Docker, MySQL, Composer, PHP, and self-hosted N8N.
Server Configuration Used
- Ubuntu VPS (Contabo)
- 8GB RAM
- 3 CPU cores
- Apache
- MySQL
- Docker
- 2 Laravel applications
- 1 self-hosted N8N instance
If your configuration is similar, this guide will work perfectly.
Step 1: Can We Run GPT-OSS-20B or 120B on 8GB VPS?
Before installation, let’s talk about physics.
❌ GPT-OSS-120B
Requires 60GB+ RAM (even quantized). Not possible.
❌ GPT-OSS-20B
Needs at least 12–16GB RAM. Will crash on 8GB.
✅ Recommended for 8GB VPS
You should use:
- Mistral 7B (Q4) – Balanced
- Llama 3 8B (tight but possible)
- Gemma 7B
- Phi (Best lightweight choice)
If your VPS also runs Laravel + MySQL + N8N, the safest choice is:
👉 Phi
Step 2: Install Ollama Using Docker (Recommended)
We isolate Ollama so it does not affect running applications.
docker run -d
--name ollama
-p 127.0.0.1:11434:11434
-e OLLAMA_NUM_THREADS=3
-e OLLAMA_MAX_LOADED_MODELS=1
-v ollama:/root/.ollama
--restart unless-stopped
ollama/ollama
Important:
- Bound to 127.0.0.1 (not public)
- Limited to single model
- Uses 3 CPU threads
Step 3: Pull Model
For lightweight production setup:
docker exec -it ollama ollama pull phi
Check installed models:
docker exec -it ollama ollama list
Step 4: Test Ollama Locally
curl -X POST http://localhost:11434/api/generate
-H "Content-Type: application/json"
-d '{
"model": "phi",
"prompt": "Explain SSL simply",
"stream": false
}'
If you do not add the Content-Type header, the request may hang without error.
Common Error 1: llama runner process terminated: signal: killed
Error:
{"error":"llama runner process has terminated: signal: killed"}
Cause: Linux OOM Killer (Out Of Memory).
Check using:
dmesg | grep -i kill
Solution:
- Use smaller model (Phi)
- Add swap
- Avoid 7B models on shared 8GB production
Add swap (optional):
fallocate -l 4G /swapfile
chmod 600 /swapfile
mkswap /swapfile
swapon /swapfile
echo '/swapfile none swap sw 0 0' >> /etc/fstab
Step 5: Create Subdomain for Ollama
Add A record:
ollama.yourdomain.com → VPS IP
Step 6: Configure Apache Reverse Proxy
Enable modules:
a2enmod proxy
a2enmod proxy_http
a2enmod headers
systemctl restart apache2
Create /etc/apache2/sites-available/ollama.conf
<VirtualHost *:80>
ServerName ollama.yourdomain.com
RewriteEngine On
RewriteRule ^ https://%{SERVER_NAME}%{REQUEST_URI} [END,NE,R=permanent]
</VirtualHost>
SSL virtual host:
<IfModule mod_ssl.c>
<VirtualHost *:443>
ServerName ollama.yourdomain.com
ProxyPreserveHost On
ProxyPass / http://127.0.0.1:11434/
ProxyPassReverse / http://127.0.0.1:11434/
RequestHeader set X-Forwarded-Proto "https"
<Location />
AuthType Basic
AuthName "Restricted Ollama"
AuthUserFile /etc/apache2/.ollama_htpasswd
Require valid-user
</Location>
SSLCertificateFile /etc/letsencrypt/live/ollama.yourdomain.com/fullchain.pem
SSLCertificateKeyFile /etc/letsencrypt/live/ollama.yourdomain.com/privkey.pem
Include /etc/letsencrypt/options-ssl-apache.conf
</VirtualHost>
</IfModule>
Step 7: Install SSL (Let’s Encrypt)
apt install certbot python3-certbot-apache -y
certbot --apache -d ollama.yourdomain.com
Test:
curl -u username:password https://ollama.yourdomain.com/api/tags
Step 8: Connect Ollama with N8N
In N8N:
- Add HTTP Request Node
- Method: POST
- URL: https://ollama.yourdomain.com/api/generate
- Authentication: Basic Auth
- Headers: Content-Type: application/json
Body:
{
"model": "phi",
"prompt": "Summarize {{$json["content"]}}",
"stream": false
}
Access response using:
{{$json["response"]}}
Monitoring RAM Usage on VPS
Check memory:
free -m
Live monitoring:
apt install htop
htop
Docker container usage:
docker stats
Top memory processes:
ps aux --sort=-%mem | head -15
Removing Large Models (If Needed)
docker exec -it ollama ollama rm mistral
Restart container:
docker restart ollama
Performance Expectations (8GB VPS)
Phi:
- 1.5–2GB RAM usage
- Fast responses
- Stable for production
Mistral 7B:
- 4–5GB RAM
- Risky on shared server
Production Recommendations
If AI becomes core to your business:
- Move Ollama to separate VPS (16GB+)
- Or use GPU server
- Or hybrid model (local + cloud)
Never run 20B+ models on 8GB shared production server.
How to Completely Uninstall Ollama from VPS
If you want to remove Ollama completely from your server, follow the steps below carefully. This process removes the Docker container, models, Apache configuration, SSL certificate, and authentication setup.
Step 1: Stop the Ollama Container
docker stop ollama
Step 2: Remove the Container
docker rm ollama
Step 3: Remove Ollama Docker Volume (Deletes All Models)
docker volume rm ollama
This step frees disk space used by downloaded models such as Phi or Mistral.
Step 4: Remove Ollama Docker Image (Optional)
docker rmi ollama/ollama
Step 5: Disable Apache Virtual Hosts
a2dissite ollama.conf
a2dissite ollama-le-ssl.conf
systemctl reload apache2
Step 6: Delete SSL Certificate (Optional)
certbot delete --cert-name ollama.yourdomain.com
Step 7: Remove Authentication File
rm /etc/apache2/.ollama_htpasswd
Step 8: Verify Ollama is Fully Removed
ss -tulnp | grep 11434
If nothing is returned, Ollama has been completely removed from your VPS.
Also Read: Complete VPS Setup Guide for Laravel, PHP, Apache & MySQL on Ubuntu
Frequently Asked Questions (FAQs)
1. Can I run GPT-OSS-20B on an 8GB VPS?
No. GPT-OSS-20B requires at least 12–16GB RAM. On an 8GB VPS, the Linux OOM killer will terminate the process.
2. Which LLM is best for an 8GB VPS?
Phi is the safest and most stable model for shared production servers running Apache, Laravel, MySQL, or N8N.
3. Why does my curl request return no output?
This usually happens because the Content-Type: application/json header is missing in the request.
4. What does “signal: killed” error mean?
This error means the Linux Out Of Memory (OOM) killer terminated the Ollama process due to insufficient RAM.
5. Should I expose Ollama directly to the public internet?
No. Always bind Ollama to 127.0.0.1 and use Apache reverse proxy with SSL and authentication for security.
6. Can I run Ollama on my local machine?
Yes. You can install Ollama using the official installer or Docker and access it via http://localhost:11434.
Final Thoughts
Ollama is powerful and easy to deploy, but hardware limits matter. If configured properly with Docker isolation, reverse proxy, SSL, and authentication, you can safely run LLMs on your VPS without affecting Laravel or N8N applications.
Always monitor RAM usage and choose models wisely.
If you need help setting up Ollama, N8N automation, or secure VPS architecture, feel free to contact me.
Happy Building 🚀
