How to Install and Secure Ollama on VPS (Ubuntu + Apache) – Complete Step-by-Step Guide

Install Ollama on VPS

Last Updated On - February 15th, 2026 Published On - Feb 15, 2026

If you want to run Large Language Models (LLMs) like Mistral, Phi, Llama 3, Gemma, DeepSeek or GPT-OSS on your own VPS, Ollama is one of the easiest and most developer-friendly solutions available today.

In this complete guide, I will show you how to:

  • Install Ollama on Ubuntu VPS
  • Run it safely alongside Laravel, MySQL, Apache, and N8N
  • Connect it to a subdomain
  • Secure it using Let’s Encrypt SSL
  • Authenticate requests
  • Fix common errors (including OOM killer issue)
  • Choose the right LLM for an 8GB VPS
  • Integrate Ollama with N8N workflows

This guide is production-focused and tested on a Contabo Ubuntu server running Apache, Docker, MySQL, Composer, PHP, and self-hosted N8N.


Server Configuration Used

  • Ubuntu VPS (Contabo)
  • 8GB RAM
  • 3 CPU cores
  • Apache
  • MySQL
  • Docker
  • 2 Laravel applications
  • 1 self-hosted N8N instance

If your configuration is similar, this guide will work perfectly.


Step 1: Can We Run GPT-OSS-20B or 120B on 8GB VPS?

Before installation, let’s talk about physics.

❌ GPT-OSS-120B

Requires 60GB+ RAM (even quantized). Not possible.

❌ GPT-OSS-20B

Needs at least 12–16GB RAM. Will crash on 8GB.

✅ Recommended for 8GB VPS

You should use:

  • Mistral 7B (Q4) – Balanced
  • Llama 3 8B (tight but possible)
  • Gemma 7B
  • Phi (Best lightweight choice)

If your VPS also runs Laravel + MySQL + N8N, the safest choice is:

👉 Phi


Step 2: Install Ollama Using Docker (Recommended)

We isolate Ollama so it does not affect running applications.

docker run -d 
  --name ollama 
  -p 127.0.0.1:11434:11434 
  -e OLLAMA_NUM_THREADS=3 
  -e OLLAMA_MAX_LOADED_MODELS=1 
  -v ollama:/root/.ollama 
  --restart unless-stopped 
  ollama/ollama

Important:

  • Bound to 127.0.0.1 (not public)
  • Limited to single model
  • Uses 3 CPU threads

Step 3: Pull Model

For lightweight production setup:

docker exec -it ollama ollama pull phi

Check installed models:

docker exec -it ollama ollama list

Step 4: Test Ollama Locally

curl -X POST http://localhost:11434/api/generate 
  -H "Content-Type: application/json" 
  -d '{
    "model": "phi",
    "prompt": "Explain SSL simply",
    "stream": false
  }'

If you do not add the Content-Type header, the request may hang without error.


Common Error 1: llama runner process terminated: signal: killed

Error:

{"error":"llama runner process has terminated: signal: killed"}

Cause: Linux OOM Killer (Out Of Memory).

Check using:

dmesg | grep -i kill

Solution:

  • Use smaller model (Phi)
  • Add swap
  • Avoid 7B models on shared 8GB production

Add swap (optional):

fallocate -l 4G /swapfile
chmod 600 /swapfile
mkswap /swapfile
swapon /swapfile
echo '/swapfile none swap sw 0 0' >> /etc/fstab

Step 5: Create Subdomain for Ollama

Add A record:

ollama.yourdomain.com → VPS IP

Step 6: Configure Apache Reverse Proxy

Enable modules:

a2enmod proxy
a2enmod proxy_http
a2enmod headers
systemctl restart apache2

Create /etc/apache2/sites-available/ollama.conf

<VirtualHost *:80>
    ServerName ollama.yourdomain.com
    RewriteEngine On
    RewriteRule ^ https://%{SERVER_NAME}%{REQUEST_URI} [END,NE,R=permanent]
</VirtualHost>

SSL virtual host:

<IfModule mod_ssl.c>
<VirtualHost *:443>
    ServerName ollama.yourdomain.com

    ProxyPreserveHost On
    ProxyPass / http://127.0.0.1:11434/
    ProxyPassReverse / http://127.0.0.1:11434/

    RequestHeader set X-Forwarded-Proto "https"

    <Location />
        AuthType Basic
        AuthName "Restricted Ollama"
        AuthUserFile /etc/apache2/.ollama_htpasswd
        Require valid-user
    </Location>

    SSLCertificateFile /etc/letsencrypt/live/ollama.yourdomain.com/fullchain.pem
    SSLCertificateKeyFile /etc/letsencrypt/live/ollama.yourdomain.com/privkey.pem
    Include /etc/letsencrypt/options-ssl-apache.conf

</VirtualHost>
</IfModule>

Step 7: Install SSL (Let’s Encrypt)

apt install certbot python3-certbot-apache -y
certbot --apache -d ollama.yourdomain.com

Test:

curl -u username:password https://ollama.yourdomain.com/api/tags

Step 8: Connect Ollama with N8N

In N8N:

  1. Add HTTP Request Node
  2. Method: POST
  3. URL: https://ollama.yourdomain.com/api/generate
  4. Authentication: Basic Auth
  5. Headers: Content-Type: application/json

Body:

{
  "model": "phi",
  "prompt": "Summarize {{$json["content"]}}",
  "stream": false
}

Access response using:

{{$json["response"]}}

Monitoring RAM Usage on VPS

Check memory:

free -m

Live monitoring:

apt install htop
htop

Docker container usage:

docker stats

Top memory processes:

ps aux --sort=-%mem | head -15

Removing Large Models (If Needed)

docker exec -it ollama ollama rm mistral

Restart container:

docker restart ollama

Performance Expectations (8GB VPS)

Phi:

  • 1.5–2GB RAM usage
  • Fast responses
  • Stable for production

Mistral 7B:

  • 4–5GB RAM
  • Risky on shared server

Production Recommendations

If AI becomes core to your business:

  • Move Ollama to separate VPS (16GB+)
  • Or use GPU server
  • Or hybrid model (local + cloud)

Never run 20B+ models on 8GB shared production server.


How to Completely Uninstall Ollama from VPS

If you want to remove Ollama completely from your server, follow the steps below carefully. This process removes the Docker container, models, Apache configuration, SSL certificate, and authentication setup.

Step 1: Stop the Ollama Container

docker stop ollama

Step 2: Remove the Container

docker rm ollama

Step 3: Remove Ollama Docker Volume (Deletes All Models)

docker volume rm ollama

This step frees disk space used by downloaded models such as Phi or Mistral.

Step 4: Remove Ollama Docker Image (Optional)

docker rmi ollama/ollama

Step 5: Disable Apache Virtual Hosts

a2dissite ollama.conf
a2dissite ollama-le-ssl.conf
systemctl reload apache2

Step 6: Delete SSL Certificate (Optional)

certbot delete --cert-name ollama.yourdomain.com

Step 7: Remove Authentication File

rm /etc/apache2/.ollama_htpasswd

Step 8: Verify Ollama is Fully Removed

ss -tulnp | grep 11434

If nothing is returned, Ollama has been completely removed from your VPS.



Frequently Asked Questions (FAQs)

1. Can I run GPT-OSS-20B on an 8GB VPS?

No. GPT-OSS-20B requires at least 12–16GB RAM. On an 8GB VPS, the Linux OOM killer will terminate the process.

2. Which LLM is best for an 8GB VPS?

Phi is the safest and most stable model for shared production servers running Apache, Laravel, MySQL, or N8N.

3. Why does my curl request return no output?

This usually happens because the Content-Type: application/json header is missing in the request.

4. What does “signal: killed” error mean?

This error means the Linux Out Of Memory (OOM) killer terminated the Ollama process due to insufficient RAM.

5. Should I expose Ollama directly to the public internet?

No. Always bind Ollama to 127.0.0.1 and use Apache reverse proxy with SSL and authentication for security.

6. Can I run Ollama on my local machine?

Yes. You can install Ollama using the official installer or Docker and access it via http://localhost:11434.


Final Thoughts

Ollama is powerful and easy to deploy, but hardware limits matter. If configured properly with Docker isolation, reverse proxy, SSL, and authentication, you can safely run LLMs on your VPS without affecting Laravel or N8N applications.

Always monitor RAM usage and choose models wisely.

If you need help setting up Ollama, N8N automation, or secure VPS architecture, feel free to contact me.

Happy Building 🚀