Author

Francis Ndungu

On this Page

Home → Articles → How to Run DeepSeek V4 AI Model with Ollama on Ubuntu 26.04

How to Run DeepSeek V4 AI Model with Ollama on Ubuntu 26.04

06 May, 2026

Introduction

Running large language models locally on your own hardware gives you control over your data and eliminates dependency on third-party APIs. Ollama simplifies the process of downloading, managing, and executing AI models on Linux systems. The tool handles complex configuration and hardware optimization automatically, making local AI deployment accessible without deep machine learning expertise. This approach works well for developers who need offline access to powerful models or want to integrate language AI into their applications without recurring costs.

This guide shows you how to install Ollama and run the DeepSeek V4 model on Ubuntu 26.04.

Prerequisites

Before you start:

Purchase an Ubuntu 26.04 Cloud GPU server with at least:
- GPU: NVIDIA H100 (80 GB) or NVIDIA GH200/B200 (next‑gen, higher bandwidth)
- System RAM: 64 GB (minimum), 128 GB preferred for CPU offload
- Free Storage: 160 GB (minimum), 200 GB+ recommended to hold GGUF weights, cache, and logs
- CPU: 16 vCPUs or more (to assist with offload and preprocessing)
If you don't have an Ubuntu 26.04 GPU, sign up with Vultr and get upto $300 worth of free credit to test the Vultr platform. * Connect to your GPU server through SSH, replace 192.168.0.1 with your VPS public IP address..
Windows
- Use PuTTY to connect to your VPS .
Linux or Mac
- Run the following command in your shell.
  console
  $ ssh username@192.168.0.1
Create a non-root user with sudo privileges. Read our guide on How to Create a Non-Root Sudo User on Ubuntu 24.04. You'll use this user's account to run the commands in this guide.

Install Ollama on Ubuntu 26.04

Ollama provides an automated installation script that sets up the service and its dependencies. The script adds the official Ollama repository to your system and installs the necessary binaries.

Refresh your system’s package information index.
console
```
$ sudo apt update
```

Download and run the Ollama installation script.

console

$ curl -fsSL https://ollama.com/install.sh | sh

Output:

>>> Installing ollama to /usr/local
>>> Downloading Linux amd64 bundle
>>> Creating ollama user
>>> Adding current user to ollama group
>>> Creating ollama systemd service
>>> The Ollama API is now available at :::11434

Verify the Ollama service is active.

console

$ systemctl status ollama

Output:

● ollama.service - Ollama Service
     Loaded: loaded (/etc/systemd/system/ollama.service; enabled; preset: enabled)
     Active: active (running) since Tue 2026-05-05 14:30:00 UTC; 10s ago
   Main PID: 12345 (ollama)
      Tasks: 8 (limit: 18979)
     Memory: 25.8M
        CPU: 1.2s

Press Ctrl + C to return to the shell prompt.

Pull the DeepSeek V4 Model

Ollama uses a centralized library of pre-trained models. The pull command downloads the model weights to your local system so you can run them without an internet connection.

Download the DeepSeek V4 model using Ollama. The download size is approximately 8 GB. The duration depends on your connection speed.

console

$ ollama pull deepseek-v4

Output:

pulling manifest 
pulling 44c83434e9c9: 100% ▕███████████████████████████████████████████████▏ 7.8 GB/7.8 GB
pulling e78cd803b886: 100% ▕███████████████████████████████████████████████▏ 620 B/620 B
pulling 799afb18aed3: 100% ▕███████████████████████████████████████████████▏ 152 B/152 B
verifying sha256 digest 
writing manifest 
success

List all downloaded models to confirm DeepSeek V4 is present.

console

$ ollama list

Output:

NAME                    ID              SIZE      MODIFIED
deepseek-v4:latest      a1b2c3d4e5f6    7.8 GB    11 seconds ago

Run the DeepSeek V4 Model Interactively

After pulling the model, you can start an interactive chat session directly in your terminal. This method lets you test prompts and explore model behavior in real time.

Start the DeepSeek V4 model.

console

$ ollama run deepseek-v4

Output:

>>> Send a message (/? for help)

Type a prompt to test the model. For example, ask a general knowledge question.

Who wrote "One Hundred Years of Solitude"?

Output:

One Hundred Years of Solitude was written by Gabriel García Márquez, a Colombian author who received the Nobel Prize in Literature in 1982.

Type /bye to exit the interactive session when you finish.

Keep the DeepSeek V4 Model Running as a Service

Ollama runs an API server on your local machine, allowing other applications to send requests. By default, the server listens on port 11434.

Check the API server status.

console

$ curl http://localhost:11434/api/tags

Output:

{"models":[{"name":"deepseek-v4:latest","modified_at":"2026-05-05T14:32:00.123456789Z","size":7800000000}]}

Test the DeepSeek V4 Model with a REST API Call

You can interact with DeepSeek V4 programmatically using standard HTTP requests. This approach is useful for integrating the model into your scripts or applications.

Send a completion request using /api/generate.

console

$ curl -X POST http://localhost:11434/api/generate -d '{
  "model": "deepseek-v4",
  "prompt": "Explain what a large language model is in one sentence."
}'

Output:

{"model":"deepseek-v4","created_at":"2026-05-05T14:35:00Z","response":"A large language model is an AI system trained on vast amounts of text to understand and generate human-like language.","done":true}

Uninstall Ollama (Optional)

If you need to remove Ollama from your system for any reason, the installation script includes an automatic removal process.

Run the Ollama uninstall script.

console

$ curl -fsSL https://ollama.com/install.sh | sh -s -- --remove

Delete the local models directory to free up storage space.
console
```
$ sudo rm -rf /usr/share/ollama
```

Conclusion

In this guide, you have installed Ollama on Ubuntu 26.04, pulled the DeepSeek V4 model from the official library, and run the model interactively in your terminal. You also learned how to access the model through the built-in API server for integration into scripts. Now that you have DeepSeek V4 running locally, consider building a chat interface with a web framework like FastAPI or Streamlit to create your own self-hosted AI assistant.

AI
Machine Learning
Deep Learning
Data Science
Neural Networks