8/27/2024

Integrating Ollama with Docker Swarm

In the ever-evolving world of cloud computing, the combination of container orchestration systems like Docker Swarm with powerful tools such as Ollama is becoming increasingly popular. Many developers and companies are discovering the benefits of running their Large Language Model (LLM) applications in a containerized and orchestrated environment. In this blog post, we will deep dive into the steps and best practices for integrating Ollama with Docker Swarm.

What is Docker Swarm?

Docker Swarm is Docker's native clustering and orchestration tool that allows you to manage multiple Docker containers as a single virtual host. This clustering is crucial for ensuring high availability and horizontal scaling of applications. Here's a quick look at some of the features that make Docker Swarm an essential tool in modern DevOps:

Simple setup & management: Easily set up a Swarm cluster and manage services through an intuitive command-line interface.
Load balancing: Automatically distribute incoming requests between containers to provide optimal resource utilization.
Scaling capabilities: Easily scale services up or down according to demand.
Declarative service model: Simply declare the desired state of your service, and Docker Swarm handles the rest.

What is Ollama?

Ollama is an innovative platform designed to enable users to run various LLM applications locally. It provides a powerful command-line interface (CLI) to manage and run models, including pulling and serving them as needed. Best of all, Ollama is built specially to work seamlessly with Docker, allowing users to take advantage of containerization while managing complex LLM workloads.

Using Ollama with Docker: Getting Started

Before we embark on the journey of integration, let’s ensure you have the required tools installed:

Docker: Make sure Docker is installed on your system. You can get it from docker.com.
Docker Swarm: Initialize your Swarm cluster with
1docker swarm init
command.
Ollama: The Ollama tool can be installed directly on your local environment. Visit ollama.ai for the installation guide.

Setting Up Your First Ollama Service in Docker Swarm

To set up Ollama in your Docker Swarm environment, we will create a Docker Compose file. This

docker-compose.yml

file will define the services required to run an Ollama API and its associated web UI.

Here’s an example docker-compose.yml: ```yaml version: '3.8'

services: ollama: image: ollama/ollama:latest container_name: ollama tty: true restart: always ports:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
- 11434:11434
  - 53:53
volumes:
  - ollama:/root/.ollama
environment:
  - "OLLAMA_HOST=0.0.0.0"
  - "OLLAMA_ORIGINS=http://localhost,https://localhost,http://127.0.0.1,https://127.0.0.1,http://0.0.0.0,https://0.0.0.0"
deploy:
  placement:
    constraints:
      - node.hostname == YOUR_NODE_NAME  # Change needed
  resources:
    reservations:
      generic_resources:
        - discrete_resource_spec:
            kind: "NVIDIA-GPU"
            value: 2  # Change needed

open-webui: image: ghcr.io/open-webui/open-webui:main container_name: open-webui volumes:

open-webui:/app/backend/data depends_on:
ollama ports:
3000:8085 environment:
"PORT=8085"
"OLLAMA_API_BASE_URL=http://192.168.0.3:11434/api" # Change needed
"WEBUI_AUTH=false"
"WEBUI_NAME=Open WebUI" restart: always
watchtower: image: containrrr/watchtower container_name: watchtower environment:
"WATCHTOWER_CLEANUP=true"
"WATCHTOWER_INCLUDE_STOPPED=false"
"WATCHTOWER_TIMEOUT=30s"
"WATCHTOWER_SCHEDULE=0
- *"
"WATCHTOWER_HTTP_API_METRICS=true"
"WATCHTOWER_HTTP_API_TOKEN=nnnnn-nnnn-nnnn-nnnn" # Change needed volumes:
/var/run/docker.sock:/var/run/docker.sock

s: ollama: {} open-webui: {} ```

This file contains three services:

ollama

(the Ollama API),

open-webui

(for the web interface), and

watchtower

(to automatically update these containers). Here’s what each section does:

Ollama API: Runs the Ollama container and exposes necessary ports.
Open Web UI: This service depends on the Ollama API, providing a web-based interface to interact with it.
Watchtower: Monitors your running containers and updates them if a new image is available.

Deploying the Services

Once your

docker-compose.yml

is ready and configured, you can deploy it to your Swarm cluster. Use the command:

1
2

bash
docker stack deploy -c docker-compose.yml ollama_stack

The

docker stack deploy

command deploys your services in the defined stack, allowing you to control them collectively.

Monitoring Your Ollama Service

To monitor the status of your deployed services, you can use:

1
2

bash
docker service ls

This command lists all services in your Swarm, allowing you to check if they are running smoothly. To see the logs for a particular service, use:

1
2

bash
docker service logs <service_name>

Best Practices

When running your Ollama integration in Docker Swarm, consider the following best practices:

Resource Management: Define CPU and memory limits in your
1docker-compose.yml
(increase or decrease resources as needed based on performance).
Data Persistency: Use Docker volumes to ensure your data persists across container restarts or updates.
Network Optimization: Utilize Docker overlay networks for enhanced service networking across multiple nodes.
Security: Implement Docker Secrets and Configs to store sensitive information (like API keys) securely.
Automated Backups: Set up routines that backup your Ollama models and configurations regularly.

Troubleshooting Common Issues

Service Not Starting

If your services don't start as expected, check the logs for detailed errors:

1
2

bash
docker service logs ollama

Common reasons include misconfigured environment variables or missing dependencies defined in your Dockerfile.

GPU Not Detected

Ensure that the

nvidia-container-runtime

is installed and configured correctly. Check your Docker configuration and the compatibility of your CUDA drivers with the images you are using.

Network Issues

If you are experiencing connectivity issues between containers, make sure they are on the same overlay network in Swarm.

Why Use Ollama with Docker Swarm?

The key advantages of integrating Ollama with Docker Swarm include:

Scalability: Easily scale your language models based on demand, ensuring performance during peak periods.
Resilience: Deploy multiple replicas of your containers, providing high availability even if one goes down.
Simplicity: The integration allows developers to focus on building applications removing the complexities of deployment.

Conclusion

Integrating Ollama with Docker Swarm enables robust management of LLM applications while leveraging the capabilities of containerization. By following the guidelines above and incorporating best practices, you can achieve an efficient and effective deployment of your machine-learning models. If you haven’t tried it yet, be sure to explore Arsturn, which offers a no-code solution to create conversational chatbots, enhancing your audience engagement effortlessly!

With Arsturn, you can take your applications a step further, enabling meaningful connections in less time. Discover how easy it can be to create custom chatbots that fit your brand’s needs today!

Stay tuned for more updates on integrating AI technologies with container orchestration!