8/27/2024

Setting Up Ollama with Apache Airflow

In the rapidly evolving world of AI & data processing, setting up tools like Ollama with orchestration platforms like Apache Airflow can lead to unparalleled efficiency and scalability. This post guides you through the entire setup process, ensuring you're equipped to leverage the best of both platforms. Let's dive in!

What is Ollama?

Ollama is a cutting-edge framework that allows you to run large language models such as Llama 3.1, Mistral, and Gemma 2 locally. With its innovative design, Ollama empowers developers to harness the capabilities of these models without needing extensive cloud infrastructure, making it perfect for local executions.

Why Ollama?

Local Execution: Offers the flexibility of running models on local machines, reducing latencies and costs associated with cloud.
Wide Model Support: The ability to run different models provides an array of functionalities for various applications—from chatbots to complex analytical functions.
Speedy Install & Manage: Seamlessly install from Ollama’s official website and manage your models with simple CLI commands.

What is Apache Airflow?

Apache Airflow is a platform designed to programmatically author, schedule & monitor workflows. It's utilized extensively in data engineering environments to make sure tasks are executed in a specific order, handle dependencies & scale workflows based on required resources.

Why Use Airflow?

Scalability: Perfect for managing workflows that involve extensive data processing tasks.
Dynamic Pipeline Generation: You can create complex workflows with simple Python code through Directed Acyclic Graphs (DAGs).
Monitoring: Keep track of task status & execution through an intuitive web interface.

Combining Ollama & Airflow

Combining the automation of Apache Airflow with the language capabilities of Ollama creates a powerful AI-driven pipeline. With Ollama, you can execute language models & with Airflow, you can orchestrate those executions along with data dependencies and scheduling.

Step-by-Step Guide to Setup Ollama with Apache Airflow

Prerequisites

Ensure you have Python installed.
Install Docker to run Ollama in a container.
You should have Airflow installed on your machine or server.

Step 1: Installing Ollama

For macOS:

You can effortlessly install Ollama using the following command:

1
2

bash
curl -fsSL https://ollama.com/download/ollama-darwin.zip | sh

For Windows:

Download the Windows Installer directly from the Ollama website.

For Linux:

Use the following command:

1
2

bash
curl -fsSL https://ollama.com/install.sh | sh

After installation, go ahead and verify by running:

1
2

bash
ollama run llama3.1

Step 2: Setting Up Apache Airflow

Using Docker: You can quickly deploy an Airflow instance using Docker. Create a
1docker-compose.yaml
file:
1 2 3 4 5 6 7 8 9 10 11 12 13yaml version: '3' services: airflow: image: apache/airflow:2.6.0 restart: always ports: - "8080:8080" environment: - AIRFLOW__CORE__EXECUTOR=LocalExecutor volumes: - ./airflow:/opt/airflow command: airflow webserver
Run the following command in the terminal where the
1docker-compose.yaml
file resides:
1 2bash docker-compose up -d
Access the Airflow dashboard by navigating to http://localhost:8080.

Step 3: Connecting Ollama in Airflow Tasks

Now that we have set up both Ollama & Apache Airflow, the next step is to write tasks in Airflow that execute Ollama commands.

Create a Basic DAG

Create a file called

ollama_dag.py

in the dags directory of your Airflow environment:

```python from airflow import DAG from airflow.operators.bash_operator import BashOperator import datetime

default_args = { 'owner': 'airflow', 'start_date': datetime.datetime(2024, 5, 6), }

with DAG('ollama_dag', default_args=default_args, schedule_interval='@daily') as dag:

1
2
3
run_llama = BashOperator(
    task_id='run_llama',
    bash_command='ollama run llama3.1 --prompt