8/27/2024

Running Ollama Models via CLI

If you're diving into the world of large language models, particularly with Ollama, you’re in for a treat! Ollama offers a simple yet powerful Command Line Interface (CLI) that allows you to run various models, including Llama 3.1, Mistral, and more. This tutorial will guide you through the entire process from installation to execution of models, ensuring you maximize the potential of this robust tool.

What is Ollama?

Ollama is an innovative platform that allows you to run large language models locally. With a plethora of models available, it’s an awesome option for developers and enthusiasts looking to experiment with natural language processing. The ollama/ollama repository has gained a lot of traction, boasting over 86.7k stars!

Getting Started

Before we jump into running models, let’s cover the basics of how to get ollama set up on your system. Depending on your operating system, Ollama can be installed using different methods:

macOS

Download from here.

Windows Preview

Download from OllamaSetup.exe.

Linux

For Linux users, the install command is super simple. Run:

1
curl -fsSL https://ollama.com/install.sh | sh

If you prefer, you can reference the Manual install instructions for more details.

Docker

You can even run Ollama using the official Ollama Docker image. Just pull the image with:

1
docker pull ollama/ollama

The Power of CLI

Once you have Ollama set up, you can start running models with just a few commands. Let’s have a look at some basic commands you might find useful. You can quickly enter the Ollama CLI by simply typing

ollama

on your terminal. Here are some commands you’ll want to get acquainted with:

Running a model:
1 2bash ollama run llama3.1
This command launches the Llama 3.1 model instantly.

Model Library

Ollama provides an extensive model library where you can find various models with differing parameters and sizes. Here’s a sneak peek at some of the models you can utilize:

Model	Parameters	Size	Command
Llama 3.1	8B	4.7GB	`1ollama run llama3.1`
Llama 3.1	70B	40GB	`1ollama run llama3.1:70b`
Llama 3.1	405B	231GB	`1ollama run llama3.1:405b`
Phi 3 Mini	3.8B	2.3GB	`1ollama run phi3`
Mistral	7B	4.1GB	`1ollama run mistral`
Gemma 2	9B	5.5GB	`1ollama run gemma2`

RAM Requirements

Before you start running models, be aware of the RAM requirements:

To run 7B models, you need at least 8GB of RAM.
For 13B models, you need 16GB.
And for models like the 33B, you'll need 32GB of RAM!

Customizing Models

One of the neat features of Ollama is its ability to customize models with ease. Let's go through some steps on how to customize a model:

Importing GGUF Models

If you have a model in a GGUF format, you can import it using the following procedure:

Create a file named
1Modelfile
and specify your model's local filepath. For example:
1./vicuna-33b.Q4_0.gguf
Create the model in Ollama:
1 2bash ollama create example -f Modelfile
Finally, run your model with:
1 2bash ollama run example

Customizing Prompts

You can also customize prompts for the models within the Ollama library. For instance, here's how to tailor the

llama3.1

model:

1
2

bash
ollama pull llama3.1

Create a

Modelfile

:
```plaintext llama3.1

set temperature 1 [higher creative, lower coherent]

PARAMETER temperature 1

system message

SYSTEM """ Mario Super Mario Bros. Answer Mario, assistant, only. """

Next, create and run the customized model:

bash ollama create mario -f ./Modelfile ollama run mario ```

CLI Commands Reference

Here are a few essential CLI commands to manage your models perfectly:

Create a Model:
1 2bash ollama create mymodel -f ./Modelfile
Pull a Model:
1 2bash ollama pull llama3.1
Remove a Model:
1 2bash ollama rm llama3.1
Copy a Model:
1 2bash ollama cp llama3.1 my-model
List Your Models:
1 2bash ollama list

REST API Integration

Ollama also supports a REST API to generate responses. Here’s how you could generate a response using

curl

1
2

bash
curl http://localhost:11434/api/generate -d '{ "model": "llama3.1", "prompt":"Why is the sky blue?" }'

Moreover, you can use the chat endpoint to interact more dynamically:

1
2

bash
curl http://localhost:11434/api/chat -d '{ "model": "llama3.1", "messages": [ { "role": "user", "content": "why is the sky blue?" } ] }'

Why Go for Ollama?

Using the Ollama CLI not only makes running models straightforward, but it also opens a door to customization that’s rarely found in other platforms. Plus, it’s lightweight, doesn’t require cloud dependency, & you can maintain your data privately.

Speaking of engaging your audience, if you’re looking to take your engagement to the next level with ChatGPT-powered AI, check out Arsturn! With Arsturn, you can instantly create custom ChatGPT chatbots for your website, enhancing engagement & conversion rates.

No credit card required & you can get started instantly! Whether you’re a business owner, influencer, or content creator, bringing conversational AI to your audience has never been easier.

Conclusion

Understanding how to run Ollama models through the CLI is an valuable skill, especially for those wanting to harness the power of AI. With customizable models, an easy-to-use CLI, and the ability to pull various powerful models, your possibilities are vast. Dive in, experiment, & enjoy the richness that Ollama offers!

FAQs

What data formats does Ollama support?
Ollama supports multiple data formats including
1.pdf
,
1.txt
, and
1.csv
. You can easily upload documents or provide web links for your chatbot.
Which language model does Ollama use?
The platform continually updates, leveraging top models ranging from OpenAI's GPT to others, ensuring you get the best value at a low cost.
How to integrate Ollama with my website?
Integration is super simple! Just set your chatbot live and embed the widget snippet into your website.

So grab Ollama today & Arsturn to create engaging experiences that your audience will love!