8/27/2024

Running Ollama Models via CLI

If you're diving into the world of large language models, particularly with Ollama, you’re in for a treat! Ollama offers a simple yet powerful Command Line Interface (CLI) that allows you to run various models, including Llama 3.1, Mistral, and more. This tutorial will guide you through the entire process from installation to execution of models, ensuring you maximize the potential of this robust tool.

What is Ollama?

Ollama is an innovative platform that allows you to run large language models locally. With a plethora of models available, it’s an awesome option for developers and enthusiasts looking to experiment with natural language processing. The ollama/ollama repository has gained a lot of traction, boasting over 86.7k stars!

Getting Started

Before we jump into running models, let’s cover the basics of how to get ollama set up on your system. Depending on your operating system, Ollama can be installed using different methods:

macOS

  1. Download from here.

Windows Preview

  1. Download from OllamaSetup.exe.

Linux

For Linux users, the install command is super simple. Run:
1 curl -fsSL https://ollama.com/install.sh | sh
If you prefer, you can reference the Manual install instructions for more details.

Docker

You can even run Ollama using the official Ollama Docker image. Just pull the image with:
1 docker pull ollama/ollama

The Power of CLI

Once you have Ollama set up, you can start running models with just a few commands. Let’s have a look at some basic commands you might find useful. You can quickly enter the Ollama CLI by simply typing
1 ollama
on your terminal. Here are some commands you’ll want to get acquainted with:
  • Running a model:
    1 2 bash ollama run llama3.1
    This command launches the Llama 3.1 model instantly.

Model Library

Ollama provides an extensive model library where you can find various models with differing parameters and sizes. Here’s a sneak peek at some of the models you can utilize:
ModelParametersSizeCommand
Llama 3.18B4.7GB
1 ollama run llama3.1
Llama 3.170B40GB
1 ollama run llama3.1:70b
Llama 3.1405B231GB
1 ollama run llama3.1:405b
Phi 3 Mini3.8B2.3GB
1 ollama run phi3
Mistral7B4.1GB
1 ollama run mistral
Gemma 29B5.5GB
1 ollama run gemma2

RAM Requirements

Before you start running models, be aware of the RAM requirements:
  • To run 7B models, you need at least 8GB of RAM.
  • For 13B models, you need 16GB.
  • And for models like the 33B, you'll need 32GB of RAM!

Customizing Models

One of the neat features of Ollama is its ability to customize models with ease. Let's go through some steps on how to customize a model:

Importing GGUF Models

If you have a model in a GGUF format, you can import it using the following procedure:
  1. Create a file named
    1 Modelfile
    and specify your model's local filepath. For example:
    1 ./vicuna-33b.Q4_0.gguf
  2. Create the model in Ollama:
    1 2 bash ollama create example -f Modelfile
  3. Finally, run your model with:
    1 2 bash ollama run example

Customizing Prompts

You can also customize prompts for the models within the Ollama library. For instance, here's how to tailor the
1 llama3.1
model:
1 2 bash ollama pull llama3.1
Create a
1 Modelfile
:
```plaintext llama3.1

set temperature 1 [higher creative, lower coherent]

PARAMETER temperature 1

system message

SYSTEM """ Mario Super Mario Bros. Answer Mario, assistant, only. """
1 Next, create and run the customized model:
bash ollama create mario -f ./Modelfile ollama run mario ```

CLI Commands Reference

Here are a few essential CLI commands to manage your models perfectly:
  • Create a Model:
    1 2 bash ollama create mymodel -f ./Modelfile
  • Pull a Model:
    1 2 bash ollama pull llama3.1
  • Remove a Model:
    1 2 bash ollama rm llama3.1
  • Copy a Model:
    1 2 bash ollama cp llama3.1 my-model
  • List Your Models:
    1 2 bash ollama list

REST API Integration

Ollama also supports a REST API to generate responses. Here’s how you could generate a response using
1 curl
:
1 2 bash curl http://localhost:11434/api/generate -d '{ "model": "llama3.1", "prompt":"Why is the sky blue?" }'
Moreover, you can use the chat endpoint to interact more dynamically:
1 2 bash curl http://localhost:11434/api/chat -d '{ "model": "llama3.1", "messages": [ { "role": "user", "content": "why is the sky blue?" } ] }'

Why Go for Ollama?

Using the Ollama CLI not only makes running models straightforward, but it also opens a door to customization that’s rarely found in other platforms. Plus, it’s lightweight, doesn’t require cloud dependency, & you can maintain your data privately.
Speaking of engaging your audience, if you’re looking to take your engagement to the next level with ChatGPT-powered AI, check out Arsturn! With Arsturn, you can instantly create custom ChatGPT chatbots for your website, enhancing engagement & conversion rates.
No credit card required & you can get started instantly! Whether you’re a business owner, influencer, or content creator, bringing conversational AI to your audience has never been easier.

Conclusion

Understanding how to run Ollama models through the CLI is an valuable skill, especially for those wanting to harness the power of AI. With customizable models, an easy-to-use CLI, and the ability to pull various powerful models, your possibilities are vast. Dive in, experiment, & enjoy the richness that Ollama offers!

FAQs

  1. What data formats does Ollama support?
    Ollama supports multiple data formats including
    1 .pdf
    ,
    1 .txt
    , and
    1 .csv
    . You can easily upload documents or provide web links for your chatbot.
  2. Which language model does Ollama use?
    The platform continually updates, leveraging top models ranging from OpenAI's GPT to others, ensuring you get the best value at a low cost.
  3. How to integrate Ollama with my website?
    Integration is super simple! Just set your chatbot live and embed the widget snippet into your website.
So grab Ollama today & Arsturn to create engaging experiences that your audience will love!

Copyright © Arsturn 2024