Sick of Cloud-Based AI? Here's How to Use Ollama as a Cheap, Local Coding Assistant
Z
Zack Saadioui
8/12/2025
Sick of Cloud-Based AI? Here's How to Use Ollama as a Cheap, Local Coding Assistant
Hey everyone, hope you're doing well. Let's talk about something that's been on my mind a lot lately: AI coding assistants. We've all seen them, & we've probably all used them. Tools like GitHub Copilot are pretty magical, not gonna lie. But honestly, I've started to get a little antsy about sending every single line of my code, including proprietary client work, up to the cloud. Plus, the subscription fees, while not back-breaking, just feel like another one of those little paper cuts to my bank account each month.
And what happens when the internet decides to take a vacation? Your expensive AI assistant clocks out, too. It’s frustrating, right?
Turns out, there’s a pretty awesome solution that’s been gaining a TON of traction: running your own AI coding assistant locally. I’m talking about Ollama. It’s this fantastic tool that lets you download & run powerful large language models (LLMs) right on your own machine. No internet required after the initial setup, no monthly fees, & your code stays COMPLETELY private.
I've been going down the rabbit hole with this for a while now, & I’m here to tell you, it’s a game-changer. It’s like having a brilliant coding partner who lives on your laptop, works offline, & doesn't charge you rent. So, grab a coffee, & let's get into how you can set up your own cheap, local, & seriously powerful coding assistant with Ollama.
Why Even Bother With a Local AI?
I get it. Setting things up yourself sounds like work. Is it really worth it when you can just pay a few bucks & have something that works out of the box? For me, the answer is a resounding YES. Here's the thing:
Total Privacy & Security: This is the big one. When you use Ollama, your code never leaves your computer. You're not sending snippets, files, or entire projects to a third-party server. For anyone working with sensitive client data, unreleased products, or just values their privacy, this is HUGE. A 2024 study actually highlighted that while 83% of firms are using AI for code, security is a massive concern. With a local setup, you're in control.
It’s FREE (Mostly): Ollama itself is open-source & free. The models you download are free. The only "cost" is the electricity to run your computer. Compared to the recurring subscription fees of cloud services, you're saving real money every single month. One developer I read about mentioned saving at least $20-30 a month, which adds up!
Works Offline, Works Anywhere: On a plane? In a coffee shop with spotty Wi-Fi? No problem. Your AI assistant is right there with you. Since everything runs on your machine, you are completely independent of an internet connection. This is a massive win for productivity & reliability.
Speed & No Latency: There's no network lag. Your requests don't have to travel across the globe to a server farm & back. The response is nearly instantaneous, limited only by the speed of your own hardware. It feels incredibly snappy.
Endless Customization: This is where it gets REALLY fun. You're not stuck with a one-size-fits-all model. You can experiment with dozens of different open-source models to find the one that perfectly suits your coding style & workflow. You can even fine-tune a model on your own codebase to create a true expert on your project. We'll get into that later.
Of course, it's not all sunshine & rainbows. The main trade-off is that you need decent hardware. The models do the heavy lifting on your machine, so you’ll need a good amount of RAM. But honestly, most modern developer laptops are more than capable.
Getting Started: Your Step-by-Step Setup Guide
Alright, let's get our hands dirty. Setting this up is actually way easier than you might think. Here’s the game plan: we'll install Ollama, then get it hooked into VS Code with a brilliant extension called Continue.
Step 1: Install Ollama
First things first, you need to get Ollama running. It’s a super simple process.
On macOS: If you use Homebrew (which you probably do), just open your terminal & run:
1
brew install ollama
On Linux: Pop open your terminal & run this command:
1
curl -fsSL https://ollama.com/install.sh | sh
On Windows: Just head over to the Ollama website & download the installer. It’s a standard setup wizard.
Once it's installed, you can verify it's working by opening a terminal & typing
1
ollama --version
. You should see the version number pop up. Pretty cool, right? Ollama now runs as a quiet background service on your machine.
Step 2: Download Your First Coding Model
Ollama is the engine, but now you need some fuel. That fuel comes in the form of AI models. There are a TON to choose from, but for our first foray into coding, let's grab a couple of solid, well-rounded choices.
I recommend starting with two models: one for general-purpose chat & another lightweight one specifically for fast code autocomplete.
For Chat & Complex Questions:
1
Llama 3
is Meta's latest powerhouse. It's fantastic for explaining code, brainstorming ideas, & generating complex functions. Let's grab the 8B (8 billion parameter) version, which is a great balance of performance & resource usage.
1
ollama pull llama3
For Fast Autocomplete: For the instant, as-you-type suggestions, you want something small & speedy.
1
StarCoder2:3b
is perfect for this. It's trained specifically on code & is light enough to not slow down your editor.
1
ollama pull starcoder2:3b
You can find a massive library of other models on the Ollama website. We'll talk more about picking the right model later on.
Step 3: Integrate with VS Code using Continue
This is where the magic really happens. We'll use a free, open-source VS Code extension called Continue to bring our local models directly into our editor.
Open VS Code.
Go to the Extensions view (Ctrl+Shift+X).
Search for "Continue" & install the one by Continue.dev.
You'll see a new icon in your activity bar. Click on it.
Now, we just need to tell Continue to use our local Ollama models instead of some cloud service.
In the Continue sidebar, click the gear icon in the bottom-right corner. This will open a
1
config.json
file.
Delete everything in there & paste this configuration: