8/12/2025

Unleash Your Personal AI Coding Army: Using Local Ollama Models in VS Code's Agent Mode

Hey there! So, you've probably been hearing a TON about AI in software development. It's everywhere, from fancy cloud-based services to a million little extensions promising to write all your code for you. But here's the thing that always bugged me: sending my code, my company's proprietary stuff, off to some third-party server? It just felt… iffy. Not to mention the costs that can rack up.
Turns out, there's a pretty amazing solution that's been gaining a lot of traction: running large language models (LLMs) locally on your own machine. We're talking full control, total privacy, & the ability to work completely offline. A recent study even highlighted that while LLMs are being rapidly adopted in software development, concerns around data sensitivity are a major issue for developers. Running models locally is the perfect answer to that.
And it gets even better. We're not just talking about simple code completion anymore. We're talking about full-on "agent mode" directly in your VS Code editor. Imagine telling an AI, "Hey, build me a basic blog application with a React frontend & a Node.js backend," & then watching it create the files, write the code, run terminal commands to install dependencies, & even try to fix its own errors.
That's not science fiction anymore. It's what you can do RIGHT NOW with a combination of two incredible open-source tools: Ollama & the Continue extension for VS Code.
In this guide, I'm going to walk you through EVERYTHING you need to know to set this up. We'll go from zero to having your own personal AI coding assistant that runs entirely on your machine. It’s a game-changer, honestly.

Why Go Local? The Perks of Running Your Own AI

Before we dive into the nitty-gritty, let's just quickly cover why this is such a big deal. The buzz around local LLMs isn't just for hobbyists; it's a serious shift for developers & businesses.
  • Privacy First: This is the big one. When you run an LLM locally with Ollama, none of your code, prompts, or data ever leaves your machine. For anyone working with sensitive or proprietary code, this is a non-negotiable benefit. No more worrying about your intellectual property being used to train some other company's model.
  • Speed & Offline Capability: Cloud-based AI tools are at the mercy of your internet connection & their server load. A local model runs on your own hardware, meaning it can be incredibly fast & it works perfectly even when you're on a plane or your internet goes out.
  • Cost-Effective: Those API calls to cloud services can get expensive, especially if you're using them heavily. Running your own models is typically MUCH cheaper. You're just using your own computer's resources.
  • Ultimate Customization: This is where it gets really fun. You can download & switch between dozens of different open-source models to find the perfect one for your needs. Want a model that's a genius at Python? There's one for that. Need one that's better at creative writing for your documentation? You can get that too. You have full control over your AI toolkit.

Step 1: Getting the Engine Running with Ollama

First things first, we need to install Ollama. Think of Ollama as a manager for your local AI models. It makes it ridiculously easy to download, run, & switch between different LLMs right from your terminal.
The installation is super straightforward. Just head over to the Ollama website & download the installer for your operating system (macOS, Windows, or Linux).
Once it's installed, you can open up your terminal & pull your first model. This is like downloading an app for your AI assistant. Let's start with a couple of solid choices. I recommend getting a good all-around chat model & a model specifically for code.
A great starting point is Llama 3 8B for general chat & reasoning, & DeepSeek Coder 6.7B for code completion & generation. DeepSeek Coder is a fantastic model that's specifically trained for programming tasks.
To download them, just run these commands in your terminal:

Copyright © Arsturn 2025