Unleash Your Personal AI Coding Army: Using Local Ollama Models in VS Code's Agent Mode
Hey there! So, you've probably been hearing a TON about AI in software development. It's everywhere, from fancy cloud-based services to a million little extensions promising to write all your code for you. But here's the thing that always bugged me: sending my code, my company's proprietary stuff, off to some third-party server? It just felt… iffy. Not to mention the costs that can rack up.
Turns out, there's a pretty amazing solution that's been gaining a lot of traction: running large language models (LLMs) locally on your own machine. We're talking full control, total privacy, & the ability to work completely offline. A recent study even highlighted that while LLMs are being rapidly adopted in software development, concerns around data sensitivity are a major issue for developers. Running models locally is the perfect answer to that.
And it gets even better. We're not just talking about simple code completion anymore. We're talking about full-on "agent mode" directly in your VS Code editor. Imagine telling an AI, "Hey, build me a basic blog application with a React frontend & a Node.js backend," & then watching it create the files, write the code, run terminal commands to install dependencies, & even try to fix its own errors.
That's not science fiction anymore. It's what you can do RIGHT NOW with a combination of two incredible open-source tools: Ollama & the Continue extension for VS Code.
In this guide, I'm going to walk you through EVERYTHING you need to know to set this up. We'll go from zero to having your own personal AI coding assistant that runs entirely on your machine. It’s a game-changer, honestly.
Step 1: Getting the Engine Running with Ollama
First things first, we need to install Ollama. Think of Ollama as a manager for your local AI models. It makes it ridiculously easy to download, run, & switch between different LLMs right from your terminal.
The installation is super straightforward. Just head over to the
Ollama website & download the installer for your operating system (macOS, Windows, or Linux).
Once it's installed, you can open up your terminal & pull your first model. This is like downloading an app for your AI assistant. Let's start with a couple of solid choices. I recommend getting a good all-around chat model & a model specifically for code.
A great starting point is Llama 3 8B for general chat & reasoning, & DeepSeek Coder 6.7B for code completion & generation. DeepSeek Coder is a fantastic model that's specifically trained for programming tasks.
To download them, just run these commands in your terminal: