Use a Local LLM to Generate Better Stable Diffusion Prompts

8/11/2025

Create Better AI Art: How to Use a Local LLM to Generate Stable Diffusion Prompts

Hey everyone, let's talk about something that's been a game-changer for my AI art workflow: using a local Large Language Model (LLM) to write my Stable Diffusion prompts. If you've ever found yourself staring at a blank prompt box, wondering how to translate the epic scene in your head into words that an AI will understand, then this is for you.

Honestly, learning to write good prompts can be a bit of a slog. It’s not just about describing what you want; it’s about understanding the weird, specific language that image generation models like Stable Diffusion speak. You've got to think about the subject, sure, but also the style, the lighting, the camera angle, the composition, the level of detail... it's a lot.

Turns out, you can offload a huge chunk of that mental effort to another AI. By using a local LLM, you can give it a simple idea, and it'll spit back a beautifully detailed, complex prompt that's ready to go. It's like having a creative partner who's an expert in prompt engineering. And the best part? It all runs on your own machine, so it's private & completely free.

Why Bother with a Local LLM?

So, why go through the trouble of setting up a local LLM when you could just use something like ChatGPT online? There are a few pretty compelling reasons.

First off, privacy. Let's be real, sometimes you might be generating images that are a little... out there. Maybe you're exploring some niche artistic styles or creating character concepts that you're not ready to share with the world. When you use a local LLM, none of your prompts ever leave your computer. It's your own private creative sandbox.

Second, it's completely free. Once you have the models downloaded, you can generate prompts to your heart's content without worrying about API costs or subscription fees. For a hobby that can already get expensive with GPU upgrades, free is a VERY good price.

Third, it's about control & customization. You can choose from a huge variety of open-source LLMs, each with its own personality & strengths. You can even fine-tune a model on your own data, like your favorite prompts from the past, to create a generator that perfectly matches your style.

And finally, it's just plain cool. There's something incredibly satisfying about having your own private AI ecosystem running on your machine, with models talking to each other to create art. It feels like living in the future.

The Tools of the Trade

Before we dive in, let's get familiar with the key pieces of software we'll be using.

Stable Diffusion Web UI (Automatic1111 or Forge): This is the classic, go-to interface for running Stable Diffusion. It's incredibly powerful, has a massive community, & supports a ton of extensions that add new features. We'll look at a couple of extensions that make it easy to integrate an LLM.
ComfyUI: This is a more advanced, node-based interface for Stable Diffusion. It looks a little intimidating at first, but it offers unparalleled flexibility & control over your image generation pipeline. We'll explore a workflow that uses a local LLM within ComfyUI.
Ollama: This is a fantastic tool that makes it ridiculously easy to download & run a wide variety of open-source LLMs locally. It handles all the complicated setup for you & lets you interact with the models through a simple command-line interface or an API.

Getting Started: The Easy Way with Automatic1111

If you're new to this whole concept, the easiest way to get your feet wet is by using an extension in Automatic1111. Let's look at a couple of options.

Using the GPT-2 Prompt Generator Extension

This is probably the simplest method of all. GPT-2 is an older model, but it's small, runs easily on most systems, & has been specifically fine-tuned on Stable Diffusion prompts. This means it's pretty good at expanding a simple idea into a more detailed prompt.

Installation: In your Automatic1111 WebUI, go to the "Extensions" tab, then the "Available" sub-tab. Click the "Load from:" button & search for "GPT2" or "prompt generator". It should pop right up. Click "Install" & then reload your UI.
Usage: Once it's installed, you'll see a new "Prompt Generator" tab. The interface is pretty straightforward. You can just type a starting phrase, like "a photo of an astronaut," into the prompt box.
Settings: You'll have a few options to play with.
- Temperature: This controls how creative or "random" the model is. Higher values mean more unexpected results.
- Top K: This is another way to control the randomness.
- Repetition Penalty: This discourages the model from repeating the same words or phrases.
- Max Length: This sets the maximum number of tokens (words or parts of words) in the generated prompt.
- Punctuation: I'd recommend checking the "use punctuation" box, as it makes the output much easier to read.

Once you've got your settings dialed in, just click the generate button. The extension will spit out a few suggestions, and you can send your favorite one over to the main txt2img tab to generate your image.

Upping the Ante with a ChatGPT Extension

If you want a bit more power than GPT-2 can offer, you can use an extension that connects to a more powerful model like ChatGPT. Now, this method isn't technically fully local if you're using the OpenAI API, but it's integrated right into your workflow.

Installation: This extension usually isn't in the official list, so you'll have to install it from a URL. Go to the "Extensions" tab, then "Install from URL". Paste in the GitHub repository URL for the ChatGPT utilities extension (a quick search will find it). Install it & reload the UI.
API Key: You'll need an OpenAI API key for this to work. You can get one from the OpenAI website. Once you have it, go to the "Settings" tab in your WebUI, find the "ChatGPT Utilities" section, and paste in your key.
Usage: This extension works a bit differently. It's a script that you select from the "Script" dropdown at the bottom of the txt2img or img2img tabs. It comes with several pre-made templates to do things like expand your prompt, add color suggestions, or even just improve what you've already written. You can even see what the LLM generated in the console or below your final image, which is great for learning.

This is a great middle-ground. You get the power of a state-of-the-art LLM with the convenience of it being right inside your Stable Diffusion interface.

The Pro-Level Workflow: ComfyUI & Ollama

Alright, now for the really fun stuff. If you're ready to dive into the deep end and have ultimate control, this is the setup for you. We're going to use Ollama to run a powerful local LLM and then pipe its output into ComfyUI.

This might seem complicated, but it's actually a very logical workflow. Ollama will act as our "prompt brain," and ComfyUI will be our "image artist."

Step 1: Installing & Running a Local LLM with Ollama

Ollama is a dream come true for anyone who's been intimidated by the process of setting up local LLMs.

Download Ollama: Head over to the Ollama website and download the installer for your operating system.
Install a Model: Once Ollama is running, open up your terminal or command prompt. To download and run a model, you just need one simple command. For our purposes, a great model to start with is specifically designed for Stable Diffusion prompts. A user on Medium recommended
1brxce/stable-diffusion-prompt-generator
, which is based on the powerful Mistral model. To run it, just type:
1ollama run brxce/stable-diffusion-prompt-generator
The first time you do this, it will download the model, which might take a few minutes. After that, it will load it up and you'll be dropped into a chat-like interface.
Test it out: Try giving it a simple prompt, like "a knight in a dark forest." It should come back with a much more detailed and evocative prompt, ready for Stable Diffusion.

Another great option recommended by Reddit users is

qwen2.5:3b

. It's a small but very capable model. A user shared a fantastic template for getting great results from it. You can send it a prompt like this:

"I have a basic prompt for generating stable diffusion images and I need it to be enhanced for better visual results. The original prompt is: ''. Can you refine and expand this prompt to be more descriptive and evocative, aiming for a word count between 40 and 60 words? Return enhanced prompt in square brackets e.g.[enhanced prompt]. Your response should begin with "[An image of" and end with "]"."

This level of instruction is how you get the BEST results from an LLM. You're not just telling it what you want, you're telling it how you want it.

Step 2: Setting up the ComfyUI Workflow

Now that we have our LLM running, we need to connect it to ComfyUI. A great tutorial from SDXL Turbo AI outlines a powerful workflow using a Python script to bridge the gap.

The basic idea is this:

You have a Python script that defines the core elements of your scene (subject, mood, physical aspects).
This script sends those elements to your local LLM (running via Ollama's API).
The LLM uses those elements to generate a detailed "scene prompt."
The Python script then takes that generated prompt and feeds it into your ComfyUI workflow.
ComfyUI generates the image.

This might sound like a lot of moving parts, but it's incredibly powerful. You can create different Python scripts for different types of images. For example, one for generating character portraits, another for sweeping landscapes.

The ComfyUI workflow itself would be based on a JSON file. This file defines all the nodes and connections. You'd have a node for loading your checkpoint model, a node for your positive and negative prompts, a sampler node, and so on. The key is that the "positive prompt" node would be filled in by the output from your Python script.

This kind of setup separates the creative ideation from the technical image generation. You can focus on the high-level concepts in your Python script and let the LLM and ComfyUI handle the nitty-gritty details.

The Power of AI in Business Communication

It's pretty amazing to see how different AI models can work together to create something new. This same principle of using AI to handle complex, time-consuming tasks is revolutionizing the business world, too. Think about customer service. Just like we're using an LLM to generate prompts, businesses are using AI chatbots to handle customer inquiries.

For a lot of companies, building and training these kinds of tools sounds just as complicated as setting up a ComfyUI workflow. But it doesn't have to be. This is where a platform like Arsturn comes in. Arsturn helps businesses create their own custom AI chatbots, trained on their specific data, without needing to write a single line of code. These chatbots can be added to a website to provide instant customer support, answer questions about products or services, and engage with visitors 24/7. It's all about using AI to automate the repetitive stuff so that humans can focus on the more creative, high-level work.

Just as a local LLM can free you from the tedium of prompt writing, an AI chatbot can free up a support team to handle the truly complex customer issues that require a human touch.

Fine-Tuning Your Own Prompt Generator

If you're feeling REALLY adventurous, you can even fine-tune your own LLM to become the ultimate prompt generator, perfectly tailored to your unique style. The Medium article on this topic mentioned that you could train a GPT-2 model on your own datasets.

Imagine you have a folder with hundreds of your favorite, most successful prompts that you've generated over the months. You could use that data to fine-tune a base model like GPT-2. The fine-tuned model would learn the specific patterns, keywords, and structures that you prefer.

The process is a bit too involved to detail here, but it generally involves:

Gathering your data: You'd need a clean dataset of text files, with each file containing one of your prompts.
Choosing a base model: A smaller model like GPT-2 or a 3-billion-parameter version of Qwen or Llama is a good place to start.
Using a training script: There are many open-source scripts available that can help you fine-tune a model on your custom data.

This is definitely an advanced topic, but it's the pinnacle of a customized workflow. You'd have your own personal AI assistant that knows your style inside and out.

Putting It All Together: A Final Word

So, there you have it. A deep dive into using a local LLM to supercharge your Stable Diffusion prompting. We've gone from the simple, beginner-friendly extensions in Automatic1111 to a sophisticated, highly customizable workflow in ComfyUI with Ollama.

The key takeaway here is that you don't have to be a prompt engineering genius to create amazing AI art. By leveraging the power of a local LLM, you can automate the most tedious parts of the process and spend more of your time on what really matters: your creative vision.

And honestly, this is just the beginning. The world of open-source AI is moving at an incredible pace. The models are getting smarter, the tools are getting easier to use, and the possibilities are expanding every single day.

For businesses looking to harness this kind of power for their own needs, the barrier to entry is also getting lower. Platforms like Arsturn are making it possible for any company to build a no-code AI chatbot trained on their own data. This helps them boost conversions and provide personalized customer experiences, building meaningful connections with their audience. It's the same core idea: using AI to create a more efficient and personalized experience.

I hope this was helpful! It's a really exciting space, and I'd love to hear what you think. Let me know if you give any of these methods a try. Happy generating!