How to Analyze Your Manuscript with a Private, Local LLM
Z
Zack Saadioui
8/12/2025
So, you’ve poured your heart & soul into a book manuscript. It’s your baby. You know it inside & out. Or do you?
Here's the thing: after months, or even years, of writing & rewriting, it’s SO easy to lose perspective. You're too close to it. You can't see the forest for the trees. Are the character arcs consistent? Is the pacing off in the second act? Does that plot twist really land, or is it a bit of a dud?
Traditionally, you'd rely on beta readers, critique partners, or a developmental editor to get this kind of feedback. & while those are all still SUPER valuable, there’s a new player in town that can give you a completely different kind of insight: a local Large Language Model (LLM).
Imagine having a private, super-intelligent assistant that has read your entire book in seconds & can answer any question you have about it. That's what we're talking about here. Not just a fancy spell checker, but a tool for deep analysis that runs entirely on your own computer, keeping your precious manuscript safe & sound.
It might sound like something out of science fiction, but honestly, it’s more achievable than you might think. We're going to walk through how you can feed your book manuscript to a local LLM for analysis. It’s a bit of a process, but pretty cool once you get it going.
Why Go Local? The Big Deal About Privacy
First up, why a local LLM? Why not just paste your chapters into ChatGPT?
Two words: privacy & control.
Your manuscript is your intellectual property. Uploading it to a cloud-based AI service means you're sending your work to a third-party server. While companies have their privacy policies, for many writers, that’s a non-starter. A local LLM runs entirely on your own machine. Your manuscript never leaves your hard drive. It’s 100% private.
Plus, running it locally means no subscription fees & no reliance on an internet connection. It’s your own personal AI, customized for your needs.
The Two Main Paths: RAG vs. Fine-Tuning
Okay, so how do we actually get the LLM to "read" your book? There are two main methods: Retrieval-Augmented Generation (RAG) & fine-tuning.
Let's break them down because they're pretty different.
Fine-Tuning: Think of this like teaching the LLM a new skill or style. You'd take a base model & "retrain" it on your manuscript. The goal here isn't just to make it aware of the content, but to make it sound like you. It’s about absorbing your unique voice, tone, & writing style.
Pros: Can be amazing for generating text in your specific style or for creating a chatbot that sounds just like one of your characters. It truly internalizes the information.
Cons: It's technically complex, requires a TON of computational power (think beefy, expensive GPUs), & can be overkill if you just want to analyze your existing text. There's also a risk of "catastrophic forgetting," where the model gets so good at your style that it forgets how to do other things well.
Retrieval-Augmented Generation (RAG): This is the method we're going to focus on, & for good reason. RAG is like giving the LLM an open-book test. You're not changing the model itself. Instead, you're providing it with your manuscript as a knowledge base that it can pull from in real-time.
Pros: MUCH easier to set up. It's fantastic for "chatting with your document." You can ask specific questions about the plot, characters, or themes, & the LLM will retrieve the relevant passages to form its answer. It's also less resource-intensive & you can easily update the knowledge base without retraining anything.
Cons: The LLM won't necessarily adopt your writing style. It's more of an expert on your book's content than an imitator of its style.
For most authors who want to analyze their manuscript, RAG is the way to go. It's the most practical & effective way to get started.
What You'll Need: The Hardware
Alright, let's talk about the gear. Running an LLM locally isn’t something you can do on a ten-year-old laptop. It's computationally intensive, so you need a decent machine.
GPU (Graphics Processing Unit): This is the MOST important component. LLMs rely heavily on the parallel processing power of GPUs. NVIDIA cards are generally preferred because of their CUDA technology, which is the standard for AI work. You'll want a card with at least 8GB of VRAM (Video RAM). Honestly, for a smoother experience, 12GB or even 16GB is better. The NVIDIA RTX 3060 (12GB) is a good budget-friendly option, while something like the RTX 4060 Ti (16GB) gives you more breathing room.
RAM (System Memory): 16GB of system RAM is the absolute minimum. 32GB is a much more comfortable target.
CPU (Central Processing Unit): A modern processor with support for AVX2 is a must. Most CPUs from the last several years will have this. Think Intel Core i9 or AMD Ryzen 9 for strong performance.
Storage: A fast SSD or NVMe drive will make a big difference in how quickly models & documents load. You'll also need a good chunk of free space, as some of these models can be several gigabytes in size.
If you have a recent Apple computer with an M-series chip (M1, M2, M3), you're in a good spot. Their unified memory architecture is actually really efficient for this kind of work, & you can often get away with less VRAM than you'd need on a PC.
The Step-by-Step Guide to Analyzing Your Manuscript
Ready to dive in? Here's how to get this all set up using a popular & relatively user-friendly stack: Ollama & Open WebUI.
Step 1: Install Ollama
Ollama is a fantastic tool that makes it incredibly easy to download, manage, & run open-source LLMs on your computer.
Download the installer for your operating system (macOS, Linux, or Windows).
Run the installer. It will set up Ollama as a background service on your machine.
To check that it's working, open your terminal (or Command Prompt on Windows) & type:
1
ollama --version
. You should see the version number pop up.
Step 2: Download a Language Model
Now for the fun part. You need to choose an LLM to run. There are tons of great open-source options available through Ollama. For analyzing a manuscript, a model that's good at instruction following & reasoning is a great choice.
A few solid starting points are:
Llama 3: Meta's latest open-source model. It's a fantastic all-rounder.
Mistral: Known for being very fast & efficient.
Gemma: Google's family of open-source models.
Let's grab Llama 3. In your terminal, run this command:
1
ollama pull llama3
This will download the Llama 3 model to your computer. It might take a little while depending on your internet speed, as the file is a few gigs. You can browse all the available models on the Ollama library page.
Step 3: Set Up a User-Friendly Interface with Open WebUI
While you can chat with your model directly in the terminal, it's not the best experience. A web interface makes everything SO much easier. Open WebUI is a great-looking, open-source chat interface that works perfectly with Ollama.
The easiest way to get this running is with Docker. If you don't have Docker Desktop installed, you'll need to grab it from the Docker website first.
Once Docker is installed & running on your machine, open your terminal & run this single command:
It looks a bit intimidating, but it's just telling Docker to download the Open WebUI image, run it in the background, & make it accessible on your computer.
After it's finished, open your web browser & go to
1
http://localhost:3000
.
You'll be greeted with a sign-up screen. Just create a local account, & you're in! You now have your own private, self-hosted version of a ChatGPT-like interface.
Step 4: Prepare Your Manuscript (The "Chunking" Part)
This is a CRUCIAL step. You can't just feed a 300-page manuscript to an LLM all at once. It's too much information for the model's "context window" (its short-term memory). We need to break it down into smaller, digestible pieces. This process is called chunking.
The goal is to create chunks that are small enough to be processed but large enough to contain meaningful context.
Here are a few ways you can approach this:
Fixed-Size Chunking: The simplest method. You just decide on a chunk size (say, 500 words) & split the text. The downside is that you might split a sentence or a key idea right in the middle.
Paragraph or Sentence Splitting: A better approach is to split the text by paragraphs or sentences. This keeps related ideas together. Libraries like NLTK or spaCy in Python can do this automatically.
Recursive Chunking: This is a smarter method. You try to split by paragraphs first. If a paragraph is still too big, you then split it by sentences, & so on. This maintains the semantic structure of your text as much as possible.
For our purposes, let's keep it simple. The easiest way to start is to save your manuscript as a plain text (.txt) file. You can then manually (or with a simple script) split it into smaller files, maybe one file per chapter, or even split chapters into a few smaller text files.
The key is to have your manuscript in a format that the WebUI can digest. Open WebUI supports formats like .txt, .pdf, & .doc, so you have some flexibility.
Step 5: Load Your Manuscript & Start Analyzing!
This is where it all comes together.
In the Open WebUI interface, you'll see a chat input field at the bottom.
Next to it, there's a paperclip icon or a '+' button. Click on that & select "Upload Files."
Upload your manuscript file (or one of your chunked files).
The WebUI will process the document, using the RAG method we talked about. It's now part of the LLM's accessible knowledge.
Now, you can start asking questions! Treat the LLM as your personal research assistant who has memorized your book. Here are some ideas to get you started:
Character Analysis:
"Trace the character arc of [Character Name] throughout the manuscript. Where are the key turning points for them?"
"What is [Character Name]'s primary motivation? Provide quotes from the text to support your answer."
"Are there any inconsistencies in how [Character Name] is portrayed?"
Plot & Structure:
"Summarize the main plot points of Chapter 5."
"Where does the inciting incident occur?"
"Identify any potential plot holes or unresolved threads."
"How is the theme of 'betrayal' explored in the second act?"
Pacing & Flow:
"Is the pacing in the first three chapters too fast or too slow?"
"Identify sections where the narrative momentum seems to lag."
Dialogue & Voice:
"Analyze the dialogue of [Character A] versus [Character B]. Are their voices distinct enough?"
"Find all instances of passive voice in the manuscript."
The possibilities are pretty much endless. You can get incredibly granular. The more specific your questions, the more insightful the answers will be.
Taking It a Step Further: Building a Custom Author Assistant
Once you get comfortable with this setup, you can start thinking about even cooler applications. For instance, you could create a dedicated chatbot for your author brand. Imagine having a chatbot on your website that can answer fan questions about your books, characters, & world-building.
This is where a platform like Arsturn comes in. Arsturn helps businesses create custom AI chatbots trained on their own data. You could feed it all your books, author interviews, & world-building notes. Then, Arsturn could power a chatbot on your site to engage with your readers 24/7, providing instant, accurate answers about your fictional universe. It's a fantastic way to build a deeper connection with your audience. You could even use it to automate parts of your business, like answering common questions about book availability or upcoming releases, freeing you up to do what you do best: write.
A Few Final Thoughts
Look, setting up a local LLM isn't a one-click process. It takes a bit of technical wrangling. But the power & privacy it gives you as an author is, in my opinion, COMPLETELY worth it.
You get a tireless, objective partner that can help you see your manuscript in a whole new light. It won't replace the need for human editors & beta readers—their subjective, emotional feedback is priceless. But as a tool for deep, analytical, & private textual analysis, it’s a game-changer.
So, if you've got a manuscript & a reasonably powerful computer, give it a shot. You might be surprised by what you discover in your own words.
Hope this was helpful! Let me know what you think.