8/13/2025

So, GPT-5 is Choking on Your PDFs. Here’s the Real Reason Why & How to Fix It.

Hey there. If you've landed here, you're probably pretty frustrated. You've got this shiny new AI, GPT-5, which is supposed to be the most advanced system yet, but when you try to give it a simple PDF, it just… fails.
Maybe it tells you it can't read the file. Maybe it analyzes the first two pages & then makes everything else up. Or maybe it just gives you a vague "unknown error" that sends you down a rabbit hole of reloading the page. Honestly, it’s infuriating. I've been there, and I've spent a ton of time digging into what's REALLY going on.
Turns out, it's not just you. There's a whole mess of reasons why this is happening, and it's a lot deeper than just a simple "bug." It's a combination of technical limitations, platform rules, & the messy nature of PDFs themselves.
So, let's get into it. I'm going to break down everything you need to know to fix this problem, from the quick first-aid fixes to the long-term strategies for making sure your documents get read, every single time.

Part 1: Your First-Aid Kit for PDF Upload Errors

Before we get into the nitty-gritty, let's try to get you a quick win. A lot of the time, the "unknown error" when uploading a PDF is caused by something simple on your end or a temporary glitch. Run through this checklist first.

1. Check Your Browser (Seriously)

This sounds almost TOO simple, but it’s the most common culprit. Modern web apps like ChatGPT are complex, & they can get tripped up by your browser's settings.
  • Update Your Browser: If you're running an old version of Chrome, Firefox, or Safari, update it. NOW. Compatibility issues are a huge source of errors.
  • Clear Your Cache & Cookies: Your browser stores data to load websites faster, but this data can get corrupted. Go into your browser settings, find "Clear browsing data," & wipe the cache & cookies. This gives ChatGPT a fresh start.
  • Disable Extensions: That ad blocker, grammar checker, or theme you installed? It could be interfering with ChatGPT's code. Try disabling all your browser extensions & then uploading the PDF again. If it works, you can turn them back on one by one to find the offender.
  • Try Incognito Mode: Opening an incognito or private window usually runs without extensions & a cleared cache. It's a quick way to test if the problem is with your browser's setup.

2. Inspect the PDF Itself

If the browser isn't the problem, the next suspect is the file you're trying to upload. GPT-5 can be surprisingly picky.
  • Is it HUGE? There’s a hard file size limit. While some sources say it's 10MB, the official platform limit for most files, including PDFs, is actually 512 MB. If your PDF is bigger than that, it's an automatic fail. Look for an online PDF compressor to shrink it down.
  • Is it Password-Protected? If your PDF requires a password to open, GPT-5 can't access it. It's that simple. You'll need to use a tool like Adobe Acrobat or an online PDF unlocker to remove the password protection before you upload it.
  • Is it Corrupted? Sometimes, a file just gets damaged during a download or transfer. If you can, try re-downloading the PDF from its original source. A corrupted file is unreadable to you & to an AI.

3. Check OpenAI's Status

Sometimes, the problem isn't you or your file—it's them. OpenAI's servers handle a mind-boggling amount of traffic, & they can get overwhelmed or have temporary outages.
  • Visit the OpenAI Status Page: Before you pull your hair out, check status.openai.com. They post real-time information about any ongoing issues. It’s not uncommon to see "degraded performance" for file uploads. If there's an issue, the only fix is to wait it out.
  • Clear Your Chat History: Some users have reported that clearing out old conversations in ChatGPT can help resolve weird, persistent bugs. It’s like tidying up its workspace.
If you’ve gone through this whole checklist & it’s STILL not working, then it’s time to move on to the deeper, more permanent solutions. The problem isn't a glitch; it's a limitation of the system itself.

Part 2: The Deep Dive - Why GPT-5 is Actually Failing

Okay, so the quick fixes didn't work. This is where most people give up. But if you understand why it's failing, you can build a workflow that almost never fails. The errors you're seeing aren't random; they're symptoms of very specific constraints built into the ChatGPT platform.
Here's the thing: the file upload limits aren't actually determined by the GPT-5 model itself. They are rules imposed at the platform level. This means the same constraints apply whether you’re using GPT-5, GPT-4o, or any other model in the interface.

The REAL Culprit: Token Limits & The Context Window

This is, without a doubt, the #1 reason GPT-5 "stops reading" your PDF. You might be aware of the file size limit (512 MB), but the token limit is FAR more important.
Here’s how it works: an AI doesn't "read" a document like a human. It breaks the text down into "tokens" (which are roughly words or parts of words). Every model has a maximum number of tokens it can hold in its "memory" at one time. This memory is called the context window.
And here are the context window sizes for GPT-5:
  • Free Users: 8,000 tokens
  • Plus Subscribers: 32,000 tokens
  • Pro Subscribers: 128,000 tokens
Let's put that in perspective. A single page of a text-heavy academic paper can be 500-800 tokens. If you're a free user, you could hit your 8K token limit with just 10-16 pages of a dense PDF. The model will literally load the first part of the document, max out its context window, & have ZERO knowledge of anything that comes after.
This is why you see it confidently answer questions about the first few pages & then completely hallucinate or ignore the rest. It's not being lazy; it's like its short-term memory is full. Your 50-page, 2 MB PDF might be well under the file size limit, but it could easily be 40,000 tokens long, making it impossible for a Plus user to process in one go.

Platform Quotas: The Annoying Overlords

On top of the context window, OpenAI has other limits to manage server load.
  • Upload Frequency: If you're a Plus user, you can upload up to 80 files every 3 hours. Free users get a measly 3 uploads per day. If you’re experimenting and uploading a lot, you might just be hitting a time-based wall.
  • Total Storage: Your account has a total storage limit of 10 GB. This includes all the files you've ever uploaded. If you hit this cap, you have to go back & delete old files before you can upload new ones.
  • Message Limits: Let's not forget the message caps. For GPT-5, Plus users are limited to 80 messages every 3 hours. This isn't directly related to the PDF reading, but it adds to the overall feeling of being constrained.

The PDF Format is Just… Awful for AI

Honestly, PDF is a terrible format for data extraction. It was designed for printing, to preserve a visual layout. It wasn't designed for machines to read. A PDF can contain all sorts of things that trip up an AI:
  • Scanned Images of Text: If your PDF is a scan of a physical document, it's just a picture of words. The AI can't read it unless it first runs Optical Character Recognition (OCR) to convert the image into actual text. ChatGPT has some OCR capability, but it's not perfect & can fail on low-quality scans.
  • Complex Formatting: Tables, columns, headers, footers, charts, & diagrams can completely confuse the AI's reading order. It might read straight across two columns, mixing up sentences, or it might skip over text embedded in a chart.
  • Embedded Fonts & Weird Encoding: Sometimes, the text in a PDF isn't stored in a straightforward way. This can lead to the AI extracting gibberish or missing text entirely.
So when you combine the strict token limits with the messy, inconsistent nature of PDFs, you have a perfect recipe for failure.

Part 3: The Ultimate PDF Workflow - How to Make it Work Every Time

Now that we know why it's failing, we can build a process that addresses the root causes. The goal is to stop fighting the system & instead give the AI the one thing it loves: clean, simple text.

The Golden Rule: Convert to Plain Text

This is the single most effective thing you can do. By converting your PDF to a plain text file (.txt), you solve MOST of the problems we've talked about.
  • It strips out all the confusing formatting (columns, tables, etc.).
  • It makes the file much smaller in terms of data.
  • It allows you to see EXACTLY what the AI will see.
How to do it:
  • Simple PDFs: For a PDF that is already text-based (you can click & drag to select the text), you can often just do a "Save As..." in Adobe Acrobat & choose "Text (Plain)." Or, you can just copy all the text & paste it into a text editor like Notepad (Windows) or TextEdit (Mac).
  • Scanned PDFs (The OCR Step): If your PDF is an image, you need an OCR tool. There are great free online options, or you can use features built into tools like Google Drive. Just upload the PDF to Google Drive, right-click, and choose "Open with > Google Docs." It will automatically perform OCR & give you the editable text. It's surprisingly accurate.

The "Chunking" Method for Long Documents

Okay, so you've converted your 100-page report to a text file. Great! But it's probably still way over the 32K token limit for a Plus user. The solution? You have to "chunk" it.
This means breaking the document down into smaller, logical sections that each fit within the context window.
  1. Split the Text: Don't just split it randomly. Break it up by chapter or section. This keeps the context within each chunk coherent. Create separate text files:
    1 chapter_1.txt
    ,
    1 chapter_2.txt
    , etc.
  2. Upload & Analyze in Pieces: Start a new conversation with GPT-5 for each chunk. For the first chunk, you can say something like: "I'm going to give you a large document in several parts. This is Part 1 of 5. Please read it, summarize the key findings, & wait for the next part. Do not make any overall conclusions until I tell you I've uploaded the final part. Here is Part 1:"
  3. Synthesize at the End: Once you've fed it all the parts, you can ask it to perform the final analysis based on everything it's learned in the conversation. "That was the final part. Now, considering all the parts I've given you, please [your original request here]."
It's more work, yes, but it's a reliable way to get around the token limits & ensure the entire document gets analyzed.

Part 4: When You Need Something More Reliable - The Business Case

Let's be honest. The workflow I just described is fine for a one-off research project. But if you're a business, you can't afford to waste time copy-pasting text files & hoping an AI gets it right.
Imagine you want to provide 24/7 customer support based on your product manuals. Or you want to give your team an internal chatbot that can instantly answer questions from your company's knowledge base. Are you going to ask your customers or your employees to "chunk" their questions into multiple prompts? Absolutely not.
This is where the limitations of a general-purpose tool like ChatGPT become a business liability. You need something built for the job.
This is exactly why we built Arsturn. It’s a platform designed from the ground up to solve this specific problem. Instead of fighting with a consumer-grade tool, Arsturn helps businesses create custom AI chatbots trained exclusively on their own data—including all those tricky PDFs, Word documents, & website content.
Here’s the thing: when you're dealing with lead generation, customer engagement, or internal training, you need reliability & accuracy above all else. With Arsturn, you can build a no-code AI chatbot that:
  • Ingests Your Documents Properly: It's designed to parse & understand business documents, not just snippets of conversation.
  • Provides Instant, Accurate Support: Your customers can ask a question & get an answer drawn directly from your knowledge base, 24/7. No more waiting for a human agent for simple queries.
  • Boosts Conversions & Engagement: By providing immediate, personalized answers on your website, you can engage visitors, answer their sales questions, & guide them toward making a purchase. It's like having a perfect sales assistant always on duty.
A tool like GPT-5 is an amazing, general-purpose powerhouse. But for a business that needs to build a meaningful, reliable connection with its audience through conversational AI, you need a specialized solution.

Hope this was helpful!

I know how maddening it can be when a tool doesn't work the way it's supposed to. The key with GPT-5 & PDFs is to understand its limitations—the token windows, the platform quotas, & the messiness of the PDF format—and work with them, not against them.
For your personal projects, the convert-to-text & chunking method is your best bet. For any serious business application, you'll save yourself a world of headaches by using a tool that's actually designed for the task.
Let me know what you think. Have you found any other workarounds? I'm always curious to hear what's working for people.

Copyright © Arsturn 2025