8/12/2025

How to Build Async MCP Tools That Don't Block Your Entire Workflow

Hey everyone, let's talk about something that drives developers NUTS. You've got this awesome workflow going, your AI agent is humming along, and then… everything grinds to a halt. The UI freezes. The whole process is stuck, waiting for some long-running task to finish. It’s like hitting a massive traffic jam on the information superhighway, & honestly, it’s a total buzzkill.

Turns out, the culprit is often a "blocking" operation. This is when a piece of code, like a tool your AI is using, has to wait for something to finish before it can move on. Think about it – waiting for a big file to download, a database to return a query, or an external API to respond. While your code is waiting, it’s not doing anything else. It's just… stuck. This is a HUGE problem, especially when you're trying to build responsive & efficient applications.

But here's the thing: it doesn't have to be this way. We can build tools that work asynchronously. This is a fancy way of saying they can start a task, and then let the rest of your program continue doing other things while that task runs in the background. When it’s done, it lets you know. No more traffic jams.

In this guide, I’m going to walk you through how to build async MCP (Model Context Protocol) tools that DON’T block your entire workflow. We’ll get into what MCP is, why async is so dang important, & then we’ll get our hands dirty with some actual code.

First Off, What's this "MCP" Thing?

Before we dive into the async magic, let's quickly touch on what MCP even is. The Model Context Protocol (MCP) is a standardized way for large language models (LLMs) to interact with external systems. Think of it as a universal remote for your AI. Instead of building custom integrations for every single tool or API you want your LLM to use, you can expose them through MCP. This makes it WAY easier to give your AI new capabilities, like searching databases, calling APIs, or even sending emails.

FastMCP is a popular Python framework for building these MCP servers & clients. It’s super intuitive & lets you create tools from your Python functions with very little boilerplate. And the best part? It has first-class support for asynchronous operations.

The Big, Annoying Problem: Blocking I/O

So, why is "blocking" such a dirty word in modern development? It all comes down to I/O, or Input/Output operations. Most of the "waiting" our applications do is I/O-bound. Here are some common examples:

Network requests: Calling an external API to get weather data, stock prices, or user information.
Database queries: Fetching or writing data to a database.
File operations: Reading from or writing to a large file on disk.

In a traditional, synchronous world, when your code makes one of these requests, the thread executing that code just stops & waits. It's blocked. If you’re running a web server, that thread can't handle any other incoming requests. If it’s a desktop app, the UI freezes. It’s incredibly inefficient. This is especially true for I/O-heavy applications, where you might have tons of these operations happening at once.

Imagine a customer service chatbot on a website. If a user asks a question that requires looking up their order in a database, a blocking approach would mean the chatbot is completely unresponsive to other users until that database query finishes. Not a great experience, right? This is where a platform like Arsturn becomes so valuable. By using no-code AI chatbots trained on your own data, you can provide instant answers 24/7. And because it's built with modern, non-blocking architecture in mind, it can handle thousands of conversations at once without breaking a sweat.

The Fix: Asynchronous Programming & Non-Blocking I/O

Asynchronous programming is the hero of our story. It allows us to perform these I/O operations without blocking the main thread. Here’s the gist of how it works.

When you make an async I/O request, instead of the thread waiting around, it hands off the request to the operating system or runtime. Then, the thread is immediately freed up to do other work. It can go handle another user's request, update the UI, or whatever else needs to be done.

Once the I/O operation is complete (the file has been read, the API has responded), the operating system notifies your application. A thread from the thread pool then picks up where it left off & continues executing your code. It’s like ordering food at a restaurant with a buzzer. You place your order, they give you a buzzer, & you can go sit down & chat with your friends instead of standing at the counter waiting. When your food is ready, the buzzer goes off, & you go pick it up. That's non-blocking I/O in a nutshell!

This approach has some HUGE advantages:

Improved Responsiveness: Your application stays responsive & doesn't freeze, even when performing long-running tasks.
Increased Scalability: A single thread can handle many concurrent operations, meaning you can serve more users with less hardware.
Better Resource Utilization: Your CPU isn't sitting idle while waiting for I/O. It's always busy doing useful work.

Let's Build Some Async MCP Tools!

Alright, enough theory. Let's get to the fun part: building our own non-blocking MCP tools using FastMCP. It's surprisingly easy.

Setting Up Your Environment

First, you’ll need to install FastMCP. You can do this with a simple pip command: