Building a Retrieval-Augmented System for Real-World Use

One of the most important ideas I’ve been learning in my MSIS journey is this:

Large language models are powerful, but they don’t know your data.

That’s where Retrieval-Augmented Generation (RAG) comes in.

Instead of relying only on a model’s training, a RAG system retrieves relevant documents and feeds them into the model at runtime. This makes responses more accurate, grounded, and useful for real-world applications.

I built a prototype RAG system to better understand how this works under the hood.

What I explored:

Breaking documents into chunks for retrieval
Embeddings and similarity search
How context is selected before sending to the model
The tradeoff between chunk size, overlap, and accuracy
Why retrieval quality matters more than model size in many cases

Why this matters

In nonprofit and organizational settings, knowledge is often scattered:

Google Drive folders
PDFs
Internal documents
Reports and meeting notes

A RAG system can turn that into something searchable and actionable.

Instead of:

“Where is that report?”

You get:

“Here’s the answer, and here’s exactly where it came from.”

Where I see this going

I’m especially interested in applying this to:

Grant writing support
Policy and compliance documentation
Internal knowledge bases for small teams

This project helped me move beyond just using AI tools to understanding how they actually work.

And that shift—from user to builder—is where things start to get interesting.