I was getting inconsistent answers from my knowledge base bot and it was driving me nuts. Here is what actually fixed it:
1. Smaller chunks.
I was uploading entire documents. Broke them into sections of 300-500 words each and accuracy jumped immediately. The retrieval system works better when it can grab a focused chunk instead of scanning a 10-page PDF for the one paragraph that matters.
2. Clear headers.
The RAG system uses headers to find relevant chunks. If your doc is one long block of text, it cannot find what it needs. Add clear section headers that describe what each block covers. Think of it like labeling folders, the clearer the label, the faster the system finds the right one.
3. Test with real questions.
I was testing with the questions I expected. Started testing with the weird questions my actual users ask and found all the gaps. A question like “what is your refund policy” is easy. But “can I get my money back if I only used it once” is what real people actually type.
Bonus tip: if you are getting answers that are close but not quite right, check your relevance cutoff settings. The difference between 0.3 and 0.5 is wild. Lower means more results but potentially less relevant. Higher means fewer but more precise matches. I have been testing different values this week and landing around 0.4 for most use cases.
Anyone else building knowledge base bots? What has been your biggest accuracy challenge? Would love to compare notes.
We dig into this kind of stuff regularly in the Pickaxe Pro Builders community if you want to go deeper.