We’ve rolled out a new update to the Token Length section in the Pickaxe Builder, giving you much more visibility and control over how your AI tool uses its context window.
Now, the Token Length section includes a Token Allocation Breakdown that shows exactly how your agent’s context window is being used.
This breakdown gives you a clear view of how tokens are distributed across input length, output length, prompt, user memories, uploaded documents, knowledge base, and the memory buffer.
You can also see:
The percentage of the model’s context window each section is consuming
The estimated maximum cost per interaction
How much of the model’s total context capacity is being used at a glance
In addition, you can now directly adjust token allocations for User Memories, Uploaded Documents, and the Knowledge Base, giving you more control over how context is prioritized in your Pickaxe.
This makes it much easier to understand why a Pickaxe behaves the way it does, tune performance, and balance cost versus context depth, especially for more advanced or document-heavy tools.
You’ll find this in the Configure tab of the Pickaxe Builder under Token Lengths. As always, we’d love to hear your feedback or suggestions for further improvements.
Hello, could you tell me a bit more about Context Allocation? I would like to understand how it works and what actually changes when I allocate more or fewer tokens.
I just want to get a clearer picture of how adjusting this affects the bot’s behavior, memory, and responses, so I can set it up correctly. Thanks in advance!
Thnx for asking. It’s something many people new to AI wonder about.
The simplest way to think about Context Allocation is that it controls how much the bot can “keep in mind” at one time. The AI has a limited working space when it generates a reply. That space is called the context.
Tokens are just a unit that measures how much text fits into that space. You don’t need to worry about the technical details. More tokens = more room.
When you allocate more tokens to something:
The bot can remember and reference more information at once
Conversations feel more consistent and aware
It’s especially helpful for longer chats or tools that rely on memory or documents
Main trade-off is that higher token allocations canincrease cost and sometimes make responses a bitslower or less focused if too much information is competing for attention.
When you allocate fewer tokens:
Older details drop out sooner
The bot focuses more on the most recent message
Responses are often shorter and faster
In everyday terms:
More memory tokens help the bot remember user preferences and past details
More document or knowledge tokens help it use more of your uploaded content at the same time
Lower allocations can reduce cost, but with less depth
You’re not changing how “smart” the bot is. You’re simply deciding what it should pay attention to most.
If you’re just getting started, the default settings work well for most cases. You can always fine-tune later as your tool grows. If you’d like help choosing settings for a specific use case, I’m happy to help.