What’s the best practice for setting up a “dynamic” knowledge base that updates from an external source?

  • A way to keep my Pickaxe KB in sync automatically whenever I update the Google Doc
  • Ideally, I’d love to just point Pickaxe at a Doc URL or use a webhook/API so the system pulls the latest content on a schedule
  • If anyone has built a pipeline or script (e.g., using the Google Docs API, Apps Script, or Pickaxe ingestion API) to automate this, I’d greatly appreciate examples or guidance
1 Like

Currently, with no API access from Pickaxe, one efficient way would be to create database in Notion > publish the Notion DB to a custom domain > add the published Notion DB URL to your Pickaxe’ DB.

2 Likes

Is this something that will be implemented in built in the months to come?

I’ve tried this, and had a similar post about it last week. It unfortunately doesn’t work. Nor with publishing links from Craft docs.

1 Like

There is not an easy way to add a file to the Knowledge Base that is automatically updated/re-indexed.

However some users have their own external database (for example, a CRM on airtable or a proprietary mongoDB database) and they will connect it as an Action. That means the Pickaxe will query the database anew each time the Action is triggered. It will not be accessing the database via our Knowledge Base system (which uses RAG) but just directly through your Action.

5 Likes

Thank you all I will look into this

Another way would be a timed Make scenario that checks on your DB (Airtable or Google Sheet) whenever a user input comes through as a payload through a Make webhook.

2 Likes

Hey Ned - Great responses in this thread. Out of curiosity…

  1. Has PickAxe updated its systems since you responded here to allow for more production-ready, built-in dynamic knowledge systems that are always updated rather than pulling from a cached version?

  2. If I were pointing a pickaxe to a database, would you recommend pointing it to a Notion database or a Airtable database, or it doesn’t really matter because PickAxe will essentially treat them the same.

1 Like

Hey @michaeldsimmons long time. Nice to see you on here !

  1. Pickaxe has not directly updated the system since I last responded here. However, they did ship Studio API + Make modules that can do the job!

I think a combo of the new Pickaxe Make modules + Studio API is the best workaround for the deprecated native dynamic KB feature from V1.

  1. I recommend you connect to an Airtable database - however, since you’re using Make, both Notion and Airtable have Make modules, and since the data will be returning from Make, Pickaxe will essentially treat them the same. Does that make sense?

Good news! @deepthin @Jim @michaeldsimmons

New Dynamic KB integrations are coming to Pickaxe!

2 Likes

Duuuude! :partying_face:

The timing on this can’t be better for me!

Is there more info on this anywhere?

Edit: Nvm, found it :+1:

1 Like

Just to add, all website knowledge base docs now get updated every day if used in the last 7 days.

1 Like

Thanks @stephenasuncion . Are there more details on how Notion can be used as a KB source? Which of these examples would be best?

  1. I have a Notion database of links that I want my agent to go out and reference before performing a Perplexity search

  2. I have a database of Notion pages with the KB info directly in those pages, with fields for links and want my agent to ref the page content, then links, then perform a search

etc…?

TIA

I think this is a great conversation and in important topic. Dynamic knowledge base updates would be quite a game changer! Have you seen this project called LEANN that only uses HNSW index to quickly find relevant documents and then does RAG-type embeddings ont he fly for relevant documents on-demand… maybe that’s an approach to interfacing with external knowledgebases?

@stephenasuncion does this mean that if a user adds a URL as a KB document, the website will be scraped again by the Pickaxe KB if it is called upon within the last 7 days?

We currently retrieve the entire page as raw text and split it into chunks. For databases, each row is treated as a separate chunk. Each chunk is then converted into an embedding vector, which is compared against the user’s input.

To answer your question, both examples will work… it just depends on whether they’re relevant enough compared to the user input and how it was formatted. That’s why testing your knowledge base documents through the Knowledge Explorer is really helpful. Check out: https://youtu.be/SlWfzz9NRo4?si=muqkVPoLJWN3Wgcc&t=54

1 Like

Yep it will be scraped again, once a day… as long as it was used/called in the last 7 days.

1 Like