How do we "Dedupe" Knowledge?

Some Background (For Context :): I’ve been stress testing the new RSS import feature. I have 500 blog posts in my RSS feed and I am working to import them into the Knowledge section of a pickaxe.

My web host limits concurrent connection to 100. So for 400 of the imported pages imports a one chunk that says “{message’: ‘Maximum concurrency allowed 100’}.

So I imported the feed again, this time it scrapped more pages successfully, but it still got errors and it scraped some pages again. After a few rounds of this, I have most of my blog posts imported, but I also have a bunch of duplicates.

Question: Is there an automated way to remove duplicates, or to not scrape URLs already in knowledge?

Hi, there is no automated way to remove duplicates. It could be a nice feature request.