Nepenthes generates random links that always point back to itself—the crawler downloads those new links. Nepenthes happily just returns more and more lists of links pointing back to itself.
DDOSing yourself to own the bots lol. Kinda joking but I wonder how well this runs once it’s been going for a few days
I bet there’s a sweet spot where you can add a delay to each but the crawler won’t give up. Kind of a reverse slowloris
Kind of a reverse slowloris
Oh I made that for my server because I noticed so many bots were probing the commonly exposed file directories. It’s nginx and a python server that just opens a connection and slowly sends out json text that looks like it has passwords and secrets until the reverse proxy closes the connection forcefully.
I’m almost certain you could get 80% of the functionality of this service in plain NGINX, maybe a tiny sprinkle of Lua for the randomness. Serving “static” content is cheap. Add a little rate limiting and I gotta imagine you could run this on a very shitty board for a long time.
deleted by creator
The real beauty of this is that he’s released it as code you can deploy on your sites. It’s not just a single website he owns that will quickly be blacklisted, it’s a tarpit you can put anywhere.
I also liked these snippets from their site:
“Lastly, optional Markov-babble can be added to the pages, to give the crawlers something to scrape up and train their LLMs on, hopefully accelerating model collapse.”
Let’s say you’ve got horsepower and bandwidth to burn, and just want to see these AI models burn. Nepenthes has what you need:
Don’t make any attempt to block crawlers with the IP stats. Put the delay times as low as you are comfortable with. Train a big Markov corpus and leave the Markov module enabled, set the maximum babble size to something big. In short, let them suck down as much bullshit as they have diskspace for and choke on it.
LMAO, rough week for tech bros
Tell me again how immanent awaked AI is.
They created the mandrill maze for Ai. Sick