Crawl Control: The Cookie Conundrum

I set the cookie, but I’m not the one crumbling!

Interactive Fiction Techno-Thriller Text Game

In 'Crawl Control: The Cookie Conundrum', you play as a beleaguered website administrator who must defend their forgejo instance from relentless web crawlers threatening to take down your site. As proxy wars break out in the digital realm, you will navigate a series of high-stakes decisions, balancing user experience against security measures. Your choices will shape the infrastructure of your virtual fortress – from cookies, redirects, and Javascript traps, all while dodging snooping AIs and hormone-driven hackers. Can you outsmart the virtual vultures before they bring your hard work crashing down?

';\n }\n # rest of your nginx config\n\n```\n\n```\nlocation / {\n # needed to still allow git clone from http/https URLs\n if ($http_user_agent ~* \"git/|git-lfs/\") {\n set $bypass_cookie 1;\n }\n # If we see the expected cookie; we could also bypass the blocker page\n if ($cookie_Yogsototh_opens_the_door = \"1\") {\n set $bypass_cookie 1;\n }\n # Redirect to 418 if neither condition is met\n if ($bypass_cookie != 1) {\n add_header Content-Type text/html always;\n return 418 '';\n }\n # rest of your nginx config\n\n```\nlocation / {\n # needed to still allow git clone from http/https URLs\n if ($http_user_agent ~* \"git/|git-lfs/\") {\n set $bypass_cookie 1;\n }\n # If we see the expected cookie; we could also bypass the blocker page\n if ($cookie_Yogsototh_opens_the_door = \"1\") {\n set $bypass_cookie 1;\n }\n # Redirect to 418 if neither condition is met\n if ($bypass_cookie != 1) {\n add_header Content-Type text/html always;\n return 418 '';\n }\n # rest of your nginx config\n\nPreferably run a string replace from \n```\nYogsototh_opens_the_door\n```\nYogsototh_opens_the_door to your own personal Cookie name.\n\nMain advantage, is that it is almost invisible to the users of my website compartively to other solutions like Anubis.\n\n## More detail\n\nNot so long ago I started to host my code to forgejo. There is a promise that in the future it will support federation and forgejo is the same project that is used for codeberg.\n\nThe only problem I had was one day, I discovered that my entire node was down. At first I didn't investigate and just restarted the node. But soon after a few hours, it was down again. Looking at the reason, clearly thousands of requests that looked at every commit which put too much pressure on the system. Who could be so interested in using the web API to look at every commit instead, of… you know, clone the repository locally and explore it. Quickly, yep, like so many of you, I discovered that tons of crawlers that did not respect the \n```\nrobots.txt\n```\nrobots.txt are crawling my forgejo instance until death ensues.\n\nSo I had no choice, I first used a radical approach and blocked my website entirely except from me. But hey, why having a public forge if not for people to be able to look into it time to time?\n\nI then installed Anubis, but it wasn't really for me. It is way too heavy for my needs, not as easy as I would have hoped to configure and install.\n\nThen I saw this article You don't need anubis on lobste.rs using a simple configuration in caddy that should block these pesky crawlers. I made some adjustments to adapt it to nginx. For now, this is working perfectly well, my users are just redirected once, without really noticing it. And they could use forgejo as they could before. And this puts the crawlers away.\n\nThe strategy is pretty basic; in fact, a lot less advanced than the strategy adopted by Anubis. For every access of my website, I just check if the user has a specific cookie set. If not, I redirect the user to a 418 HTML page containing some js code to execute that set this specific cookie and reload the page.\n\nThat's it.\n\nI also tried to return a 302 and add a cookie from the response without javascript, but the crawlers are immune to that second strategy. Unfortunately this means, my website could only be seen if you enable javascript in your browser. I feel this is acceptable. I guess, someday this very basic protection will not be enough and my forgejo instance will break again, and I will be forced to use more advanced system like Anubis or perhaps even iocaine.\n\nI hope this could be helpful, because, I recently saw many discussions on that subject where people were not totally happy to use Anubis, while at least for me, this quick dirty fix does the trick. And I am fully aware that this would be very easy to bypass. But for now, I think the volume is more important than the quality for these crawlers and it may take a while for them to need to adapt. Also, by publishing this, I know if too many people use the same trick, quickly, these crawlers will adapt.","prompt_model":"gpt-4o-mini","image_prompt_model":"replicate:black-forest-labs/flux-schnell","image_prompt_name":"GenerateGameArt-v1","image_prompt_text":"Digital fortress, beleaguered website administrator, web crawlers in relentless pursuit, high-stakes decision-making, balancing user experience and security, proxy wars in the digital realm, cookie defense mechanisms, redirects and Javascript traps, dodging snooping AIs, hormone-driven hackers, virtual vultures circling, tension-filled gameplay, moody ambient lighting, vibrant orange (#FF5733) accents against dark interface, cyberpunk aesthetic, rendered in 4k resolution, inspired by surrealism and digital art, medium: digital painting, artist influence: Syd Mead, focus on dynamic composition and detailed textures, intense atmosphere, interactive fiction elements, engaging narrative storytelling.","image_data":null,"music_prompt_text":null,"music_prompt_seed_image":null,"private":false,"createdAt":"2026-02-08T07:36:13.535Z","updatedAt":"2026-02-08T07:45:05.884Z","UserId":null}, { parent_id: window.ia.params.parent_id, chat_id: window.ia.params.chat_id, action: window.ia.params.action, }); window.ia.handleRadio(45592); window.history.pushState({}, document.title, window.location.pathname); });