Is returning a 403 based on the user agent worth a blog post? Also, can't Bytespider just change their user agent to Byte-Spider? Or, just make their user agent a random string? It will be a forever arms race and require constant code updates to keep chasing that bot by user agent. You're probably better off whitelisting the known user agents and blocking everything else.
Also, does it really require a specific "gem"? This is HTTP request filtering, the router (as in the real router, like the metal box with network cables) can probably do it by itself these days.
It might not be, but I couldn't find much about the topic so I figured I'd write it up and share. And you're right that this may be a bit of whack-a-mole, but for now I've cut my bandwidth down which means I may be able to downgrade my cloudinary plan to a lower tier, which is a big win for me since it accounts for like 20-30% of my total operating cost
Also, does it really require a specific "gem"? This is HTTP request filtering, the router (as in the real router, like the metal box with network cables) can probably do it by itself these days.