12:13:11 not sure if it counts as a loop but I'm seeing some significant crawling of some vietnamese lookalike SEO spam sites for some online casino, all using randomly generated subdomains. examples https://hgy22.churchofisolation.net/ https://1qg24.therealestateintucson.com/ 12:13:11 ersiering.net/ 12:13:11 https://on3q3l.mikesapartment.net/ https://o5ps70.pamall.net/ https://8j64.kerstv 12:13:40 viewing source for any of those shows a bunch of links to other randomly generated subdomains at each domain, that change each time the page is loaded 12:14:10 Great pickup. arkiver can we get those filtered out please? 12:14:45 https://xrp71.reptilekeepers.net/ is another 12:15:00 cheers 12:19:00 Only hard part is that they have nothing in commen except for the .net so will be hard to filter out :( 12:19:44 yeah this had been going on since yesterday 12:19:51 annoying casino website spam loop 12:19:58 has* 12:20:44 they don't even have .net in common 12:20:47 some have .info 12:20:50 or .com even i elieve 12:20:53 believe* 12:21:53 i have a plan, but it'll require some work 12:38:35 I've been collecting quite a few forums recently, would there be interest in adding "recent posts" pages and rss feeds from those to this project? 13:47:29 datechnoman: F for you my friend 13:48:34 Hetzner was pissed with me because I'm a "Brand new customer" to them and I generated an abuse notice and several Spamhaus listings within a week of account ownership 14:02:49 !ig 8of2baxzumf11k98412mrflil /(showthread\.php\?.*(&p=\d+|&mode=(threaded|hybrid))|search\.php) 14:02:57 wrong chat apologies 14:31:20 Don't run this project over wifi lol 14:32:44 "No HTTP response received from tracker" yeah, because I'm saturating the life out of this line 15:32:07 lol those spam domains are great. churchofisolation, reptilekeepers.. 17:49:50 fix is coming up 20:59:11 nyany I think the spamhaus listings ended up getting me locked for the month as ive never had an issue with any other abuse notices 20:59:19 Hopefully they dont do the same thing to you! 21:50:44 datechnoman: They didn't; I was able to remove my listings 21:51:18 Basically explained that the IP addresses were running a copy of the "URLs" ArchiveTeam project, linked to the wiki 21:51:33 Spamhaus were happy to remove it 21:56:46 Nice. Sounds like you dodged a bullet 22:11:13 Might also help that I'm associated with DroneBL 22:11:14 lol 23:01:23 an update is in 23:01:26 let's see how it goes 23:08:13 if this works that'd be great 23:08:25 then we also have a handy new method of removing difficult stuff in the future 23:09:26 ohhh that is great to hear :D 23:11:18 :) 23:11:23 Fingers crossed! 23:11:27 yeah! 23:11:58 if it works, I'll also start blocking the annoying loop from some days ago with this 23:12:18 the solution is implemented back then has a higher chance of losing 'good' URLs than the new method 23:16:32 Code improvements is always welcomed :) 23:16:53 well old loop is gone 23:16:55 new one is in 23:17:02 this time with a ton of /news/ URLs :P 23:17:28 hehehe. Nom nom nom 23:17:42 So backlog will increase alot again :P 23:17:53 not a lot I suspect 23:18:12 I think this is a left over of the loop over a few days (a week?) ago that was not completely fixed 23:18:42 i don't see any signs anymore of the one that was supposedly just fixed! 23:19:23 blegh IA is down. can't check the CDX 23:19:31 (power issue) 23:19:35 :( always when you need it haha 23:19:44 yep 23:19:46 Joys of not running out of DC's 23:19:58 Put cost is a killer 23:21:23 hopefully fixed in a few minutes