00:25:42 rewby - Looks like we are target bound again. I didnt add any additional workers so I guess we filled buneary 00:26:02 :( 01:28:21 probably AB should archive all the .well-known etc stuff too 05:05:05 !a https://transfer.archivete.am/R6JJo/sitemap_urls_january_february_2023.txt 05:05:15 datechnoman: Deduplicating and queuing 254459 items. (for 'https://transfer.archivete.am/R6JJo/sitemap_urls_january_february_2023.txt') 05:05:27 datechnoman: Deduplicated and queued 254459 items. (for 'https://transfer.archivete.am/R6JJo/sitemap_urls_january_february_2023.txt') 05:05:45 More sitemaps to chew on :) 05:46:07 datechnoman: awesome :) 05:46:33 and many of these (except sitemaps) are very low in required resources 05:59:06 yeah they are all pretty small :) 05:59:21 Over the coming week I will be working on more batches including robots.txt files etc 08:38:58 !a https://transfer.archivete.am/J2KuE/goo-gl.2023-02-12-00-17-01.txt 08:44:40 datechnoman: Skipped 27938 invalid URLs: https://transfer.archivete.am/qypPB/goo-gl.2023-02-12-00-17-01.txt.bad-urls.txt (for 'https://transfer.archivete.am/J2KuE/goo-gl.2023-02-12-00-17-01.txt') 08:44:41 datechnoman: Deduplicating and queuing 8801833 items. (for 'https://transfer.archivete.am/J2KuE/goo-gl.2023-02-12-00-17-01.txt') 08:53:51 datechnoman: Deduplicated and queued 8801833 items. (for 'https://transfer.archivete.am/J2KuE/goo-gl.2023-02-12-00-17-01.txt') 11:47:24 oh no, my urls grabber ip has been greylisted by bitninja.. for.. doing a GET on random images once every few minutes? :D 11:47:24 should I try to contest that or do I just roll with it? 11:49:29 some of the user agents of those "bad requests" they blocked look ancient btw, is that intended? 12:00:59 mh, list is two years old, maybe time to update? 12:01:52 Urls is a problematic project that often triggers lists. Bitninja is often tripped here 12:02:31 urls is a "opt-in-only" project due to that, and it has a warning in its description 12:02:33 I am aware and don't care about the ip reputation, more asking from the project point of view if I should ask them to unlist me 12:02:52 or just do nothing 12:19:11 Bitninja are clowns. Unless your hosting provider is complaining, I wouldn't worry about them. 12:20:26 Hetzner used to require a statement for them. Then they would forward the email as "informational" and not require a statement. I honestly haven't seen anything from them in a while so I'm not sure if Hetzner even forwards the complaints anymore 12:32:11 !a https://transfer.archivete.am/MUYL3/telegram.me_urls_processed.txt 12:32:13 datechnoman: Skipped 3 invalid URLs: https://transfer.archivete.am/RoehB/telegram.me_urls_processed.txt.bad-urls.txt (for 'https://transfer.archivete.am/MUYL3/telegram.me_urls_processed.txt') 12:32:14 datechnoman: Deduplicating and queuing 3739 items. (for 'https://transfer.archivete.am/MUYL3/telegram.me_urls_processed.txt') 12:32:15 datechnoman: Deduplicated and queued 3739 items. (for 'https://transfer.archivete.am/MUYL3/telegram.me_urls_processed.txt') 13:02:28 Echoing what Craigle said, bitninja honestly isnt' even worth reading past "bitninja" if you see the email 13:02:58 They're absolute clowns who greylist you just for saying hello to the wrong server 13:16:29 thanks :) 22:59:20 !a https://transfer.archivete.am/jRn4X/telegram.me_share_urls.txt 22:59:23 datechnoman: Something went wrong. (for 'https://transfer.archivete.am/jRn4X/telegram.me_share_urls.txt') 22:59:37 oops wrong copy and paste >.< 22:59:41 !a https://transfer.archivete.am/jRn4X/telegram.me_urls.txt 22:59:42 datechnoman: Deduplicating and queuing 15339 items. (for 'https://transfer.archivete.am/jRn4X/telegram.me_urls.txt') 22:59:44 datechnoman: Deduplicated and queued 15339 items. (for 'https://transfer.archivete.am/jRn4X/telegram.me_urls.txt') 23:00:50 Before anyone panic's on the name of the file, they are telegram.me share urls, processed to be standard urls for #// consumption