01:15:37 It looks like Read the Docs recently started blocking IA/SPN: https://web.archive.org/web/20241001011425/https://archivebot.readthedocs.io/ 01:17:49 bleh 01:18:00 nicolas17: Thanks, reminds me of "snoozing" my phone alarm by setting another alarm then going back to sleep 01:18:03 JAA: Ah 01:18:27 Was ReadTheDocs one of those sites that was getting overloaded by Anthropic and OpenAI? 01:18:36 thuban: grabbed F7410ZCS1AXHB myself now 01:19:05 Yeah: https://about.readthedocs.com/blog/2024/07/ai-crawlers-abuse/ 02:15:17 Hiya. Around a year or two ago, a few of you archived Knowledge Adventure/JumpStart's buckets. As the person who has found the buckets, I just want to say thank you for your service and preserving a part of history. Friends of mine have made amazing discoveries thanks to your help. 02:15:17 I have one thing to ask: Is mirroring an option for Wayback Machine to do? Basically, there are newsletters in media.jumpstart.com/ka/images/newsletter/ that show that the ka folder was part of knowledgeadventure.com. For example, knowledgeadventure.com/images/newsletter/2006-12/reminder_files/main1.jpg leads to a 404, but 02:15:18 media.jumpstart.com/ka/images/newsletter/2006-12/reminder_files/main1.jpg exists. 02:23:59 Hi, I'm the one who ran that one, glad to hear it's useful. :-) 02:24:28 If you mean 'could the WBM pretend that media.jumpstart.com/ka/* snapshots were once at knowledgeadventure.com/*?', then no, that's not possible. I suppose you could simulate it on your side with a userscript or similar though. 02:25:14 Do you know if it's feasable for Wayback Machine to copy files from there to knowledgeadventure.com? 02:26:14 To the actual knowledgeadventure.com site, sure. To the WBM's copy of knowledgeadventure.com, no. 02:27:04 The WBM only displays things as they were fetched from the URL. And these were not fetched from knowledgeadventure.com. 02:27:20 (Copying to the actual site would only work if you're the owner of the domain, obviously.) 02:28:50 Weren't the content fetched from the s3 buckets and then distributed to media.jumpstart.com, media.schoolofdragons.com, etc? (knowledgeadventure shut down on June 30th, hence the need to preserve the bucket a year or two ago. 02:28:55 Ah yeah, this was the project where I had to download everything 4 times because it was available under different URLs. 02:29:14 oh... rip... sorry. 02:29:17 I archived it under those four different URLs, yeah. 02:29:25 So it should appear under all four in the WBM, too. 02:30:00 E.g. https://web.archive.org/web/20230624035410/http://media1.knowledgeadventure.com/ka/images/newsletter/2006-12/reminder_files/main1.jpg 02:30:16 gotcha, but not for knowledgeadventure.com. oof. I understand now. 02:30:42 If they were still available under the knowledgeadventure.com domain at the time, I wasn't aware of it, and it wasn't grabbed that way. Sadly, we haven't acquired a time machine yet. 02:54:05 oh lawd, MathBlaster... I remember that... 04:46:57 nicolas17: Are you able to access https://www.tupperware.com.co/ ? It's the only Tupperware site I wasn't able to grab with AB (or otherwise) due to geoblocking. Maybe Colombians get secret access to products in advance or something. 04:47:19 works for me from argentina 04:47:35 Nice, can you grab-site it? 04:52:50 Underground Colombian Tupperware Parties 04:58:03 nicolas17: Is this the same site as the main one? https://unete.tupperware.com.co/ 04:58:16 funnily that is not restricted 04:58:31 no 04:59:12 "www" seems to be for end customers and "unete" (join) seems to be for resellers? 04:59:22 Huh 04:59:50 I've thrown that one into AB.