01:15:37 <JAA> It looks like Read the Docs recently started blocking IA/SPN: https://web.archive.org/web/20241001011425/https://archivebot.readthedocs.io/
01:17:49 <nicolas17> bleh
01:18:00 <OrIdow6> nicolas17: Thanks, reminds me of "snoozing" my phone alarm by setting another alarm then going back to sleep
01:18:03 <OrIdow6> JAA: Ah
01:18:27 <OrIdow6> Was ReadTheDocs one of those sites that was getting overloaded by Anthropic and OpenAI?
01:18:36 <nicolas17> thuban: grabbed F7410ZCS1AXHB myself now
01:19:05 <JAA> Yeah: https://about.readthedocs.com/blog/2024/07/ai-crawlers-abuse/
02:15:17 <BrokenTV> Hiya. Around a year or two ago, a few of you archived Knowledge Adventure/JumpStart's buckets. As the person who has found the buckets, I just want to say thank you for your service and preserving a part of history. Friends of mine have made amazing discoveries thanks to your help.
02:15:17 <BrokenTV> I have one thing to ask: Is mirroring an option for Wayback Machine to do? Basically, there are newsletters in media.jumpstart.com/ka/images/newsletter/ that show that the ka folder was part of knowledgeadventure.com. For example, knowledgeadventure.com/images/newsletter/2006-12/reminder_files/main1.jpg leads to a 404, but
02:15:18 <BrokenTV> media.jumpstart.com/ka/images/newsletter/2006-12/reminder_files/main1.jpg exists.
02:23:59 <JAA> Hi, I'm the one who ran that one, glad to hear it's useful. :-)
02:24:28 <JAA> If you mean 'could the WBM pretend that media.jumpstart.com/ka/* snapshots were once at knowledgeadventure.com/*?', then no, that's not possible. I suppose you could simulate it on your side with a userscript or similar though.
02:25:14 <BrokenTV> Do you know if it's feasable for Wayback Machine to copy files from there to knowledgeadventure.com?
02:26:14 <JAA> To the actual knowledgeadventure.com site, sure. To the WBM's copy of knowledgeadventure.com, no.
02:27:04 <JAA> The WBM only displays things as they were fetched from the URL. And these were not fetched from knowledgeadventure.com.
02:27:20 <JAA> (Copying to the actual site would only work if you're the owner of the domain, obviously.)
02:28:50 <BrokenTV> Weren't the content fetched from the s3 buckets and then distributed to media.jumpstart.com, media.schoolofdragons.com, etc? (knowledgeadventure shut down on June 30th, hence the need to preserve the bucket a year or two ago.
02:28:55 <JAA> Ah yeah, this was the project where I had to download everything 4 times because it was available under different URLs.
02:29:14 <BrokenTV> oh... rip... sorry.
02:29:17 <JAA> I archived it under those four different URLs, yeah.
02:29:25 <JAA> So it should appear under all four in the WBM, too.
02:30:00 <JAA> E.g. https://web.archive.org/web/20230624035410/http://media1.knowledgeadventure.com/ka/images/newsletter/2006-12/reminder_files/main1.jpg
02:30:16 <BrokenTV> gotcha, but not for knowledgeadventure.com. oof. I understand now.
02:30:42 <JAA> If they were still available under the knowledgeadventure.com domain at the time, I wasn't aware of it, and it wasn't grabbed that way. Sadly, we haven't acquired a time machine yet.
02:54:05 <steering> oh lawd, MathBlaster... I remember that...
04:46:57 <JAA> nicolas17: Are you able to access https://www.tupperware.com.co/ ? It's the only Tupperware site I wasn't able to grab with AB (or otherwise) due to geoblocking. Maybe Colombians get secret access to products in advance or something.
04:47:19 <nicolas17> works for me from argentina
04:47:35 <JAA> Nice, can you grab-site it?
04:52:50 <nulldata> Underground Colombian Tupperware Parties
04:58:03 <that_lurker> nicolas17: Is this the same site as the main one? https://unete.tupperware.com.co/
04:58:16 <that_lurker> funnily that is not restricted
04:58:31 <nicolas17> no
04:59:12 <nicolas17> "www" seems to be for end customers and "unete" (join) seems to be for resellers?
04:59:22 <JAA> Huh
04:59:50 <JAA> I've thrown that one into AB.