00:07:05 Someone hasn't given this a page even though it was archived by archive team (https://wiki.archiveteam.org/index.php/Pixie_Hollow_Online) 00:09:03 A lot of smaller projects don't get their own page. 00:09:26 Well it was from a big project 00:09:33 Also, first time I hear of that. 00:09:48 Basically it was in a big project and went under everyone's noses 00:10:16 [citation needed] 00:10:20 It was in the Disney cdn go and dolimg archive project 00:10:50 Disney CDN Name is [Go] [Dolimg] 00:11:11 Right, so it wasn't its own project, which would explain why there isn't a page with that name. 00:11:14 Do you have a link? 00:11:35 I'll link it 00:11:43 Oh, I found it, that was in 2013, well before my time. 00:11:46 https://wiki.archiveteam.org/index.php/Games 00:13:42 Can't find anything relevant in IRC logs regarding Disney or dolimg. 00:15:30 I can't find it 00:15:37 I think I lost the link 00:16:44 Here is the cdn urls to prove my point that archives team archived items from the cdn (http://dolimg-60de6c82-be11-98e1-4d6c-c65a234eee95.disney.io/) (http://go-60de6c82-be11-98e1-4d6c-c65a234eee95.disney.io/) 00:17:40 We archive a lot of random things, especially through the URLs project and ArchiveBot. That doesn't mean there was a project for it. 00:17:58 Huh weird I was told it was 00:18:13 Who told you that? 00:18:26 And what did they tell you exactly? 00:18:34 It's been months since it happened I don't remember but they had a purple name 00:19:13 * fireonlive collates the list of purple users 00:19:17 Maybe in your client, users don't have colours on IRC. 00:19:31 Oh 00:19:38 But I see now that you requested those sites specifically in July, right. 00:20:11 Yes, pokechu22 listed those things and ran them through ArchiveBot. 00:20:18 Yes 00:20:26 We don't usually document such small projects. 00:20:50 But it's was a large cdn and you guys go nuts for cdns 00:21:02 It was 250-ish GiB? 00:21:08 That's tiny. :-) 00:21:13 More then 250gb 00:21:18 Not ish 00:21:20 JAA: does Knowledge Adventure CDN have a wiki page? :D 00:21:36 Maybe 300, and I'm going by the figures pokechu22 mentioned at the time for the actual bucket size. 00:21:40 Yeah I thought it wasn't archived someone told me they won't able to 00:22:00 Reply to nicolas17 00:22:19 nicolas17: Nope, case in point :-) 00:23:08 JAA It was more then 300gb 325gb to my knowledge 00:23:12 GhostyTongue: We archive a couple dozen terabytes per day usually. 00:23:21 Ain't nobody got time for documenting all of that. 00:23:21 Ok I know 00:23:50 Also someone sent me this image i think we need to archive this https://gcdnb.pbrd.co/images/pEH7wdRNi6mR.jpg?o=1 00:24:08 yo is that even supposed to be public? 00:24:16 Idk 00:24:27 that looks like it could be a compromised account 00:24:32 → #codearchiver 00:24:34 https://gitlab.wdi.disney.com/ asks for employee login 00:24:52 Ah yeah, nevermind then. 00:25:05 Oh OK bc j didn't check the url 00:25:13 I should of checked lmao 00:25:21 Nothing public on it. 00:25:21 "if your issue requires urgent and immediate attention, contact the WDI SRE team via PagerDuty email" oh fun, a public email address to wake up on-call staff! 00:25:40 Lmao 00:26:04 send your best memes 00:27:32 Same duds who sent me that gitlab crao sent me https://media.discordapp.net/attachments/1144776785024790578/1185312074658758706/image.png?ex=658f26e4&is=657cb1e4&hm=0bb1023b6b853bb8cfcead952efece4c4169616e0bd1b8a10b36fc075f4a2465& 00:27:44 yikes 00:28:27 Yeah I think it's bad I wasn't even involved whatever he did but he claimed to have access to wonderland, gitlab, jira its wild what the dude claims 04:26:53 Hello, I want to inform you guys about a dead satire anime news site called anime maru, I recently found out its domain has been expired, and there appears to be no news pertaining this on its social media, so I want to bring this to attention in hopes of archiving whatever remnant of it that exists, 04:27:20 Twitter: https://twitter.com/AnimeMaru 04:27:20 nitter: https://nitter.net/AnimeMaru 04:27:28 cas: how long ago did it expire? can you find what IP address it had before it expired? 04:28:09 I don't know how to do that, sorry. But this is the site in question https://www.animemaru.com/ 04:28:38 They also have a facebook https://www.facebook.com/p/Anime-Maru-100063523659742/ and patreon https://www.patreon.com/AnimeMaru afaik 04:30:44 ugh seems that expired in february 04:31:24 Doesn't look like it to me. 04:31:28 or hm 04:31:33 It expires in February. 04:32:02 https://completedns.com/dns-history/?domain=animemaru.com says dns10.parkpage.foundationapi.com was added as a nameserver in feb 2023 04:32:16 It resolved to as of last week, but that server doesn't respond. 04:33:28 Oh 04:33:37 It has four NS, and they resolve to two different IPs. Beautiful. 04:34:21 lovely 04:34:26 The one above from ns{1,2}.exonhost.com and from dns1{0,1}.parkpage.foundationapi.com. 04:35:54 maybe WBM can say when the site was last... usable 04:37:07 hmmm so what's the status? Are things bad? 04:38:07 According to the WBM, it was up in April, and I'm seeing at that time in DNSHistory, which is dead. 04:39:57 according to the WBM, it was dead in May, so there's nothing to archive here, the ship has sailed 04:40:32 ofc there's still the facebook/twitter/patreon but 04:40:56 twitter still up: https://twitter.com/animemaru 04:40:57 nitter: https://nitter.net/animemaru 04:41:20 sucks to hear, that's unfortunate 04:41:53 ig the remaining social medias can be archived, perhaps? 04:42:19 facebook unfortunately no, twitter after a fashion 04:42:37 added twitter/@AnimeMaru to my todo 04:42:59 ty 04:43:30 =] 04:44:34 idk what we normally do about patreon, but it looks like there aren't any posts anyway, so maybe just !ao? 04:45:06 sounds reasonable 04:45:15 nice thanks fireonlive 04:45:41 :) welcome 04:49:49 it's a bummer that I was too late in notifying AT about its death, but it is what it is. 04:57:27 Sudden shutdowns without announcements are almost impossible to catch, unfortunately. 05:01:52 yeah I imagine it's difficult to catch such cases on time, if ever 05:07:42 btw what's WBM? 05:07:54 wayback machine - https://web.archive.org 05:11:51 ahhh ok cool 05:16:13 Megame edited Deathwatch (+256, https://www.peepsandcompany.com - Jan 2): https://wiki.archiveteam.org/?diff=51391&oldid=51389 05:26:37 if anyone feels like queuing all (official) youtube channels for all of https://en.wikipedia.org/wiki/Telewizja_Polska#TV_channels in #down-the-tube , please feel free to 05:26:43 else i can have a look later today 05:26:49 AB, DPoS, IA, SPN, WBM; any other acronyms we commonly use? 05:27:03 FOS and the new one 05:27:16 Not used much anymore, but true. 05:27:20 does // in #// count? 05:27:25 sure :3 05:27:35 WARC, arguably 05:27:41 hah yeah 05:27:44 CDX 05:28:00 AT :p 05:28:12 !! 05:28:26 SWH/SCN in #gitgud 05:28:35 BS/OT 05:28:50 yeah maybe like #-bs , etc., that we often use 05:28:59 it could confuse new people 05:29:42 Is CDX actually an acronym? I never figured out what it's supposed to mean. 05:30:16 gitgud has AFN for add forge now as well 05:30:27 well, not really gitgud 05:30:31 but ye :p 05:30:42 (codearchiver/SWH) 05:31:20 * fireonlive um actuallys everyone and introduces the word 'initialism' 05:32:03 "Traditionally, an index for a web archive (WARC or ARC) file has been called a CDX file, probably from Capture/Crawl inDeX (CDX)" ~ https://pywb.readthedocs.io/en/latest/manual/indexing.html 05:32:15 though, that's from webrecorder 05:32:16 so uh 05:32:23 back up a dumptruck full of salt 05:32:24 * JAA slaps fireonlive around a bit with a large trout 05:32:24 ;) 05:32:28 It had to be done. 05:32:32 :D 05:32:34 Yeah, good enough. 05:32:36 they're equivalent over irc, since we pronounce all of them '...' :3 05:32:56 :3 05:34:14 hmm, there's https://loc.gov/preservation/digital/formats//fdd/fdd000582.shtml and https://loc.gov/preservation/digital/formats//fdd/browse_list.shtml but that's a different CDX 05:36:40 * fireonlive asks chatgpt 05:36:44 fireonlive: Actually, if it's 'inDeX', it wouldn't be an initialism. :-P 05:37:06 i more meant the other ones :D 05:37:17 I know, I just had to. 05:37:20 hmm, https://archive-access.sourceforge.net/projects/wayback/apidocs/org/archive/wayback/resourceindex/cdx/format/RedirectURLCDXField.html has no coverage 05:37:21 :3 05:37:44 there's a whole "um, actually" game show, too! 05:41:20 hm, i really can't find a canonical expansion of 'cdx' anywhere! 05:41:46 i asked the main wayback machine guy 05:41:54 nice 05:42:53 thanks arkiver :3 05:42:59 would be cool to get that documented 05:46:57 Even stuff like https://www.cs.odu.edu/~salam/archive-profiling-tpdl-2015-camera-ready.pdf and https://blogs.loc.gov/thesignal/2022/04/candidates-campaigns-and-cdx-files/ / https://blogs.loc.gov/thesignal/2019/01/the-library-of-congress-web-archives-dipping-a-toe-in-a-lake-of-data/ don't seem to expand it 05:48:19 JustAnotherArchivist created Archiveteam:Acronyms (+1104, Created page with "This is a list of topical…): https://wiki.archiveteam.org/?title=Archiveteam%3AAcronyms 05:48:34 quite. neither does the original documentation: http://web.archive.org/web/20031226073353/http://www.archive.org/web/researcher/cdx_file_format.php http://web.archive.org/web/20040815100631/http://www.archive.org/web/researcher/cdx_legend.php 05:48:36 https://www.bird.co/ , electric scooter rental, filed for chapter 11 - potential for it to disappear. 05:49:21 no longer the word 05:51:37 https://archive.org/details/WaybackMachineSetup20020126/page/n11/mode/2up?q=CDX doesn't explain it either 05:52:22 a scanned printed README from 2002, neat :3 05:53:05 oh my it's all perl 05:53:41 > describing the content of the archive 05:53:56 So could be Content inDeX, too. 05:54:57 https://archive.org/details/the-past-web-exploring-web-archives-preprint/page/n111/mode/2up?q=CDX claims Capture Index 05:57:08 interesting, but authority unclear 05:58:14 Yeah 05:59:23 oh neat 05:59:45 oh hey, *the* brewster left a review on that item 06:01:50 https://archive.org/details/@brewster?tab=reviews "digitize these" ~ brewster 06:01:59 :) 06:02:22 oh, https://github.com/internetarchive/cdx-summary also uses "capture index" 06:02:35 ooh 06:03:26 as does https://commoncrawl.org/blog/announcing-the-common-crawl-index by ilya kreymer, who apparently ought to know 06:05:06 >Ilya Kreymer is Lead Software Engineer at Webrecorder Software. 06:05:07 hmmmm 06:05:09 ;) 06:05:13 i know 06:05:50 but he _did_ work at ia on the wayback machine, so... 06:06:15 hmmm 06:06:23 TIL 06:09:04 then again, pywb is a webrecorder project, so why is its documentation so diffident on the subject >:? 06:09:21 >:( 06:09:51 i suppose kreymer (or indeed other ex-iaers) need not have looked at that section personally 06:14:26 Petchea edited Deathwatch (+278, /* 2023 */ China Judgments Online (court…): https://wiki.archiveteam.org/?diff=51393&oldid=51391 06:14:27 Nulldata edited Deathwatch (+215, Added Today's Plan): https://wiki.archiveteam.org/?diff=51394&oldid=51393 06:14:28 JustAnotherArchivist changed the user rights of User:Nulldata 06:15:06 🥳 06:15:07 congrats 06:15:26 ClubBBC TV edited List of lost online videos/list (+409, added alkinboy7500 hd cuz we need to restore…): https://wiki.archiveteam.org/?diff=51395&oldid=49247 06:18:40 https://tvpworld.com and other redirect to https://www.tvp.pl/ now. Any way to still grab them? 06:18:56 JAA, ^ 06:20:13 ah :/ 06:22:25 so it's happening right now 06:23:29 TVP Info, TVP3, TVP World and TVP Parlament were closed so far 06:23:37 according to wikipedia 06:23:54 all redirect to tvp.pl 06:24:18 crt.sh search for tvpworld.com only shows www.tvpworld.com sadly 06:24:25 Yes, grab everything we can. 06:39:03 https://www.gearrice.com/update/teraleak-thousands-of-old-iphone-games-resurface/ 06:39:18 "The manufacturer closed the site in February of the following year, then in March the cache was uploaded by the Wayback Machine Archive Team" 06:39:23 we have a new name 06:41:59 lol 06:46:48 *facepalm* 06:55:02 no no they were uploaded by the notorious archive hacker James Scott 06:55:33 xP 06:57:39 RIP h2ibot 06:57:45 RIP 06:58:52 f 06:58:58 do we have a way to archive an entire domain+outlinks starting at *multiple* input pages on the domain at different levels of dirs? (IIRC AB !a and !a < aren't suitable) 06:59:39 I was thinking of saving all of the https://www2u.biglobe.ne.jp/ (no index) pages I can find on search engines 06:59:50 nope! 07:02:38 i was just thinking about this use case again the other day. afaict best you could do is sans outlinks using grab-site with `--span-hosts` and `--domains`, and then extract outlinks from the results 07:41:53 I'm guessing the gearrice.com article might have been scraped from somewhere else (or possibly AI-generated) because there's the sentence "To find your way around, a search interface has been put online at this address, which links to applications recorded in the Wayback Machine." but no actual link anywhere I can see. 07:45:52 Also, when viewing the source of the page, there's "" after the text of the article. 07:46:25 lol 07:46:38 Yeah, sounds about right. 07:48:01 ahh lol 10:12:11 "End of an era for electronics giant Toshiba" https://www.bbc.com/news/business-67757333 https://news.ycombinator.com/item?id=38706547 10:12:21 o7 10:15:41 "Bird, once valued at $2.5B, just filed for bankruptcy" https://www.businessinsider.com/bird-silicon-valley-electric-scooter-startup-files-for-bankruptcy-2023-12 https://news.ycombinator.com/item?id=38711808 10:21:30 *poof* another 2.5 billion of value went up in nothing 10:22:12 coulda been better used at IA 10:26:50 brewster's billions 10:27:49 :D 10:33:00 someone on Wikipedia apparently tagged us as "Organizations disestablished in 2023" https://en.wikipedia.org/wiki/Archive_Team 10:34:41 wtf 10:36:20 (if he's the cofounder btw, who are the other cofounders?) 10:37:06 https://en.m.wikipedia.org/wiki/Special:Contributions/ 10:37:26 their other changes already reverted 10:37:26 same ip editor has done the same thing to other articles; just revert it 10:37:32 ninja'd 10:37:33 yeah 10:38:01 weird 10:38:17 "fireonlive" doesn't have a wikipedia account sadly 10:39:17 Reverted 10:39:33 =] 10:39:43 tech234a++ 10:39:43 -eggdrop- [karma] 'tech234a' now has 1 karma! 14:09:49 https://www.reddit.com/r/homelab/comments/18mlvjn/psa_asrock_racks_motherboard_pages_are_outdated/ 14:10:08 bios archive of asrock server mainbaords, might be useful to pull 14:13:01 Pulling a dumb dump into my own server 17:34:45 https://hempuli.itch.io/mobile-suit-baba is free for a limited time (~6 days). Is there any way to archive it? 17:38:08 https://web.archive.org/web/20231221173726/https://itchio-mirror.cb031a832f44726753d6267436f3b414.r2.cloudflarestorage.com/upload2/game/2434854/9356105?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=3edfcce40115d057d0b5606758e7e9ee%2F20231221%2Fauto%2Fs3%2Faws4_request&X-Amz-Date=20231221T173637Z&X-Amz-Expires=60&X-Amz-SignedHeaders=host&X-Amz-Signature=cac6efabe2e68295ae3 17:38:10 9ccf8c095c2c22deaba2c92ef66700b720b9d960ebbee 18:06:32 Thanks. (the first time I tried SPN it got a 403 for some reason, guess I'll try it again to get the other file) 18:11:03 Worked this time. https://web.archive.org/web/20231221180933/https://itchio-mirror.cb031a832f44726753d6267436f3b414.r2.cloudflarestorage.com/upload2/game/2434854/9356047?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=3edfcce40115d057d0b5606758e7e9ee%2F20231221%2Fauto%2Fs3%2Faws4_request&X-Amz-Date=20231221T180917Z&X-Amz-Expires=60&X-Amz-SignedHe 18:11:03 aders=host&X-Amz-Signature=a45e6d7cb97e1a88dfbe65e25ec7bb35e8e13b5befdd5be77664a0b9a202ac90 18:23:06 Regarding Co-Founding 18:23:25 There were a set of us, I was just the person who had the idea, and people had all sorts of ideas 18:26:45 Like, I'd count chronomex 18:27:34 (Checking mail) 18:31:48 Hmm. 18:31:53 My mail seems to go back to 2009. 18:34:18 Oh, I switched to Gmail in June of 2009. 18:56:26 I'll put it on the list... find my 2009 mail, find where I talked with Archive Team members, get Co-Founders listed. 19:08:04 Motherfuckers, I am re-installing PINE 19:09:40 hell yeah 19:24:37 https://mastodon.xyz/@johl/111618899554454932 <- "Wikimedia Russia has been dissolved" (the org, not the website) 20:17:39 Can someone please throw https://medium.com/@hyperloop_one into AB? Hyperloop One is shutting down and is scrubbing socials 20:28:36 Medium is a bit of a pain, but I can try 20:29:32 specifically https://medium.com/@hyperloop_one/archive doesn't exist... but https://medium.com/hyperloop-one/archive does so that's probably fine 20:30:16 scribe.rip might do in a pinch 20:31:15 hmm, 403s... does medium also need a special UA? 20:31:40 ah, seems like it's usually done with -u firefox 20:41:19 oh awesome (re: PINE/finding co-founders) :) 20:41:37 -+rss- Beeper – Moving Forward: https://blog.beeper.com/p/beeper-moving-forward https://news.ycombinator.com/item?id=38722246 20:41:42 looks like Beeper has given up 20:41:42 Thanks! 20:42:01 (on iMessage, not completely!) 20:45:51 they released their bridge as open source 20:45:58 maybe gitgud that stuff? 20:48:27 pushed the imessage repo to that chan 20:48:35 well, relayed