00:38:05 JustAnotherArchivist edited Deathwatch (+201, /* 2024 */ Add TinyLetter): https://wiki.archiveteam.org/?diff=51231&oldid=51222 00:45:06 JustAnotherArchivist edited Deathwatch (+59, /* 2024 */ Add DK Find Out!): https://wiki.archiveteam.org/?diff=51232&oldid=51231 01:22:39 heh I started a mediafire worker 25 hours ago 01:22:52 it completed 7 (seven) items 01:23:20 Yes, there isn't much running through that project. 01:24:03 I'll start telegram and see how much concurrency I can get away with on a single core 02:37:24 Over the past week or so (although I only tried on three days), I've managed to grab all the Tesla Roadster PDFs. The online manual thing still remains to be done though. I'll try, but I'm not sure that'll get anywhere. 03:38:46 FireonLive edited Current Projects (+24, invisibly move imgur to the 'Long-term'…): https://wiki.archiveteam.org/?diff=51233&oldid=51173 03:39:13 whomever decided that things on mediawiki must start with a capital letter 03:39:16 🖕right here 03:40:38 WhatsWrongWithCamelCasing? 03:41:46 some things, just, like, start with a lower case letter, man 03:42:37 There's a way of changing that with displaytitle but I don't know whether that's enabled on the archiveteam wiki 03:42:40 (also to all the developers using text-transform on my text i see you, and you're the first into the meat shredders) 03:42:54 oh neat 03:43:03 wait, no, it definitely is, given what's going on at https://wiki.archiveteam.org/index.php/YouTube 03:43:11 perhaps that one's going a bit too far though :) 03:43:37 so bluesky would make a public web interface 03:43:43 that is still not up i guess? 03:44:44 test link.. https://bsky.app/profile/archive.org/post/3kfdxl66l2e25 03:44:54 looks like no 03:44:57 yeah 03:47:02 There is the og:description etc there, though. 03:47:48 FireonLive edited Current Projects (-436, remove expired recently finished projects): https://wiki.archiveteam.org/?diff=51234&oldid=51233 03:49:52 arkiver: was issuu going to be turned into a long-term project? 03:50:16 ah it's off of the tracker now 03:50:52 also oops, wrong channel 03:52:22 speaking of long-term projects, may i ask again about getting googlecrash back up? 03:52:24 fireonlive, pokechu22: Example with display title: https://wiki.archiveteam.org/index.php/Codearchiver 03:52:52 ah :D 03:52:52 fireonlive: likely not no 03:52:54 No way to avoid the capital letter in the URL though. 03:52:57 doesn't fix the url, but nicer 03:53:05 arkiver: ah ok! sounds good 03:53:32 thuban: do we have examples you would want to run through? 03:54:28 not offhand, but i could get some later today 03:54:28 I think lowercase initial letters can be done with a mediawiki configuration change, like Wiktionary, but it's probably not worth it, as it makes /codearchiver and /Codearchiver different pages 03:54:30 This is a very minor issue, but currently the wording of "Current Running Warrior Project" on the wiki page (https://wiki.archiveteam.org/index.php/Template:CurrentWarrior) is annoying to me - it implies that the others are paused 03:54:50 thuban: i'm a little worried about size. it's also not easily downloadable from the Wayback Machine i believe 03:55:11 JAA: https://wiki.archiveteam.org/index.php/User:FireonLive < that category though 03:55:21 * nicolas17 prepares the bonk stick 03:56:28 * JAA adds [[Category:Permanently_horny_users]] to that page. 03:56:42 xD 03:56:45 arkiver: i'm unclear about wbm playback myself, but see previous discussion in this channel 03:56:50 FireonLive edited Current Projects (-220, remove Issuu, finished long ago; move…): https://wiki.archiveteam.org/?diff=51236&oldid=51234 03:56:52 JAA: lmfao 03:58:14 as for size, we could skip dedicated discovery and just do manual queueing? (and/or backfeed from #//, but that might be too big all on its own) 04:00:29 Yeah manual queuing would be great 04:01:16 TheTechRobo: you run the warrior yeah? 04:01:28 fireonlive: ish, why? 04:01:40 can you shoot me a screenshot of the pick a project page 04:02:13 https://lounge.thetechrobo.ca/uploads/a682453523498797/image.png 04:02:17 er put a bit too much on the bottom 04:02:22 thanks 04:02:51 FireonLive edited Template:CurrentWarrior (-11, align wording with warrior wording): https://wiki.archiveteam.org/?diff=51237&oldid=46438 04:03:12 fireonlive: thanks 04:03:15 =] 04:03:53 Is it canonically ArchiveTeam or Archive Team? 04:04:11 i've often wondered that myself 04:04:19 not sure 04:04:19 So have I. 04:04:28 i like the one word version 04:04:31 unfortunately there doesn't seem to be an authoritative answer 04:04:56 The main page has both. lol 04:05:04 Well, there's no real authority here, so getting an authoritative answer is tricky... :-P 04:05:09 if you look at the 2009 'logo' 04:05:10 https://wiki.archiveteam.org/index.php/File:Archiveteam.jpg 04:05:11 always keep 'em on their toes 04:05:15 I personally use ArchiveTeam. 04:05:21 has a space, and the desc. does too 04:05:28 I also use ArchiveTeam 04:05:30 and you can see the uploader there 04:05:32 ye me too 04:05:42 a lot of the older stuff, news coverage, etc, uses "Archive Team", but i've always preferred "ArchiveTeam" as it's more clearly a proper noun 04:05:56 yeah "Archive Team" sounds way too generic IMO 04:06:03 like you're describing it rather than naming it 04:06:05 so do our topics in -bs and -dev 04:06:08 Yeah, marginally less risk of confusion with IA. 04:06:09 (but not #archiveteam) 04:06:16 :D 04:06:21 That's because I set those, I think. 04:06:27 ah :) 04:06:28 Whereas the #archiveteam topic is ancient. 04:06:42 -ot didn't exist when I showed up here, and I think -bs had a different topic. 04:06:54 Right, didn't -bs used to be -ot? 04:06:56 ye "Archive Team: We're not archive.org" sounds like something from early days 04:07:42 TheTechRobo: Very originally yes, I've been told. By the time I joined, it was already the separation we have today, more or less. 04:07:45 the press seems to favour two word variant 04:07:54 looking at https://wiki.archiveteam.org/index.php/In_The_Media 04:08:23 that's probably because that's what the opening paragraph in the main page says 04:08:24 i'm sure they went to our frontpage though for that :p 04:08:26 ye 04:09:20 according to wiki.*'s we're "Archiveteam" 04:09:23 <fireonlive> :D 04:09:49 <TheTechRobo> so current contenders: 04:09:51 <TheTechRobo> - Archive Team 04:09:53 <TheTechRobo> - ArchiveTeam 04:09:55 <TheTechRobo> -Archiveteam 04:10:17 <arkiver> not Archiveteam 04:10:21 <fireonlive> - dick drawn on ballot/wasted vote 04:10:27 <arkiver> i guess we'll claims both Archive Team and ArchiveTeam 04:10:29 <fireonlive> (it seems that always happens) 04:10:39 <arkiver> it's spread out across a ton of articles already 04:10:41 <arkiver> both versions 04:10:51 <arkiver> i usually use Archive Team thouhg 04:10:57 <fireonlive> we could prefer one in our 'style guide' i suppose 04:10:58 <TheTechRobo> arkiver: Right, but we should probably pick one for e.g. Warrior UI? 04:11:01 <TheTechRobo> yeah 04:11:07 <arkiver> maybe 04:11:17 <fireonlive> hm 04:11:22 <arkiver> what triggered this discussion? 04:11:23 <TheTechRobo> maybe ArchiveTeam could be informal and Archive Team could be press? idk 04:11:29 <TheTechRobo> arkiver: me asking :P 04:11:32 <arkiver> ah :P 04:11:39 <arkiver> well let's not make major changes yet 04:11:43 <TheTechRobo> yeah 04:12:19 <fireonlive> but i already have 300 wiki edits in queue 04:12:21 <fireonlive> :P 04:12:29 <fireonlive> jkjk 04:15:00 <JAA> I was going to say <amateur.png>, but it turns out my mass IRC edit two years ago was only ~330 edits: https://wiki.archiveteam.org/index.php?title=Special:Contributions/JustAnotherArchivist&offset=20211031191500&limit=350&target=JustAnotherArchivist 04:15:23 <fireonlive> :P 04:15:35 <fireonlive> lordy lordy 04:15:57 <JAA> Yeah, that was fun. 04:16:04 <JAA> Much of it was automated, plus some manual checking. 04:16:18 <fireonlive> :) 04:16:37 <fireonlive> https://wiki.archiveteam.org/index.php/List_of_lost_online_videos < that's a hard list to maintain 04:16:47 <fireonlive> i think wikipedia probably has a term for it 04:18:43 <fireonlive> hmmm. re: that page deletion proposal forever ago; i guess an !ao of a couple links followed by deletion could work; instead of the "namespace of wikipages locked in time that we are also scared to look at" 04:20:21 <joepie91|m> I assume people here are already aware but Kissinger is dead 04:20:32 <joepie91|m> don't know if that implies any archival work 04:21:52 <JAA> I archived the website, and I'm looking into throwing interviews etc. into #down-the-tube. Otherwise, probably not a whole lot. 04:22:33 <JAA> By the way, 'archiveteam' reference: https://nitter.net/textfiles/status/1156646412756627456 04:23:06 <JAA> :-P 04:23:07 <fireonlive> ooh 04:23:42 <JAA> There are a few more tweets, but there are plenty more with 'Archive Team'. 04:24:55 <fireonlive> wikimedia template to $rand between the two every time it's mentioned? 04:24:58 <fireonlive> :p 04:25:58 * JAA slaps fireonlive around a bit with a large trout 04:26:02 <fireonlive> xD 04:27:49 <pabs> joepie91|m: AB has a job for www, not sure if anyone did subdomain enumeration 04:28:15 * pabs votes ArchiveTeam 04:28:28 * fireonlive votes ArchiveTeam 04:28:40 <pabs> yay nitter.net unbanned me 04:30:03 <JAA> The https://www.henryakissinger.com/ AB job is incomplete due to PerimeterX 403s. I grabbed a separate copy from a residential IP with grab-site. 04:31:36 <fireonlive> https://strawpoll.com/BJnX8W7mPnv vote now! 04:31:51 * fireonlive waits for JAA to inform of the JS-disabled status 04:32:24 <fireonlive> (informal non-binding poll) 04:32:31 <fireonlive> (does not hold up in AT-core court) 04:32:49 <joepie91|m> ah yes, governance(tm) 04:32:54 <pabs> anyone know the status of archiving for Twitter and Facebook? does nitter work? any particular instance? 04:32:54 <fireonlive> :3 04:33:25 <fireonlive> pabs: nitter for twitter, can use a special instance; facebook.. not sure 04:36:36 <JAA> I'm beginning to repeat myself, but... 04:36:38 * JAA slaps fireonlive around a bit with a large trout 04:37:17 <JAA> pabs: Facebook is hell and virtually impossible, sadly, especially since the redesign. 04:37:30 <JAA> Or well, nothing is impossible, but no tooling exists. 04:37:31 <fireonlive> 😅 04:37:39 <fireonlive> hmm, did they kill off mbasic 04:37:42 <JAA> snscrape's Facebook module has been broken for a long time. 04:37:53 <fireonlive> ah, login wall 04:38:07 <fireonlive> (https://mbasic.facebook.com/l) 04:38:19 <fireonlive> (-l) 04:40:46 <fireonlive> i was a good boy and set the options choices to random :3 04:40:51 <fireonlive> so theres no bias ™ 06:55:27 <JAA> m.facebook.com still exists, but you don't get far there. 07:29:49 <JAA> Sanqui: I'm only about two thirds done (635 of 930), but I'll send you what I got so far since there's not too much time left. The extraction is somewhat incomplete because there's still no decent WARC tooling and mine broke on a few WARCs. In order to ensure I didn't have to download anything again due to incomplete extraction, I went with a much simpler `grep -Fai -e ulozto.cz -e uloz.to`, so it 07:29:55 <JAA> contains the surrounding HTML, not just the Uloz URLs. After filtering out dupes (mostly from uloz.to pages themselves), here's the 33k results so far: https://transfer.archivete.am/mUisN/webzdarma-uloz-partial.zst 07:29:56 <eggdrop> inline (for browser viewing): https://transfer.archivete.am/inline/mUisN/webzdarma-uloz-partial.zst 07:30:06 <JAA> Bad bot 07:30:09 <JAA> :-) 07:57:21 <fireonlive> i guess you can’t inline a .zst file haha 08:02:44 <that_lurker> well you can, but not with the expected outcome 08:17:01 <fireonlive> JAA: https://transfer.archivete.am/mUisN/webzdarma-uloz-partial.zst 08:17:36 <JAA> Excluded URLs ending with .zst? 08:21:59 <Sanqui> JAA: Awesome, thank you, I expect we'll have our hands full with this but who knows 08:22:01 <project10> Good bot! 08:23:48 <fireonlive> ,, string cat $ATTExcludedExtensions 08:23:48 <eggdrop> ok: *.zst *.gz *.tar.gz *.tar *.tar.xz -0ms- 08:23:52 <fireonlive> JAA: indeed :) 08:24:27 <JAA> :-) 11:28:28 <supercar99> Noob here. Why isn't frogger/blogger the current active project? Isn't it more "urgent" than telegram? 11:37:32 <flashfire42|m> <supercar99> "Noob here. Why isn't frogger/..." <- Cause we already at capacity for it I think and telegram needs love too I guess 11:39:06 <imer> Yeah, not able to ingest data fast enough as is, don't need more workers on it :) 11:40:06 <flashfire42|m> I think the claims is so high in hopes that when the deadline hits there will still be a bunch of downloaded stuff waiting to upload 11:42:25 <supercar99> Got it, thank you! Initally set warrior to run on blogger but didn't understand why it kept getting stuck (really stuck, warrior dashboard wouldn't even load). Set to the current project (i'll leave it there) and everything went back to smooth! 11:42:54 <flashfire42|m> Yeah we appreciate the workers no matter what. 11:43:10 <flashfire42|m> Telegram does have a giant backlog too so 12:20:16 <qwertyasdfuiopghjkl> ( replying to https://hackint.logs.kiska.pw/archiveteam-bs/20231130#c392586 ) fireonlive: I'm guessing you need to include the "User:" part in the displaytitle thing for it to work. 19:17:03 <fireonlive> ohh maybe! 20:45:48 <Pedrosso> Star Trek related sites: 20:45:48 <Pedrosso> https://transfer.archivete.am/Y13GB/ex-astris-scientia.TXT comes from https://www.ex-astris-scientia.org/misc/statistics.htm 20:45:48 <Pedrosso> https://transfer.archivete.am/16ikxC/ex-astaris-scientia.org-links.htm.txt comes from https://www.ex-astris-scientia.org/links.htm 20:45:48 <eggdrop> inline (for browser viewing): https://transfer.archivete.am/inline/Y13GB/ex-astris-scientia.TXT 20:46:21 <Pedrosso> Hehe, too speedy. Inline for the 2nd one: https://transfer.archivete.am/inline/16ikxC/ex-astaris-scientia.org-links.htm.txt 21:29:31 <Pedrosso> https://transfer.archivete.am/m2r00/ex-astris-scientia%20a_few_others.txt 21:29:31 <eggdrop> inline (for browser viewing): https://transfer.archivete.am/inline/m2r00/ex-astris-scientia%20a_few_others.txt 21:29:33 <Pedrosso> https://transfer.archivete.am/Y8pl8/ex-astris-scientia.org-links-main.cgi-outlinks.txt from https://www.ex-astris-scientia.org/links/main.cgi excluding youtube links 21:38:00 <Pedrosso> has the memory-alpha, memory-beta, memory-gamma, and memory-delta wikis been saved? 21:38:03 <Pedrosso> have* 21:38:12 <Pedrosso> I'll take it to #wikiteam - 21:59:09 <Pedrosso> oh, and I've tried to clean up the https://www.ex-astris-scientia.org/links.htm https://transfer.archivete.am/Vv7vN/ex-astaris-scientia.org-links.htm_cleaned.txt 21:59:10 <eggdrop> inline (for browser viewing): https://transfer.archivete.am/inline/Vv7vN/ex-astaris-scientia.org-links.htm_cleaned.txt 22:03:19 <Pedrosso> and from https://www.angelfire.com/ns/firepit/msfm/links.htm in the cleanup I got these: https://transfer.archivete.am/Vv7vN/ex-astaris-scientia.org-links.htm_cleaned.txt 22:03:20 <eggdrop> inline (for browser viewing): https://transfer.archivete.am/inline/Vv7vN/ex-astaris-scientia.org-links.htm_cleaned.txt 22:06:13 <nulldata> https://www.theverge.com/2023/11/29/23981363/mailchimp-shutting-down-tinyletter 22:10:51 <Pedrosso> Whops, my last sentence was wrong. I got these https://transfer.archivete.am/IvUrS/estar_trek%20extras.txt from https://www.angelfire.com/ns/firepit/msfm/links.htm 22:10:51 <eggdrop> inline (for browser viewing): https://transfer.archivete.am/inline/IvUrS/estar_trek%20extras.txt 22:11:14 <Lord_Nightmare> https://twitter.com/jiromifune/status/1730157521862902037 is a problem, I have no idea how that can be archived except manually 22:11:14 <eggdrop> nitter: https://nitter.net/jiromifune/status/1730157521862902037 23:47:14 <Pedrosso> can't wait for smell-o-vision and all the smells to archive, haha 23:49:00 <project10> 🤮 23:59:28 <pabs> Lord_Nightmare: AT has a nitter instance we can AB. I threw it in the queue just now