00:00:01 [remind] thuban: upload game atsumaru data 00:01:40 !remindme 1week upload game atsumaru data 00:01:41 -eggdrop- [remind] ok, i'll remind you at 2024-06-07T00:00:00Z 00:01:59 * fireonlive reminds self to fix the regez 00:43:37 I thought you had uploaded it already 00:43:44 on the prev reminder 01:48:07 noooope, still waiting for transfer to finish 02:03:16 https://old.reddit.com/r/DataHoarder/comments/1d48hyv/linus_media_group_is_working_on_digitizing_the/ 02:03:23 ...the tapes went to linus tech tips? 02:03:27 ?? 02:03:35 ? 02:06:59 Apparently not, just helping with sourcing machinery and logistics and stuff. 02:07:12 ah! great 02:07:32 my bad 😅 02:07:32 At least that's how I'm reading the description in the comments. 02:57:31 So, by the way, fuck those people. 03:00:35 i was hoping they'd send it to proper preservationists/digitizers 03:00:44 it -> the tapes 03:01:05 Yeah, they made it clear, no. 03:01:14 I got SO FUCKING BARRAGED by everyone 03:01:15 :/ 03:01:26 Jason Obi-Wan, you're their only hope 03:01:26 oof yeah i bet 03:01:41 I said "go fucking use George Blood, Inc. of Philadelphia, PA" 03:01:50 And they went 03:02:04 Fuck them, there are 5 fully functional machines running in PA with technicians to do it right 03:02:11 But _reeeaasssonnnnsnssss_ 03:02:22 -_- 03:02:35 I have so little time any more for people demanding the internet fucking dog-fetch the answer and then they go "ewww" at the right answer. 03:03:46 buy their own random machine what could go wrong 06:09:28 I dont have a big love with Linus, but heck they just went full Public Stunt instead of doing what had the most sense. Careful, they couldn't even maintain their zpool right. 06:11:36 Barto: what is this... 'scrub'? 06:11:38 :p 14:18:03 Korean streamer community tgd.kr is shutting down on June 30 14:19:52 thank you for reporting this! 14:29:29 Yzqzss edited 抽屉新热榜 (+58, Image CDN: {{offline}}): https://wiki.archiveteam.org/?diff=52294&oldid=52293 15:26:10 yzqzss, do we need a project for https://dig.chouti.com/ or are you handling it? 16:00:09 Megame: i don't see dig.chouti.com on deathwatch? 16:01:03 oh my dig.chouti.com has a ton of juicy outlinks 16:01:05 we need that 16:02:01 Megame: is it https://wiki.archiveteam.org/index.php/%E6%8A%BD%E5%B1%89%E6%96%B0%E7%83%AD%E6%A6%9C ? 16:03:10 yzqzss: i think i might start a warrior project for dig.chouti.com - what do you think? 16:03:21 the outlinks would go into #// - looks like there's some 40 million of them maybe 16:03:24 to various places 16:05:13 Megame: Yes, I'm working on it. We are currently iterating over its links api at 200q/s 16:05:35 yzqzss: nice, can i have the list of outlinks? :) 16:06:04 yzqzss: can you let me know when you finish? since you already are running, best to let you finish making a copy first - we may also make a copy afterwards for the Wayback Machie 16:06:07 arkiver: yeap 16:06:12 awesome stuff :) 16:07:03 so i guess it would take you ~3 days to finish 16:12:26 https://git.saveweb.org/saveweb/chouti/src/branch/main/tests/links_sample.png 16:13:10 yzqzss: how big are the chunks of IDs you collect each sample of links over? 16:13:15 The x-axis is link id, and the y-axis is the api response size. 0 means the id does not exist, and the line close to 800 means it has been deleted. 16:13:35 ah response size 16:13:50 so more like 20 million outlinks maybe 16:13:55 probably about 40%+ of the ids are invalid (do not exist o deleted) 16:14:16 o->or 16:22:15 do I need to do anything to get my wiki edits approved? 16:22:25 will my changes get lost if someone else also tries to edit the same page? 16:22:56 Oh 16:24:48 Flama12333 edited Deathwatch (+415, …): https://wiki.archiveteam.org/?diff=52295&oldid=52286 16:24:49 Flama12333 edited Talk:Deathwatch (+317, /* https://www.computercraft.info/ */ new section): https://wiki.archiveteam.org/?diff=52296&oldid=51933 16:24:50 Xaft edited List of websites excluded from the Wayback Machine (+50): https://wiki.archiveteam.org/?diff=52297&oldid=52275 16:24:51 Izzint edited ICQmail (+105, add ICQ shutdown notice,…): https://wiki.archiveteam.org/?diff=52298&oldid=30156 16:25:48 Heinrich5991 edited Places to store data (+16, Add requested link): https://wiki.archiveteam.org/?diff=52299&oldid=37347 16:25:49 Heinrich5991 edited Reddit (+191, Add links to the-eye.com reddit archive): https://wiki.archiveteam.org/?diff=52300&oldid=52116 16:25:50 Heinrich5991 edited Discord (+7, /* Active */ Add a note that you can use…): https://wiki.archiveteam.org/?diff=52301&oldid=51482 16:27:23 Sneaky, someone tried to add their spam link to [[Discourse]] while also adding some other forums to the list (though they aren't using Discourse). 16:39:30 thanks JAA 16:39:42 do you know if the changes would get lost if someone else also tried to edit the same page? 16:40:23 heinrich5991: They wouldn't get lost, but they might require manual merging. 16:46:54 fireonlive: interestingly enough, coming back to that tapes, I had the argument come to me that shipping those tapes in the US would be unsafe. Wdyt? 16:47:47 if they ship a tape drive to canada and it gets damaged in transit, they'll have to repair it, or get another 16:48:02 if they ship the tapes to the US and they get damaged in transit, they're fucked 16:48:33 so they didn't want to transport the tapes *anywhere*, they wanted to keep it inside Canada 16:49:05 If the machine gets damaged in transit in a way they don't notice until they start eating tapes, they're also fucked. There are a lot of things that can go wrong. 16:49:35 And I think on one of the WAN shows, is that the owners of the tapes don't want them to leave Canada 16:50:06 JAA: yeah, but which do you trust more, yourself inspecting some equipment, or fedex throwing the box out of the truck 16:50:06 i'm not seeing where involving linus is a good idea either way 16:50:08 My memory might be fuzzy onthis 16:50:37 kiska: they said Vancouver specifically. i got the idea they just don't want them to leave the warehouse or whatever. 16:50:55 steering: As someone who works in the freight business, I wouldn't trust fedex 16:51:04 yeah, exactly 16:51:09 steering: Myself personally bringing the tapes to wherever a suitable machine and people are located. 16:51:14 if anyone is willing to help speed up crawling chouti.com -> https://git.saveweb.org/saveweb/chouti 16:51:27 37550490/42468932 [03:23<5:08:22, 265.83it/s 16:51:29 JAA: sure, but the limit for that from vancouver is like... seattle, maybe portland 16:52:20 How so, unless a fear of flying is involved? 16:52:29 ehh... maybe fair 16:52:38 And even if, road trip time! 16:52:38 that'd be a big carry-on. did you see how big those tapes are? 16:52:55 IDK how many tapes there are. I don't know anything about Reboot, 'fore my time 16:53:41 when does road trip become more dangerous (oops, got in a wreck with the tapes in the back seat) than letting fedex at least fly it? 16:53:51 Heh 16:54:17 Wikipedia says 4 seasons of 12 episodes each 16:54:37 I also heard a mention, BTW, that they want to learn (and share) the process. Which I'm 1000% for, I wanna see that video 16:55:08 I think that's a better reason than being scared of damaging the tapes 16:55:38 True 17:00:53 JAABot edited List of websites excluded from the Wayback Machine (+0): https://wiki.archiveteam.org/?diff=52302&oldid=52297 17:04:17 JAA: could I get edit permissions without reviews being required? I realize this is what a spammer woul ask, so it makes sense to turn it down. I have some edits on wikipedia https://en.wikipedia.org/wiki/Special:Contributions/Heinrich5991 https://de.wikipedia.org/wiki/Spezial:Beitr%C3%A4ge/Heinrich5991 and a relatively old github account: https://github.com/heinrich5991 17:14:56 Heinrich5991 created Template:Table cell templates (+331, Add list of table cell templates): https://wiki.archiveteam.org/?title=Template%3ATable%20cell%20templates 17:14:57 Heinrich5991 edited Template:Yes (+7, Add links to other table cell templates): https://wiki.archiveteam.org/?diff=52304&oldid=21739 17:14:58 Heinrich5991 edited Template:No (+7, Add links to other table cell templates): https://wiki.archiveteam.org/?diff=52305&oldid=21740 17:14:59 Heinrich5991 edited Template:Maybe (+7, Add links to other table cell templates): https://wiki.archiveteam.org/?diff=52306&oldid=24401 17:15:00 Heinrich5991 edited Discord (-11, /* Self-archival */ Search-Cord's domain is parked): https://wiki.archiveteam.org/?diff=52307&oldid=52301 17:15:01 JustAnotherArchivist changed the user rights of User:Heinrich5991 17:20:28 thanks :) 17:20:57 Heinrich5991 edited Template:Table cell templates (-6, Archiving status templates → Table cell templates): https://wiki.archiveteam.org/?diff=52308&oldid=52303 17:36:16 Composer "Pusu" (https://ja.wikipedia.org/wiki/ツユ; over 1,000,000 YouTube subscribers) has been arrested for a stabbing. https://old.reddit.com/r/TUYU_official/comments/1d4s0b8/tuty_is_now_really_fucked_up_pusu_arrested_for/ 17:46:02 Surely that would make eligible for #down-the-tube 18:31:19 For comments on chouti 18:31:19 https://api.chouti.com/comments/batch.json?ids={comment_id} 18:32:23 yzqzss: Is this bound by IPs or compute? 18:32:54 Iterate from 0 to 56316309+ to archive all. (ids receive multiple comment_ids separated by commas) 18:35:04 kiska: no so far, currently 30+ ips are requesting it's links API at a total of 325q/s, no ban and degradation 18:35:30 Only solution is to start more chouti_links instances to get more concurrency? 18:37:40 Kiska: if Claimer or Processer is hungry, 18:37:40 pass the BASE_CONCURRENCY (defaults to 10) env variable to make more wokers 18:38:18 I'll try 100 on this 18:46:10 chouti_links ETA: 2h30m 18:48:42 I'm happy to help on the chouti thing if you can get a docker image for me to run :P 18:49:57 just added chouti_comments command to https://git.saveweb.org/saveweb/chouti, let's to save all comments :) 18:52:33 nstrom: our docker images maintainer was asleep and couldn't trigger the build. lol 18:56:41 https://github.com/IceCodeNew/docker-collections/tree/master/saveweb 18:58:26 https://github.com/IceCodeNew/docker-collections/blob/daa3a79802181423af0137ad7026f5dbde444520/saveweb/Dockerfile.chouti#L58 18:58:26 You may want to change this line to `chouti_comments` command. 19:01:10 > You can run they at the same time, but do not run the same subcommand in parallel on the same ip. 19:01:11 Ooops? 19:01:40 What problems could happen if I was to do so? 19:05:25 Probably nothing will happen, just in case. 19:05:25 (I found out their API is behind Tencent(?) WAF, IIRC, just not very active) 19:05:45 I see :D 19:14:53 Oh wait. Now there is no WAF/CDN in front of their api, api.chouti.com resolved to 19:15:44 Makes sense to me considering they've shut down some services. :) 19:42:36 "yzqzss: i think i might start..." <- (Ah, forgot to reply) Yeah, why not 20:31:23 yzqzss: I think I got the comments grabber up and running on a few boxes, not sure if it helps 20:33:35 https://hexdocs.pm/google_api_content_warehouse/0.4.0/api-reference.html has anyone ran archivebot on the google leak? 20:40:40 nstrom: thank you 20:48:10 did we archive a lot of rabbit stuff? 20:48:25 (that weird 'AI' device that's being dragged around and exposed) 20:49:26 Does "the google leak" include everything in https://hexdocs.pm/sitemap.xml mentioning google_api? 20:52:43 'The leak' was in a repo on GitHub, everything else is derived from that, as I understand it. 21:09:12 fireonlive: https://rabbitu.de/ 21:10:10 ooh nice 21:10:46 i haven't really watched the videos yet, but coffeezilla did a thing calling them out for basically lying about/faking most of their claims 21:11:22 "$30,000,000 AI Is Hiding a Scam" https://www.youtube.com/watch?v=NPOHf20slZg 21:11:23 "Rabbit Gaslit Me, So I Dug Deeper" https://www.youtube.com/watch?v=zLvFc_24vSM 21:13:00 his channel seems like one of those ones DTT should probably periodically grab, though I don't remember if he was forced to remove anything at the moment 21:13:29 unlike say those scammer exposing channels which have had things taken down/censored (Jim Browning et al.) 21:14:09 getting more informed about the details of this scam doesn't seem like a good use of my time :P 21:14:23 ye :p 21:14:36 just a 'hm maybe we should grab the rabbit stuff' is what flagged me 21:14:51 i think we did some already though 21:34:54 fireonlive: yeah, all of the rabbit stuff that I could find was saved, though https://rabbitu.de wasn't done. https://engineering.rabbit.tech/ is the page with the info about playwright (which I couldn't find via duckduckgo or google, though it looks like google's indexed it now) 21:35:29 pokechu22: ah! awesome 21:35:35 thanks for that :) 21:36:04 JAA started a rabbitu job so we're golden then 21:36:15 :-)