00:12:29 https://old.reddit.com/r/programming/comments/jgub36/youtubedl_just_received_a_dmca_takedown_from_riaa/g9sphpg/?context=3 is scary as hell, they went after the MAINTAINERS and random contributors first? 00:12:39 i guess the RIAA/MPAA was trying to do another popcorn time 00:13:08 but unlike that case, youtube-dl has lots of significant legal use, while popcorn time was basically built from day 1 to stream content illegally 00:13:34 JAA: should we archive that thread? 00:13:55 Lord_Nightmare: I threw it into AB yesterday, but there are a few more replies now. 00:14:12 Done 00:14:33 phihag as also edited a bunch of posts 00:14:51 ... you just told me how to archive a reddit user, its a bunch of scripts... 00:15:07 ...but i don't have any functional host that a .txt file could be pulled from for archivebot anymore 00:15:31 https://transfer.notkiska.pw/ 00:22:52 only the first page of https://old.reddit.com/user/phihag is relevant to recent events, so should i just !ao that? 00:23:28 I'll do that for now, at least... 00:23:55 Hold on, I'll do the whole thing. 00:30:23 Done 00:52:32 Meanwhile on github: https://github.com/github/dmca/pull/8142 00:52:41 bee bee space rollback space username 00:54:49 Yeah, mentioned that in -ot. I dumped the repo earlier. 00:55:35 There was also at least one PR that has been deleted since (8128) and contained a download link. It's still in the repo (for now). 00:56:28 On a completely unrelated note: reminder that Docker Hub will begin to delete 'inactive' images a week from now. 00:59:18 Oh, thought that was in February for some reason 00:59:25 Maybe time to make a channel then 01:01:55 Apparently account holders will be notified of the pending deletion (that wasn't in the original FAQ). No clear mention of when that will happen exactly. 'Account owners will also be notified by email of “inactive” images that are scheduled for deletion.' 01:02:20 Oh, it was in the original FAQ, nevermind. 01:04:38 #failwhale and #dick were suggestions from a discussion a while ago, but I don't think we properly decided. There are a couple users in each. 01:06:36 maybe make it mobydick instead 01:06:43 failwhale is indeed kinda twitter themed 01:06:59 also what is that github dcma pull url supposed to be ? 01:07:13 mobydick isn't really a pun though. :-/ 01:08:19 i didn't realize it had to be a pun, the two suggested aren't puns either lol 01:08:38 https://github.com/github/dmca/commit/bccf7d0dbfec423c4a967f668be47b6339d15893#r43532899 the 'claiming copyright' on the code makes me wonder if they actually forced phihag or others to sign over their own copyright to their contributions, so that RIAA could go back and DMCA code they now officially own 01:09:46 i don't think the UNLICENSE lets you DO that though 01:09:53 or does it? has it ever been tested in court? 01:10:06 -ot for that discussion please. 01:11:59 mgrandi: Well yeah, not necessarily puns, and I agree that #dick isn't going to end well probably. And yeah, 'failwhale' is definitely too strongly associated with Twitter in my opinion. 01:14:13 i dunno, someone pick something whale themed or dock themed since those are the two primary themes 01:15:03 I guess I submit #undock :D 01:15:50 Or #depier 01:16:35 undock sounds good 01:24:17 on the playstation online store, it seems some of the links are now broken but for now that all games page is still working 01:25:07 i'm not too worried about it because this is all probably public information anyway, but getting the internal IDs of the games is what was requested and will be useful for scraping off of googlecache or similar if people want everything 01:25:31 Anyway I'll let you guys get the name of the chat, then you can invite me into it :D 01:27:19 -purplebot- Deathwatch edited by JustAnotherArchivist (+398, /* 2020 */ Add FurNation.ru) just now -- https://www.archiveteam.org/?diff=45706&oldid=45692 01:28:16 Last one from me #shipwrecked 01:29:53 #texascity1947 :-P 01:39:29 the last one is obscure and tragic, lets use it 02:01:51 JAA: Do you have a particular strategy for uploading youtube channel comments grabbed by the youtube-comments qwarc project to IA? I have some that I'd like to upload, but there are a lot of individual files, so I'm wondering if I should merge them or something 02:04:49 jodizzle: None yet. I have 15.4k from the Joe Rogan scrape, and the plan was to megawarc them. Cf. https://hackint.logs.kiska.pw/archiveteam-bs/20200928#c11409 02:12:59 Hm okay. My use-case is much smaller, 102 .warc.gz files (half of them the logs). But maybe the same concept applies? 02:13:11 Then again, 102 in a single item doesn't sound that bad. 02:13:51 Yeah, 102 is probably fine. 02:15:13 Okay, I'll probably do that then. 02:33:56 so, who here knows about blaseball 02:34:45 I'm considering setting up a as-it-happens scraper for blaseball, there are other sites that have this but they don't seem to have a nice format for the data 02:56:53 this is my face when i wrote a html scraper and then i noticed that it has JSON data in the webpage itself :| :| :| 03:14:04 haha 04:01:30 JAA: est ~2 TB compressed, ~15 uncompressed 07:09:33 been having trouble trying to put up new files on the wiki, I'm getting thrown "Internal error: Server failed to store temporary file." when attempting to upload a (newer) screenshot to stick on the Amazon page 07:09:35 anyone else? 08:28:16 Maybe aws s3 is having issues? 08:34:37 been that way for me for at least a week, tried it once or twice some days back and got the same too 08:53:11 mgrandi: I /think/ several folks from VGPC also scrapped the internal IDs (many/most of them are probably in https://serialstation.com/ although I’m not sure how best/easy it is to query them) and the data JSON blobs that are on each page ; so I’m not too worried about these either. I was worrying about the individual webpages (for the 08:53:12 Wiki*edia perspective) but you folks hre are the real experts of web archiving so I’ll defer to your level of worriness ^_^ 08:54:12 yeah, i will be trying to get those, a few of them already seem gone though, but since they are so old its possible that google cache or WBM already has them 09:23:36 Cool :) 09:24:23 Also, i compiled a list of all URLs currently used on Wikidata at https://bin.privacytools.io/?44aa8d7fc2c777e7#rSiHFE6FoOPTs4fCw+QeJ9tF1l0dFkw2ZNdUz0NTM0U= − that gs 25K URLs if using all the regional stores URLs − it’s not a lot but hope that helps 09:55:49 do you happen to know all the available regions? (aka their url code) 09:57:46 @Jean-Fred , that is the only thing i'm missing atm 11:16:28 i'm running a wget-at on the enUS urls at least now 11:22:55 mgrandi This is a list of 75 regional URLs: https://justpaste.it/93kgd ; I heard from someone from VGPC that the complete figure is 94 URLs, trying to get that list from them. 11:23:22 geez 11:23:44 yeah if you can get all those language codes i can generate the URLs for those and get what i can 11:25:28 i am noticing that some of the urls don't work with various language codes but work for enUS , hmm 11:34:05 So a Japanese ID will not work with the French store URL, but a European ID should work with the French store, the German store etc. Is that what you mean or am I misunderstanding ? 12:04:42 mgrandi Sorry for the wait, got a spreadsheet of all 94 stores − as a CSV at https://catdrop.drycat.fr/r/0xa5N1Cz#Y93zx+Gy78NyECGHhyLZXaauil03Bq4PvAZ51iIrViM= or plaintext at https://bin.privacytools.io/?5fc606162184e753#yMqxw3WrjYGwXJt2F0r/B1yk90coRr9DhjyNLsvf1e4= 12:05:18 Thanks, I'll give those a go when I wake uoz the enUs urls are going now 12:05:36 Cheers :) 12:06:04 Hopefully the stores are still around :'( 12:09:38 they are somewhere, cause they are still around if you are on the device 12:10:23 and theoretically they all should be the same with just translation differences 12:14:10 I heard the devices talk to a legacy API, not really the website ; and there are some differences between regional stores (typically, the German USK rating will only be on the German-language store, whereas other EU websites will give he PEGI) 12:14:37 But I don’t want to sound ungrateful − I’m super grateful for your help & work :) 12:17:00 yeah, the urls might be in google cache too 12:17:10 also shoutouts to the SKU for untitled goose game being `"UP3971-CUSA23079_00-HONKHONKHONKHONK"` 12:18:39 :D 12:23:00 it is something that probably should be kept track of, since there are so few games honestly 12:25:07 cause they do remove stuff, obvious examples are P.T. , and smaller stuff like the original Nier:Automata SKU (https://www.ps4database.io/view/CUSA04551_00/NP, `UP0082-CUSA04551_00-ANDROIDS20030612`,) but is replaced by Nier:Automata game of the yorha edition (`UP0082-CUSA04551_00-GOTYORHADIGITAL0`) 12:32:55 but yeah, trying various urls, its in various states of broken, i dunno what is new and what is existing broken-ness, but at least we have full urls and the metadata about the games for further search 12:33:27 some games go to a blank grey page , others redirect to store.playstation.com, and others work, and i can't seem to find a common thread , their codebase must be a mess lol 12:35:17 thuban: Ah, tiny. Yes, let's grab that. 12:37:27 Anyway, bedtime, I'll ping you when I wake up and look at the data and attempt the other store regions 12:42:09 happy to see some work getting done on the PS4 stuff! :) 12:52:32 JAA: home page documents url format; you got it or want me to generate a list? 12:53:02 Glad to be of service jake 12:53:04 (i have the space to download / upload directly to ia, but not really the bandwidth) 13:03:57 thuban: I don't have time to look into it at the moment. Maybe in early November. 13:11:26 JAA: ok. (fwiw, with dateutils: `dateseq 2011-02-12T00:00:00 1h now -f 'https://data.gharchive.org/%Y-%m-%d-%-H.json.gz'`) 13:13:09 Thanks :-) 18:26:19 -purplebot- Ultraweb.hu edited by Bzc6p (-245, recovered) just now -- https://www.archiveteam.org/?diff=45707&oldid=45640 18:39:11 I'm going to suggest that #failwhale be used for DockerHub - already more people there (8) than any other of the suggested channels 18:39:52 Since there's a lot to do there in a short time, and it's best that that starts being worked on 18:49:54 Since most of the key people for a DPoS project aren't in any of the channels yet, the number of users doesn't matter all that much. 19:11:58 Suggestions that have been made: #failwhale #dick #mobydick #undock #depier #shipwrecked #texascity1947 19:12:29 The term 'failwhale' is strongly associated with Twitter, and '#dick' is probably not a great idea. 19:12:30 what's DPoS? 19:12:51 ivan: Distributed Preservation of Service, aka distributed project with tracker and workers/pipelines. 19:13:30 #dpos? :-) 19:13:37 Sometimes also called 'warrior project', but I don't like that term since the large majority of the work doesn't even involve the warrior (VM). 19:14:06 Er no, you misunderstand. We're looking for a name for a DPoS project for Docker Hub. 19:14:15 ah 19:14:38 (General discussions about the backend, code, etc. is normally in -dev.) 19:15:42 #shipwrecked is ok 19:16:14 #overboard 19:17:00 https://www.onelook.com/reverse-dictionary.shtml?s=sinking+ship 19:17:21 https://www.onelook.com/reverse-dictionary.shtml?s=shipwreck 20:06:19 -purplebot- Coronavirus edited by Wessel1512 (+149, /* Global */) just now -- https://www.archiveteam.org/?diff=45708&oldid=45695 20:14:01 i like #overboard (as the contaners a trown overboard) 20:15:18 like https://i.pinimg.com/originals/51/dc/fe/51dcfec87f633ee2476712baf06f92ec.jpg 20:43:47 JAA: #leakymess ? 20:44:04 for dockerhub :) 20:59:39 https://framapic.org/ a image hoster will close 20:59:54 2021-01-12 21:03:16 https://framablog.org/wp-content/uploads/2020/03/Planning-fr-v2-fermes.png 21:04:16 mid-2021: framasite, framawiki, framabin 21:08:36 https://frama.site/ <= 3872 sites , 2356 wikis et 4731 pages 23:39:38 Blog post about that: https://framablog.org/2020/03/03/10-bonnes-raisons-de-fermer-certains-services-framasoft-la-5e-est-un-peu-bizarre/ - various things closing in 2020 and 2021 23:39:53 Someone should add them to Deathwatch, preferably someone with better French than me 23:40:33 (That's the link that nico 32's image came from) 23:42:08 Looks like both the 2020 things have been shut down already, actually 23:46:43 nico_32: thanks 23:46:55 I like #failwhale JAA :P 23:47:05 or did we already have one