00:29:00 JAA, for the logs, I suppose URLs only filter could be interesting for quickly finding stuff even faster; I rarely use it but some people might use it~ 00:49:11 tech234a: Cool site. Does crxcavator.io have a public list of all the apps/extensions/themes? 00:57:56 lennier1: I'm not sure if they have a public list, but perhaps they could be asked for a list? 01:00:08 https://api.crxcavator.io/v1/scans lists a few of the most recently scanned extension updates 01:02:11 Yeah, I was just playing around with that endpoint. Can't find a pagination parameter though. 01:02:36 The other endpoints require an extension ID. 01:05:01 https://rss.crxcavator.io/ looks kind of S3-like 01:05:08 Well, there's a search endpoint, but it only returns 5 results. 01:05:16 Sample RSS: https://rss.crxcavator.io/hdokiejnpimakedhajhdlcegeplioahd.xml 01:05:29 RSS for extension versions 01:05:41 If it is an S3 bucket could it be listed? 01:06:20 They also list an email address: support⊙ci 01:07:01 Looks like they locked it down. Unless we can find the underlying bucket and that is misconfigured, unlikely. 01:07:58 Alright 01:09:43 JAA: any context on crxcavator? 01:11:12 also ping tech234a ^ 01:11:31 arkiver: Google nuking paid extensions in the Chrome Web Store. crxcavator.io is an index and archive of extensions or something like that. 01:12:21 Oh yeah, also nuking Chrome apps after that. 01:12:22 ah for december 1st 01:12:40 Yeah and next year. 01:12:41 June 2022 right 01:13:23 I'll look into getting a project up for it 01:13:25 June 2021 apps on Mac/Windows/Linux. June 2022 apps on Chrome OS. 01:13:37 they did give us plenty of time 01:14:35 so 01:14:39 coming up are the other .ee sites 01:14:47 This is the S3 bucket that has the actual data, but also locked: https://extensions.crxcavator.io/ 01:14:52 http://My2020Census.gov is probably changing on the 15th, probably has been picked up by way back machine a bunch but maybe do a archive bot run of it 01:15:20 looks pretty quite for the rest of the year 01:15:24 also reddit is still up 01:16:01 Docker Hub and Twitch Sings 01:16:08 Might be some extensions with free trials or in-app purchases going away as soon as Demember 1. (And fully paid apps, which you could at least get metadata for.) 01:16:57 yeah I'm not completely sure what we'll do on docker hub 01:17:13 they're deleting PBs of data iirc 01:17:32 There must be some process that these sites are using to scan the Chrome store, maybe an API for it? 01:17:49 Yes, 4.5 PB according to their FAQ. 01:18:00 Maybe start with the docker files first? 01:18:06 yeah IA won't store that 01:18:21 we can get metadata at least 01:18:27 yeah and dockerfiles 01:18:30 Since those are the instructions to build said containers 01:18:36 yep 01:18:37 maybe 01:18:38 And you can reverse engineer after that 01:18:49 maybe this is a good chance to get a copy of all docker metadata :) 01:19:02 Do we need 4PB of alpine linux 01:19:21 Maybe also the actual layers for official images and some other popular ones, though I'm not sure if those are affected at all. 01:19:24 let's set up a channel 01:19:26 It'd be good to archive those regardless. 01:19:38 any ideas? 01:19:41 Also, for twitch sings, should I look into archiving comments? 01:20:13 I don't know if there is a CLI way to get twitch vods and clips yet, there is a GUI program at least, but nothing for comments exists anywhere 01:20:29 Failwhale? Lol 01:20:41 A twitter-ism but docker is a whale 01:22:38 :P failwhale sounds ok 01:22:47 You have to set up a free Twitch developer account to use it, but there's this: https://github.com/PetterKraabol/Twitch-Chat-Downloader 01:23:11 JAA: did twitch use websockets for comments? 01:23:26 On live chat, yes. IRC through WebSockets. 01:23:46 not archived that? 01:24:01 Not sure how it works on VOD. 01:24:02 mgrandi: you have a twitch sings example? 01:24:20 Archived chat is JSON, not sure if it's web sockets or just long polling or what 01:24:24 https://success.mirantis.com/api/images/.%2Ferror-a-firewall-is-blocking-file-sharing-between-windows-and-the-containers%2Fimages%2Fimage.png 01:24:32 Docker even has a failwhale icon just for us 01:24:57 nice 01:25:14 I can get one later 01:25:25 But twitch sings are just normal vods and clips I think 01:32:45 #dick ? 01:33:42 uh :P 01:33:48 for twitch? 01:33:53 of docker 01:33:55 or 01:34:07 Well, apparently it's not obvious enough, so I guess not. :-P 01:34:13 I meant for Docker, via Moby Dick. 01:34:42 Is there enough interest for a Chrome Web Store channel? #chromeweblore ? 01:35:17 Surprised #dick wasn't already taken, lol. 01:35:33 ah lol 01:35:46 kinda liked failwhale 01:37:14 Sure, although yeah, it's usually associated with Twitter. 01:38:16 Not sure what happened with that purge of inactive accounts that was planned for late last year and then abandoned after Jason raised a shitstorm over the accounts of dead people that would be lost. 01:49:18 -purplebot- Deathwatch edited by JustAnotherArchivist (-31, Sandboxie website to dead) just now -- https://www.archiveteam.org/?diff=45666&oldid=45664 01:50:40 mgandi: Twitch VOD chat is accessible through "api.twitch.tv/v5/videos/{VIDEO_ID}/comments?content_offset_seconds={SECONDS_SINCE_START}" but requires setting the Client-ID header to "kimne78kx3ncx6brgo4mv6wki5h1ko" which is what web clients use (it's not secret) 01:52:40 I just check the chat messages returned, take the highest offset, and use that to compute the next seconds offset to request. Not sure if there's some other way of getting all messages 01:53:15 mgrandi: ^ 01:57:16 Cool, thanks 01:57:34 @JAA: they decided against it and haven't done anything further 01:57:57 Honestly it wouldn't be that hard to do what tumblr does and just rename the accounts and maybe keep a reference to the old username 01:58:23 But they gotta keep up the stereotype that they don't actually think about the stuff they are implementing lol 02:00:32 @benjins: is that their new api, I want to say it's called hydra or something? 02:01:21 No clue, I just poke at stuff in the network inspector until it works 02:01:37 There may have been some changes, but it's worked more or less the same way for a couple years 02:08:59 Ah, they are switching to a new api and apparently it's like no where near feature compatible with the old one, dunno how much that had changed 02:21:12 is there a channel for fotoalbum.ee? 02:21:18 the project doesnt have a wiki page 02:22:25 #lookatthisfotograph 13:09:40 mgrandi: Last I heard was 'we won't be doing this until there is a way to memorialise accounts' or something like that, not that they abandoned it entirely. 14:28:18 -purplebot- List of websites excluded from the Wayback Machine edited by Nikchemny (+52, Added this website) just now -- https://www.archiveteam.org/?diff=45667&oldid=45658 19:55:00 how long dus it takes to get the conformation email form hackint 19:55:31 Just a minute or two on the registrations I've done. 19:55:49 and thet it is 19:56:58 im registered JAA 20:52:18 -purplebot- Coronavirus edited by Wessel1512 (+575, /* Information */) just now -- https://www.archiveteam.org/?diff=45672&oldid=45514