00:00:10 JAABot edited CurrentWarriorProject (+4): https://wiki.archiveteam.org/?diff=49206&oldid=49202 04:02:50 Hi? 04:09:33 Bye. 14:01:31 "Ugh, glad I never tried out Atom..." <- I used atom for a long time, I liked it a lot (still do) but with ceased development, you just can’t compete with vscode when you need to actually code stuff :/ 14:02:44 Yes, I mean tianya. Looks like my matrix was a bit wonky, the messages above the one I sent were not there when I sent my original message... 14:13:30 should I gather a list of thread urls from tianya or is this already being worked on? 15:04:42 schwarzkatz|m: feel free to start gathering 15:05:01 Ok 15:05:23 Maakuth|m: do you know where those WARCs are on IA? 15:05:29 of koti.mbnet.fi 15:14:04 arkiver: I don't know if Orldow6^2 uploaded them, but at least they seem to be in the transfer site by quick looks, linked here: https://pad.notkiska.pw/p/mbnet 15:15:02 It doesn't seem like they mentioned an IA URL in #webroasting 15:31:42 schwarzkatz|m: but we will also have discovery through a project 15:32:52 Maakuth|m: ah, is this the stuff for which transfer.archivete.am was used, but it should not be used for that? 15:33:32 I'm afraid so, 15:35:16 alright now I know what this is about 15:35:36 ok. let me know if I can be of help 15:36:45 it seems that I have a full set of those tars on my machine too if some have gone missing from the transfer site for some reason 17:17:07 Codecov, a code coverage metrics company, is acquired by Sentry: https://about.codecov.io/blog/codecov-is-joining-sentry-heres-what-you-need-to-know/ 17:25:23 Hello, I got some resources about BuzzVideo archival over here https://github.com/yt-dlp/yt-dlp/issues/5330 17:26:18 I also added a bit more metadata to the extractor: https://github.com/upintheairsheep/ytdl-sheep/pull/5/commits/092f88c6f6c2608fbe6a21fb8d2dfa2918deb94a 17:26:47 Question: was buzzvideo put in archivebot yet? 17:28:44 The above issue contains a buzzvideo extractor, which could be pulled into the main yt-dlp and then we could tubeup the whole site's videos. 17:32:37 upintheairsheep: we're not going to tubeup BuzzVideo to IA 17:32:55 Just asking, why? 17:33:11 IA is already being spammed with a ton of tubeup stuff 17:33:14 But you could refer to the extractor to WARC the site's videos 17:33:26 OK, I understand. 17:33:30 buzzvideo will be archived into WARCs in likely a warrior project 17:33:47 you can join #buzzoff for the upcoming buzzvideo project 17:35:02 It would be nice for someone to merge the extractor, as I have been banned for OCD-induced issue spamming without verbose logs 17:35:49 archiveteam does not maintain yt-dlp 17:35:55 Is there a way to archive a Facebook profile as a logged in user? I have a friend who unexpectedly died about a week ago and I am trying to preserve his legacy. I am friends with him on facebook so I see more than the public profile shows. Is it possible for me to craw his profile as me and then add it to a archive? 17:36:24 Ryz told me to ask about this here. 17:36:27 ChrisWsrn: warcprox? 17:37:26 I do not have any archiving skills yet. Is there a wiki page I can take a look at on warcprox? 17:38:18 ChrisWsrn: you could try https://github.com/gildas-lormeau/SingleFile until someone comes up with a better idea for facebook. 17:38:18 It generates a static html page of the site. 17:38:28 not specifically, but there's a readme here: https://github.com/internetarchive/warcprox 17:38:45 Thanks. 17:38:58 What should I do with these files I collect? 17:39:22 (that said, if you're not comfortable with e.g. the command line, you may not find it user-friendly) 17:39:36 Command line i am fine with. 17:39:43 Alternatively, you could manually save each page or the entire feed as a single .mhtml file and upload it to the internet archive if that's all you need, and use tubeup with authentication for videos. 17:40:00 Does your friend have any other social media accounts 17:40:26 you could upload them to the internet archive. that won't put them in the wayback machine, but it will make them available for download 17:40:58 I'm sorry for your loss, I never suffered through the pain at this moment. 17:41:10 Nobody major died. 17:41:54 You could also download each image by right clicking them and uploading them manually if that's all you need. 17:42:10 Youtube which Ryz put in #down-the-tube. He had a blog which I added manually with https://web.archive.org/save and I think was added to be crawled by #archivebot. He has a presence on the doomworld fourms and some other fourms. I am still looking for other things. 17:42:59 ChrisWsrn: which blog? we might want to run in through #archivebot to ensure a complete copy is made 17:43:28 #down-the-tube seems to only have metadata and comments, did you archive the videos themselves yet? 17:43:40 #down-the-tube archives the video 17:43:41 Blog is https://ghastlygaming.wordpress.com/ 17:43:42 videos 17:43:51 and the videos will become playable in the Wayback Machine 17:43:55 It was just added less than a hour ago 17:44:07 ah good i see Ryz covered it 17:52:14 arkiver, I just checked the the thread count for bbs.tianya. we are looking at roughly 122,358,356 threads... 17:52:33 221 subforums 17:53:13 yeah, and roughly an equal number of accounts 17:53:36 (planning on going after all tianya.cn) 17:53:42 I have not started crawling anything, but I wrote some documentation on what I gathered so far, should I upload that to the transfer? 17:53:49 yes please!~ 17:53:50 yes please! 17:54:01 let's make a channel for tianya.cn! 17:54:19 anyone have ideas for a tianya channel? 17:54:57 byenya 17:55:35 endoftheworldclub 17:55:48 (https://en.wiktionary.org/wiki/%E5%A4%A9%E6%B6%AF) 17:56:30 (endoftheendoftheworldclub?) 17:56:41 yeah literally named that in english https://en.wikipedia.org/wiki/Tianya_Club 17:59:09 here's what I got so far https://transfer.archivete.am/qHTMP/docu-bbs.tianya.md 18:00:09 looks good 18:00:22 the additional -1 is interesting 18:00:43 (hm... can we get /inline/ for .md?) 18:06:04 if they exist, the -1 threads are suffixed with a small symbol next to it (cannot ocr the text) and are always the first thread in a subforum. 18:06:04 without the additional -1, they redirect to it though so that's good. 18:06:22 like here http://bbs.tianya.cn/list.jsp?item=46&order=1 18:39:30 ? 19:31:00 Dragalia lost has ended, might be a good idea to do a final run of the website and a comic which doesn't seem to be linked from the main page https://dragalialost.com/sp/en/ https://comic.dragalialost.com/dragalialife/en/ 19:33:54 https://twitter.com/DragaliaLostApp (if there is room in the probably overloaded socialbot space) 21:39:19 thuban: /inline/ doesn't do anything special, it just omits the Content-Disposition header to not force a download (so the browser can display the file directly if supported). I'm guessing you mean rendering the Markdown as HTML. 21:39:30 That'd be a bit more complicated. 21:56:20 atom.io is still horribly slow and throwing 500s all the time. I hope it'll get better soon. Can't really archive the packages at the moment. 21:57:15 just in case, the people over at https://pulsar-edit.dev saved all the packages a while ago. 21:57:58 But did they do it as WARC so that installing packages inside Atom will still be possible via the WBM after the shutdown? 21:58:04 :-) 21:59:20 Where can I find more details about what they did? 22:00:00 of course not, but at least it's better than nothing. 22:00:00 most likely on the discord server: 7aEbB9dGRT 22:00:23 :-/ 22:01:14 *sad J_AA noises* 22:01:31 Yeah, an imperfect mirror is certainly better than nothing. Also, most packages are public GitHub repos, so they could probably still be installed from there as well. 22:12:11 most of the packages are on github, yes. also note that they received a huge of amount of spam after announcing that atom will be deprecated. 22:12:11 ...I just checked, it's at 415k now... the amount of actual packages is about 12k. On 2022-08-08 it was 15k. 22:12:11 also, here are some related repos: https://github.com/confused-Techie/AtomPackagesArchive 22:12:11 https://github.com/confused-Techie/atom-package-collection (used to migrate the packages to the pulsar site) 22:13:17 sorry for not thinking of this earlier, I'm not checking in here enough 22:30:35 Yeah, I saw the spam. It's actually way more than 415k, no idea where that number comes from. The API returns almost 36k pages with 30 packages each, which also matches the number in the footer: '1,078,592 packages & themes' 22:32:05 that's incredible. interesting that they have not preemptively killed the site up until now 22:32:29 Yeah, or made it read-only or whatever. 22:34:10 How did they even handle uploading a theme, just anyone can do it with no review process? 22:34:21 Or package rather 22:56:32 I found Ghastlys twitter (He died last week). What should I do to archive this? https://twitter.com/ghastly310 23:04:16 I also found his bandcamp account, his imgur account, his reddit account, his twitch account (empty), plus "some accounts" that he would not want archived. What tools should be used to archive these accounts?