04:37:08 Even if it has ads, the original video is still there for archival purposes so it's not the end of the world at that end 09:33:12 i am banned from the-archive and distributed youtube archive, or else i might see what they have going for "important channel lists" 09:33:46 the petty politics is not good for archiving without duplication 13:36:13 immibis: what do you mean? 13:57:30 if the channels can't all be archived, having a list of channels that should be archived but currently aren't is still useful for when someone can do them 15:26:02 immibis: what channels are this and do you have descriptions of them? we also have #down-the-tube , but there are _very_ strict rules for that on the wiki 15:40:24 arkiver: that's what I meant - I'm not aware of any watchlist of channels worth archiving in advance of actually archiving them 15:41:09 my own system has a prioritized list of channels, and works through them at its own rate, with *months* of backlog 15:44:25 > i am banned from the-archive and distributed youtube archive, or else i might see what they have going for "important channel lists" 15:44:26 Nothing 15:49:52 i know that DYA has a spreadsheet covering stuff that is already archived 15:50:03 i'm pretty sure neither has a "want to have" list 15:50:16 probably because they just archive it, instead of putting it on a list 15:51:46 Yeah 16:26:40 !archive https://angrybirds.miraheze.org 16:35:10 lol 16:35:16 "my work here is done" 16:40:00 at least its the correctish channel :P 18:05:18 well, they did try in #archiveteam after 18:05:20 xP 18:32:12 oh yeah :P 20:45:19 JAA: who runs archivebot pipelines? 20:45:48 we should really archive some hamas (and likely related) sites - but this may have negative implications for whatever IP this is run on 20:45:49 arkiver: Yours truly. 20:46:10 JAA: ah :) 20:46:15 Two machines are my own, the rest are rented by others, and I run everything from there on. 20:46:31 s/rented // (not all are rented servers, actually) 20:47:11 so basically looking for someone who might want to run a temporary archivebot which we can use to archive hamas and related content 21:36:12 Is there an existing project that handles a phpbb forum? Xentax is closing at the end of the year and has a lot of attachments that aren't hosted anywhere else 21:38:02 mgrandi: not specifically, plus (while there's been an ab job for the forums) i'm given to understand that attachments are login-walled 21:38:58 Yeah, that makes it hard for AB right 21:39:22 But I was seeing if maybe I could look at the seesaw project code if one exists for a phpbb forum 21:41:50 sort of--archivebot is technically capable of doing logged-in crawls, but the interface is designed not to allow them to be configured, because as a matter of policy we don't send them to the wbm 21:42:38 grab-site is basically the same internals and would work well for an 'unofficial' crawl if given login cookies (just use the forums igset) 21:45:30 I know past seesaw scrapes do login crawls, dunno if policy has changed 21:46:31 the only one i'm aware of was yahoo groups, and that was agreed on as a special case 21:50:00 I know a few art site scrapes were cause you needed to be logged in to see nsfw art 21:51:45 til, i must not have been around for those 21:52:47 anyway, a grab-site run would be a good start--could dump it on ia as an item if nothing else 21:53:12 I think there's also JAA's qwarc - it *probably* could do logged in stuff if needed 21:53:32 https://github.com/ArchiveTeam/furaffinity-grab 21:53:41 Was one of them 21:54:02 I can see if I can write a script to grab urls, or see if one exists 21:54:48 2015, wow 21:55:26 Yeah, we did a couple projects with accounts, but the most recent one was over 5 years ago I believe. 21:56:10 And generally, such data won't go into the WBM these days. 21:56:25 JAA: yahoo groups was 2019-2020. but that one was... special 21:56:45 Well, yeah, but we didn't create WARCs with accounts there, I believe. 21:57:42 Maybe I'm misremembering, but I think it was only for GMD exports. 21:58:41 i believe so yes 21:58:47 (but same disclaimer here) 21:59:51 the api grab used login cookies and generated warcs https://wiki.archiveteam.org/index.php/Yahoo!_Groups#2019_API_grab https://github.com/ArchiveTeam/yahoo-group-archiver 22:01:18 > from warcio import WARCWriter 22:01:24 *twitch* 22:01:24 :( 22:01:44 'Special' indeed... 22:01:53 oof. >"1/3 is in australia, 1/3 with me, and 1/3 on IA"[12]. As of September 2022 neither of the first two parts have been uploaded. 22:02:18 just the flowchart makes me wanna cry 22:02:24 oh 22:02:29 was this that alternative project? 22:02:46 Yeah 22:02:50 (the state of affairs depicted by the flowchart, not the flowchart itself, it's a very nice flowchart, thank you Doranwen) 22:03:02 sigh 22:03:06 where did those WARCs end up? 22:03:28 that was annoying from what i remember 22:04:06 ah the data was never uploadd? 22:04:19 it should be, but not in the wayback machine due to warcio and login 22:05:05 some of it's been uploaded (but not in wbm afaik), some of it's floating around in limbo 22:05:39 ask marked and/or lennier1 (lennier2?) 22:06:11 they'll upload when they upload, there's been plenty of time 22:06:47 anyway 22:06:57 mgrandi: writing your own script(s) seems like overkill; something wrong with grab-site? 22:07:49 mgrandi: are the links you to through login actually then downloadable without login? 22:08:11 oh, good question 22:09:07 No I think it needs a cookie but I can find out later 22:09:15 I dunno how grab site works or if I can run that heh 22:09:59 it's basically local archivebot, dashboard and all 22:10:12 https://github.com/ArchiveTeam/grab-site/ 22:13:12 Kevidryon2 created "osu!" (+2283, Version 2): https://wiki.archiveteam.org/?title=%22osu%21%22 22:13:13 JustAnotherArchivist moved "osu!" to Osu!: https://wiki.archiveteam.org/?title=Osu%21 22:13:26 I never did figure out for sure what that "1/3 in Australia" was referring to. Did datechnoman run any targets? I was meaning to ask them. 22:15:12 JustAnotherArchivist edited Osu! (-59, Cleanup): https://wiki.archiveteam.org/?diff=50978&oldid=50977 22:30:19 https://forum.xentax.com/download/file.php?id=22167 here is a example file 22:31:20 Looks like you need to be logged in, maybe it does a redirect to the raw file that might not check 22:32:05 oh dammit, they moved the date up 22:34:16 Switchnode edited Deathwatch (-1, /* 2023 */ update xentax with new deadline): https://wiki.archiveteam.org/?diff=50979&oldid=50973 22:34:37 (dec 1 now) 23:20:19 :/ 23:35:34 thuban: I didn't create the flowchart, lol - I did contribute to some of the *data* being acquired, and to sorting it all out (still working on that!) - but I'd have to search through logs to see who created the flowchart 23:38:01 I appreciate your dedication <3 23:43:09 oh, my mistake--looks like it was OrIdow6. i hereby redirect my thanks 23:43:11 ditto, though