00:06:02 FTR, I threw vger.kernel.org into AB, it is small (no archives) 01:09:12 Hi everyone: Hi, just wondering if anyone know if the upload of the "telenor home" collection is all different version of the same set of website crawls or if every large collection on archive.org are different? https://archive.org/details/archiveteam_telenor - https://wiki.archiveteam.org/index.php/Telenor 01:18:09 mrcave: I'm pretty sure that each item in that collection contains different data, split so that each item is ~20 GB each 01:18:54 the home.online.no/~joeolavl/ and similar ones are a bit weird but it sounds like those were for individual users that weren't found by the main grab 01:19:09 So if you wanted to download all of the site, you'd need to download the WARCs from all of the items 01:21:16 hey, thanks for the info 01:26:04 I check the individual home.online.no/~joeolavl uploads. I helped out on the home.no grab, but not looked at the files since. now trying to look for band pages, but finding interest in writing a summary of home.no and it content. compared to geo cities, the users was the owner of the Internet subscription, so the pages are often made by adults and 01:26:04 with a close family vibe.. person start pages for the familys+++ only found 1 band page 01:56:47 Have there been any major changes in the last ~6 months? Anything I can help with in the end-of-year rush? 01:59:00 OrIdow6: hi :) 01:59:04 it's not very busy at the moment 01:59:16 mostly we're working on #frogger (Blogger) still, which is almost finished 02:00:08 That's good 02:00:21 And hi 02:03:16 hii 04:12:41 https://deadline.com/2023/12/andre-braugher-dead-homicide-life-on-the-street-brooklyn-nine-nine-actor-1235665513/ 04:12:44 "André Braugher Dies: Star Of ‘Homicide: Life On The Street’, ‘Brooklyn Nine-Nine’ & Other Series And Films Was 61" 04:14:43 ...that URL sounds like he died by homicide 04:22:47 oh it does 04:23:55 I feel like they edited the headline after publication due to the same issue, but their system doesn't regenerate the slug in that case. 04:24:15 Yup: https://web.archive.org/web/20231213013032/https://deadline.com/2023/12/andre-braugher-dead-homicide-life-on-the-street-brooklyn-nine-nine-actor-1235665513/ 04:24:46 Oh wait no, the isn't the article headline... 04:24:52 <JAA> And that's where the slug is derived from. 04:25:08 <fireonlive> ahh 04:25:43 <fireonlive> at least they have a unique ID in the URL so they can redirect later 04:45:35 <fireonlive> -+rss- Professor in Jordan sues sleuth who exposed citation anomalies: https://retractionwatch.com/2023/11/29/professor-in-jordan-sues-sleuth-who-exposed-citation-anomalies/ https://news.ycombinator.com/item?id=38622057 05:28:02 <DJ> Ello 05:29:54 <DJ> I would like to help archive ponychan, is there anything I can help with or do I just have to download Warrior? 05:30:31 <flashfire42> DJ God forbid I ask. Is something happening to ponychan? or would this be proactive archival? 05:30:52 <DJ> It's apparently shutting down on Jan 7th 05:31:02 <flashfire42> Do you have a source for that at all? 05:31:30 <DJ> Yep, here you go https://www.ponychan.net/chat/res/112453.html, it's on Deathwatch as well. 05:31:41 <DJ> Sorry just remove the comma 05:32:25 <flashfire42> Oh it is too. Ok so I mean depending on the rate limit it may just be an archivebot job. More warrior runners is always great but I am not sure this would be a warrior project cc arkiver maybe? 05:33:12 <fireonlive> hmm depends how many posts it has i suppose 05:33:16 <fireonlive> + media per post 05:33:24 <flashfire42> I dont know a lot about chans or MLP for that matter I tend to avoid both so 05:33:33 <fireonlive> though we do have until the 7th 05:33:36 <JAA> How far back do the posts go anyway? Many image boards continuously purge old posts. 05:34:23 <flashfire42> I was thinking that too JAA thats the way those boards often operate 05:34:28 <fireonlive> ah yes 05:34:42 <fireonlive> /pony/ shows 2023-07-24 05:35:05 <JAA> /oat/ goes back to 2021. 05:35:19 <fireonlive> /chat/ has one from 2023-08-29 05:35:30 <JAA> /fan/ 2015... 05:35:36 <JAA> So I guess it's not very consistent. lol 05:35:53 <fireonlive> hmm.. 11 pages on /oat/ 05:36:03 <fireonlive> wonder if it's not very active 05:36:47 <fireonlive> i think imageboards are usually purged based on new threads instead of a timer 05:36:48 <JAA> Older posts still exist. Random example from /pony/: https://www.ponychan.net/pony/res/36833460.html 05:37:04 <fireonlive> oh interesting 05:37:38 <fireonlive> hm that one shows up in catalog still 05:37:43 <fireonlive> https://www.ponychan.net/pony/catalog.html 05:37:48 <fireonlive> so not nesc. pruned yet 05:43:19 <DJ> flashfire42 Alright, do I ask something specific or just go for it? 05:43:53 <flashfire42> If you look above they are discussing possible ways of doing it 05:44:30 <flashfire42> I also just threw it into archivebot just to see 05:44:44 <DJ> Ah okay then, thanks. 05:50:30 <JAA> Yeah, checking some more, those are all 404s. I just happened to check one of the very few that's in the catalog.html. lol 05:50:52 <fireonlive> ah good luck haha 05:52:52 <fireonlive> AB job seems to be going well so far 08:56:20 <angenieux> Hello 08:56:24 <angenieux> Would it be a good idea to rearrange the order of the command line argument of wget-at so that "--lua-script foo.lua" so its easier to see what project a particular process of wget-at is running with htop? 08:58:24 <angenieux> *so that --lua-script part is closer to the front 13:47:04 <foaf> hello guys 13:48:30 <foaf> im trying to decompress one megawarc.warc.zst file but it says that i need the dictionary I tried the script in the warc page but it gives me some errors File "C:\jk.py", line 46, in <module> d = get_dict(fp) ^^^^^^^^^^^^ File "C:\jk.py", line 30, in get_dict p = subprocess.Popen(['unzstd'], stdin = subprocess.PIPE, stdout = 13:48:30 <foaf> subprocess.PIPE, stderr = subprocess.PIPE) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Program Files\Python311\Lib\subprocess.py", line 1026, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "C:\Program Files\Python311\Lib\subprocess.py", 13:48:31 <foaf> line 1538, in _execute_child hp, ht, pid, tid = _winapi.CreateProcess(executable, args, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^FileNotFoundError: [WinError 2] 13:48:50 <foaf> i have tried to change unzstd for zstd with no results 13:49:08 <foaf> thanks in advance 14:14:24 <TheTechRobo> Is that script compatible with windows? 14:19:36 <foaf> i dont know what i changed is unzstd to zstd 14:33:16 <TheTechRobo> Make sure it's in the same folder as yhe script 14:33:18 <TheTechRobo> *the 23:45:22 <flashfire42> https://apo.org.au/ were we able to do anything about this? 23:46:37 <JAA> Nope, all attempts got banned very quickly.