02:07:49 2023-09-09 01:52:51 (1.36 MB/s) - Read error at byte 290689826387/484259014419 (Error decoding the received TLS packet.). Retrying. 02:07:58 using wget instead of curl just saved me 02:08:48 it resumed from where it failed 02:10:16 2023-09-09 01:52:39 (940 KB/s) - Read error at byte 38657556051/377308233499 (Error decoding the received TLS packet.). Retrying. 02:10:23 interesting that both downloads failed at similar times 02:52:34 interesting, I've only ever seen TLS errors from wayback machine 03:05:39 yeah this was from archive.org items, not wbm 04:35:47 does anyone know if AB/IA/WBM/something can save magnet links? this band publishes raw video files, audio stem files and fan-made bootleg videos via magnet: links https://kinggizzardandthelizardwizard.com/automation https://kinggizzardandthelizardwizard.com/bootlegger 04:39:17 not in danger (except if seeder counts go down), but stem files are so rarely published that it would be good to have them on IA 05:06:51 https://www.theregister.com/2023/09/08/atari_snaps_up_atariage/ 05:34:15 pabs: I was going to say no but it seems actually yes 05:34:34 "If a valid .torrent file is uploaded (e.g. through our Uploader) into an item, when that item is derived, we will instantiate a BitTorrent client (Transmission) and attempt to retrieve the Torrent. If the Torrent is successfully retrieved, its contents will be added to the item. ‘Valid’ in this case means, well-formed and seeded." 05:34:53 "Bonus feature: if you have only a magnet link, and not a Torrent file, you can create a dummy .torrent file by pasting that magnet link into a text file and naming it foo.torrent." 05:35:30 not sure if you can do multiple .torrent files per item this way 05:40:48 nice, so just write to a file and ia upload 05:41:05 worth noting this will not seed the *original* torrents 05:41:22 * pabs not sure about doing this with the current IA upload bandwidth issues hmm 05:41:34 the end result is equivalent to you downloading the torrent and then uploading the files 05:42:15 hmm, guess I should check the torrent are fully seeded before doing this... 05:42:34 it saves *you* a ton of bandwidth 05:42:58 yeah 05:43:24 where did you read that btw? 05:43:47 https://help.archive.org/help/archive-bittorrents/ 05:44:07 the first half of this page is all about IA-generated torrents for items 05:44:41 later it talks about using torrents to upload, which is an entirely different feature 05:45:39 thanks, adding some TODO notes 05:57:02 * pabs goes to AB atariage stuff 13:00:46 i had no idea about torrent uploads, that's awesome 13:52:40 It's also mentioned on the wiki: https://wiki.archiveteam.org/index.php/Internet_Archive#Torrent_upload 13:52:57 Used to use it a lot before I got fibre 15:20:46 hi, so there is this publisher that recently posted about possibly being insolvent, depending on how much donations they will receive in the next two weeks. they have a magazine, a shop, etc.—6 domains in total (that i’m aware of). would you be willing to run that through the archivebot? (repost here so it doesn't get lost in #archivebot) 15:20:56 this is their post, but it's german only :/ https://katapult-magazin.de/de/artikel/katapult-ist-insolvent 15:23:01 manu|m: thanks for the report! that looks archivebot-able. can you list the relevant domains? 15:25:28 sure: https://katapult-verlag.de/ - https://katapult-magazin.de/ - https://katapult-mv.de/ - https://katapult-ukraine.com/ - https://www.katapult-shop.de/ - https://katapult.link/ 15:26:35 katapult-magazin.de has a bunch of subdomains, but their all either some internal service with a login at the front, not reachable, and two were some one-pagers that I sent to IA via browser extension 15:28:59 I don't think katapult.link needs to process external links, looks like they're only linking to their own stuff on other domains 15:34:40 having looked at the link structure between those domains, i think it'll be best to run a separate archivebot job for each 15:35:12 (possibly with --no-offsite-links on the katapult.link job to avoid duplication) 15:38:25 someone with bot privileges should queue them up shortly, i believe 15:44:49 thanks, i appreciate it :) 15:57:56 That lurker edited GitHub (-44, ghtorrent.org domain is used for casino…): https://wiki.archiveteam.org/?diff=50737&oldid=50426 18:14:25 wondering - how are we on resource for #archivebot nowadays? seems like all jobs are starting immediately. or is there something else that shows we would need more resources? 18:16:35 Right now we seem to be OK - it looks like all of the stuff I queued just now has filled all pipelines (I know because !status says 205 in progress, and around 200 is the limit - the exact number varies because some pipelines only accept specific jobs though), but that stuff should finish shortly. It was struggling a bit when there was a ton of gabon stuff going on but I 18:16:38 think it's OK at the moment 18:18:25 as my operating system teacher used to say, disks are more often full than empty, and are meant to be filled up :-) I think we're okay too. If I want to put a lot of subdomain it will definitely be queued, and if socialbot comes back for twitter, the same will apply. 18:19:58 (though also it's worth noting that the 200 in progress is misleading as that includes stuff on the cybercontrol pipelines, which have been offline for over a year (see http://archivebot.com/pipelines)) 18:20:34 certainly, also how about those jobs there? 18:21:57 I'm not entirely sure what their status is and if they could be resumed if/when the pipelines return 19:32:10 https://twitter.com/tanks404/status/1700171142642757657 19:32:10 nitter: https://nitter.net/tanks404/status/1700171142642757657 19:32:15 "Come end of Oct, Tokyo Lab, one of the largest holders of film material in Japan, will be discarding all film left unclaimed by rights holders. It's a foregone conclusion that anything stuck in "licensing hell" is done-for. This is bad." 19:32:29 https://twitter.com/retroanimechris/status/1700512775536042073 19:32:30 nitter: https://nitter.net/retroanimechris/status/1700512775536042073 19:32:43 "> @nappasan @NFAJ_PR The negatives will be looked into by Councillor Ken Akamatsu. We have not received a commitment to resolve the issue, so please keep an eye on the progress. We would also like to continue to ask for effective approaches from all parties." 19:33:09 arkiver: re: archivebot, we have gotten pretty good at estimating the limit and backing off when we hit it, but that doesn't mean we couldn't use more resources, especially with a more intelligent queuing system, or a proper ability to suspend jobs 21:15:11 arkiver: note that there may be some low-priority stuff that people *aren't* adding to archivebot because they know of the issues uploading to IA 21:16:17 Sanqui: pokechu22: thank you! i do not have anything to offer right now, but wanted to know what the current situation is 21:16:36 nicolas17: i think it's fine for people to queue stuff anyway, we can always start using the offload space if really needed 21:16:39 also JAA ^ 21:16:58 I mean where is archivebot data going now? 21:17:11 are we uploading to IA, maybe at reduced speed? 21:17:56 I guess I should also mention that there have been issues where one of the pipelines (I can't remember if it was hel3 or hel4) had issues uploading to IA, while the other one of the two was fine (and it also was able to transfer data to the other pipeline and upload it quickly). I'm not actually sure who's responsible for those pipelines (probably AK because 21:17:58 "ak-was-here-hel4") but it's something weird that's happening 21:21:38 (can't remember which one it is either). Iirc JA_A did some digging and couldn't see an obvious reason. It also didn't seem to happen all the time 21:22:12 (I pay the bill, JAA does all the managing of the AB stuff on the top because I don't know how and they're super nice <3) 21:22:18 it's being uploaded 21:22:22 i believe not at reduced speeds 21:22:28 ah 21:22:38 it's going in, and it's anyway a small portion of what we normally put through to IA so not a huge problem 21:22:47 I'm more familiar with how warrior stuff works and largely blind to archivebot :) 21:22:54 It's worth noting that the issue with hel3 or hel4 is a longstanding one that wasn't caused by the more recent IA issues; it's just mysterious 21:23:42 i bet rewby would have an idea about that, rewby is the expert when it comes to pushing data to IA 21:36:03 Myusernameisanything edited ISP Hosting (+3, I archived the scraped URLS for Claranet…): https://wiki.archiveteam.org/?diff=50738&oldid=50486 21:36:04 Myusernameisanything edited Claranet Netherlands Personal Web Pages (+3, I archived the scraped URLS and now they are in…): https://wiki.archiveteam.org/?diff=50739&oldid=47495 21:37:03 That lurker edited Valhalla (+483, Cerabytes product is like ment for archiving so…): https://wiki.archiveteam.org/?diff=50740&oldid=48930 22:11:23 arkiver: The hel4 upload issues are totally unrelated to anything at IA etc. That machine just has irregular transient issues towards the rsync target for some unknown reason. Nothing weird shows up in mtr etc. when it happens, but rsync transfers run at a couple hundred kB/s. Then it's fine again a few hours later. I asked a few people and nobody had any ideas what I could try to diagnose the issue 22:11:29 either. When I'm around, I route the rsync traffic through another machine at Hetzner Helsinki, and everything works perfectly fine there. 22:12:22 could it be a hardware issue? 22:12:25 On cybercontrol, I mentioned this in #archivebot recently, but it obviously got buried, so repeating for visibility: those pipelines might be a loss since the relevant person has been MIA for a long time now. If nothing happens by the end of the month, they'll be thrown out of the system. 22:12:31 Cc pokechu22 ^ 22:12:56 yes anything mentioned in #archivebot will probably be missed by a ton of people, please post it here too or we should have a special AB discussion channel 22:13:25 #archivebot-bot 22:13:52 damn, never fun to have someone go MIA :( 22:13:58 I mark important things with [PSA], and I advise people to set up highlights for that. A separate discussion channel has been brought up several times, there even is one, but it's completely unused, and it could kind of lead to a split brain. 22:14:20 which one? 22:14:23 let's start using it 22:15:18 #archivebot-bs 22:16:20 for the interested: weechat.look.highlight_regex "^\[PSA\]" 22:16:30 that's what i use 22:18:00 i could do that with TL; though it works everywhere 22:18:08 add that to your reasons to hate TL JAA :p 22:32:29 is #archivebot-bs official, then? 22:32:45 no 22:32:47 There's just been discussion about that in there. :-P 22:34:17 so are we going to start using it or no ?_? 22:34:38 My point of view is: discussions in #archivebot are mostly about immediate action, e.g. ignores or starting jobs, and dev stuff can already happen in #archiveteam-dev. PSAs like the above are rare enough that I'm not sure a separate channel makes sense, and I can repeat them here in the future for visibility. 22:35:22 I'll also note that the above was merely a 'this might happen' comment, not a real PSA. If it comes to that, I'll of course explicitly ping everyone with affected jobs, too. 22:35:43 More important PSAs are also usually added to the channel topic. 22:35:54 (Like the 'don't touch the 600001 jobs' one) 22:37:33 i hope long term discussions/planning can be outside of #archivebot 22:37:41 talking about the cybercontrol stuff for example 22:37:52 Well, not much to plan there, sadly. 22:37:53 very jobs specific stuff can be in #archivebot i think? 22:38:27 agreed, but i think JAA is correct that it can happen here or in -dev (as appropriate) 22:39:20 JAA: so archiveteam-bs/dev instead of archivebot-bs ? 22:42:33 I think so, at least. If others disagree, I won't oppose a separate channel. 22:43:48 we could decide tomorrow 22:47:17 JAA: Are PSA's posted on the #archiveteam channel? 22:51:42 that_lurker: If it's something sufficiently important, I can do that, yeah. 23:20:22 JustAnotherArchivist edited Deathwatch (+193, /* 2023 */ Add Squat the Planet): https://wiki.archiveteam.org/?diff=50741&oldid=50713 23:23:59 Yeah then I woulds say the current channels and practices cover the need of a dedicated channel. But I make the ocasional memes so the decision is up to someone else :P 23:34:05 archivebot meme, go! 23:34:07 :D 23:40:09 https://lounge.kuhaon.fun/folder/28c3f47d3ea56b97/7yllrg.jpg 23:41:17 x3 23:41:38 Alternatively: right label 'ArchiveBot', left label 'Buttflare' 23:42:46 https://lounge.kuhaon.fun/folder/daed62744f4581d3/7yllyz.jpg 23:43:06 :-)