00:05:26 Yay :-) 00:06:49 moving queues 00:11:36 still seeing outdated project code on my end 00:12:04 yeah, i paused it again while queues are being moved around 00:12:21 i should do that in the proper way (which requires a few extra steps) 00:13:11 ah, right 00:14:04 imer: fixed :) 00:14:07 now properly paused 00:24:11 thanks, probably wasn't necessary though. moving the items shouldn't take too long, right? 00:24:42 00:26:56 no i think it's half way done or so 02:07:31 10ish day eta at this rate 02:07:59 should be speeding up a bit still as well 02:53:01 imer: yeah, due to not getting URLs from special interest pages anymore 02:54:18 mmh, guess if that's what needs to happen so we can keep up that's fine 02:54:34 just a shame there's so much spam :( 02:55:33 imer: that spam is now out of the way 02:55:43 oh cool, nice work :) 02:55:59 well until it pops up in a different shape 02:56:04 we'll deal with it then again 03:06:47 arkiver: there's still stuff left in the queue though to work through, right? that's just not going to queue more things? 03:07:08 https://transfer.archivete.am/MuV5S/2023-11-13_03-07-01.txt for example 03:12:16 * pabs reminds arkiver about the FLOSS planets urls-sources PR :) 03:45:02 we might want to filter some of the common domains, currently that looks like kalkulatorpolityczny.pl and unternehmen-mut.de (cc JAA) 04:03:35 imer: Those two should be gone now. 04:03:55 thanks :) 04:10:40 skalle66.de beachvolleyball2005.de multikodzik.pl aktualnewzory.pl tarmed.pl 04:25:33 Hmm, those are <2% each of the queue currently, compared to the 6-7% each on the previous two. 04:28:49 just looking at what stands out to me currently (> 1/s) 04:31:08 what to do is of course at your discretion, dont have the full picture 04:33:01 1% of the in-memory queue is still 200k URLs, so I'll yeet them. 04:34:29 Done 04:36:33 nice, thanks 13:23:15 rewby: seeing some -1's again 16:13:56 moving secondary to redo, so we can put several million PDFs in secondary for archiving 16:15:40 oooh buddy 16:17:14 :) 16:44:42 my poor cpu 16:46:38 Think mine has melted 16:46:50 RIP 16:47:00 My output has dropped, don't think I can manage as many concurrent when it's 90% pdfs 17:01:08 Ahh yeah seeing mainly -1 now, think we've filled up 17:02:17 AK: the PDFs are not going through now actually 17:02:19 imer: ^ 17:02:35 moving items is 25% done only still 17:02:46 yeah 17:03:01 imer: the PDFs will go into secondary, so the regular URLs will still be going through backfeed, meaning the rate of PDFs is not 100% 17:03:10 as in relative rate 20:53:42 !a https://transfer.archivete.am/MYKeH/pdfs.txt 20:55:02 509 MB zstd-compressed 20:55:09 yeah 20:55:18 it's 1.5 GB decompressed 20:55:41 *chuckles* I'm in danger 20:55:54 :P 20:56:00 h2ibot: you'll be fine <3 20:56:53 19.7M PDF URLs 20:56:55 Nice :-) 21:02:28 arkiver: Fixed 123 unprintable URLs: https://transfer.archivete.am/9y6Su/pdfs.txt.not-printable.txt (for 'https://transfer.archivete.am/MYKeH/pdfs.txt') 21:02:29 arkiver: Deduplicating and queuing 19725857 items. (for 'https://transfer.archivete.am/MYKeH/pdfs.txt') 21:03:58 Bloom filter's not going to be forgiving you for this one 21:18:52 arkiver: Deduplicated and queued 19725857 items. (for 'https://transfer.archivete.am/MYKeH/pdfs.txt') 21:19:00 \o/ 21:52:22 it's moved to secondary, so it will slowly be eaten 21:52:53 chomp 22:40:40 load average: 266.38 22:42:55 nice 22:43:12 goes down as the targets are clogged :( 22:59:21 optane9 like https://irc.project10.net/uploads/107ed7b0f174b236/Untitled.jpg 23:03:33 The problem with optane9 isn't that it's clogged, but it's eating while on the toilet 23:03:54 He wasn't ready for that joke 23:06:14 :| 23:07:58 :-)