00:49:09 arkiver: It's fine to queue here, right? 00:49:28 I've got ~500k URLs from #burnthetwitch 00:51:18 https://transfer.archivete.am/14vHJu/burnthetwitch-urls 01:12:37 TheTechRobo: woah 500k, we will not DDoS anything with that? 01:12:39 if not, then yes 01:17:00 arkiver: They're mostly Twitch CDN 01:18:22 well let's do it 01:35:38 :D 01:35:48 so many 01:35:55 !a https://transfer.archivete.am/14vHJu/burnthetwitch-urls 01:36:15 fireonlive: Skipped 6 invalid URLs: https://transfer.archivete.am/EgP7W/burnthetwitch-urls.bad-urls.txt (for 'https://transfer.archivete.am/14vHJu/burnthetwitch-urls') 01:36:16 (it is !a right?) 01:36:16 fireonlive: Deduplicating and queuing 504120 items. (for 'https://transfer.archivete.am/14vHJu/burnthetwitch-urls') 01:36:18 ther we go 01:36:32 look@em numbers go 01:36:36 pew pew 01:36:42 fireonlive: Deduplicated and queued 504120 items. (for 'https://transfer.archivete.am/14vHJu/burnthetwitch-urls') 01:59:37 https://transfer.archivete.am/vvQ2N/www.ontariotravelguides.com.log looks spammy to me 02:19:37 fireonlive: Thanks, I somehow forgot to queue the list. lol 02:20:12 np :3 02:20:20 :3 02:21:03 Here's the remainder: 02:21:05 !a https://transfer.archivete.am/B4Tyi/burnthetwitch-urls 02:21:09 TheTechRobo: Deduplicating and queuing 78812 items. (for 'https://transfer.archivete.am/B4Tyi/burnthetwitch-urls') 02:21:18 TheTechRobo: Deduplicated and queued 78812 items. (for 'https://transfer.archivete.am/B4Tyi/burnthetwitch-urls') 02:21:39 (obligatory thank you to transfer for decompressing zstd on-the-fly) 02:22:48 transfer++ 02:22:48 -eggdrop- [karma] 'transfer' now has 1 karma! 02:42:11 :-) 02:44:44 Zstd is amazing for compression for temporary stuff 02:52:30 s/ for temporary stuff// 02:52:57 :-) 03:11:03 zstd is your friend 03:14:19 the only std you want! 03:35:43 JAA: Eh, I prefer lzip for long-term storage because it has nice data-recovery tools. (see https://www.nongnu.org/lzip/lziprecover.html) 03:35:57 fireonlive: what about libstd? 03:36:13 i'll allow it! 03:36:54 also, lzip has an amazing logo, so there's that 03:37:05 https://lounge.thetechrobo.ca/uploads/84e7a28d4f07de68/image.png Perfection 03:37:48 TheTechRobo: Yeah, but it's also much slower. And I'd argue that if you need to recover corrupted files, your entire storage system is already flawed. :-P 03:38:11 JAA: Isn't AT's mission statement preserving shit "forever"? :P 03:39:02 Lziprecover provides another avenue to recovering data potentially decades into the future. 03:39:03 Check and mate. :P 03:39:42 Hmm, so we should use a compression algorithm that takes forever to compress the data in the first place! *taps forehead* 03:40:04 if you want the ability to recover corrupted files, use par2 to generate parity data 03:40:38 nicolas17: But that's work. 03:40:39 You'll never need to recover from corrupted compressed files if you never get to the compressed output. 03:40:50 JAA: hey don't be mean to lzip like that 03:40:56 What did lzip ever do to you? 03:41:01 I propose `sleep inf && cat`. 03:41:11 JAA: we fixed the storage problem! :P 03:41:15 Yay 03:41:22 zstd + par2 anyone? 03:41:25 anyone? 03:41:25 :3 03:41:34 isn't paq8 even slower? :D 03:41:40 Zpaq ftw 03:41:51 small files but at what cost 03:42:52 your power bill 03:43:20 fireonlive: What about lzip and par2? 03:43:26 (I'm not shutting up about this) 03:43:28 🤔 03:43:38 lzip is amazing and I will *make* you all realise that 03:43:53 xz is ubiquitous 03:44:45 TheTechRobo's lzip torture room 03:45:15 I wonder what the best compression algo is with respect to a very compute-constrained future 03:46:35 * project10 is a party pooper :P 03:48:31 I love how the creator of BARF (http://mattmahoney.net/dc/barf.html) explains how it works in great detail, showing its limitations too (how it stores information in both the decompressor and filename), and then at the bottom says "BARF2 (Oct. 5, 2010) iteratively compresses random data down to 0 bytes and restores it correctly without hiding data 03:48:31 in the decompresser or in the filename." and refuses to elaborate 03:50:37 X 03:50:57 Hehe 03:52:12 Ah, it saves the needed information to a separate file. Clever. lol 03:52:27 lol 03:52:38 looks like compression is basically treating the file as a a number and subtracting 1 03:52:55 then all you need is knowing how many times you have to run the decompression tool 03:52:57 It's described as 'How to cheat at compression benchmarks' elsewhere on the website, so yeah. 03:53:19 yeah 03:53:51 to compress "hi" to 0 bytes you may need to compress it 0x6869 times, then save that number so you know how many times to decompress it to get "hi" back :P 03:54:57 Ah, lol 03:55:08 "It's evolving, just backwards." https://lounge.thetechrobo.ca/uploads/5c8fadae052c14a2/image.png 06:55:39 .cpp? what year is it :D 11:52:45 interesting, these folks have a database of local news orgs in the USA https://localnewsinitiative.northwestern.edu/ stateoflocalnews⊙ne 11:52:56 they describe it as "proprietary" though :( 16:09:22 -_-