02:14:45 !a https://transfer.archivete.am/Jf16B/dripr_urls.txt 02:14:48 arkiver: Skipped 1 bad URLs: https://transfer.archivete.am/11mgf4/dripr_urls.txt.bad-urls.txt 02:14:50 arkiver: Skipped 1 unprintable URLs: https://transfer.archivete.am/UgJEj/dripr_urls.txt.not-printable.txt 02:14:51 arkiver: Deduplicating and queuing 35975 items. 02:14:52 arkiver: Deduplicated and queued 35975 items. 02:24:05 !a https://transfer.archivete.am/pm9jk/discord-Anchor 02:24:11 arkiver: Invalid command message. 02:27:24 !a https://transfer.archivete.am/pm9jk/discord-Anchor 02:27:29 arkiver: Invalid command message. 02:29:03 !a https://transfer.archivete.am/pm9jk/discord-Anchor 02:29:09 arkiver: Skipped 31 bad URLs: https://transfer.archivete.am/CKtm5/discord-Anchor.bad-urls.txt 02:29:11 arkiver: Skipped 1 unprintable URLs: https://transfer.archivete.am/vnq32/discord-Anchor.not-printable.txt 02:29:12 arkiver: Deduplicating and queuing 112516 items. 02:29:16 arkiver: Deduplicated and queued 112516 items. 02:29:44 hah check out those ugly URLs https://transfer.archivete.am/CKtm5/discord-Anchor.bad-urls.txt 02:29:58 anyway fixed TheTechRobo :) we can now handle even more bad URLs here 02:34:11 Hmm, the on.aws one could be valid. URLs are not required to have a path component. 02:34:25 Same with http://localhost etc. actually. 02:34:57 true 02:36:08 URL pattern i use here requires at least one dot 02:36:15 before the first / 02:36:36 Yeah, that's fine I think. But the on.aws one is still valid even when requiring a domain with at least two labels. 02:36:46 yeah 02:37:12 I actually had to look it up in the URL standard to make sure, but it's here, point 2 in the path start state: https://url.spec.whatwg.org/#path-start-state 02:37:28 !a https://transfer.archivete.am/pm9jk/discord-Anchor 02:37:34 arkiver: Skipped 31 bad URLs: https://transfer.archivete.am/5sAkw/discord-Anchor.bad-urls.txt 02:37:35 arkiver: Skipped 1 unprintable URLs: https://transfer.archivete.am/mFFNm/discord-Anchor.not-printable.txt 02:37:36 arkiver: Deduplicating and queuing 112516 items. 02:37:42 arkiver: Deduplicated and queued 112516 items. 02:37:52 ah IDN encoding problems 02:50:46 !a https://transfer.archivete.am/pm9jk/discord-Anchor 02:50:52 arkiver: Skipped 31 bad URLs: https://transfer.archivete.am/BuWnv/discord-Anchor.bad-urls.txt 02:50:53 arkiver: Skipped 1 unprintable URLs: https://transfer.archivete.am/kCnMa/discord-Anchor.not-printable.txt 02:50:54 arkiver: Deduplicating and queuing 112516 items. 02:51:00 arkiver: Deduplicated and queued 112516 items. 02:51:15 !a https://transfer.archivete.am/pm9jk/discord-Anchor 02:51:21 arkiver: Skipped 1 bad URLs: https://transfer.archivete.am/ZnQCQ/discord-Anchor.bad-urls.txt 02:51:22 arkiver: Skipped 1 unprintable URLs: https://transfer.archivete.am/WFgAC/discord-Anchor.not-printable.txt 02:51:23 arkiver: Deduplicating and queuing 112546 items. 02:51:30 arkiver: Deduplicated and queued 112546 items. 02:53:25 !a https://transfer.archivete.am/pm9jk/discord-Anchor 02:53:32 arkiver: Skipped 1 bad URLs: https://transfer.archivete.am/2TUjc/discord-Anchor.bad-urls.txt 02:53:33 arkiver: Skipped 1 unprintable URLs: https://transfer.archivete.am/ZoSr/discord-Anchor.not-printable.txt 02:53:34 arkiver: Deduplicating and queuing 112546 items. 02:53:38 arkiver: Deduplicated and queued 112546 items. 02:55:34 going to keep in the requirement of a dot in the domain 02:56:04 !a https://transfer.archivete.am/pm9jk/discord-Anchor 02:56:10 arkiver: Skipped 30 bad URLs: https://transfer.archivete.am/12mcVj/discord-Anchor.bad-urls.txt 02:56:12 arkiver: Skipped 1 unprintable URLs: https://transfer.archivete.am/M7vCx/discord-Anchor.not-printable.txt 02:56:13 arkiver: Deduplicating and queuing 112517 items. 02:56:22 arkiver: Deduplicated and queued 112517 items. 02:56:48 JAA: we're adding the / now 02:57:00 inserting actually, one could say 'fixing' 02:57:20 * arkiver is afk for the night 03:00:48 Sounds good, and good night! :-) 03:08:27 `https://sh.rustup.rs](https://sh.rustup.rs/` is probably a bug in the regex 03:08:51 granted, it is not valid formatting, but a lot of people think that it works on discord since it uses mini markdown iirc 05:04:22 Another social media platform to go through for offsite links called 'Minds'; exameple, https://www.minds.com/RussLeachDraws/ - came from https://www.russleach.com/ 05:04:25 arkiver ^ 08:24:12 Hmm, what about mining outlinks from links like https://bio.site/momcmasters ? (Came from https://nitter.net/momcmasters ) - considering that stuff like Twitter only allows one URL to link out (whereas in the past, multiple links can be outputted), websites like this would bypass this limit 08:24:35 The originating website that runs those kinds of links is https://biosites.com/ 19:35:04 I also thought about that, but they are sooo js hevy most of the time 19:35:13 heavy* 19:35:49 And there are like 100 of them, they can go down anyday and links get lost