07:44:29 Exorcism edited Bugzilla (-49, /* Status */): https://wiki.archiveteam.org/?diff=52870&oldid=52869 07:55:06 I'm trying to list a urls.txt of all blog posts on cnblogs.com 09:15:44 Exorcism edited Bugzilla (+46, /* Status */): https://wiki.archiveteam.org/?diff=52871&oldid=52870 09:16:44 Exorcism edited Bugzilla (+29, /* Status */): https://wiki.archiveteam.org/?diff=52872&oldid=52871 09:17:45 Exorcism edited Bugzilla (+23, /* Status */): https://wiki.archiveteam.org/?diff=52873&oldid=52872 09:21:45 Exorcism edited Bugzilla (+22, /* Status */): https://wiki.archiveteam.org/?diff=52874&oldid=52873 09:23:46 Exorcism edited Bugzilla (+36, /* Status */): https://wiki.archiveteam.org/?diff=52875&oldid=52874 09:25:46 Exorcism edited Bugzilla (+34, /* Status */): https://wiki.archiveteam.org/?diff=52876&oldid=52875 09:27:46 Exorcism edited Bugzilla (+32, /* Status */): https://wiki.archiveteam.org/?diff=52877&oldid=52876 09:35:43 Bugzilla++ 09:35:43 -eggdrop- [karma] 'Bugzilla' now has 1 karma! 09:40:49 Exorcism edited Bugzilla (+79, /* Status */): https://wiki.archiveteam.org/?diff=52878&oldid=52877 09:45:49 Exorcism edited Bugzilla (+47, /* Status */): https://wiki.archiveteam.org/?diff=52879&oldid=52878 09:48:50 Exorcism edited Bugzilla (+47, /* Status */): https://wiki.archiveteam.org/?diff=52880&oldid=52879 09:49:50 Exorcism edited Bugzilla (+26, /* Status */): https://wiki.archiveteam.org/?diff=52881&oldid=52880 09:55:51 Exorcism edited Bugzilla (+33, /* Status */): https://wiki.archiveteam.org/?diff=52882&oldid=52881 09:57:51 Exorcism edited Bugzilla (+48, /* Status */): https://wiki.archiveteam.org/?diff=52883&oldid=52882 09:57:52 Exorcism edited Bugzilla (+24, /* Status */): https://wiki.archiveteam.org/?diff=52884&oldid=52883 10:03:52 Exorcism edited Bugzilla (+35, /* Status */): https://wiki.archiveteam.org/?diff=52885&oldid=52884 12:27:59 Can we get a "thanks, we know about"... whatever the heck that japanese things is 12:45:28 kpcyrd makes a good point in #archiveteam. If the Nazi propaganda YouTube videos are archived, maybe whoever uploads them should request the Internet Archive restrict access so that the IA doesn’t become NaziTube for Germans looking to get around the government ban. 12:46:28 I’m referring to this: in germany a right wing media outlet has been banned today; their websites are already timing out: https://apnews.com/article/germany-far-right-magazine-compact-banned-c284d76eb1a83f7c651299606d31337a 12:53:09 i think they usually go through stuff like that anyways, even if not immediately. i know other videos i've uploaded that are still publicly available elsewhere have been restricted ("may contain harmful content") and put into the de-emphasized Fringe collection 12:54:00 yzqzss: what for, if i may ask? 12:55:27 Exorcism edited Bugzilla (+32, /* Status */): https://wiki.archiveteam.org/?diff=52886&oldid=52885 13:36:45 18 million or so blog posts on cnblogs.com 13:37:15 any more info other than https://www.cnblogs.com/cmt/p/18302049 ? 13:38:23 so big chance of going away 13:38:53 just asking because there's already a job running (1lbcky9haf2j84w3j3vyb0lv3) 13:39:32 c3manu: is AB enough? 13:40:01 in what sense? 13:41:44 (answer is probably "i've got no idea" either way) 14:08:40 Exorcism edited Mailman/2 (+37, /* Status */): https://wiki.archiveteam.org/?diff=52887&oldid=52740 14:10:40 Exorcism edited Mailman/2 (-6): https://wiki.archiveteam.org/?diff=52888&oldid=52887 15:29:54 Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52889&oldid=52886 16:17:03 About hp.vector.co.jp... 16:17:17 https://web.archive.org/web/20161012170825/www.vector.co.jp/vpack/author/listpage.html is the latest version of the public homepage list, before they took it down 16:17:25 (but I'd hazard a guess that no/few new ones were made since then) 16:17:49 I don't know if it's a complete list, but the sum of all the hp.vector.co.jp links on that page is a good starting seed for archiving hp.vector.co.jp 16:18:06 I can try and dump them into a more machine-readable format, if that'd help 16:18:20 (since all the authors got shutdown notification emails today, I expect some of them will replace their sites with redirects sooner than later) 16:19:04 Other than that, all the URLs follow the format http://hp.vector.co.jp/authors/VAnnnnnn/ [nnnnnn - 0-9], so checking for the existence of any unlisted pages could be done too, at a slower pace 16:21:40 It's essentially Geocities for Japanese hobbyist software developers, and the pages have a small size limit, so I think AB is sufficient (just a fairly large job due to the sheer number of pages) 16:39:06 Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52890&oldid=52889 17:11:12 Exorcism edited Bugzilla (-49, /* Status */): https://wiki.archiveteam.org/?diff=52891&oldid=52890 17:13:12 Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52892&oldid=52891 17:31:34 asie: i think that would be helpful, yeah. i'm brute-forcing authors but it's pretty slow on their end 17:35:16 Bzc6p edited Kepfeltoltes.eu (+873, /* Site reconaissance */ list URL types of images): https://wiki.archiveteam.org/?diff=52893&oldid=49819 17:37:16 Bzc6p edited Kepfeltoltes.eu (+108, /* Archiving */ update with 2023 data): https://wiki.archiveteam.org/?diff=52894&oldid=52893 17:43:50 asie: oh sweet, nice they had a list at all :) 18:06:05 Hi, On 2024/12/20, a free web hosting service called hp.vector.co.jp will be shut down. This service is operated by vector.co.jp, a Japanese software distribution service, and is mainly used by old software authors for their websites. hp.vector contains a lot of information about old software, and its disappearance reminds me of Geocities.. 18:06:27 I couldn't find an official announcement. vector seems to have emailed the closure only to site registrants. here is a link to a screenshot of the email posted on Twitter. 18:06:36 https://x.com/k_takata/status/1813477938987479357 18:06:36 nitter: https://nitter.poast.org/k_takata/status/1813477938987479357 18:15:28 Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52895&oldid=52892 18:17:28 Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52896&oldid=52895 18:18:29 Webuser864: this shutdown was a topic previously in this channel, so people should be looking into it 18:18:41 HELP us list all article URLs of cnblogs.com: https://git.saveweb.org/saveweb/cnblogs 18:21:16 lennart: I appreciate you letting me know! 18:21:27 glad to help 18:23:04 thank you :) 18:55:34 "18 million or so blog posts on..." <- We expect it to be less than 18 million 19:01:26 A few years ago they were fined and asked to implement stricter censorship. They hid a lot of posts at that time. Some of them are still hidden now (we don't know what percentage yet) 19:02:18 ah :( 19:02:35 i assume they didn't make a mistake and we can still see them somehow? 19:08:49 "i assume they didn't make a..." <- I didn't dig it out 19:10:44 ah ok 20:04:30 https://git.saveweb.org/saveweb/cnblogs just upload a docker image, feel safe to run :) 20:06:32 uploaded 20:14:19 thuban: here they are -> https://asie.pl/files/hp_vector_urls_20161012.txt 20:16:07 yzqzss: started a container :) 20:16:45 docker++ 20:16:45 -eggdrop- [karma] 'docker' now has 1 karma! 20:16:54 docker++ 20:16:54 -eggdrop- [karma] 'docker' now has 2 karma! 20:17:07 docker++ 20:17:07 -eggdrop- [karma] 'docker' now has 3 karma! 20:19:20 oop, it crashed; but it came back 20:21:49 yzqzss: https://transfer.archivete.am/inline/3fZmf/cnblogs.log 20:23:49 Exorcism edited Bugzilla (+36, /* Status */): https://wiki.archiveteam.org/?diff=52897&oldid=52896 20:24:12 asie: thanks! my brute-force output looks ~identical to that list so far; we'll see whether there's more at the 'end' 20:24:40 know anything about those two differently-formatted slugs? 20:24:45 no 20:25:00 I've stumbled on hp.vector.co.jp occasionally, but I've never been like, an avid user 20:37:16 also just saw fireonlive's panic 20:40:18 I can't get it to run for much more than 2 minutes without seeing a panic 20:42:52 "oop, it crashed; but it came..." <- "EnsureHomepageOK failed for poweredby" These failed tasks will requeue, don't worry 20:49:53 Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52898&oldid=52897 21:02:57 sounds good :) 21:32:58 released a fix (v0.2.1) 21:36:32 containrrr/watchtower -Rv gogo 21:37:00 hmm needs a new docker iamge maybe 21:45:02 Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52899&oldid=52898 21:45:07 builds are triggered hourly :) 22:10:18 now at 200 post/s, ETA: <20h 22:18:16 ah :) 22:24:44 woo! 22:25:09 Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52900&oldid=52899 22:25:26 yzqzss++ 22:25:27 -eggdrop- [karma] 'yzqzss' now has 6 karma! 22:26:26 yzqzss++ 22:26:26 -eggdrop- [karma] 'yzqzss' now has 7 karma! 22:43:01 yzqzss++ 22:43:03 -eggdrop- [karma] 'yzqzss' now has 8 karma! 22:43:12 Exorcism edited Bugzilla (-49, /* Status */): https://wiki.archiveteam.org/?diff=52901&oldid=52900 22:51:13 Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52902&oldid=52901 23:11:17 Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52903&oldid=52902 23:38:21 Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52904&oldid=52903 23:48:23 Exorcism edited Bugzilla (+0, /* Status */ aborted): https://wiki.archiveteam.org/?diff=52905&oldid=52904