05:37:26 https://x.com/steventey/status/1814160924565201177?s=12 05:37:26 nitter: https://nitter.poast.org/steventey/status/1814160924565201177 05:37:32 no. bad. stop. 05:55:44 ^ some rando using the goo.gl shutdown to promote their own "successor" 05:59:02 cantap64⊙gc 05:59:42 suthamidea30⊙gc 06:01:25 cantap: ? 06:14:22 thanks for the list yzqzss 06:14:41 this goo.gl shut down is going to be hugely damaging 06:15:22 yzqzss: how was this list collected? 06:16:02 on goog.gl, note that URLTeam2 project does not create WARCs, anything found there we'll reprocess with a custom project 06:16:14 i'll bring it up early so we have a lot of time to find and queued items to it 06:16:27 URLs redirected to will be stashed for now, not immediately fed to #// 06:21:29 rock the WARC :) 06:23:18 sigh bot is down due to yet another power outage 06:23:52 power outage is not related to IA itself 06:24:02 can't think of a place that has more power outages :/ 06:25:24 :( yeah 06:26:06 bot is back 06:28:07 :) 06:28:12 https://x.com/willrandship/status/1814159966028054791?s=12 06:28:13 nitter: https://nitter.poast.org/willrandship/status/1814159966028054791 06:28:29 all the immutable places (including print media) that have used goo.gl... 06:28:41 let's make a goo.gl channel! any suggestions? 06:29:14 there's #googlecrash for all things google-ish 06:29:30 ah, google drive 06:29:37 also i need opinion on the name of this thing... https://en.wikipedia.org/wiki/Google_URL_Shortener is it "Google URL Shortener" or "goo.gl" :P 06:29:50 fireonlive: yeah let's make a new one, this one will get traffic i think 06:30:37 could re-use microsoft's #scroogled https://en.wikipedia.org/wiki/Scroogled :p 06:31:00 i don't think i've ever heard it called 'Google URL Shortener' myself, interesting 06:32:12 (or a more-explicit #screwgled) 06:34:49 #googone 06:37:45 is #sgroogled too far fetched? :P (replace c by g) 06:38:07 i'm not native english speaker... but i think scroogled and sgroogled would be pronounced the same? 06:48:01 at 62 characters with 0-9a-zA-Z it's only like 570 billion requests 06:48:29 oops 06:48:33 57 billion i mean 06:48:42 not bad 06:48:58 we can scan through that 06:49:41 hmm the translate bot pronounces it as s-groog-led 06:50:12 what will be difficult to archive is the *.app.goo.gl stuff though 06:51:48 wtf another product to the google graveyard in 2 weeks? 06:52:09 looks like the old-style https://goo.gl/Y5VIoG+ and https://goo.gl/Y5VIoG+ links died when they moved to firebase dynamic links 06:52:15 ... 06:52:21 https://goo.gl/Y5VIoG.info for the second one 06:52:42 ?d=1 is nice though 06:52:48 what was the +? 06:52:59 ah nice 06:53:12 iirc it used to show where it resolved to as well as stats? 06:53:38 fireonlive: are we looking at the same stackexchange page? :P 06:53:57 or is https://goo.gl/Y5VIoG that universal 06:54:10 haha yes we are https://webapps.stackexchange.com/questions/54961/how-can-i-find-out-what-a-goo-gl-url-leads-to-without-visiting-it 06:54:21 we are indeed haha 06:54:23 :D 06:55:06 .qr would have been nice, looks like that is dead too already 06:55:37 yeah :( 06:56:44 i guess there's a lot of routing options, so good that d=1 exists.. https://firebase.google.com/docs/dynamic-links/debug .. though unsure how much if any that was used on goo.gl itself 06:57:56 yeah there's a ton of routing options 06:58:00 ... time to get them all :) 06:58:40 :D 06:58:50 found a nice example at https://f7td5.app.goo.gl/VzgJeH?d=1 (just some random shortened *.app.goo.gl/* URLs, not idea about the page it leads to) 06:59:25 example images.app.goo.gl: https://images.app.goo.gl/edtQ5fqzjr3XryPh7 06:59:32 blegh 06:59:35 ah, wow 07:00:00 yeah lot of character space :/ 07:00:25 yeah we can't scan through that 07:00:31 firebase dynamic links also work on custom domains; but that might just be too big too 07:00:40 (since they can be anything arbitrary/programatic) 07:00:54 we'll support it when people want to queue it and try to discover as much as possible, but can't scan through that all 07:01:03 the goo.gl/* URLs alone we can scan through though 07:01:06 ah that sounds good 07:01:42 #goo.gone 07:01:56 And yes sg... and sc... would be pronounced the same 07:02:05 (don't have an example handy, but say bbc.co.uk running one at b.bc or sth) 07:10:45 hmm, are they even shutting down *.apps.goo.gl? 07:14:03 if I open google maps right now, it still gives me links like https://maps.app.goo.gl/W4NXZ2ghG4TcyPpU9 and they are pretty specific about "https://goo.gl/*" in the announcement 07:17:11 Also btw, is anyone aware that the target is dead 07:24:31 dead or full 07:24:38 IA is power outage for a bit so could be backed up 07:25:11 DigitalDragons: hm, it seems to be running on firebase dynamic links too 07:25:32 (https://firebase.google.com/support/dynamic-links-faq) 07:31:21 images.app.goo.gl is another subdomain 07:31:37 and f7td5.app.goo.gl photos.app.goo.gl 07:32:47 hmm, wonder if goo.gle is affected too, I found http://goo.gle/patchz-nomination 07:33:13 also found URLs like https://goo.gl/photos/... 07:35:57 #gonegl 07:43:41 https://goo.gl/maps/Dg1mRbavUHWU9aF5A is also not equivalent to https://maps.app.goo.gl/Dg1mRbavUHWU9aF5A which is lovely... 07:44:08 :| 08:02:47 (Replying to fireonlive) Yep the target is full 08:03:54 ah 08:29:45 also I dont think the *.app.goo.gl is affected 08:30:02 "Any developers using links built with the Google URL Shortener in the form https://goo.gl/* will be impacted, and these URLs will no longer return a response after August 25th, 2025." 08:30:48 I dont think they are intending to shut down firebase dynamic links 08:34:52 they are 08:35:23 https://firebase.google.com/support/dynamic-links-faq 08:38:46 wtf 08:38:50 why, just WHY 08:39:56 the necrotic engines of the google graveyard demand sacrifice 08:40:07 <@arkiver> i'm not native english speaker... but i think scroogled and sgroogled would be pronounced the same? 08:40:22 ^ not imo, [c] is /k/ (voiceless) and [g] is /g/ (voiced). i might approximate initial /sgɹ/ as /skɹ/ but would not really regard it as pronounceable 08:44:36 sounds like there isn't a channel yet? (i like #googone; however, consider also: #ruegl) 08:47:56 IPA++ 08:47:58 -eggdrop- [karma] 'IPA' now has 1 karma! 08:48:05 😎 09:00:21 maybe for the app.goo.gl crap we can sift through the old project WARCs again to catch them like we have done for imgur and friends# 09:18:57 "yzqzss: how was this list..." <- 1. get a blog's homepage URI via https://feed.cnblogs.com/blog/u/{BlogID}/rss/ () 2. iterate through {URI}?page={page} page by page 09:20:22 https://git.saveweb.org/saveweb/cnblogs/src/branch/main/cmd/cnblogs_posts_list/cnblogs_posts_list.go#L105-L118 09:20:39 Exorcism edited Bugzilla (+4, /* Status */): https://wiki.archiveteam.org/?diff=52941&oldid=52940 09:29:58 thuban: would #scroogled be pronunced screw oogled 10:17:50 hello people 10:18:03 can anyone give me information related to hello_solver123 10:31:06 maybe we should backup the crowdstrike public facing stuff, too incase they go belly-up due to their goofup 11:11:45 hi 11:11:46 can i ask about archiving? 11:23:25 hi! sure, what would you like to know? 11:28:19 homepage service of vector, japanese software distribution site is closing down on 2024/12/20 (jst?) 11:28:19 https://internet.watch.impress.co.jp/docs/yajiuma/1609184.html 11:28:20 https://www.itmedia.co.jp/news/articles/2407/18/news117.html 11:35:02 ah, we already know about that one. thanks! :) 12:21:11 Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52942&oldid=52941 12:49:45 fyi: since i didn't really get an answer regarding the hp.vector.co.jp stuff, i just queued the list nullpeta requested in #archivebot: https://files.catbox.moe/ooe4xx.txt 12:50:55 depending on how large this will get it can probably get re-run in december, but i share the concern that some users might replace their sites with redirects to their new locations (which is often enough paired with a rebuild of it) 12:51:04 (or a makeover, rather) 12:51:33 job is currently waiting for the pipeline 12:56:56 I am working on hp.vector.co.jp and will share my results. 12:56:57 Here is the list of URLs that return 200 on http and the script we used to extract them. Feel free to use them! https://files.catbox.moe/ooe4xx.txt https://files.catbox.moe/5ga5db.py 12:58:32 nullpeta: hehe, just told them i queued your list ;) 13:01:11 nullpeta: that's not complete 13:01:24 there was a public homepage list removed after october 2016... it had two URLs which do not fit that scheme 13:01:33 I posted it the other day -> https://asie.pl/files/hp_vector_urls_20161012.txt 13:01:41 not a big deal, but 13:02:01 c3manu: ^ 13:03:10 and yeah rerunning it in december makes sense 13:04:25 ah, i must have missed that >.< 13:04:32 asie: Wow, I did not notice this. Excellent point! 13:04:42 the list is from https://web.archive.org/web/20161012170825/www.vector.co.jp/vpack/author/listpage.html 13:05:06 but a bruteforce is good to run, as thuban previously found six pages created after mid-2016 which were not on that page, but fit the VAnnnnnn scheme 13:05:20 the two combined should be fairly comprehensive 13:06:34 The problem is that some URLs return 403 instead of 404. Maybe there is some kind of hidden page. Here is the list of id's that return 403. (sorry it's not sorted!) https://files.catbox.moe/7esoln.txt 13:09:38 or maybe banned users? 13:09:46 some forums return 403 for those 13:09:59 "http://hp.vector.co.jp/authors/VA012227/index.html" used to be a website, at least 13:10:05 so IMO this could be something like deleted accounts 13:13:36 if you can give me a combined list, i'll run that one instead 13:14:30 asie: Indeed, it seems likely. I will report back if I find out anything on this. 13:17:59 aborted nullpeta's list in favor of a combined one 13:18:39 https://asie.pl/files/hp_vector_urls_20161012_plus.txt 13:18:47 combined one 13:22:06 merci :) 13:36:16 I checked and the combined list looks complete! This page (https://web.archive.org/web/20161012170825/www.vector.co.jp/vpack/author/listpage.html) seems to be missing a few sites(eg: VA001028), but it looks like a brute force hit found those as well. Thank you for help. 14:30:41 'ello. I am running a podman based worker and set a http basic auth password that is lost to the sands of time. How can I reset it? 14:32:33 Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52943&oldid=52942 14:32:45 Blackb|rd: stop the container and remove it and start from the beginning 14:33:25 I tried using systemctl to stop it, and then deleting it using the podman cmdline, but there was no container anymore 14:33:39 podman system prune -a 14:33:58 Could someone insert https://honz.jp into archivebot? honz is a Japanese book review site that has already been discontinued and is no longer being updated. The site will be deleted in the near future. 14:34:43 Blackb|rd: Here is the exlination for that command also https://docs.podman.io/en/latest/markdown/podman-system-prune.1.html 14:35:36 So podman system prune -a did delete stuff, and I started the worker again using systemctl, but it still prompts me for a password 14:35:46 nullpeta - queued in AB 14:36:05 nulldata: Thank you! 14:37:11 huh, it somehow got into the state of remembering the username, but with an empty pw, and loging in with the username and no pw worked. It also remembered the concurrency setting "somehow" 14:38:18 when using podman the config presists if I remember correctly, but good if you got in and can set a new password you can remember 14:54:15 yeah, thanks lurker! 15:03:23 akiver: cnblogs will have a warrior project or go to #// ? 16:19:33 Not sure if anyone has posted about this yet? - https://developers.googleblog.com/en/google-url-shortener-links-will-no-longer-be-available/ 16:20:09 All google short url's will start returning 404 16:22:01 JaffaCakes118 - Yes, it has been discussed 17:52:06 JAA: Would it be a good idea to have the same topic edits for this channel that are made in #archiveteam when topic are known already. Stuff gets burried here at times too :-) 20:04:41 Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52944&oldid=52943 20:24:50 fireonlive sure, sorry, my bad 20:30:24 Ravenloft: not at all! :) the initial 'hey this is on fire' is perfect for that channel (though not a lot of people read the topic unfortunately) 20:30:30 we just move here for the more nitty-gritty after 20:31:03 (and, soon a dedicated channel for the project itself to contain it in one area) 20:35:12 speaking of, any rate limiting on goo.gl? 20:40:06 i think so, the urlteam code checks for a sorry page: https://github.com/ArchiveTeam/terroroftinytown/blob/7c0093ba8b3622d1f6198188b1dd535e6698bf5d/terroroftinytown/services/googl.py 21:07:40 I propose: #goop.gl 21:09:11 that_lurker: That'll go out of sync immediately. Better to keep it in one place, I think. 21:09:49 true and i'll second goop :-) I was thinking about gone.gl or grave.gl but that sound better 21:27:49 goop.gl is good. it keeps it linked to the shortener, which is good, because we're all but guarantied to have more google related projects 21:31:05 !status 4zne9j84uh36an0c5rscw855p 21:31:08 Not here. 21:33:19 shortening the shortener to #gl might be fitting 21:50:28 that could overlap if something happens to the gl tld and that needs a project 21:54:20 wouldn't be a pun though 21:54:31 (or a joke of some kind) 21:59:26 #goon.gl could be funny too ;-) (https://www.urbandictionary.com/define.php?term=gooning) 22:04:13 🥵 22:39:25 gig.gl not much connection but made me smile