01:29:30 I'd like to request https://www.replaymod.com/ for AB due limited IA and no AB coverage and https://www.replaymod.com/forum being closed 01:45:40 thuban: I don't believe I'm looking for any server functionality. I just want to preserve the content and presentation style. 01:45:43 https://storage.scenariopla.net/discord_lofigirl_grab1_all_urls.txt - discord attachments links grab, lots of avatars. 01:46:12 https://storage.scenariopla.net/discord_lofigirl_grab1_all_ytvideos.txt - some youtube videos IDs from there 01:49:11 ctag: what i said about grab-site and wget should stand then, although people might have other recommendations wrt wget flags (i haven't done this with it in a while) 01:49:42 OK, thank you for the advice. Are you referring to wget or wget-at? 01:50:03 Wait, the distinction was in how they handle warcs. Nevermind. 02:02:10 Pedrosso: threw it in. any subdomains? 02:02:31 not as far as I know. One problem is https://www.replaymod.com/center becuase I have no clue how to look through that without manually doing search terms 02:03:37 Pedrosso: please send the GitHub to #gitgud 02:04:06 just send the url, no context or !a ? 02:04:21 url + context 02:04:47 (no bot yet, J_A_A will queue it when around) 02:05:46 gotcha 02:12:50 Pedrosso: re replays, we can just generate a URL list and !ao < all of them 02:13:22 Cool, but how'd one generate a URL list for that? 02:15:12 echo https://www.replaymod.com/replay/{1..15567} 02:15:27 or printf or a bash loop 02:16:28 for eg: echo https://www.replaymod.com/replay/{1..15567} | tr ' ' '\n' > www.replaymod.com-replay-1-to-15567.txt 02:16:34 https://transfer.archivete.am/BDSgq/www.replaymod.com-replay-1-to-15567.txt 02:16:34 inline (for browser viewing): https://transfer.archivete.am/inline/BDSgq/www.replaymod.com-replay-1-to-15567.txt 02:18:13 found the max 15567 by binary search - put in 20000 and got a redirect to /center then halved it etc 02:18:53 praise be to devs who use ascending integer ids 02:19:13 hmm, some of the early ones redirect too... 02:19:56 i wonder if there’s some json or something to download as well somewhere 02:20:11 *uses random IDs on everything >:D* 02:20:22 hmm, it uses custom links to download: replaymod://15567 02:24:35 looks like 02:24:43 its an mcpr file somewhere? https://github.com/ReplayMod/ReplayStudio 02:26:20 Oh, cool! I didn't notice it had enumeration 02:26:26 I should've known 02:26:45 Thank you pabs 02:26:58 brb submitting a patch to them to use uuidv4 for everything 02:27:16 with aggressive rate limits 02:29:26 at least they don't use cloudflare which is nice 02:29:45 pabs: I don't think it is abandoned considering that there are downloads for 1.20.4 up on the site 02:30:32 the site however... Other than the downloads page it is possible 02:31:34 * nicolas17 stabs fireonlive 02:32:10 thank you 02:35:51 Pedrosso: I'm going by the forum topics about how it is discontinued 02:36:01 https://www.replaymod.com/forum/thread/2979 02:36:16 replaymod is awesome and definitely not abandoned 02:36:54 hmmm, they made a release after that 02:37:19 i don't quickly see a way to get the mprc files? 02:37:28 DigitalDragons: do you know where they're stored/anhything about that? 02:40:19 there's https://minio.johni0702.de with an expired cert (23d ago) 02:41:03 I don't, i've only ever used it for my own replays 02:41:47 and it seems like in future iterations they've completely removed these online features 02:42:03 replaymod-1.12.2-2.2.0-b1 still has them 02:42:44 ah ok 03:35:22 Hmm. OK, wget -mpk did its best, but I think I need to do something to account for the site being PHP. Links from the index page (which looks correct!) don't work 03:36:59 I wish I could just flatten it down to HTML pages 03:37:09 No need for PHP, nothing on the site is going to change 03:52:02 just save it to web.archive.org using AB? :) 03:55:20 I'd like to learn how to do that 03:55:40 It's been saved partially and manually on there, but I'd also like to have a copy I can host 03:56:22 I'm their volunteer webmaster, and the organization was started in the 1950s, they have a lot of physical archival documents 03:56:34 Would be neat to keep this copy with that stuff 03:57:01 AB is archive bot, right? 04:02:01 indeed it is :) 04:02:19 not anyone can use it though but you can ask for someone to archive a particular site 04:04:39 Thanks 04:04:47 Oh 04:05:28 I'm guessing there isn't a way to archive it under the original domain? 04:05:51 I'm hosting the copy on a subdomain of my personal stuff 04:08:07 I should have thought about this more before we replaced the website 04:14:34 ctag: not unless the domain is publicly accessible 04:14:49 we can also just do the subdomain 04:15:04 The rehosted version is https://vbas-legacy.berocs.com/ 04:15:20 Which was originally http://www.vbas.org 04:15:25 do you still own the original domain? 04:15:48 https://web.archive.org/web/20191118210141/http://www.vbas.org/ 04:15:55 Yes, partly 04:15:57 ah, vbas.org is now a new website 04:16:09 It's tied to a company that donates our hosting 04:16:14 Yeah :-/ 04:17:58 If it'd help, I could try and schedule a time to revert the vbas.org domain to the old site temporarily. 04:18:09 I believe the files for the old drupal site are still on the server 04:18:48 No, I'm remembering now, one of the reasons we switched was the PHP version updated and broke drupal 04:20:24 www.vbas.org redirects to vbas.org, what about temporarily pointing the www subdomain to your server, running AB and then reverting? 04:21:37 Changing the PHP version back would break the new site, I think 04:21:44 Maybe? I'll look into it 04:21:51 That's a great idea though 04:24:28 alternately, add an obsolete/old/legacy subdomain to vbas.org pointing at your server, save that in AB 04:24:33 or just save the rehosted domain 04:58:49 hello, i wanted to download a track from the artist union archive but couldn't get past the login prompt 04:59:10 could i get some help? 05:04:45 I'm apparently not smart enough to get the redirect working. Will take a look at it this weekend, thank you again for the advice pabs. 05:10:49 ctag: shouldn't be a 301 or iframe but something like www.vbas.org points to your IP then your server is configured to serve the old content at that hostname 05:11:28 (in addition to (or temporarly instead of) vbas-legacy.berocs.com) 06:04:37 https://www.replaymod.com/api/download_file?id= 06:04:45 For downloading the replaymod files 06:11:15 pabs: so that means !ao https://transfer.archivete.am/12p9sk/replaymod_downloads.txt 06:11:16 inline (for browser viewing): https://transfer.archivete.am/inline/12p9sk/replaymod_downloads.txt 06:39:51 !ao < since !ao just downloads that URL 06:40:30 Ah right the < for the file contents 06:42:38 running, getting some weird errors 06:51:11 I see that 07:05:27 Any idea of what the error means? 07:17:47 none, probably some sort of bug in the script 07:32:31 Wow 08:17:05 Hi guys! Trying to find the channel specific for ArchiveTeam Warrior questions. Any hints on how to get there? :) 08:18:53 #warrior 08:19:09 Thanks for stopping by 08:20:25 Thank you! :) 10:38:38 Pedrosso: to be clear, I mean a bug in the PHP script running on the replaymod server. its clearly throwing PHP warnings before doing any HTTP header output, which means bugs in the script 10:38:54 Ahh 10:39:14 possibly the error is triggered by AB (I can't seem to get the errors here), but its a bug in the script 10:39:27 I can't get it to trigger either 10:39:58 we can rerun the weird-failure and conn closed ones after its done and see what happens 13:58:53 pabs: `printf '%s\n' https://www.replaymod.com/replay/{1..15567}` :-) 13:59:01 printf is much better than echo anyway. 15:41:45 Is it me or did YouTube took out the ability to see YouTube accounts' subscription pages now? I started noticing it around this month or last month and thought it might be a bug 15:41:55 To note, public pages~ 15:45:33 Either way, that's more lost metadata I fear~ 15:47:59 < ctag> Hmm. OK, wget -mpk did its best, but I think I need to do something to account for the site being PHP. Links from the index page (which looks correct!) don't work 15:48:12 i think `-E` handles this 15:49:07 (sorry, this is why i was hoping someone would double-check my flags!) 15:55:19 Yeah, it looks like it might be confirmed, can't see such a thing anymore: https://old.reddit.com/r/youtube/comments/17nkkga/did_youtube_just_seriously_remove_the_channels_tab/ 15:55:40 For reference, here's an example of a Channels/public subscriptions page: https://web.archive.org/web/20210624044052/https://www.youtube.com/c/vinesauce/channels 16:21:51 hey all, gamebattles is shutting down in a little over two weeks. There's a ton of info on tournaments, players, and matches. I've had a very slow long term scrape going way before they announced this but I fear it won't finish in time. There seems to be a pretty good rate limit in place and when you hit it they basically ban your ip. I might have 16:21:51 to resort to getting one of those proxy services and just churning through those, but thought it could be a good warrior project too. I'm not really on here frequently but am on discord at cyrlx. Here's a tweet about the shutdown: https://twitter.com/GameBattles/status/1724171598117101830 16:21:52 nitter: https://nitter.net/GameBattles/status/1724171598117101830 16:38:12 Switchnode edited Deathwatch (+0, /* 2024 */ correct gamebattles date): https://wiki.archiveteam.org/?diff=51422&oldid=51394 16:38:45 js hell, this'll be fun. 16:40:25 cyrix: any information you can give us on site/api structure (including your existing scrape program, if you'd care to upload it to transfer.archivete.am) would be helpful 17:52:43 Can eggdrop change its nitter to this one : https://nitter.mint.lgbt/ :3 17:57:28 i agree with cyrix and thuban, that would be a good project to work on. 20:14:05 thuban: Thank you, I'll give wget another shot this afternoon. 20:14:23 The redirect looks to be working from my end! http://www.vbas.org 20:30:02 @thuban I can clean up my scripts and share them there...their id systems are just incrementing integers so I largely just keep going up on certain endpoints 20:31:15 cyrix: sounds great, thank you! 20:34:00 JustAnotherArchivist edited Deathwatch (+279, motor-talk.de got a reprieve): https://wiki.archiveteam.org/?diff=51423&oldid=51422 20:41:03 JustAnotherArchivist moved GuteFrage to Gutefrage (Fix capitalisation; although the German phrase…): https://wiki.archiveteam.org/?title=Gutefrage 20:42:02 JustAnotherArchivist created Gutefrage.net (+23, Former official branding of [[gutefrage]]): https://wiki.archiveteam.org/?title=Gutefrage.net 20:44:02 JustAnotherArchivist edited Gutefrage (+112, Fix capitalisation; it's been 'gutefrage.net'…): https://wiki.archiveteam.org/?diff=51427&oldid=51424 20:45:02 JustAnotherArchivist edited Quora (+17, + [[Category:Q&A]]): https://wiki.archiveteam.org/?diff=51428&oldid=49870 20:46:03 Exorcism: It could randomly return either that or https://nitter.x86-64-unknown-linux-gnu.zip/ for extra nerdiness. :-) 20:47:23 thuban just uploaded them: 20:58:03 cyrix: er, link? 21:02:34 Oooo :3 21:19:21 thuban 21:19:22 https://transfer.archivete.am/s0Wrg/README.txthttps://transfer.archivete.am/OultB/gb_scrape_effort.dbhttps://transfer.archivete.am/JDsge/gb_scrape_effort_2_parallel.pyhttps://transfer.archivete.am/I2myk/gb-2-count.csvhttps://transfer.archivete.am/162IIC/gb-api-match-count.csvhttps://transfer.archivete.am/Ryod7/gp-api.majorleaguegaming.com_match_dl_ 21:19:23 parallel.py 21:19:23 inline (for browser viewing): https://transfer.archivete.am/inline/s0Wrg/README.txthttps://transfer.archivete.am/inline/OultB/gb_scrape_effort.dbhttps://transfer.archivete.am/inline/JDsge/gb_scrape_effort_2_parallel.pyhttps://transfer.archivete.am/inline/I2myk/gb-2-count.csvhttps://transfer.archivete.am/inline/162IIC/gb-api-match-count.csvhttps://transfer.archivete.am/inline/Ryod7/gp-api.majorleaguegaming.com_match_dl_ 21:19:56 er https://transfer.archivete.am/%28/I2myk/gb-2-count.csv,/162IIC/gb-api-match-count.csv,/JDsge/gb_scrape_effort_2_parallel.py,/s0Wrg/README.txt,/OultB/gb_scrape_effort.db,/Ryod7/gp-api.majorleaguegaming.com_match_dl_parallel.py%29.zip 21:19:56 inline (for browser viewing): https://transfer.archivete.am/inline/%28/I2myk/gb-2-count.csv,/162IIC/gb-api-match-count.csv,/JDsge/gb_scrape_effort_2_parallel.py,/s0Wrg/README.txt,/OultB/gb_scrape_effort.db,/Ryod7/gp-api.majorleaguegaming.com_match_dl_parallel.py%29.zip 21:20:04 eek 21:20:47 cyrix: thanks! 21:44:10 You tried, eggdrop. You tried. 21:44:31 :-) 21:45:03 https://transfer.archivete.am/Ryod7/gp-api.majorleaguegaming.com_match_dl_parallel.py 21:45:03 inline (for browser viewing): https://transfer.archivete.am/inline/Ryod7/gp-api.majorleaguegaming.com_match_dl_parallel.py 21:45:10 Er 23:41:38 Exorcism uploaded File:Talktalk-screenshot.png: https://wiki.archiveteam.org/?title=File%3ATalktalk-screenshot.png 23:42:38 Exorcism edited TalkTalk (+33): https://wiki.archiveteam.org/?diff=51430&oldid=51315 23:43:38 Exorcism uploaded File:Xangalogo-large.jpg: https://wiki.archiveteam.org/?title=File%3AXangalogo-large.jpg 23:43:39 Exorcism edited Xanga (+29): https://wiki.archiveteam.org/?diff=51432&oldid=51313 23:53:39 Exorcism uploaded File:Argenteam-screenshot.png: https://wiki.archiveteam.org/?title=File%3AArgenteam-screenshot.png 23:54:39 Exorcism edited ARGENTeaM (+35): https://wiki.archiveteam.org/?diff=51434&oldid=51285