01:29:30 <Pedrosso> I'd like to request https://www.replaymod.com/ for AB due limited IA and no AB coverage and https://www.replaymod.com/forum being closed
01:45:40 <ctag> thuban: I don't believe I'm looking for any server functionality. I just want to preserve the content and presentation style.
01:45:43 <ScenarioPlanet> https://storage.scenariopla.net/discord_lofigirl_grab1_all_urls.txt - discord attachments links grab, lots of avatars.
01:46:12 <ScenarioPlanet> https://storage.scenariopla.net/discord_lofigirl_grab1_all_ytvideos.txt - some youtube videos IDs from there
01:49:11 <thuban> ctag: what i said about grab-site and wget should stand then, although people might have other recommendations wrt wget flags (i haven't done this with it in a while)
01:49:42 <ctag> OK, thank you for the advice. Are you referring to wget or wget-at?
01:50:03 <ctag> Wait, the distinction was in how they handle warcs. Nevermind.
02:02:10 <pabs> Pedrosso: threw it in. any subdomains?
02:02:31 <Pedrosso> not as far as I know. One problem is https://www.replaymod.com/center becuase I have no clue how to look through that without manually doing search terms
02:03:37 <pabs> Pedrosso: please send the GitHub to #gitgud
02:04:06 <Pedrosso> just send the url, no context or !a ?
02:04:21 <pabs> url + context
02:04:47 <pabs> (no bot yet, J_A_A will queue it when around)
02:05:46 <Pedrosso> gotcha
02:12:50 <pabs> Pedrosso: re replays, we can just generate a URL list and !ao < all of them
02:13:22 <Pedrosso> Cool, but how'd one generate a URL list for that?
02:15:12 <pabs> echo https://www.replaymod.com/replay/{1..15567}
02:15:27 <pabs> or printf or a bash loop
02:16:28 <pabs> for eg: echo https://www.replaymod.com/replay/{1..15567} | tr ' ' '\n' > www.replaymod.com-replay-1-to-15567.txt
02:16:34 <pabs> https://transfer.archivete.am/BDSgq/www.replaymod.com-replay-1-to-15567.txt
02:16:34 <eggdrop> inline (for browser viewing): https://transfer.archivete.am/inline/BDSgq/www.replaymod.com-replay-1-to-15567.txt
02:18:13 <pabs> found the max 15567 by binary search - put in 20000 and got a redirect to /center then halved it etc
02:18:53 <Terbium> praise be to devs who use ascending integer ids
02:19:13 <pabs> hmm, some of the early ones redirect too...
02:19:56 <fireonlive> i wonder if there’s some json or something to download as well somewhere
02:20:11 <fireonlive> *uses random IDs on everything >:D*
02:20:22 <pabs> hmm, it uses custom links to download: replaymod://15567
02:24:35 <fireonlive> looks like
02:24:43 <fireonlive> its an mcpr file somewhere? https://github.com/ReplayMod/ReplayStudio
02:26:20 <Pedrosso> Oh, cool! I didn't notice it had enumeration
02:26:26 <Pedrosso> I should've known
02:26:45 <Pedrosso> Thank you pabs
02:26:58 <fireonlive> brb submitting a patch to them to use uuidv4 for everything
02:27:16 <fireonlive> with aggressive rate limits
02:29:26 <Terbium> at least they don't use cloudflare which is nice
02:29:45 <Pedrosso> pabs: I don't think it is abandoned considering that there are downloads for 1.20.4 up on the site
02:30:32 <Pedrosso> the site however... Other than the downloads page it is possible
02:31:34 * nicolas17 stabs fireonlive
02:32:10 <fireonlive> thank you
02:35:51 <pabs> Pedrosso: I'm going by the forum topics about how it is discontinued
02:36:01 <pabs> https://www.replaymod.com/forum/thread/2979
02:36:16 <DigitalDragons> replaymod is awesome and definitely not abandoned
02:36:54 <pabs> hmmm, they made a release after that
02:37:19 <fireonlive> i don't quickly see a way to get the mprc files?
02:37:28 <fireonlive> DigitalDragons: do you know where they're stored/anhything about that?
02:40:19 <fireonlive> there's https://minio.johni0702.de with an expired cert (23d ago)
02:41:03 <DigitalDragons> I don't, i've only ever used it for my own replays
02:41:47 <Pedrosso> and it seems like in future iterations they've completely removed these online features
02:42:03 <Pedrosso> replaymod-1.12.2-2.2.0-b1 still has them
02:42:44 <fireonlive> ah ok
03:35:22 <ctag> Hmm. OK, wget -mpk did its best, but I think I need to do something to account for the site being PHP. Links from the index page (which looks correct!) don't work
03:36:59 <ctag> I wish I could just flatten it down to HTML pages
03:37:09 <ctag> No need for PHP, nothing on the site is going to change
03:52:02 <pabs> just save it to web.archive.org using AB? :)
03:55:20 <ctag> I'd like to learn how to do that
03:55:40 <ctag> It's been saved partially and manually on there, but I'd also like to have a copy I can host
03:56:22 <ctag> I'm their volunteer webmaster, and the organization was started in the 1950s, they have a lot of physical archival documents
03:56:34 <ctag> Would be neat to keep this copy with that stuff
03:57:01 <ctag> AB is archive bot, right?
04:02:01 <fireonlive> indeed it is :)
04:02:19 <fireonlive> not anyone can use it though but you can ask for someone to archive a particular site
04:04:39 <ctag> Thanks
04:04:47 <ctag> Oh
04:05:28 <ctag> I'm guessing there isn't a way to archive it under the original domain?
04:05:51 <ctag> I'm hosting the copy on a subdomain of my personal stuff
04:08:07 <ctag> I should have thought about this more before we replaced the website
04:14:34 <pabs> ctag: not unless the domain is publicly accessible
04:14:49 <pabs> we can also just do the subdomain
04:15:04 <ctag> The rehosted version is https://vbas-legacy.berocs.com/
04:15:20 <ctag> Which was originally http://www.vbas.org
04:15:25 <pabs> do you still own the original domain?
04:15:48 <ctag> https://web.archive.org/web/20191118210141/http://www.vbas.org/
04:15:55 <ctag> Yes, partly
04:15:57 <pabs> ah, vbas.org is now a new website
04:16:09 <ctag> It's tied to a company that donates our hosting
04:16:14 <ctag> Yeah :-/
04:17:58 <ctag> If it'd help, I could try and schedule a time to revert the vbas.org domain to the old site temporarily.
04:18:09 <ctag> I believe the files for the old drupal site are still on the server
04:18:48 <ctag> No, I'm remembering now, one of the reasons we switched was the PHP version updated and broke drupal
04:20:24 <pabs> www.vbas.org redirects to vbas.org, what about temporarily pointing the www subdomain to your server, running AB and then reverting?
04:21:37 <ctag> Changing the PHP version back would break the new site, I think
04:21:44 <ctag> Maybe? I'll look into it
04:21:51 <ctag> That's a great idea though
04:24:28 <pabs> alternately, add an obsolete/old/legacy subdomain to vbas.org pointing at your server, save that in AB
04:24:33 <pabs> or just save the rehosted domain
04:58:49 <w> hello, i wanted to download a track from the artist union archive but couldn't get past the login prompt
04:59:10 <w> could i get some help?
05:04:45 <ctag> I'm apparently not smart enough to get the redirect working. Will take a look at it this weekend, thank you again for the advice pabs.
05:10:49 <fireonlive> ctag: shouldn't be a 301 or iframe but something like www.vbas.org points to your IP then your server is configured to serve the old content at that hostname
05:11:28 <fireonlive> (in addition to (or temporarly instead of) vbas-legacy.berocs.com)
06:04:37 <Pedrosso> https://www.replaymod.com/api/download_file?id=
06:04:45 <Pedrosso> For downloading the replaymod files
06:11:15 <Pedrosso> pabs: so that means !ao https://transfer.archivete.am/12p9sk/replaymod_downloads.txt
06:11:16 <eggdrop> inline (for browser viewing): https://transfer.archivete.am/inline/12p9sk/replaymod_downloads.txt
06:39:51 <pabs> !ao < since !ao just downloads that URL
06:40:30 <Pedrosso> Ah right the < for the file contents
06:42:38 <pabs> running, getting some weird errors
06:51:11 <Pedrosso> I see that
07:05:27 <Pedrosso> Any idea of what the error means?
07:17:47 <pabs> none, probably some sort of bug in the script
07:32:31 <Pedrosso> Wow
08:17:05 <Wickerz> Hi guys! Trying to find the channel specific for ArchiveTeam Warrior questions. Any hints on how to get there? :)
08:18:53 <Vokun> #warrior
08:19:09 <Vokun> Thanks for stopping by
08:20:25 <Wickerz> Thank you! :)
10:38:38 <pabs> Pedrosso: to be clear, I mean a bug in the PHP script running on the replaymod server. its clearly throwing PHP warnings before doing any HTTP header output, which means bugs in the script
10:38:54 <Pedrosso> Ahh
10:39:14 <pabs> possibly the error is triggered by AB (I can't seem to get the errors here), but its a bug in the script
10:39:27 <Pedrosso> I can't get it to trigger either
10:39:58 <pabs> we can rerun the weird-failure and conn closed ones after its done and see what happens
13:58:53 <JAA> pabs: `printf '%s\n' https://www.replaymod.com/replay/{1..15567}` :-)
13:59:01 <JAA> printf is much better than echo anyway.
15:41:45 <Ryz> Is it me or did YouTube took out the ability to see YouTube accounts' subscription pages now? I started noticing it around this month or last month and thought it might be a bug
15:41:55 <Ryz> To note, public pages~
15:45:33 <Ryz> Either way, that's more lost metadata I fear~
15:47:59 <thuban> < ctag> Hmm. OK, wget -mpk did its best, but I think I need to do something to account for the site being PHP. Links from the index page (which looks correct!) don't work
15:48:12 <thuban> i think `-E` handles this
15:49:07 <thuban> (sorry, this is why i was hoping someone would double-check my flags!)
15:55:19 <Ryz> Yeah, it looks like it might be confirmed, can't see such a thing anymore: https://old.reddit.com/r/youtube/comments/17nkkga/did_youtube_just_seriously_remove_the_channels_tab/
15:55:40 <Ryz> For reference, here's an example of a Channels/public subscriptions page: https://web.archive.org/web/20210624044052/https://www.youtube.com/c/vinesauce/channels
16:21:51 <cyrix> hey all, gamebattles is shutting down in a little over two weeks. There's a ton of info on tournaments, players, and matches. I've had a very slow long term scrape going way before they announced this but I fear it won't finish in time. There seems to be a pretty good rate limit in place and when you hit it they basically ban your ip. I might have
16:21:51 <cyrix> to resort to getting one of those proxy services and just churning through those, but thought it could be a good warrior project too. I'm not really on here frequently but am on discord at cyrlx. Here's a tweet about the shutdown: https://twitter.com/GameBattles/status/1724171598117101830
16:21:52 <eggdrop> nitter: https://nitter.net/GameBattles/status/1724171598117101830
16:38:12 <h2ibot> Switchnode edited Deathwatch (+0, /* 2024 */ correct gamebattles date): https://wiki.archiveteam.org/?diff=51422&oldid=51394
16:38:45 <thuban> js hell, this'll be fun.
16:40:25 <thuban> cyrix: any information you can give us on site/api structure (including your existing scrape program, if you'd care to upload it to transfer.archivete.am) would be helpful
17:52:43 <Exorcism> Can eggdrop change its nitter to this one : https://nitter.mint.lgbt/ :3
17:57:28 <VickoSaviour> i agree with cyrix and thuban, that would be a good project to work on.
20:14:05 <ctag> thuban: Thank you, I'll give wget another shot this afternoon.
20:14:23 <ctag> The redirect looks to be working from my end! http://www.vbas.org
20:30:02 <cyrix> @thuban I can clean up my scripts and share them there...their id systems are just incrementing integers so I largely just keep going up on certain endpoints
20:31:15 <thuban> cyrix: sounds great, thank you!
20:34:00 <h2ibot> JustAnotherArchivist edited Deathwatch (+279, motor-talk.de got a reprieve): https://wiki.archiveteam.org/?diff=51423&oldid=51422
20:41:03 <h2ibot> JustAnotherArchivist moved GuteFrage to Gutefrage (Fix capitalisation; although the German phrase…): https://wiki.archiveteam.org/?title=Gutefrage
20:42:02 <h2ibot> JustAnotherArchivist created Gutefrage.net (+23, Former official branding of [[gutefrage]]): https://wiki.archiveteam.org/?title=Gutefrage.net
20:44:02 <h2ibot> JustAnotherArchivist edited Gutefrage (+112, Fix capitalisation; it's been 'gutefrage.net'…): https://wiki.archiveteam.org/?diff=51427&oldid=51424
20:45:02 <h2ibot> JustAnotherArchivist edited Quora (+17, + [[Category:Q&A]]): https://wiki.archiveteam.org/?diff=51428&oldid=49870
20:46:03 <JAA> Exorcism: It could randomly return either that or https://nitter.x86-64-unknown-linux-gnu.zip/ for extra nerdiness. :-)
20:47:23 <cyrix> thuban just uploaded them:
20:58:03 <thuban> cyrix: er, link?
21:02:34 <Exorcism> Oooo :3
21:19:21 <cyrix> thuban
21:19:22 <cyrix> https://transfer.archivete.am/s0Wrg/README.txthttps://transfer.archivete.am/OultB/gb_scrape_effort.dbhttps://transfer.archivete.am/JDsge/gb_scrape_effort_2_parallel.pyhttps://transfer.archivete.am/I2myk/gb-2-count.csvhttps://transfer.archivete.am/162IIC/gb-api-match-count.csvhttps://transfer.archivete.am/Ryod7/gp-api.majorleaguegaming.com_match_dl_
21:19:23 <cyrix> parallel.py
21:19:23 <eggdrop> inline (for browser viewing): https://transfer.archivete.am/inline/s0Wrg/README.txthttps://transfer.archivete.am/inline/OultB/gb_scrape_effort.dbhttps://transfer.archivete.am/inline/JDsge/gb_scrape_effort_2_parallel.pyhttps://transfer.archivete.am/inline/I2myk/gb-2-count.csvhttps://transfer.archivete.am/inline/162IIC/gb-api-match-count.csvhttps://transfer.archivete.am/inline/Ryod7/gp-api.majorleaguegaming.com_match_dl_
21:19:56 <cyrix> er https://transfer.archivete.am/%28/I2myk/gb-2-count.csv,/162IIC/gb-api-match-count.csv,/JDsge/gb_scrape_effort_2_parallel.py,/s0Wrg/README.txt,/OultB/gb_scrape_effort.db,/Ryod7/gp-api.majorleaguegaming.com_match_dl_parallel.py%29.zip
21:19:56 <eggdrop> inline (for browser viewing): https://transfer.archivete.am/inline/%28/I2myk/gb-2-count.csv,/162IIC/gb-api-match-count.csv,/JDsge/gb_scrape_effort_2_parallel.py,/s0Wrg/README.txt,/OultB/gb_scrape_effort.db,/Ryod7/gp-api.majorleaguegaming.com_match_dl_parallel.py%29.zip
21:20:04 <nicolas17> eek
21:20:47 <thuban> cyrix: thanks!
21:44:10 <JAA> You tried, eggdrop. You tried.
21:44:31 <project10> :-)
21:45:03 <JAA> https://transfer.archivete.am/Ryod7/gp-api.majorleaguegaming.com_match_dl_parallel.py
21:45:03 <eggdrop> inline (for browser viewing): https://transfer.archivete.am/inline/Ryod7/gp-api.majorleaguegaming.com_match_dl_parallel.py
21:45:10 <JAA> Er
23:41:38 <h2ibot> Exorcism uploaded File:Talktalk-screenshot.png: https://wiki.archiveteam.org/?title=File%3ATalktalk-screenshot.png
23:42:38 <h2ibot> Exorcism edited TalkTalk (+33): https://wiki.archiveteam.org/?diff=51430&oldid=51315
23:43:38 <h2ibot> Exorcism uploaded File:Xangalogo-large.jpg: https://wiki.archiveteam.org/?title=File%3AXangalogo-large.jpg
23:43:39 <h2ibot> Exorcism edited Xanga (+29): https://wiki.archiveteam.org/?diff=51432&oldid=51313
23:53:39 <h2ibot> Exorcism uploaded File:Argenteam-screenshot.png: https://wiki.archiveteam.org/?title=File%3AArgenteam-screenshot.png
23:54:39 <h2ibot> Exorcism edited ARGENTeaM (+35): https://wiki.archiveteam.org/?diff=51434&oldid=51285