01:19:57 <thetechrobo_> Yay, r/antiwork went private...
01:20:01 <thetechrobo_> They say they'll be back soon tho
01:20:13 <thetechrobo_> ...guess what I'm downloading once it comes back up?
01:38:52 <OrIdow6> That's fine wessel1512
01:46:53 <JAA> TheTechRobo: You're aware that we're continuously archiving all of Reddit, right?
01:47:19 <TheTechRobo> Right...I thought of that, but dismissed it for some reason.
02:12:16 <pabs> JAA: outlinks too? is HN+outlinks also archived?
02:16:43 <JAA> pabs: Yes, Reddit outlinks go into #//. Don't think we're covering HN anywhere at the moment, but it's small enough to probably not need a distributed project.
02:17:03 <JAA> It's been on my list to investigate that for a while.
02:17:13 <JAA> But you know how that goes...
02:17:18 <pabs> :)
02:23:10 <OrIdow6> I do believe I remember seeing an IA non-continuous crawl of HN and its outlinks somewhere
02:31:32 <arkiver> aiming for a start of the pinger.pl project tomorrow
02:35:09 <DopefishJustin> https://twitter.com/MankaiCompanyEN/status/1486258864647548932 https://twitter.com/pinkberryfizzi3/status/1486256047589236740
02:42:32 <OrIdow6> Respectively: "An important in-game announcement regarding future plans for the game has been published. We ask that users login to confirm the contents of the announcement. #a3game"
02:42:33 <OrIdow6> " PLEASE BOOST THIS WE R ARCHIVING A3 EN Guys !! I already have an archives discord up so if u wanna help me archive stuff please do join https://discord.gg/tGEnmadD"
02:43:38 <OrIdow6> "Closing March 9" according to https://a-three.fandom.com/wiki/A3!_Wiki
02:43:43 <OrIdow6> *on
02:43:51 <OrIdow6> Any more info DopefishJustin?
05:28:37 <DopefishJustin> nope
07:24:02 <OrIdow6> Oh
07:24:08 <OrIdow6> Yet another forum
07:28:58 <OrIdow6> Looks like www. and straight domain are slightly different
07:29:13 <OrIdow6> If someone wants me in particular to look at this I will do so tomorrow
07:32:39 <OrIdow6> Same goes for A3, will investigate and add to DW tomorrow if no one else does
08:06:05 <ohyes> OrIdow6, fyi, there is also 207.148.109.12 besides the www. and straight domain
14:46:55 <Hifihedgehog> Hello. I have a question regarding the TechnologyGuide data, which has been posted on Archive.org. Will this also be back-populated into the Wayback Machine?
15:08:43 <arkiver> yes
15:09:17 <Hifihedgehog> Cool. Thanks for confirming and thanks for your work!
15:09:43 <arkiver> JAA: those thanks are for you ^
18:44:29 <JAA> :-)
18:59:14 <JAA> Regarding the TechnologyGuide websites: technologyguide.com is done, notebookreview.com and brighthand.com are running in AB currently, digitalcamerareview.com and tabletpcreview.com have not been covered at all yet.
18:59:29 <JAA> The rate limiting on the sites works differently than on the forums.
20:35:28 <daxxy> hi, is there a list of the technologyguide forum URLs broken by the WAF?
20:36:13 <daxxy> I'm thinking of grabbing the contents over the tapatalk API - no use for WBM, but at least they'd be *somewhere*
20:37:12 <JAA> Hmm, good idea. Maybe it can be put into the WBM in some form even. Do you have more details on that API?
20:37:24 <JAA> There isn't a list of the broken threads currently, but I can get one later.
20:39:05 <daxxy> honestly I'm not even sure you can get posts via GET requests from it, internally it's almost all POSTs with parameters in bodies
20:39:20 <daxxy> used to be xmlrpc, they added JSON later on
20:39:29 <JAA> I see.
20:40:18 <JAA> Well, POST can still be saved to WARC and go into the WBM. Maybe we can add a URL parameter that gets ignored by the API and serves as the topic identifier in the API, like we did for the YouTube dislikes data.
20:40:39 <daxxy> oh, neat
20:40:47 <JAA> Is the API open or does it require auth?
20:41:16 <daxxy> I think everything we'd need is open
20:41:29 <JAA> Nice
20:43:19 <daxxy> the serverside implementation is freely available and all unobfuscated PHP, that's my reference: https://www.tapatalk.com/download_xenforo
20:43:35 <daxxy> (note, xenforo 1)
20:56:42 <daxxy> tl;dr: e.g. /mobiquo/mobiquo.php?method_name=get_forum (for xmlrpc) and /mobiquo/tapatalk.php?method_name=get_forum (for json) are handled by mobiquo/mbqAction/MbqActGetForum.php, etc.
21:00:21 <daxxy> one very neat thing about it is, there's no limit on the results/page parameter, you can request as much as you want at once until you hit httpd/php side timeouts/size limits
21:41:11 <cadence> hi there! I had a question about the 2009 geocities crawl - I thought it could be fun to download some whole sites from the collection so I can quickly navigate through them, offline, using a text mode browser, rather than having to go via the wayback machine interface. however, seems as though download is disabled for the warc files in the collection?
21:42:02 <cadence> if there's an older collection of geocities, I'm totally okay going back to it, since I'm looking to browse rather than to have a complete archive.
21:42:12 <cadence> if this is infeasible and I am being a fool, please let me know :^)
21:46:31 <OrIdow6> cadence: What is this collection you are looking at?
21:46:57 <cadence> https://archive.org/details/geocities
21:48:01 <daxxy> that's IA's crawl, *Archive Team's* crawl is openly available, see https://wiki.archiveteam.org/index.php/GeoCities
21:48:04 <OrIdow6> I believe those are archive.org crawls, not ArchiveTeam ralws
21:48:40 <OrIdow6> Archive.org does not usually release its raw WARCs, and I don't think ArchiveTeam used WARCs for the geocities roject
21:48:40 <cadence> my bad! I'll check out the link you posted, thanks
21:48:42 <OrIdow6> yeah
21:57:20 <h2ibot> Adrmcr edited Game Jolt (+137, added tracker and grab links): https://wiki.archiveteam.org/?diff=48222&oldid=48157
21:57:21 <h2ibot> Fidel edited List of websites excluded from the Wayback Machine (+24): https://wiki.archiveteam.org/?diff=48223&oldid=48213
22:02:21 <cadence> I see, thanks so much for this link. so there's the patched torrent, which I can pretty easily figure out how to download. the page also mentions an art project which is apparently "far superior", though this appears to be a series of screenshots rather than a data dump. am I correct?
22:15:09 <cadence> checking blog.geocities.institute/about -- afaict, the patched torrent is the one to get. thanks for the help!
22:44:50 <cadence> awesome, thanks everyone <3