01:19:57 Yay, r/antiwork went private... 01:20:01 They say they'll be back soon tho 01:20:13 ...guess what I'm downloading once it comes back up? 01:38:52 That's fine wessel1512 01:46:53 TheTechRobo: You're aware that we're continuously archiving all of Reddit, right? 01:47:19 Right...I thought of that, but dismissed it for some reason. 02:12:16 JAA: outlinks too? is HN+outlinks also archived? 02:16:43 pabs: Yes, Reddit outlinks go into #//. Don't think we're covering HN anywhere at the moment, but it's small enough to probably not need a distributed project. 02:17:03 It's been on my list to investigate that for a while. 02:17:13 But you know how that goes... 02:17:18 :) 02:23:10 I do believe I remember seeing an IA non-continuous crawl of HN and its outlinks somewhere 02:31:32 aiming for a start of the pinger.pl project tomorrow 02:35:09 https://twitter.com/MankaiCompanyEN/status/1486258864647548932 https://twitter.com/pinkberryfizzi3/status/1486256047589236740 02:42:32 Respectively: "An important in-game announcement regarding future plans for the game has been published. We ask that users login to confirm the contents of the announcement. #a3game" 02:42:33 " PLEASE BOOST THIS WE R ARCHIVING A3 EN Guys !! I already have an archives discord up so if u wanna help me archive stuff please do join https://discord.gg/tGEnmadD" 02:43:38 "Closing March 9" according to https://a-three.fandom.com/wiki/A3!_Wiki 02:43:43 *on 02:43:51 Any more info DopefishJustin? 05:28:37 nope 07:24:02 Oh 07:24:08 Yet another forum 07:28:58 Looks like www. and straight domain are slightly different 07:29:13 If someone wants me in particular to look at this I will do so tomorrow 07:32:39 Same goes for A3, will investigate and add to DW tomorrow if no one else does 08:06:05 OrIdow6, fyi, there is also 207.148.109.12 besides the www. and straight domain 14:46:55 Hello. I have a question regarding the TechnologyGuide data, which has been posted on Archive.org. Will this also be back-populated into the Wayback Machine? 15:08:43 yes 15:09:17 Cool. Thanks for confirming and thanks for your work! 15:09:43 JAA: those thanks are for you ^ 18:44:29 :-) 18:59:14 Regarding the TechnologyGuide websites: technologyguide.com is done, notebookreview.com and brighthand.com are running in AB currently, digitalcamerareview.com and tabletpcreview.com have not been covered at all yet. 18:59:29 The rate limiting on the sites works differently than on the forums. 20:35:28 hi, is there a list of the technologyguide forum URLs broken by the WAF? 20:36:13 I'm thinking of grabbing the contents over the tapatalk API - no use for WBM, but at least they'd be *somewhere* 20:37:12 Hmm, good idea. Maybe it can be put into the WBM in some form even. Do you have more details on that API? 20:37:24 There isn't a list of the broken threads currently, but I can get one later. 20:39:05 honestly I'm not even sure you can get posts via GET requests from it, internally it's almost all POSTs with parameters in bodies 20:39:20 used to be xmlrpc, they added JSON later on 20:39:29 I see. 20:40:18 Well, POST can still be saved to WARC and go into the WBM. Maybe we can add a URL parameter that gets ignored by the API and serves as the topic identifier in the API, like we did for the YouTube dislikes data. 20:40:39 oh, neat 20:40:47 Is the API open or does it require auth? 20:41:16 I think everything we'd need is open 20:41:29 Nice 20:43:19 the serverside implementation is freely available and all unobfuscated PHP, that's my reference: https://www.tapatalk.com/download_xenforo 20:43:35 (note, xenforo 1) 20:56:42 tl;dr: e.g. /mobiquo/mobiquo.php?method_name=get_forum (for xmlrpc) and /mobiquo/tapatalk.php?method_name=get_forum (for json) are handled by mobiquo/mbqAction/MbqActGetForum.php, etc. 21:00:21 one very neat thing about it is, there's no limit on the results/page parameter, you can request as much as you want at once until you hit httpd/php side timeouts/size limits 21:41:11 hi there! I had a question about the 2009 geocities crawl - I thought it could be fun to download some whole sites from the collection so I can quickly navigate through them, offline, using a text mode browser, rather than having to go via the wayback machine interface. however, seems as though download is disabled for the warc files in the collection? 21:42:02 if there's an older collection of geocities, I'm totally okay going back to it, since I'm looking to browse rather than to have a complete archive. 21:42:12 if this is infeasible and I am being a fool, please let me know :^) 21:46:31 cadence: What is this collection you are looking at? 21:46:57 https://archive.org/details/geocities 21:48:01 that's IA's crawl, *Archive Team's* crawl is openly available, see https://wiki.archiveteam.org/index.php/GeoCities 21:48:04 I believe those are archive.org crawls, not ArchiveTeam ralws 21:48:40 Archive.org does not usually release its raw WARCs, and I don't think ArchiveTeam used WARCs for the geocities roject 21:48:40 my bad! I'll check out the link you posted, thanks 21:48:42 yeah 21:57:20 Adrmcr edited Game Jolt (+137, added tracker and grab links): https://wiki.archiveteam.org/?diff=48222&oldid=48157 21:57:21 Fidel edited List of websites excluded from the Wayback Machine (+24): https://wiki.archiveteam.org/?diff=48223&oldid=48213 22:02:21 I see, thanks so much for this link. so there's the patched torrent, which I can pretty easily figure out how to download. the page also mentions an art project which is apparently "far superior", though this appears to be a series of screenshots rather than a data dump. am I correct? 22:15:09 checking blog.geocities.institute/about -- afaict, the patched torrent is the one to get. thanks for the help! 22:44:50 awesome, thanks everyone <3