02:03:31 inStarcraft (starcraft2.ingame.de), inWarcraft (warcraft.ingame.de), and inCounterstrike (cs.ingame.de) forums are archived. 02:03:41 Looks like I got to inWarcraft *just* in time. It redirects to ingame.de now. 02:04:26 inDiablo (diablo3.ingame.de) is still running and almost done with the normal thread view. 02:04:52 inStarcraft: 70403 threads, 379812 thread pages, 2247281 posts 02:06:37 inWarcraft: 95061 threads, 387332 thread pages, 2083496 posts 02:07:41 inCounterstrike: 25329 threads, 70128 thread pages, 181511 posts 02:26:04 Chick Corea (jazz composer) recently passed away... have his website and social media profiles been archived? https://chickcorea.com/ 02:33:29 Aww :-( 02:33:55 Yeah, looks like jodizzle ran a bunch of stuff through AB. 02:41:08 Thanks 04:45:06 Parler seems up and actually functional now, as opposed to the website just having a landing page 04:46:54 Yes, as mentioned in the Parler channel topic and on our wiki page. 04:52:21 Turns out I'm dumb and I thought it was half up still, I promise I read the channel topics :) 05:00:37 arkiver: So here is my NicoNino script https://github.com/OrIdow6/niconino-grab 05:02:07 Obviously it's not completely ready, but all that should be done is removing informational prints and branding 05:03:27 Also, ZSTD would be useful on this one, since much of it is nearly-identical video pages; but I can't do that 05:03:39 Unless it's acceptable to use a static dictionary 05:12:09 An example item is vid:sm38315094 05:15:57 I've see projects use a static dictionary 05:43:31 Really, which one? 06:53:08 Is a general Craigslist project a good idea? I know when we worked on Yahoo Groups there was interest in Freecycle stuff as it provided a window into the kinds of things people were trading at various times in the past. 08:50:51 I found a social network aimed at real estate folks. 300k+ members. Has blogs too like https://activerain.com/blogs/mayoung not fully covered by WBM, and content going back at least to 2012 https://activerain.com/blogsview/3562327/range-hoods-we-have-seen-on-fha-203k-projects . Blogs can be read without login. 09:01:41 actually 2009. https://activerain.com/blogs/mayoung/archives/2009/03 09:06:30 appears that WBM has many gaps once you start poking around in the blog archives. Wouldn't surprise me if there was a lot of stuff related to the US financial collapse/housing crisis in there. 09:59:57 atphoenix: Have thrown it in archivebot 10:27:08 thanks dxrt. Thanks for the second opinion. I don't know how large it may be beyond what I have already mentioned. It does appear possible to enumerate posts by using URLs like https://activerain.com/blogsview/3562327/ (change the number, although some numbers result in login prompts). JAA's qwarc might be useful for that. 10:52:13 interesting Twitter observation: http://twitter.com/ActiveRainCorp/status/2764628968 is a live URL that redirects to the current (renamed) Twitter account, however http://twitter.com/ActiveRainCorp returns "This account doesn’t exist 10:52:13 " 10:54:58 Interesting fact about Twitter: you can replace the account name in the URL and the tweet will still load 12:17:30 I noticed that when I was scraping videos of the Beirut explosion. they just care about the post id 16:24:24 I tried replacing the account name above in the above example and nothing loaded: https://twitter.com/asdfasdfasdfasdfasdfasdfasdf/status/2764628968 17:07:21 atphoenix: That's because that username is invalid (too long). Try a shorter one, and it'll work. 17:07:39 And yeah, that's what happens when someone changes their screen name. 17:42:45 I assumed from the conversation that only the post ID mattered. So post ID is the main thing, but the username must meet some minimum criteria and not just be a random length filler in the URL 20:03:45 OrIdow6: sounds good, will check it in a few hours 23:02:48 inDiablo stats: 95216 threads, 615110 thread pages, 2579603 posts