-
JAA
inStarcraft (starcraft2.ingame.de), inWarcraft (warcraft.ingame.de), and inCounterstrike (cs.ingame.de) forums are archived.
-
JAA
Looks like I got to inWarcraft *just* in time. It redirects to ingame.de now.
-
JAA
inDiablo (diablo3.ingame.de) is still running and almost done with the normal thread view.
-
JAA
inStarcraft: 70403 threads, 379812 thread pages, 2247281 posts
-
JAA
inWarcraft: 95061 threads, 387332 thread pages, 2083496 posts
-
JAA
inCounterstrike: 25329 threads, 70128 thread pages, 181511 posts
-
tech234a
Chick Corea (jazz composer) recently passed away... have his website and social media profiles been archived?
chickcorea.com
-
JAA
Aww :-(
-
JAA
Yeah, looks like jodizzle ran a bunch of stuff through AB.
-
tech234a
Thanks
-
mgrandi
Parler seems up and actually functional now, as opposed to the website just having a landing page
-
JAA
Yes, as mentioned in the Parler channel topic and on our wiki page.
-
mgrandi
Turns out I'm dumb and I thought it was half up still, I promise I read the channel topics :)
-
OrIdow6
arkiver: So here is my NicoNino script
github.com/OrIdow6/niconino-grab
-
OrIdow6
Obviously it's not completely ready, but all that should be done is removing informational prints and branding
-
OrIdow6
Also, ZSTD would be useful on this one, since much of it is nearly-identical video pages; but I can't do that
-
OrIdow6
Unless it's acceptable to use a static dictionary
-
OrIdow6
An example item is vid:sm38315094
-
mgrandi
I've see projects use a static dictionary
-
kiska
Really, which one?
-
atphoenix
Is a general Craigslist project a good idea? I know when we worked on Yahoo Groups there was interest in Freecycle stuff as it provided a window into the kinds of things people were trading at various times in the past.
-
atphoenix
I found a social network aimed at real estate folks. 300k+ members. Has blogs too like
activerain.com/blogs/mayoung not fully covered by WBM, and content going back at least to 2012
activerain.com/blogsview/3562327/ra…s-we-have-seen-on-fha-203k-projects . Blogs can be read without login.
-
atphoenix
-
atphoenix
appears that WBM has many gaps once you start poking around in the blog archives. Wouldn't surprise me if there was a lot of stuff related to the US financial collapse/housing crisis in there.
-
dxrt
atphoenix: Have thrown it in archivebot
-
atphoenix
thanks dxrt. Thanks for the second opinion. I don't know how large it may be beyond what I have already mentioned. It does appear possible to enumerate posts by using URLs like
activerain.com/blogsview/3562327 (change the number, although some numbers result in login prompts). JAA's qwarc might be useful for that.
-
atphoenix
interesting Twitter observation:
twitter.com/ActiveRainCorp/status/2764628968 is a live URL that redirects to the current (renamed) Twitter account, however
twitter.com/ActiveRainCorp returns "This account doesn’t exist
-
atphoenix
"
-
Arcorann
Interesting fact about Twitter: you can replace the account name in the URL and the tweet will still load
-
Wayward
I noticed that when I was scraping videos of the Beirut explosion. they just care about the post id
-
atphoenix
I tried replacing the account name above in the above example and nothing loaded:
twitter.com/asdfasdfasdfasdfasdfasdfasdf/status/2764628968
-
JAA
atphoenix: That's because that username is invalid (too long). Try a shorter one, and it'll work.
-
JAA
And yeah, that's what happens when someone changes their screen name.
-
atphoenix
I assumed from the conversation that only the post ID mattered. So post ID is the main thing, but the username must meet some minimum criteria and not just be a random length filler in the URL
-
arkiver
OrIdow6: sounds good, will check it in a few hours
-
JAA
inDiablo stats: 95216 threads, 615110 thread pages, 2579603 posts