-
Ryz
After resistance from Elon Musk, they have now agreed to buy Twitter for the original stated price of 44 billion dollars:
wsj.com/articles/elon-musk-proposes…-deal-on-original-terms-11664901454
-
Ryz
Any potential of archiving Twitter? Not just the latest goods but from the past?
-
Foo_Q3W
Hola all. I'm trying to reach someone who holds a copy of the FilePlanet /ftp2/ contents. I'm building an archive of Q3 levels and am hpping someone can run a search over the stash to list out any .pk3 files it contains.
-
Foo_Q3W
On a related note if any data hoarders would like a ~60Gb stash of Q3 levels, there's a torrent up right now with everything I've managed to find so far =)
-
TheTechRobo
<Ryz> Any potential of archiving Twitter? Not just the latest goods but from the past?
-
TheTechRobo
I did think of an implementation awhile ago, but not sure if it's AT-ready and it's also resource intensive.
-
TheTechRobo
Basically, there'd be hashtag: items, search: items, user: items, etc. user: items would use both the search method of scraping and the user profile method (to miss as few tweets as possible). Then they'd queue t: items, which are individual tweets.
-
Jake
Foo_Q3W: I believe this is everything we were allowed to share from ftp2.
archive.org/details/Fileplanet_ftp2_FILES_FROM_PUBLIC_ID_GRAB
-
TheTechRobo
search: and hashtag: and user: items queue t: items, and t: items queue hashtag: and user: and t: items (from the tweet contents, parent tweet, replies, mentions, etc)
-
TheTechRobo
it'd use a lot of memory with redis though
-
TheTechRobo
and the backfeed filter
-
Foo_Q3W
Ah thanks Jake. I have that on download currently to sift through. I was hoping someone with full access would be able to do an offline search for *.PK3, then let me know what's a there so I can identify any lost levels. I figure that'd be a privacy-respecting way of handling it, since PK3s were only packaged game levels.
-
Foo_Q3W
However I'm hopeful that the public id grab content is going to turn up some lost stuff, so bonus either way!
-
Jake
I believe that's spirit: ^
-
Foo_Q3W
Oh Spirit is in here, sweet. I pinged him on email a lil while back but no dice, figured must be busy =)
-
Ryz
TheTechRobo, I hope it runs sooner or later, ideally sooner :c
-
jpiter7
Oh, hello there, all who archive for humanity's sake.
-
jpiter7
How's it all going this week?
-
Jake
good!
-
jpiter7
Have you all been versed in the recent Fandom/Wikia acquisition as of now?
-
jpiter7
Amongst those gobbled up were GameFAQs and Metacritic, both of them holding quite some important video game history and info, especially for the obscure ones.
-
thuban
yep, we've heard. no concrete plans at the moment (we've discussed gamefaqs a couple of times), but we're keeping an eye out for operational changes, so ping us if you see or hear of anything specific
-
h2ibot
OrIdow6 uploaded
File:Yahoo Groups provenance.png (State of information from Yahoo Groups as of…):
wiki.archiveteam.org/?title=File%3AYahoo%20Groups%20provenance.png
-
h2ibot
OrIdow6 edited Yahoo! Groups (+13680, Adding this in):
wiki.archiveteam.org/?diff=49057&oldid=49033
-
spirit
Foo_Q3W: sorry, i am very good at replying to some mails after weeks or months =)
-
spirit
will mail to you tomorrow or so
-
spirit
gist is sadly "no", those pk3 files might be private betas or other files not meant for the public
-
Foo_Q3W
hehe, no worries. Thanks for the update
-
spirit
the only way i can give out files from that part of the archive is from publically archived proof of public availability, e.g. fileplanet urls archived on some pages in the wayback machine or the live website
-
Foo_Q3W
Am I right in thinking that if I can pull dl.fileplanet links, that can be tied back to files?
-
spirit
yeah!
-
Foo_Q3W
Ah, yep, we're on the same page. Got it.
-
Foo_Q3W
I'll work up that list.
-
Foo_Q3W
ty ty
-
spirit
:))
-
spirit
with online sources please, just URLs can be forged and bruteforced
-
pabs
-
pabs
some possible links to archive in the HN thread
-
fenugrec
Ok, trying grab-site on this annoying cloudflare-protected forum. Is there a way to make grab-site use cookies from a regular firefox browsing session ? I.e. do the clownflare "check" in a regular browser, then start a crawl with whatever cookies it generated
-
fenugrec
-
fenugrec
nvm, I had missed the instructions in the Readme
-
fenugrec
aaand it doesn't work
-
arkiver
fenugrec: i am no expert on grab-site, so can't answer that, but I do wonder - is that one shutting down?
-
qwertyasdfuiopghjkl
forum.tek.com/viewtopic.php?f=583&t=143177 says "If you're a frequent visitor of this forum, please be aware that the forum is moving and will be decommissioned before the end of the year."
-
fenugrec
arkiver, yes, it's scheduled to be "decomissioned", as qwerty posted above
-
Retrofan
Hi
-
Retrofan
I am wondering if there is any script made for auto-starting the archivebot service
-
Retrofan
s
-
kaz
no
-
kaz
there is for grab-site though.
-
Retrofan
Kaz: look like that grab-site is very user-friendly ;)
-
JAA
It's almost as if that's what you should use... ;-)
-
Retrofan
JAA: The problem is that.. I need the IRC bot and pipeline system.
-
Retrofan
Anyways.. I made a simple bash script to start all of archivebot's services.
-
Jake
I can't remember if I asked this before, but why do you need a IRC bot or a pipeline?
-
Retrofan
Jake: Pipeline will be useful for anyone (who has a lot of space/bandwidth) wants to help me in archiving project
-
Retrofan
irc bot make it easier for a group of people to manage the archiving process
-
Jake
I.... guess? but you've also spent like a month trying to get it setup. seems way easier to me to go with something like grab-site or something similar?
-
Retrofan
Yeah, it took me some time to set it up. But, in my opinion... that is easier than creating a full bot for a grab-site and a pipeline-like system.
-
Retrofan
also, I upload the finished warc to my project's ftp, so... uploader.py is so useful for me...
-
pabs