#archiveteam-bs

01:23

Ryz

After resistance from Elon Musk, they have now agreed to buy Twitter for the original stated price of 44 billion dollars: wsj.com/articles/elon-musk-proposes…-deal-on-original-terms-11664901454
01:29

Ryz

Any potential of archiving Twitter? Not just the latest goods but from the past?
01:44

Foo_Q3W

Hola all. I'm trying to reach someone who holds a copy of the FilePlanet /ftp2/ contents. I'm building an archive of Q3 levels and am hpping someone can run a search over the stash to list out any .pk3 files it contains.
01:47

Foo_Q3W

On a related note if any data hoarders would like a ~60Gb stash of Q3 levels, there's a torrent up right now with everything I've managed to find so far =)
02:06

TheTechRobo

<Ryz> Any potential of archiving Twitter? Not just the latest goods but from the past?
02:07

TheTechRobo

I did think of an implementation awhile ago, but not sure if it's AT-ready and it's also resource intensive.
02:07

TheTechRobo

Basically, there'd be hashtag: items, search: items, user: items, etc. user: items would use both the search method of scraping and the user profile method (to miss as few tweets as possible). Then they'd queue t: items, which are individual tweets.
02:08

Jake

Foo_Q3W: I believe this is everything we were allowed to share from ftp2. archive.org/details/Fileplanet_ftp2_FILES_FROM_PUBLIC_ID_GRAB
02:08

TheTechRobo

search: and hashtag: and user: items queue t: items, and t: items queue hashtag: and user: and t: items (from the tweet contents, parent tweet, replies, mentions, etc)
02:08

TheTechRobo

it'd use a lot of memory with redis though
02:08

TheTechRobo

and the backfeed filter
02:09

Foo_Q3W

Ah thanks Jake. I have that on download currently to sift through. I was hoping someone with full access would be able to do an offline search for *.PK3, then let me know what's a there so I can identify any lost levels. I figure that'd be a privacy-respecting way of handling it, since PK3s were only packaged game levels.
02:09

Foo_Q3W

However I'm hopeful that the public id grab content is going to turn up some lost stuff, so bonus either way!
02:10

Jake

I believe that's spirit: ^
02:10

Foo_Q3W

Oh Spirit is in here, sweet. I pinged him on email a lil while back but no dice, figured must be busy =)
02:15

Ryz

TheTechRobo, I hope it runs sooner or later, ideally sooner :c
03:23

jpiter7

Oh, hello there, all who archive for humanity's sake.
03:23

jpiter7

How's it all going this week?
03:28

Jake

good!
03:35

jpiter7

Have you all been versed in the recent Fandom/Wikia acquisition as of now?
03:36

jpiter7

Amongst those gobbled up were GameFAQs and Metacritic, both of them holding quite some important video game history and info, especially for the obscure ones.
03:49

thuban

yep, we've heard. no concrete plans at the moment (we've discussed gamefaqs a couple of times), but we're keeping an eye out for operational changes, so ping us if you see or hear of anything specific
05:42

h2ibot

OrIdow6 uploaded File:Yahoo Groups provenance.png (State of information from Yahoo Groups as of…): wiki.archiveteam.org/?title=File%3AYahoo%20Groups%20provenance.png
05:53

h2ibot

OrIdow6 edited Yahoo! Groups (+13680, Adding this in): wiki.archiveteam.org/?diff=49057&oldid=49033
06:38

spirit

Foo_Q3W: sorry, i am very good at replying to some mails after weeks or months =)
06:39

spirit

will mail to you tomorrow or so
06:39

spirit

gist is sadly "no", those pk3 files might be private betas or other files not meant for the public
06:39

Foo_Q3W

hehe, no worries. Thanks for the update
06:40

spirit

the only way i can give out files from that part of the archive is from publically archived proof of public availability, e.g. fileplanet urls archived on some pages in the wayback machine or the live website
06:40

Foo_Q3W

Am I right in thinking that if I can pull dl.fileplanet links, that can be tied back to files?
06:40

spirit

yeah!
06:40

Foo_Q3W

Ah, yep, we're on the same page. Got it.
06:40

Foo_Q3W

I'll work up that list.
06:41

Foo_Q3W

ty ty
06:46

spirit

:))
06:46

spirit

with online sources please, just URLs can be forged and bruteforced
08:31

pabs

blog.archive.org/2022/10/04/interne…of-amateur-radio-and-communications news.ycombinator.com/item?id=33089535
08:31

pabs

some possible links to archive in the HN thread
14:17

fenugrec

Ok, trying grab-site on this annoying cloudflare-protected forum. Is there a way to make grab-site use cookies from a regular firefox browsing session ? I.e. do the clownflare "check" in a regular browser, then start a crawl with whatever cookies it generated
14:17

fenugrec

(forum.tek.com)
14:18

fenugrec

nvm, I had missed the instructions in the Readme
14:25

fenugrec

aaand it doesn't work
14:27

arkiver

fenugrec: i am no expert on grab-site, so can't answer that, but I do wonder - is that one shutting down?
14:28

qwertyasdfuiopghjkl

forum.tek.com/viewtopic.php?f=583&t=143177 says "If you're a frequent visitor of this forum, please be aware that the forum is moving and will be decommissioned before the end of the year."
14:29

fenugrec

arkiver, yes, it's scheduled to be "decomissioned", as qwerty posted above
16:20

Retrofan

Hi
16:21

Retrofan

I am wondering if there is any script made for auto-starting the archivebot service
16:21

Retrofan

s
20:52

kaz

no
20:53

kaz

there is for grab-site though.
21:43

Retrofan

Kaz: look like that grab-site is very user-friendly ;)
21:44

JAA

It's almost as if that's what you should use... ;-)
21:46

Retrofan

JAA: The problem is that.. I need the IRC bot and pipeline system.
21:47

Retrofan

Anyways.. I made a simple bash script to start all of archivebot's services.
21:52

Jake

I can't remember if I asked this before, but why do you need a IRC bot or a pipeline?
21:55

Retrofan

Jake: Pipeline will be useful for anyone (who has a lot of space/bandwidth) wants to help me in archiving project
21:56

Retrofan

irc bot make it easier for a group of people to manage the archiving process
22:00

Jake

I.... guess? but you've also spent like a month trying to get it setup. seems way easier to me to go with something like grab-site or something similar?
22:04

Retrofan

Yeah, it took me some time to set it up. But, in my opinion... that is easier than creating a full bot for a grab-site and a pipeline-like system.
22:07

Retrofan

also, I upload the finished warc to my project's ftp, so... uploader.py is so useful for me...
22:23

pabs

a tech person arrested in Iran: twitter.com/jadi jadi.net youtube.com/c/JadiMirmirani fosstodon.org/@Mehrad/109117086408879957

2 years ago

« a day earlier

a day later »

today »