-
h2ibot
JustAnotherArchivist edited Deathwatch (+201, /* 2024 */ Add TinyLetter):
wiki.archiveteam.org/?diff=51231&oldid=51222
-
h2ibot
JustAnotherArchivist edited Deathwatch (+59, /* 2024 */ Add DK Find Out!):
wiki.archiveteam.org/?diff=51232&oldid=51231
-
nicolas17
heh I started a mediafire worker 25 hours ago
-
nicolas17
it completed 7 (seven) items
-
JAA
Yes, there isn't much running through that project.
-
nicolas17
I'll start telegram and see how much concurrency I can get away with on a single core
-
JAA
Over the past week or so (although I only tried on three days), I've managed to grab all the Tesla Roadster PDFs. The online manual thing still remains to be done though. I'll try, but I'm not sure that'll get anywhere.
-
h2ibot
FireonLive edited Current Projects (+24, invisibly move imgur to the 'Long-term'…):
wiki.archiveteam.org/?diff=51233&oldid=51173
-
fireonlive
whomever decided that things on mediawiki must start with a capital letter
-
fireonlive
🖕right here
-
project10
WhatsWrongWithCamelCasing?
-
fireonlive
some things, just, like, start with a lower case letter, man
-
pokechu22
There's a way of changing that with displaytitle but I don't know whether that's enabled on the archiveteam wiki
-
fireonlive
(also to all the developers using text-transform on my text i see you, and you're the first into the meat shredders)
-
fireonlive
oh neat
-
pokechu22
wait, no, it definitely is, given what's going on at
wiki.archiveteam.org/index.php/YouTube
-
pokechu22
perhaps that one's going a bit too far though :)
-
arkiver
so bluesky would make a public web interface
-
arkiver
that is still not up i guess?
-
fireonlive
-
fireonlive
looks like no
-
arkiver
yeah
-
TheTechRobo
There is the og:description etc there, though.
-
h2ibot
FireonLive edited Current Projects (-436, remove expired recently finished projects):
wiki.archiveteam.org/?diff=51234&oldid=51233
-
fireonlive
arkiver: was issuu going to be turned into a long-term project?
-
fireonlive
ah it's off of the tracker now
-
fireonlive
also oops, wrong channel
-
thuban
speaking of long-term projects, may i ask again about getting googlecrash back up?
-
JAA
fireonlive, pokechu22: Example with display title:
wiki.archiveteam.org/index.php/Codearchiver
-
fireonlive
ah :D
-
arkiver
fireonlive: likely not no
-
JAA
No way to avoid the capital letter in the URL though.
-
fireonlive
doesn't fix the url, but nicer
-
fireonlive
arkiver: ah ok! sounds good
-
arkiver
thuban: do we have examples you would want to run through?
-
thuban
not offhand, but i could get some later today
-
nicolas17
I think lowercase initial letters can be done with a mediawiki configuration change, like Wiktionary, but it's probably not worth it, as it makes /codearchiver and /Codearchiver different pages
-
TheTechRobo
This is a very minor issue, but currently the wording of "Current Running Warrior Project" on the wiki page (
wiki.archiveteam.org/index.php/Template:CurrentWarrior) is annoying to me - it implies that the others are paused
-
arkiver
thuban: i'm a little worried about size. it's also not easily downloadable from the Wayback Machine i believe
-
fireonlive
-
» nicolas17 prepares the bonk stick
-
» JAA adds [[Category:Permanently_horny_users]] to that page.
-
fireonlive
xD
-
thuban
arkiver: i'm unclear about wbm playback myself, but see previous discussion in this channel
-
h2ibot
FireonLive edited Current Projects (-220, remove Issuu, finished long ago; move…):
wiki.archiveteam.org/?diff=51236&oldid=51234
-
TheTechRobo
JAA: lmfao
-
thuban
as for size, we could skip dedicated discovery and just do manual queueing? (and/or backfeed from #//, but that might be too big all on its own)
-
TheTechRobo
Yeah manual queuing would be great
-
fireonlive
TheTechRobo: you run the warrior yeah?
-
TheTechRobo
fireonlive: ish, why?
-
fireonlive
can you shoot me a screenshot of the pick a project page
-
TheTechRobo
-
TheTechRobo
er put a bit too much on the bottom
-
fireonlive
thanks
-
h2ibot
FireonLive edited Template:CurrentWarrior (-11, align wording with warrior wording):
wiki.archiveteam.org/?diff=51237&oldid=46438
-
TheTechRobo
fireonlive: thanks
-
fireonlive
=]
-
TheTechRobo
Is it canonically ArchiveTeam or Archive Team?
-
thuban
i've often wondered that myself
-
fireonlive
not sure
-
JAA
So have I.
-
fireonlive
i like the one word version
-
thuban
unfortunately there doesn't seem to be an authoritative answer
-
TheTechRobo
The main page has both. lol
-
JAA
Well, there's no real authority here, so getting an authoritative answer is tricky... :-P
-
fireonlive
if you look at the 2009 'logo'
-
fireonlive
-
joepie91|m
always keep 'em on their toes
-
JAA
I personally use ArchiveTeam.
-
fireonlive
has a space, and the desc. does too
-
joepie91|m
I also use ArchiveTeam
-
fireonlive
and you can see the uploader there
-
fireonlive
ye me too
-
thuban
a lot of the older stuff, news coverage, etc, uses "Archive Team", but i've always preferred "ArchiveTeam" as it's more clearly a proper noun
-
TheTechRobo
yeah "Archive Team" sounds way too generic IMO
-
TheTechRobo
like you're describing it rather than naming it
-
fireonlive
so do our topics in -bs and -dev
-
JAA
Yeah, marginally less risk of confusion with IA.
-
fireonlive
(but not #archiveteam)
-
fireonlive
:D
-
JAA
That's because I set those, I think.
-
fireonlive
ah :)
-
JAA
Whereas the #archiveteam topic is ancient.
-
JAA
-ot didn't exist when I showed up here, and I think -bs had a different topic.
-
TheTechRobo
Right, didn't -bs used to be -ot?
-
fireonlive
ye "Archive Team: We're not archive.org" sounds like something from early days
-
JAA
TheTechRobo: Very originally yes, I've been told. By the time I joined, it was already the separation we have today, more or less.
-
fireonlive
the press seems to favour two word variant
-
fireonlive
-
TheTechRobo
that's probably because that's what the opening paragraph in the main page says
-
fireonlive
i'm sure they went to our frontpage though for that :p
-
fireonlive
ye
-
fireonlive
according to wiki.*'s <title> we're "Archiveteam"
-
fireonlive
:D
-
TheTechRobo
so current contenders:
-
TheTechRobo
- Archive Team
-
TheTechRobo
- ArchiveTeam
-
TheTechRobo
-Archiveteam
-
arkiver
not Archiveteam
-
fireonlive
- dick drawn on ballot/wasted vote
-
arkiver
i guess we'll claims both Archive Team and ArchiveTeam
-
fireonlive
(it seems that always happens)
-
arkiver
it's spread out across a ton of articles already
-
arkiver
both versions
-
arkiver
i usually use Archive Team thouhg
-
fireonlive
we could prefer one in our 'style guide' i suppose
-
TheTechRobo
arkiver: Right, but we should probably pick one for e.g. Warrior UI?
-
TheTechRobo
yeah
-
arkiver
maybe
-
fireonlive
hm
-
arkiver
what triggered this discussion?
-
TheTechRobo
maybe ArchiveTeam could be informal and Archive Team could be press? idk
-
TheTechRobo
arkiver: me asking :P
-
arkiver
ah :P
-
arkiver
well let's not make major changes yet
-
TheTechRobo
yeah
-
fireonlive
but i already have 300 wiki edits in queue
-
fireonlive
:P
-
fireonlive
jkjk
-
JAA
I was going to say <amateur.png>, but it turns out my mass IRC edit two years ago was only ~330 edits:
wiki.archiveteam.org/index.php?titl…mit=350&target=JustAnotherArchivist
-
fireonlive
:P
-
fireonlive
lordy lordy
-
JAA
Yeah, that was fun.
-
JAA
Much of it was automated, plus some manual checking.
-
fireonlive
:)
-
fireonlive
-
fireonlive
i think wikipedia probably has a term for it
-
fireonlive
hmmm. re: that page deletion proposal forever ago; i guess an !ao of a couple links followed by deletion could work; instead of the "namespace of wikipages locked in time that we are also scared to look at"
-
joepie91|m
I assume people here are already aware but Kissinger is dead
-
joepie91|m
don't know if that implies any archival work
-
JAA
I archived the website, and I'm looking into throwing interviews etc. into #down-the-tube. Otherwise, probably not a whole lot.
-
JAA
-
JAA
:-P
-
fireonlive
ooh
-
JAA
There are a few more tweets, but there are plenty more with 'Archive Team'.
-
fireonlive
wikimedia template to $rand between the two every time it's mentioned?
-
fireonlive
:p
-
» JAA slaps fireonlive around a bit with a large trout
-
fireonlive
xD
-
pabs
joepie91|m: AB has a job for www, not sure if anyone did subdomain enumeration
-
» pabs votes ArchiveTeam
-
» fireonlive votes ArchiveTeam
-
pabs
yay nitter.net unbanned me
-
JAA
The
henryakissinger.com AB job is incomplete due to PerimeterX 403s. I grabbed a separate copy from a residential IP with grab-site.
-
fireonlive
-
» fireonlive waits for JAA to inform of the JS-disabled status
-
fireonlive
(informal non-binding poll)
-
fireonlive
(does not hold up in AT-core court)
-
joepie91|m
ah yes, governance(tm)
-
pabs
anyone know the status of archiving for Twitter and Facebook? does nitter work? any particular instance?
-
fireonlive
:3
-
fireonlive
pabs: nitter for twitter, can use a special instance; facebook.. not sure
-
JAA
I'm beginning to repeat myself, but...
-
» JAA slaps fireonlive around a bit with a large trout
-
JAA
pabs: Facebook is hell and virtually impossible, sadly, especially since the redesign.
-
JAA
Or well, nothing is impossible, but no tooling exists.
-
fireonlive
😅
-
fireonlive
hmm, did they kill off mbasic
-
JAA
snscrape's Facebook module has been broken for a long time.
-
fireonlive
ah, login wall
-
fireonlive
-
fireonlive
(-l)
-
fireonlive
i was a good boy and set the options choices to random :3
-
fireonlive
so theres no bias ™
-
JAA
m.facebook.com still exists, but you don't get far there.
-
JAA
Sanqui: I'm only about two thirds done (635 of 930), but I'll send you what I got so far since there's not too much time left. The extraction is somewhat incomplete because there's still no decent WARC tooling and mine broke on a few WARCs. In order to ensure I didn't have to download anything again due to incomplete extraction, I went with a much simpler `grep -Fai -e ulozto.cz -e uloz.to`, so it
-
JAA
contains the surrounding HTML, not just the Uloz URLs. After filtering out dupes (mostly from uloz.to pages themselves), here's the 33k results so far:
transfer.archivete.am/mUisN/webzdarma-uloz-partial.zst
-
eggdrop
-
JAA
Bad bot
-
JAA
:-)
-
fireonlive
i guess you can’t inline a .zst file haha
-
that_lurker
well you can, but not with the expected outcome
-
fireonlive
-
JAA
Excluded URLs ending with .zst?
-
Sanqui
JAA: Awesome, thank you, I expect we'll have our hands full with this but who knows
-
project10
Good bot!
-
fireonlive
,, string cat $ATTExcludedExtensions
-
eggdrop
ok: *.zst *.gz *.tar.gz *.tar *.tar.xz -0ms-
-
fireonlive
JAA: indeed :)
-
JAA
:-)
-
supercar99
Noob here. Why isn't frogger/blogger the current active project? Isn't it more "urgent" than telegram?
-
flashfire42|m
<supercar99> "Noob here. Why isn't frogger/..." <- Cause we already at capacity for it I think and telegram needs love too I guess
-
imer
Yeah, not able to ingest data fast enough as is, don't need more workers on it :)
-
flashfire42|m
I think the claims is so high in hopes that when the deadline hits there will still be a bunch of downloaded stuff waiting to upload
-
supercar99
Got it, thank you! Initally set warrior to run on blogger but didn't understand why it kept getting stuck (really stuck, warrior dashboard wouldn't even load). Set to the current project (i'll leave it there) and everything went back to smooth!
-
flashfire42|m
Yeah we appreciate the workers no matter what.
-
flashfire42|m
Telegram does have a giant backlog too so
-
qwertyasdfuiopghjkl
( replying to
hackint.logs.kiska.pw/archiveteam-bs/20231130#c392586 ) fireonlive: I'm guessing you need to include the "User:" part in the displaytitle thing for it to work.
-
fireonlive
ohh maybe!
-
Pedrosso
Star Trek related sites:
-
Pedrosso
-
Pedrosso
-
eggdrop
-
Pedrosso
-
Pedrosso
-
eggdrop
-
Pedrosso
-
Pedrosso
has the memory-alpha, memory-beta, memory-gamma, and memory-delta wikis been saved?
-
Pedrosso
have*
-
Pedrosso
I'll take it to #wikiteam -
-
Pedrosso
-
eggdrop
-
Pedrosso
-
eggdrop
-
nulldata
-
Pedrosso
-
eggdrop
-
Lord_Nightmare
twitter.com/jiromifune/status/1730157521862902037 is a problem, I have no idea how that can be archived except manually
-
eggdrop
-
Pedrosso
can't wait for smell-o-vision and all the smells to archive, haha
-
project10
🤮
-
pabs
Lord_Nightmare: AT has a nitter instance we can AB. I threw it in the queue just now