-
nicolas17
fireonlive: wait what did you archivebot? the release notes page?
-
fireonlive
nicolas17: te
-
fireonlive
te
-
fireonlive
ye
-
nicolas17
probably won't work, it's a JS-infested SPA :P
-
nicolas17
maybe I should SPN it
-
h2ibot
Switchnode edited Template:CTA URL lists (-129, move category link to hat position; tighten up…):
wiki.archiveteam.org/?diff=51461&oldid=51240
-
fireonlive
rip
-
fireonlive
much nicer, re: CTA URL
-
h2ibot
Switchnode edited URLs (-74, replace CTA... carefully!):
wiki.archiveteam.org/?diff=51462&oldid=51456
-
h2ibot
Switchnode edited ARGENTeaM (+0, private donkey kong?):
wiki.archiveteam.org/?diff=51463&oldid=51452
-
tomodachi94
@jacksonchen666:hackint.org: You beat me to the 0w0.is and gender.systems shutdown lol
-
fireonlive
pls add to deathwatch
-
h2ibot
Switchnode edited Deathwatch (+339, /* 2024 */ add 0w0.is and gender.systems):
wiki.archiveteam.org/?diff=51464&oldid=51457
-
tomodachi94
Damnit I just edited that too :(
-
tomodachi94
Is there a reason for an edit moderation queue on the wiki?
-
thuban
tomodachi94: just drive-by spammers
-
thuban
what do we even do with mastodon these days? still totally borked in archivebot, right?
-
fireonlive
yeah it’s fucked in AB since v4.?
-
fireonlive
no known
-
fireonlive
procedure at the moment
-
tomodachi94
That's unfortunate
-
fireonlive
yeah :/
-
fireonlive
thank the devs who removed the no js fallback
-
eggdrop
[tell] Doranwen: [2024-01-03T19:28:09Z] <fireonlive> do you have a wiki account?
-
Doranwen
fireonlive: No, I haven't gotten one yet. I've hardly edited wikis so it takes eternally long looking up the syntax because I don't know it yet. Which means I tend to avoid doing it, lol.
-
fireonlive
ah :)
-
fireonlive
I know that feeling
-
fireonlive
we can always supply a little fixes after tho
-
Doranwen
Yeah, I need to just get around to it… eventually… there always seems to be something more important, lol.
-
fireonlive
ye :$
-
fireonlive
:)
-
mgrandi
Pedrosso: it seems that my steam workshop downloader works fine with portal 2
-
mgrandi
i have a few bugs i need to work out, such as deleting files as they get downloaded so you aren't duplicating space within the "steamcmd" folder and cleaing up the code but it seems to work
-
mgrandi
example log of it running for 1 page of the workshop (30 items):
gist.github.com/mgrandi/a1f7dfeae765890e91b987debadf3d09
-
Pedrosso
Well, that's great
-
h2ibot
OrIdow6 edited Google Drive (+376, Pubhtml pages):
wiki.archiveteam.org/?diff=51468&oldid=51399
-
h2ibot
OrIdow6 edited Google Drive (+153, /* Notes */ /pub):
wiki.archiveteam.org/?diff=51469&oldid=51468
-
nulldata
-
eggdrop
-
nulldata
More layoffs at 3D Realms and Slipgate. Probably wouldn't be a bad idea to throw them in AB.
-
nulldata
-
nulldata
-
nulldata
-
eggdrop
-
Barto
nulldata: i took care of the websites
-
nulldata
Thanks!
-
Exorcism
!tell Megame I was able to obtain a list of subdomains for
cgsociety.org (If you want to put them in the archivebot):
chibi.mint.lgbt/s/oBrR9I7jVS2Y
-
eggdrop
[tell] ok, I'll tell Megame when they join next
-
ctag
If someone with AB would grab
vbas.org please, and let me know.
-
ctag
I guess setting that up broke the https redirect, so I'm hoping to put it back to normal soonish
-
Barto
ctag: let me test this :-)
-
Barto
is the ftp still working?
-
ctag
Uhhh, probably not
-
Barto
dayum
-
ctag
I do have a backup of it somewhere around here though
-
ctag
We made a big push to migrate to google drive this past year, to make files more accessible to regular membership
-
ctag
So most of it should be over there too
-
ctag
That FTP server is still running though. It's a 32-bit SuSe Linux machine that we got from government auction around 200(2?)
-
ctag
4x500GB Raid
-
Barto
lol, the trusty old beast
-
ctag
Was hot stuff back in the day, I imagine.
-
ctag
But it's never needed any maintenance. Has just ran under a desk for 20 years straight
-
Barto
yeah, cant reach it from here
-
ctag
Hmm. Try ftp.vbas.org?
-
ctag
I don't have an ftp client at this computer
-
ctag
Nope, there's no ftp daemon running anymore, I just checked. We switched to ssh-rsync and filebrowser apparently.
-
ctag
The blog is broken too, Hrm
-
ctag
:D
-
ctag
Blog should be fixed-ish now
-
ctag
Looks like a lot of image resources are missing though
-
ctag
I wish we had a copy of the website pre-2011 :-/ Oh well
-
ctag
! There's versions in wayback machine back to 1999!
-
ctag
I'm gonna go tell our society historian haha. How have I not checked that before
-
Exorcism
Barto: can you archive this website with AB ?
0w0.is
-
ctag
Hrm. The version that I'm trying to save seem well preserved on wayback machine. I'm not sure if it's a good idea to save another copy now.
-
ctag
My original goal was to keep the old site available as an archive on-site for our organization
-
ctag
Is there a way to retrieve warcs from wayback machine?
-
Barto
i let others do Mastodon Exorcism
-
Exorcism
👍️
-
ctag
Barto, I think I'm going to yank that url redirect. I'm less sure if it's a good idea to archive it on IA like this.
-
joepie91|m
<Exorcism> "Barto: can you archive this..." <- note, mastodon content should only be archived with consent
-
joepie91|m
or well, fedi content* I suppose
-
nulldata
Can someone throw this into AB? A well known GTA modder is leaving the scene.
zolika1351.pages.dev
-
Barto
joepie91|m: i dont have the history on this, so i cant make any comment on why we do it this way
-
nulldata
Thanks Barto!
-
Barto
nulldata: not sure when i'll do twitter, but it's in my account list
-
nulldata
He has a Discord too, but invite links are dead - I am still in it though. If I have get a moment later I'll look into archiving it.
-
Barto
#discard :-)
-
arkiver
Barto: what do you mean "this way"?
-
Barto
arkiver: the fact that we dont usually throw mastodon instances in AB.
-
fireonlive
can add the twitters to the queue at
pad.notkiska.pw/p/archivebot-twitter too
-
Barto
good idea. will need to do a bit of cleanup and check capitalization of handle firsts
-
Barto
first*
-
fireonlive
:)
-
aprego
after a website is excluded from the wayback machine, is it possible to download snapshots of the site?
-
aprego
is it true you can still find the WARCs in collections?
-
Barto
fireonlive: added, i think you can add case ok to all of them
-
Barto
it's a mix of space stuffs, gabonese handles, owasp drama, hacked company handles, swiss catholic groups, and some other proactive shit.
-
fireonlive
thanks :)
-
TheTechRobo
aprego: Depends on how the site was archived
-
aprego
TheTechRobo: SavePageNow mostly
-
nicolas17
aprego: I think savepagenow WARCs are *always* inaccessible in order to support the *possibility* that they may get excluded from wayback machine in the future
-
aprego
i guess it's over
-
SketchCow
Testflight Crashland Project was taken down off Internet Archive and Wayback
-
Barto
damn
-
Dango360
"Where do all the saved files go? Files are ultimately uploaded to Internet Archive on the archiveteam collection." we kept the warcs, right?
-
nicolas17
Dango360: you mean for testflight?
-
nicolas17
it's possible the warcs are still on IA's servers but they are not publicly accessible
-
audrooku|m
it is my understanding that that is IA's policy yes
-
fireonlive
oh.
-
fireonlive
sad news :(
-
fireonlive
i was just coming here to relay that the discord said to "Under no circumstance, for any reason, upload another copy of the data to the Internet Archive/archive.org."
-
fireonlive
and was confused as to why, but now I know
-
fireonlive
s/said/pinged @here/
-
audrooku|m
which discord?
-
fireonlive
"TestFlight"
-
fireonlive
one sec
-
nicolas17
"kids talking about the testflight leak" discord
-
fireonlive
i'll PM you to keep it out of here
-
fireonlive
yeah
-
fireonlive
this whole testflight 'leak' thing is a big can of shitfuckery that was presented in bad faith from the discovering party and then clickbait media being clickbait media ran foaming at the mouth with it
-
fireonlive
from what i understand anyways
-
fireonlive
(it was never a leak)
-
fireonlive
at least some tried to rebrand it later to something else, but i don't think it stuck
-
JAA
Exorcism's gender.systems subdomain list at
chibi.mint.lgbt/s/PddyAUliT1qk (from #archiveteam earlier) in plain text rather than using the world's worst pastebin: matrix.gender.systems chat.im.gender.systems matrix.im.gender.systems im.gender.systems read.gender.systems
-
fireonlive
"<noscript><strong>We're sorry but chibisafe doesn't work properly without JavaScript enabled. Please enable it to continue.</strong></noscript>"
-
fireonlive
ah
-
JAA
-
JAA
Imagine requiring JS to literally just render a few lines of plain text...
-
fireonlive
the future is here :D
-
Exorcism
fireonlive, JAA: there is also the RAW version 😭 :
chibi.mint.lgbt/api/snippet/oBrR9I7jVS2Y/raw
-
joepie91|m
Barto: I don't know the exact history of how things went with mastodon stuff *in archiveteam* but as a general rule, fedi instances tend to be deliberately ephemeral and not friendly towards scraping
-
joepie91|m
and heavily focused around consent
-
joepie91|m
fedi instances tend to be closer to someone's living room than to a public square
-
OrIdow6
I wasn't there for the incident that prompted that rule but I do think that enough time has passed that it can be written up about on the wiki
-
Gooshka
List of Geonames sources, may be useful for saving government content around the world
geonames.org/datasources
-
nicolas17
given a .warc, do I have any chance of compressing it back to the original .warc.gz? like gzipping each record in the same way wget-at does?
-
JAA
I'm not aware of tooling for that. Bit-identical output would likely be virtually impossible.
-
fireonlive
static.space < interesting/strange
-
fireonlive
-
JAA
It's one of those things I intend to support in my WIP tooling though.
-
h2ibot
Switchnode edited Template:CTA URL lists (+24, additional fixes):
wiki.archiveteam.org/?diff=51470&oldid=51461
-
Barto
joepie91|m: that is kinda why i said i'd let others do it, hoping they have a better view on the mastodon event. Same with the process to take if we want to crawl it, others may have a better experience than me.