-
arkiver
thank you!
-
arkiver
schwarzkatz|m: tianya.cn?
-
h2ibot
Wickedplayer494 edited Alive... OR ARE THEY (+757, /* Alarm */ Surrender at 20 appears to have…):
wiki.archiveteam.org/?diff=49203&oldid=49158
-
michaelblob
ooolala surrenderat20 looks to be just (more or less) static html pages
-
wickedplayer494
Which is nice and all until you get to the rather big Disqus following it has, which would also be a bit of a shame to lose
-
h2ibot
Wickedplayer494 edited Alive... OR ARE THEY (+42, /* Alarm */ Grammar, clarification):
wiki.archiveteam.org/?diff=49204&oldid=49203
-
wickedplayer494
Should also mention that Spideraxe30 would be another probable successor to keeping the site running if Aznbeat's claim that moobeat has also lost interest checks out, but as with Uli/SkinSpotlights I wouldn't be holding my breath too long, both are basically wildcards in this situation
-
systwi
-
systwi
YouTube video (now dead) with ID violating the original regex pattern: [A-Za-z0-9_-]{10}[AEIMQUYcgkosw048]
-
systwi
-
JAA
systwi: Read the paragraph after the pattern. :-)
-
systwi
I noticed that about a minute after posting that. ~_~
-
systwi
Today must be one of my off-days.
-
derenrich
i'm interested in trying to archive
communities.win any pointers to what kinds of tools i should use? thus far i've just been using wget with mirror and warc mode
-
» erenrich swapped to irssi
-
erenrich
is there maybe a tutorial on setting up a warrior project?
-
thuban
erenrich: i think
github.com/ArchiveTeam/grab-site might be more suited to your needs
-
erenrich
thuban: thanks i'll take a look
-
erenrich
thuban: ugh, website won't load without js. grab-site crawl immediately terminates. i think it's cloudflare protection
-
JAA
I don't see any Buttflare interference, but the site is an SPA, so a normal crawler won't work.
-
erenrich
oh maybe they changed. i sucessfully crawled with wget a few months ago
-
JAA
Actually, not the whole site is an SPA, but the homepage is.
-
erenrich
i was actually trying to archive
scored.co/c/TheDonald
-
erenrich
but yeah that's also an SPA
-
erenrich
maybe i can seed with a different page
-
JAA
It probably won't recurse correctly due to --no-parent.
-
erenrich
yeah ok they made this much harder to crawl since i last looked
-
erenrich
ok i found a hacky way to do it. thanks for the tips
-
Maakuth|m
hmm, it seems that the koti.mbnet.fi warcs we grabbed in early october are not in wbm
-
Sanqui
I discovered 45 thousand more sweb.cz domains, so I'm gonna take up some AB capacity -- sorry
-
Sanqui
!con 5egkv60sdt099ta1byl5w0cx0 5
-
h2ibot
Pokechu22 edited ISP Hosting (+2, elisanet.fi and kolumbus.fi closing per…):
wiki.archiveteam.org/?diff=49205&oldid=48965
-
JAA
Megame raised in #gitgud that GitHub is shuttering Atom (the CPU-based space heater with a text editor function):
github.blog/2022-06-08-sunsetting-atom
-
JAA
Two points in that blog post caught my attention and probably need further digging: 'Atom package management will stop working' and 'Deprecated redirects that supported downloading Electron symbols and headers will no longer work'.
-
JAA
Packages are easy enough, I'll take care of those.
-
JAA
Looks like it should even be possible to install packages from the WBM in the future.
-
JAA
apm has an env var ATOM_API_URL to override the default. If that's set to the WBM and everything's archived, that should in theory work, probably, maybe.
-
asie
pdroms.de/miscellaneous/pdroms-is-dead - PDRoms, an important site for the history of homebrew releases (both due to its news section and its files section), has announced in October that it will cease to be maintained in its present form. It's still up indefinitely for the sake of history (albeit pages take like 10 seconds to load each), but might be worth archiving
-
asie
for the sake of said history.
-
asie
As, well, indefinitely usually means "until that one day"
-
Sanqui
oof, the 10 second load time is painful
-
NickS|m
Ugh, glad I never tried out Atom then
-
NickS|m
Seems like a pretty damn short run for a text editor