-
Ryz
Want this stuff to be arrrcccccccchived, website's been around since 1996 damnit <#>;
-
Ryz
Old loooooooot
-
Guac
Anyone here know someone who tries to keep archives of of prominent alt-right figures' online communications (including Youtube vids where applicable)? Trying to find a copy of a YouTube vid from 2017ish from an account that was suspended by 2019
-
joepie91|m
usually searching for the raw youtube ID (without the rest of the URL) tends to turn up archives for me, or at least something like a title that I can use to find a copy elsewhere
-
Billy549
Good very early morning! Just wondering if any planned action is coming for GameFAQs? :)
-
OrIdow6
I don't believe so Billy549
-
Billy549
OrIdow6, I see - I know some people were trying to archive it directly through Web Archive, but seems GameFAQs already blocked the IPs
-
OrIdow6
I don't think we've seen an indication of changes to the site yet
-
Billy549
That's a fair point, resources are spread too thin anyway I suppose - this just reminds me I need to set Warrior back up on my machine so gonna do that now. Thanks for the info at least :)
-
OrIdow6
You're welcome
-
OrIdow6
Please do tell us if the site (or any other sites) undergo substantial changes in the future etc.
-
Billy549
Of course :)
-
Guac
Hm. Good bit of advice joepie91|m though it mostly turned up research papers that referenced the youtube link. It did also lead to a forum post that had some bits of description of part of the vid.
-
thuban
fenugrec: that wiki page is rather old (for one thing, we usually suggest
github.com/ArchiveTeam/grab-site rather than wget these days). unfortunately, we don't currently have a good mechanism for bypassing that type of cloudflare protection--the best i can suggest is setting a browser user-agent and a generous timeout and hoping that it doesn't trigger.
-
thuban
we have definitely had Discussions™ on the increasing prevalence of this and the practicality of various possible workarounds, but nothing concrete yet. may i ask what forum?
-
thuban
(the message about timestamping is harmless--'-m' is shorthand for several options, including timestamping, but it's the others we care(d) about)
-
fenugrec
thuban, thanks. Will try grabsite later today. I might try through a proxy too although IME that usually triggers clownflare even more. The forum is forum.tek.com
-
fenugrec
they already bodged previous migrations / updates , there are many broken links already, so I'm fairly certain they will not even try to migrate it again to a new platform
-
thuban
shutdown date is "before the end of the year", for reference
forum.tek.com/viewtopic.php?f=583&p=291164
-
thuban
(Jake: want to try that go crawler of yours?)
-
h2ibot
Switchnode edited Deathwatch (+183, /* 2022 */ add tektronix forums):
wiki.archiveteam.org/?diff=49051&oldid=49049
-
betamax
Do we have a way to archive specific sub-forums of a vBulletin forum?
-
betamax
The "Vintage Radio Forums" have a section for members to sell or give away equipment to each other
-
betamax
but it seems it will now be subject to a 90-day deletion rule:
vintage-radio.net/forum/showthread.php?t=194887
-
betamax
Any way that the specific sub-forum in question (
vintage-radio.net/forum/forumdisplay.php?f=27 ) can be archived?
-
betamax
This is going to happen imminently, if it hasn't already happened.
-
thuban
betamax: i don't think there's a _good_ way, but i've done it on xenforo by (iirc) spidering just the thread list pages for that subforum, ignoring everything else, and then extracting the threads from the list of ignored urls and queueing each individually with no-parent
-
thuban
-
thuban
vbulletin has equivalent url structure
-
ivan
-
thuban
hmmmm, no it doesn't actually. (sorry, the forum i checked that i thought was vbulletin is actually also xenforo now.) its threads are 'showthread.php?t=threadid&page=pageid', so --no-parent won't prevent wpull from recursing into other threads or indeed most of the forum ui (you can manage the latter with aggressive ignores but not the former).
-
thuban
write a qwarc script, i guess? (if you're not too concerned about page requisites)
-
JAA
betamax: Two-step process: run a crawl with tight ignores that only permits forumdisplay.php with f=27. Then extract thread links from that and retrieve those, either by building a list with all pages (I think vB always links the last page, so you can generate the missing page URLs) or in theory by further ignores that only permit the relevant thread IDs.
-
JAA
Also, looks like the deletion already happened, oldest thread I'm shown is from July.
-
thuban
yeah, that's probably a better idea actually. still some scripting involved, but probably less, plus you get page requisites
-
thuban
(to be clear, when i said you 'can't' manage thread selection with ignores, i meant _a priori_)
-
thuban
i would just generate the missing pages, since that way you can use --1 and not have to worry about ignoring every other damn thing
-
JAA
Yep, agreed. The scripting can be a relatively simple grep + awk or similar. In case there's Transfer-Encoding, `warc-tiny dump-responses` (from my little-things repo) handles that.
-
thuban
-
thuban
(using gs-dump-urls, for convenience)
-
JAA
Oh right, the URLs are in the DB anyway, yeah, that's even easier. :-)
-
tzt
-
tzt
bulletin.com is shutting down 'early 2023'
-
arkiver
tzt: please add it to the deathwatch page!
-
arkiver
or someone else ^\
-
arkiver
or someone else ^
-
JAA
Oh wait, '*by* early 2023', not 'in'. So we should move quickly on that one.
-
h2ibot
JustAnotherArchivist edited Deathwatch (+172, /* 2023 */ Add Bulletin):
wiki.archiveteam.org/?diff=49052&oldid=49051
-
h2ibot
JustAnotherArchivist edited Deathwatch (+3, /* Pining for the Fjords (Dying) */ Move…):
wiki.archiveteam.org/?diff=49053&oldid=49052
-
h2ibot
IDKhowToEdit edited Twitter (-72, Removes ChromeBot):
wiki.archiveteam.org/?diff=49054&oldid=48258
-
h2ibot
Usernam edited List of websites excluded from the Wayback Machine/Partial exclusions (+39):
wiki.archiveteam.org/?diff=49055&oldid=49016
-
Jake
thuban: yes. Will give it a shot tonight.
-
Ono
Does this work
-
JAA
oh no