-
h2ibot
-
h2ibot
Flashfire42 edited URLTeam/Warrior (+37, /* Warrior projects */):
wiki.archiveteam.org/?diff=50269&oldid=50268
-
h2ibot
Flashfire42 edited URLTeam/Warrior (+179, /* Warrior projects */):
wiki.archiveteam.org/?diff=50270&oldid=50269
-
that_lurker
Could someone grab
mitnicksecurity.com now that Kevin Mitnick died.
-
pokechu22
that_lurker: done
-
h2ibot
Reece2oo9 edited Internet Archive (+348, /* Mirrors */):
wiki.archiveteam.org/?diff=50271&oldid=49710
-
h2ibot
Usernam edited List of websites excluded from the Wayback Machine (+23):
wiki.archiveteam.org/?diff=50272&oldid=50267
-
OrIdow6
madpro|m: What happened to the Twitter Developers Forum? Is it safe to assume it's back around?
-
ShakespeareFan00
Afternoon
-
ShakespeareFan00
Proposing that at some point -
legislation.gov.uk/browse/eu and related is placed on a list for archive.
-
ShakespeareFan00
It's not an immediate priority however.
-
pokechu22
With how archivebot works, it'd probably require doing all of
legislation.gov.uk instead - which might be kinda big?
-
ShakespeareFan00
But worth it
-
ShakespeareFan00
Not an immediate priority though.
-
ShakespeareFan00
My other suggestions for archiving are potentially NSFW sites , and I wasn't sure you archived those.
-
pokechu22
ugh, and incomplete coverage:
legislation.gov.uk/eudn/2020/1809/contents from
legislation.gov.uk/eu-origin?page=15 hasn't been saved yet - definitely worth doing then
-
pokechu22
If it's an NSFW site with unique user-generated content (e.g. an booru) that's closing, or a site with a lot of content and only some is NSFW, it can be saved
-
pokechu22
legislation.gov.uk/sitemap-ukcm.xml - <lastmod>1920-12-23T00:00:00</lastmod> - I guess they're not exactly wrong about that, but still funny
-
fireonlive
hahaha
-
ShakespeareFan00
pokechu22: Having legislation.gov.uk as a dump might also be feasible...
-
ShakespeareFan00
I'm not sure if there's a way to FOIA an entire UK gov website though.
-
pokechu22
I'm currently running it via archivebot, and so far it seems OK, but there is still a lot of law
-
fireonlive
was
foiathedead.org ever archivebotted?
-
fireonlive
unfortunately, foiathedead is also dead
-
fireonlive
in terms of updates anyways
-
joepie91|m
does that mean that a FOIA request should be made to the FBI about their documents on FOIA The Dead? :p
-
fireonlive
ooh yes :p
-
ShakespeareFan00
If you are interested in UK laws... -
statutes.org.uk/site/collections this has links to a LOT of items on Hathi/Google that should probably also be on Archive.org
-
ShakespeareFan00
And if you are discussing FOIA - ( I was referencing the UK one) -
whatdotheyknow.com has an archive of responses to UK FOIA requests...
-
ShakespeareFan00
The UI can't find stuff going back indefinitly though, but a clever crawler might be able to find earlier requests linked from later ones.
-
ShakespeareFan00
I can't run a crawler bot locally, due to bandwidth/port restrictions on the PC I use.
-
ShakespeareFan00
Another non-immediate priority, archiving Activision before the merger...
-
ShakespeareFan00
I assume archiveteam people know about Wikisource?
-
ShakespeareFan00
Archivial of old UK laws assists that project because pulling down a DJVU/PDF from IA is far easier than it is from some other sites.
-
ShakespeareFan00
Oh and I'll note something about certain Google Books URL's..
-
fireonlive
the great thing about archivebot/DPoS is it's available in the 'wayback machine' in addition to collections
-
fireonlive
so easier for some to find
-
ShakespeareFan00
Certain University of California originated scan links have sequential ID's
-
ShakespeareFan00
books.google.co.uk/books?vid=UCAL:B4958531 for excample can be iterated to find related publications.
-
ShakespeareFan00
It would need a clever programmer to develop a special crawler , but there isn't antyghin unfeasible about writing a crawler to go through all potential ID's and grabbing PDF/metadata to put on IA.
-
ShakespeareFan00
(Even better if there is a way to just give it a Hathi ID and teh bot does the rest, pulling from Google Books if needed... :)
-
ShakespeareFan00
Not my area of expertise, but thought I'd mention it.
-
ShakespeareFan00
fireonlive: DPoS?
-
fireonlive
distributed preservation of service / aka the warrior projects
-
ShakespeareFan00
The two NSFW sites I had in mind for archiving where -
fictionmania.tv and
bigclosetr.us/topshelf They should not be immediate priority as the sites are not under threat right now.
-
ShakespeareFan00
I'd also suggest someone looks into archiving UK NSFW sites at some point, before new rules might cause some of them to close down.
-
ShakespeareFan00
(There is a ongoing debate about forcing sites to age gate... and bear the cost of doing so.)
-
ShakespeareFan00
Anyway thanks for listening :)