-
h2ibot
-
fireonlive
Rite Aid files for bankruptcy amid opioid-related lawsuits and falling sales:
cbsnews.com/news/rite-aid-bankruptcy-opioids-lawsuits
-
project10
fireonlive: yep I posted that one last night and pabs did the AB 🪄
-
» project10 misses fuckedcompany.com
-
fireonlive
ah :D
-
audrooku|m
How should I submit ~250 blog post urls to be archived for the WBM? should I use a script to submit them to the spn2 api? or is there a way to queue them for archiveteam?
-
pokechu22
They can be run via archivebot - archivebot doesn't evaluate javascript but that should be fine for blog posts
-
pokechu22
if you upload a list of URLs to
transfer.archivete.am I can run it
-
audrooku|m
will do in a few hours, thanks
-
pokechu22
(or if they're all on the same site, I can just do a recursive archivebot job over the whole site)
-
pabs
audrooku|m: re SPN2, I find the email endpoint is better than the web one for sending lots of links, since the web API has relatively low limits.
-
pabs
but archivebot indeed is the best option
-
audrooku|m
I'm not familiar with this email endpoint
-
pabs
mail HTML or plain text to savepagenow⊙ao and it goes through SPN2, you will get one or more mails in response, batches of 100 URLs IIRC
-
audrooku|m
interesting
-
pabs
and if some of the URLs tempfail you can copy those into a new mail
-
pabs
I'd only use SPN2 for things that really need JS or otherwise don't work in archivebot tho
-
pabs
linkedin for eg does not like AB
-
pabs
but sometimes works in SPN2
-
audrooku|m
noted
-
Ryz
Oof, apparently Epic Games (former owner) laid off 50% of the staff of Bandcamp before being sold off:
twitter.com/ethangach/status/1713970488257413600
-
eggdrop
-
audrooku|m
6mo severance tho
-
pabs
-
pabs
hmm, meant for -ot channel, oops
-
pabs
-
pabs
nature.com/articles/d41586-023-03191-3 - Argentina to shut down their national science org
-
pabs
er, one candidate wants to
-
h2ibot
Megame edited Deathwatch (+145, /* 2023 */ tensorboard.dev):
wiki.archiveteam.org/?diff=50997&oldid=50991
-
h2ibot
JustAnotherArchivist changed the user rights of User:Megame
-
DogsRNice
-
DogsRNice
a bunch of source engine map archives
-
DogsRNice
some of them are on arhive.org but one person said they had issues with uploading them
-
pokechu22
DogsRNice: I've started an archivebot job on
ar.mevl2.duckdns.org which I believe is the main archive (
maps.mevl2.duckdns.org links to it)
-
Barto
yeah, bandcamp is still one of those golden place of the internet
-
Barto
question, i know that !ao <
transfer.archivete.am... exists, does !a <
transfer.archivete.am... exists too? What are the quirks? Could it be used to save all main domain + subdomains all at once?
-
JAA
It exists, but you have to be very careful about what you throw into there because it can break recursion in entertaining ways.
-
JAA
(That's also why it isn't documented.)
-
JAA
It might recurse over things you don't want, or it might miss things you'd expect it to grab.
-
JAA
That depends on the contents of the initial list as well as timing.
-
pokechu22
It exists, but each URL tracks what URL it came from, which means that the notion of something being on site or offsite gets messy (especially when sites link to other subdomains; if a page on a subdomain is found by a different domain first, it'll be treated as offsite and things from that page won't be recursed over). Things get worse when you do multiple URLs on the same
-
pokechu22
site as the no-parent rule means that example.com/a/ linking to example.com/b/subpage will entirely skip example.com/b/subpage, even if that same link is discovered via example.com/b/ later
-
JAA
Basically, the only *safe* way of using it is to have a list of URLs that are all on the same host and which are all identical up to the last slash.
-
Barto
ok, as i understand my example is not the recommended way to do it. Indeed there's some weird behavior with it
-
Barto
thanks for the explanation
-
pokechu22
Yeah. We've still done it in the past for things where interlinking is unlikely to happen (mainly for ISP hosting with thousands of users) but it's not a good idea in most case
-
Barto
i was especially seeing it the way to "group" archiving jobs, so no way it's a good idea :D
-
h2ibot
Flashfire42 edited List of websites excluded from the Wayback Machine (+33):
wiki.archiveteam.org/?diff=50998&oldid=50994
-
h2ibot
JAABot edited List of websites excluded from the Wayback Machine (+0):
wiki.archiveteam.org/?diff=50999&oldid=50998