-
h2ibot
-
nulldata
nvsgames.com - Nuverse, the game publishing arm of ByteDance, is being shutdown by on Monday.
bloomberg.com/news/articles/2023-11…g-arm-in-business-retreat#xj4y7vzkg
-
pabs
what is a good way to archive to the WBM a URL that responds eventually but too slowly for AB/SPN timeouts?
debtags.debian.org/reports/recent
-
JAA
Wow, that's slow...
-
JAA
pabs: I've grabbed it with grab-site. It'll be in the WBM whenever I upload my backlog.
-
pabs
thanks!
-
nicolas17
JAA: did you upload a functional grab-site yet? :p
-
nicolas17
before our president-elect changes his website
-
JAA
nicolas17: Right, no, I got distracted while waiting for information on how to push.
-
JAA
Ok, I think it's working.
-
JAA
(Fucking Docker requiring storing credentials on disk for a single push...)
-
JAA
nicolas17: atdr.meo.ws/justanotherarchivist/grab-site-docker:20220509-g398726f7
-
JAA
(But fixing the build is still on my todo list.)
-
nicolas17
-
nicolas17
I assume the warcs and cdx are enough, but I tar'd up the whole output directory
-
nicolas17
there... isn't actually any content in there, it's just images and empty slogans :P
-
JAA
<surprised_pikachu.png>
-
JAA
Thanks, and yeah, I usually also keep everything and put all but the WARCs into a tar.
-
h2ibot
Tech234a edited Google Plus Comments on Blogspot (+333, Add note about submission of discoveries to…):
wiki.archiveteam.org/?diff=51219&oldid=47913
-
h2ibot
Tech234a edited Google Plus Comments on Blogspot (+84, Search tool broke, add a few more details):
wiki.archiveteam.org/?diff=51220&oldid=51219
-
tech234a
^ IA doesn't seem to serve HTML in items with the correct Content-Type headers any more, probably anti-spam? It broke the search tool for the Google Plus Comments on Blogspot project though
-
h2ibot
Tech234a edited Google Plus Comments on Blogspot (+0, Correction for rescued items):
wiki.archiveteam.org/?diff=51221&oldid=51220
-
JAA
Cc arkiver ^
-
that_lurker
-
Mannie
I have this idea for the archivebot project. One of the goals of the project is to archive bankrupt company's. I have submitted a lot of those cases by looking them up in the insolvencies registor of the court. I go to the registor and just copy trade names and google them and post them to be archived. Now I had this idea to be automate it. This would be done by looking ones a day to the company of that day and 'google' the
-
Mannie
company's and archive the website.
-
pabs
Mannie: Google is hard to automate, you get lot of captchas. also subdomain archiving and related resource archiving is quite manual right now
-
Mannie
pabs that was the problem that is also struiglled with. Here do we get the website from (automated). They are not in the court registor or in the trade register (msot times).
-
Mannie
Is there not a way like whois to get the domains that are owned by a particialer comapany?
-
Mannie
We have the name from the court files so we only need to have the domain that they own.
-
pabs
in Australia the business registry doesn't record website domains
-
murb
I've noticed from mapping small businsses may of them let their domains expire and just use a facebook page.
-
murb
(without hiring new sign writters.)
-
Mannie
In the Netherlands and Luxembourg also not. That is the problem that we have. We have all the details but not the domain
-
pabs
there is also social media to think about, at least YouTube can be archived. twitter used to be able to
-
pabs
a related problem is archiving sites around a natural disaster (fire/flood etc). thats even harder, 1) to get the affected area 2) find websites related to things in that area
-
pabs
for 2 I have been manually perusing Google Maps sometimes, but that is very manual and tedious
-
murb
pabs: or datacentres going up in smoke...
-
pabs
yeah, that is likely an impossible problem, since you can't even get a list of IPs in a datacentre
-
Mannie
pabs for problem 2 can be partially automated to look up all bussiness in that area in the trade register and from the openstreetmap/google maps api. with that list you can start searching. If the company is like a theatre or café you can use a script to look on review sites like tripadvicer and yelp to see if there is a website. Whats over needs to be search manualy
-
Mannie
There is still the part of domains like blogs that are part of hobby.
-
Mannie
pabs I have find this free company to domain site: autocomplete.clearbit.com
-
Mannie
this is the link for tello.com: autocomplete.clearbit.com
-
h2ibot
Switchnode edited Deathwatch (+189, /* 2024 */ add hardware.info):
wiki.archiveteam.org/?diff=51222&oldid=51175
-
arkiver
tech234a: correct, anti-spam
-
arkiver
but if i remember correctly, we may be able to move it to some collection
-
arkiver
the item that is
-
h2ibot
-
fireonlive
“ I am reaching out to inform you that over the next few months we will be migrating our email addresses from @zoom.us to @zoom.com. You may receive emails from both domains, and I want to assure you that emails from these two domains are legitimate.”
-
fireonlive
“In the coming months, you’ll also notice our website and other collateral change to zoom.com.”
-
fireonlive
¯\_(ツ)_/¯ probably just a 1:1 shift of literally everything but who knows. thought i’d shart it out here for y’all