#archiveteam-bs

04:45

tapos

Nope, I can't code
04:45

deadorbit

what
04:45

tapos

I'd have to get the domains manually if I was to do it on my own
04:46

tapos

I was responding to thuban
04:46

deadorbit

oh i cant see the message lol
04:46

tapos

Yeah, you joined after he sent it
10:05

c3manu

murb: i just realized i conflated the cases against the journalist and the one against linksunten (sometimes i just need a day or two for that :D). hence the paragraph mixup
10:32

cprecioso

Hi, I think guiasnintendo.com might be in danger of getting turned off. It is an official Nintendo Spain website with detailed game guides for almost all first-party (and some third-party) Nintendo games since the GameBoy. However, it has not been updated since 2022 (as can be seen in the footer), nor has it added any new guides since.
10:32

cprecioso

I am running grab-site on it right now, as I don't think it is an overly large website; but I am unsure of how to proceed if I do want to send this to the internet archive, or share the archiving load with the rest of the Archive Team. How should I proceed?
11:28

thuban

cprecioso: i have started an archivebot job for www.guiasnintendo.com which you can monitor at archivebot.com; the results will be uploaded to the internet archive and indexed by the wayback machine a few days after the job completes. sound good?
11:30

cprecioso

thuban amazing, thanks!
11:34

thuban

you're welcome; thanks for the tip!
11:37

katia

a/G aita
11:37

katia

ops
11:39

thuban

* archivebot.com (mea culpa...)
11:42

c3manu

:D
11:43

c3manu

they'll probably come back when it doesn't work :)
11:56

Cheesy

Any list where I can dump government sites to be eventually get crawled?
11:57

kiska

Perhaps #//
12:18

Cheesy

Just gonna do a massive dump there
12:18

Cheesy

Hopefully not against any rule
12:41

c3manu

Cheesy: like a list of urls?
12:41

Cheesy

Yeah
12:42

c3manu

you could also list them in a text file and upload it to transfer.archivete.am
12:42

c3manu

and then just post the link here :)
12:46

c3manu

on another note: did anyone here get tripadvisor-urls to work with AB, or has another reliable method of archiving them? if so, i'd be intrigued :)
22:21

tapos

Is anyone here willing to write a scraper that extracts Google Sites and Blogspot links from Kemono? Google seems to be about to do a NSFW purge on at least Google Sites and this would probably be the best way of backing up as many NSFW artist sites as possible
22:22

tapos

You could probably use a lot of code from github.com/SatyamSSJ10/Kemono-youtube-fetch
22:22

tapos

kemono.su/posts?q=sites.google.com
22:23

tapos

kemono.su/posts?q=blogspot.com
22:23

tapos

^ The webpages that need to be scraped

13 days ago

« a day earlier

a day later »

today »