-
nicolas17
#archiveteam topic may need updating (pointless to still mention taringa, dunno about the other two)
-
» arkiver is back from a few days of lower availability
-
» fireonlive waves to arkiver
-
fireonlive
welcome back!
-
arkiver
thanks :)
-
fireonlive
:)
-
h2ibot
-
c3manu
fyi: a german journalist apparently has a case against him for linking to the linkunten-indymedia archive. that's after offices have been raided and electronic devices been confiscated. only got a german language link for now:
rote-hilfe.de/meldungen/unbequeme-b…g-prozess-gegen-linken-journalisten
-
c3manu
-
murb
how is (2)
gesetze-im-internet.de/stgb/__85.html actually interperted by the courts?
-
c3manu
murb: not sure, the case hasn't been decided yet. it seems to be based §129 though. and it's really fishy in this whole matter. for making the website illegal, they declared linksunten to be a german "Verein", which is not at all what it was
-
c3manu
-
c3manu
i don't remember what the verdict on that was though, i would have to read up on that
-
h2ibot
-
h2ibot
-
h2ibot
-
JAA
I'm getting rid of a bunch of old project channels today. You won't notice anything as they've been inaccessible since late 2022 already anyway. They're also marked accordingly on the wiki since then.
-
» fireonlive pours several out
-
deadorbit
has anyone thought of archiving help.openstreetmap.org
-
JAA
Yes, it was fully archived with ArchiveBot last month.
-
nicolas17
and coordinated with the openstreetmap admins
-
deadorbit
nice
-
tapos
thuban: For some reason I've decided to torture myself by manually getting every Google Site and Blogspot link from E-Hentai
-
» myself screams in anguish
-
tapos
Done with Google Sites, I'll do Blogspot as my mental state allows
-
thuban
o7
-
tapos
I'd assume most freely hosted hentai scanlation sites are on there
-
tapos
I also just got a good idea
-
tapos
Kemono has a ton of NSFW Google Sites and Blogspot links
-
tapos
-
tapos
-
tapos
It's basically a Patreon/etc. archiver
-
tapos
I wonder if one of these softwares saves the text that has the links:
github.com/search?q=Kemono&type=repositories&s=stars&o=desc
-
tapos
Still, a bunch of separate txt files is a pain in the ass to deal with
-
tapos
So I guess a custom scape would be best
-
tapos
The site uses DDoS-Guard though
-
tapos
This could maybe be rewritten for Google Sites and Blogspot:
github.com/SatyamSSJ10/Kemono-youtube-fetch
-
tapos
-
eggdrop
-
tapos
I skipped the sites that were behind a Google login
-
tapos
Which was most of them
-
tapos
And for one group I included their other links in there while I was at it
-
tapos
Seems to mostly be artists using Google Sites
-
nyany
i'm really curious as to why I was highlighted for that
-
thuban
i have no idea how we handle google sites, actually--we had a project but i think it was just for the 'classic' sites. no idea whether it would work on current sites
-
tapos
So vanilla ArchiveBot wouldn't cut it?
-
pokechu22
Archivebot does work with google sites to my understanding
-
tapos
Nice
-
pokechu22
but you do have to start one archivebot job per site, which makes it not super useful for large quantities of sites that need to be saved quickly
-
tapos
thuban do you think you can scrape Kemono for Google Sites and Blogspot links?
-
thuban
^ right, just not sure whether something else would be more apt
-
tapos
Ok
-
thuban
sorry, i'm rather busy at present
-
tapos
Well, it's just 16 Google Sites from E-Hentai
-
tapos
So it could probably be fed site by site
-
tapos
Ok, no worries
-
nyany
Hey, it's inporntant. We'll figure it out :D
-
tapos
Yeah
-
tapos
I'm not doing Kemono manually though lol
-
tapos
823 posts (17 pages) of Google Sites links
-
tapos
9612 posts (193 pages) of Blogspot links
-
thuban
blogspot we can just dump in #frogger, so that's fine
-
thuban
google sites we could _maybe_ do through ab with queueh2ibot, but it would make sense to find out whether #nearlylostmygoogles does/can apply first
-
JAA
16 sites is few enough to just do it manually.
-
thuban
yeah, but 823...
-
JAA
Oh, two different sources, right.
-
tapos
thuban it's 823 posts, not 823 sites
-
tapos
Most likely it's like 30 sites with a few of them being linked to in hundreds of posts each
-
tapos
Since some artists put their site link in every post
-
tapos
There's no way of seeing which artist made which post without opening the post though
-
tapos
Otherwise I could just speedrun through the search pages manually
-
tapos
Now if I want to do it manually I'd have to open every single post
-
tapos
Even if I could speedrun it the Blogspot ones are too much
-
thuban
oic, thought you were using that scraper you linked
-
fireonlive
-+rss- Show HN: A self-published art book about Google's first 25 years: This took me 3 years to finish. (It is 100% self-published, not endorsed by Google.)So… I wrote a book. It’s a different book with a unique approach. It’s not a novel or a technical book. It’s a biography, a company’s biography. My hope is that it serves two
-
fireonlive
purposes: to inspire founders and to captivate interior designers.It all [...]
news.ycombinator.com/item?id=40067484
-
fireonlive
i hope this gets preserved somehow..