-
AK
-
AK
Not sure if anyone's seen this, think there's a case to be made that css-tricks is at risk
-
razul
Sure looks like it.
-
OrIdow6
Anyone already looked at Wysp (August 1)? Thinking of taking something far away, so I stop with these last-minute projects
-
arkiver
OrIdow6: no yet here! go ahead :)
-
arkiver
more than a month left!
-
yts98
Has anyone archived Game Atsumaru? If not, I suggest archiving with ArchiveBot first, and I'll inspect how to find game assets in a few days.
-
arkiver
yts98: it's actually not that difficult. the actual 'game' seems to be some HTML with references to the assets
-
arkiver
i see some games though that heavily use js to generate asset URLs
-
yts98
there's in-game API
atsumaru.github.io/api-references related to comments and score boards,
-
yts98
-
yts98
parsing 29754 games and search for the string "RPGAtsumaru" would be easy without the warrior
-
joepie91|m
AK: fwiw, anything DigitalOcean touches content-wise is always at risk. they have a long history of shoddy content marketing practices, pretty much from the first moment they started doing their 'guides' thing
-
joepie91|m
they talk a big game, but it's been obvious from the poor quality of their content (and the refusal to issue corrections) that they don't actually care about it in any way other than to draw in more customers by looking hip and helpful
-
fireonlive
i think it’s one of those like seo boosting things, get 30 articles out about roughly the same topic changing things slightly so they appear more in search results and are more in people’s minds / call to action to register for digitalocean. iirc they allow anyone to write for them and pay out a few bucks per article
-
fireonlive
oh lordy imagine a site stuffed with chatgpt instructionals
-
JAA
I believe we already archived css-tricks.com and some other things a few months ago when DO was doing layoffs.
-
masterX244
re the announcement in archiveteam: clearly too large for AB
-
nicolas17
...is this frequency of shutdowns normal?
-
JAA
They have a URL shortener at sk.mu, but the codes are far too long to be bruteforcable. E.g.
sk.mu/a90soh3wEbti for one of the most recent posts.
-
fireonlive
here's deathwatch, some years look busier than others:
wiki.archiveteam.org/index.php/Deathwatch
-
fireonlive
but i imagine there's stuff that didn't make it cc nicolas17
-
JAA
Yeah, stuff lands on Deathwatch if someone adds it, and I think some of the smaller things used to not be added as often.
-
JAA
-
JAA
Blog IDs go to around 124 million, so that's fun enough.
-
albertlarsan68
Skyblog (
skyrock.com/blog) announced that they will shut down the 21st August.
-
albertlarsan68
It was a pre-facebook social media, especially popular in France.
-
albertlarsan68
Has it been archived? Would it be archivable?
-
albertlarsan68
Can it be added to the Deathwatch page please?
-
JAA
See above, we're already discussing it.
-
JAA
And please do add it to Deathwatch. It's a wiki. :-)
-
JAA
Server seems very stable, I'm already getting timeouts after just clicking around a bit.
-
albertlarsan68
It has been said that anonymized data will be saved to the INA and BNF (French authorities that archive mainly the TV and radio broadcast and books respectively)
-
albertlarsan68
They offer ways to save blogs, but using third-party tools (e.g. Cyotek WebCopy, A1 Website Download, HTTrack are methods on the official page)
-
pokechu22
Exorcism|m: re
wysp.ws, it seems like archivebot won't work well with it due to javascript. The "new" tab on the front page uses
wysp.ws/timeline/load/?tlid=wysp-ma…ntichronological&term_string=newest (and that progresses onwards). I also tried
-
pokechu22
wysp.ws/timeline/load/?tlid=wysp-ma…er=chronological&term_string=oldest and that seems like the oldest post it gives is
wysp.ws/post/866261001 which isn't the oldest (the "hall of fame" tab gives
wysp.ws/post/8492023 from 2013, while that post is 2017). IDs don't seem to be incremental so I'm not sure how to
-
pokechu22
go about saving everything.
-
thuban
do we have a source on the august 21 date for skyrock? i don't see it in the linked announcement
-
thuban
it's given in the news article, but no mention of where they got that information
-
thuban
LeGoupil, albertlarsan68: ^ any info?
-
LeGoupil
thuban: on
skyrock.com/blog it's in small on the banner next to ICI T LIBRE
-
thuban
LeGoupil: so it is, thank you
-
h2ibot
Switchnode edited Deathwatch (+235, /* 2023 */ add skyrock):
wiki.archiveteam.org/?diff=49997&oldid=49993
-
h2ibot
Switchnode edited Deathwatch (+8, /* 2023 */):
wiki.archiveteam.org/?diff=49998&oldid=49997
-
albertlarsan68
Official english statement:
the-skyrock-team.skyrock.com
-
albertlarsan68
-
albertlarsan68
-
h2ibot
Switchnode edited Deathwatch (+92, /* 2023 */ add english-language skyrock…):
wiki.archiveteam.org/?diff=49999&oldid=49998
-
albertlarsan68
What should the IRC stream be named?
-
albertlarsan68
I can propose #downblog, #thunderblog
-
albertlarsan68
I quite like the second one, being a reference to French history (kinda): it is known (in France) that the Gaulois (IDK the English for that) were supposedly afraid of the sky falling on their head, this being the thunder.
-
fireonlive
with the new emoji policy arkiver +1'd i recommend #⛈️📝
-
fireonlive
:P
-
fireonlive
s/the new/my proposed/
-
albertlarsan68
It seems like each blog (subdomain in "<blog-id>.skyrock.com") has a sitemap.xml, so maybe would be a warrior project?
-
fireonlive
ooh that's useful
-
masterX244
depending on average size of a blog it might blow up on VM warriors
-
fireonlive
hmmm. sitemap links could be reported back to the trackers I guess? or pre-scraped?
-
albertlarsan68
It seems like the "canonical" URL is composed of an ID example URL:
lequipe-skyrock.skyrock.com/3356709…2-Comment-sauvegarder-ton-blog.html, that seems like it is sequential and unique across the network. However, I'm not sure of its usefulness.
-
nstrom|m
-
albertlarsan68
The ID is the only part that identifies a post within a blog, and is the only needed part to find the post. If the name is not correct of absent, it will redirect to the correct url.
-
masterX244
gah, we still need to know which blog it is on, can't buzz out the blog where the article lives on via a redirect
-
JAA
#bowlofpetunias
-
albertlarsan68
Seems like it
-
albertlarsan68
JAA Why?
-
JAA
Just my channel name suggestion.
-
masterX244
since you didnt write "proposed" it looked like a channel announcement
-
albertlarsan68
And what would be the pun, as required by the rules?
-
fireonlive
#25732e07-b4e7-42c0-ad8a-a9bad8716b9b
-
fireonlive
:3
-
JAA
albertlarsan68: Looks like someone needs to listen to/read/watch The Hitchhiker's Guide To The Galaxy.
-
JAA
:-)
-
albertlarsan68
Something that is not contained in my French culture...
-
fireonlive
or; if you prefer; UUIDv7: #06495f63-2f1c-74f3-8000-0efab13ab36e
-
fireonlive
:D
-
fireonlive
but nah lmao
-
fireonlive
imagine trying to manage a channel list full of those
-
albertlarsan68
It seems like the API can't help too much doing basic post retrieving/discovering, the sitemap.xml file contains what we need.
-
masterX244
probably a 2-stage project needed, buzzing out the posts/articles via sitemap hunting and then the core retrieval
-
albertlarsan68
Maybe get a first pool, then grow it via backfeed?
-
albertlarsan68
There is also an atom feed: example
lequipe-skyrock.skyrock.com/atom.xml
-
albertlarsan68
or having someone create an account and maybe subscribe to everyone discovered, then maybe (not verified) there can be a feed/api callback (webhook type) to stay up to date on new posts?
-
albertlarsan68
Also, I have tried to create a wiki page for Skyblog (my user account is sensibly the same as on IRC).
-
arkiver
let's see
-
arkiver
what is this
-
masterX244
french blogging site, really old. see on #archivteteam
-
arkiver
yep
-
arkiver
we have a channel! #bowlofpetunias thanks JAA
-
albertlarsan68
What would be the strategy?
-
imer
JAA: is the ia repo not receptive to fixes then? seems like there's a lot of issues with it
-
imer
or just too much work fixing it?
-
imer
seeing as you implemented an entirely new uploader
-
JAA
imer: Jake is receptive to fixes, and I've fixed a bunch of things myself.
-
JAA
I implemented ia-upload-stream separately because adding multi-part uploads and parallelism to ia would've been a pain.
-
Jake
(just to be extra clear, different Jake :)
-
JAA
Oh right :-)
-
fireonlive
aw darn was just about to PM you with my list of wants :3
-
JAA
After I implemented multi-part uploads, I realised they were ... not exactly great on IA's side of things. There's a bunch of copying parts around, which adds significant overhead.
-
JAA
So I ended up adding the single-part uploading as well.
-
imer
alright, makes sense. might look at the ia stuff then, if its easy enough to sort out I might
-
fireonlive
-
imer
yup yup
-
fireonlive
:)
-
fireonlive
oh there's a whole how to contribute page!
-
imer
just asking since the stuff I ran into seemed to have been known for a while, never know if a project is on life support or something
-
fireonlive
i didn't have to like s/download/details/ and poke around lol
-
fireonlive
🤦
-
fireonlive
i do like that official releases are on the archive itself =]
-
arkiver
a lot is happening in Russia, and most of it is happening through Telegram. we archive Telegram. if you haven't done so, please join and run the telegram project #telegrab
-
OrIdow6
pokechu22: See above, I'll probably write a warrior project or similar for wysp.ys
-
myself
Am running a telegrab container, but the channel was low signal-to-noise. Perhaps it's time to rejoin..
-
nicolas17
arkiver: do telegram and imgur use the same targets? I could slow down the bruteforce queueing if we need target capacity
-
arkiver
nicolas17: if needed i'll slow down the imgur project
-
nicolas17
okay
-
nicolas17
I'm now enqueueing bruteforce lists when todo reaches a threshold, rather than every N minutes
-
nicolas17
so if you reduce the rate limit, it will adapt to that
-
fireonlive
i love the jank → fancier route things always take over time
-
fireonlive
even if there is still jank
-
fireonlive
is that what having children is like?
-
fireonlive
lol
-
arkiver
any youtube videos or channels related to what is happening in Russia now can be archived at #down-the-tube