-
Notrealname1234
JAA: you active right now? I want to ask a question.
-
thuban
Notrealname1234:
dontasktoask.com
-
Notrealname1234
thuban: only knew about
nohello.net
-
thuban
same principle
-
Notrealname1234
JAA: literally, is ISIS websites allowed to be scraped by ArchiveBot?
-
JAA
Notrealname1234: I replied to this earlier in #archivebot, and I don't want to repeat it in a publicly logged channel.
-
Notrealname1234
Dang, i disconnected too early
-
Notrealname1234
"Pokechu22" already quoted it to me
-
JAA
Xe: Your HLS playlists are perfectly fine. The 'only' problem here is that ArchiveBot (which is really the primary tool we use for small to medium sites) currently doesn't even attempt to support HLS in any way. So there's not much you could do, unless you had a single video file on the server side and used Range requests in the playlist and also referenced the video file directly elsewhere, e.g. in
-
JAA
the <video> tag. But yeah, this is purely a lack of support on our side.
-
JAA
My tooling simply collects the segment URLs, and then I queue those directly for archival.
-
Xe
JAA: would it help if i somehow mechanically recreated the origin video as part of my upload process?
-
Xe
like muxing it from HLS to a mkv
-
Xe
i'm more than happy to change how I do uploads
-
pokechu22
I don't think it's worth worrying about changing how your site is structured for archivebot if archivebot's only going to run into it a few times a year at most
-
JAA
Agreed, though a simple link to download the whole video as a single file would likely also be useful to some people.
-
JAA
Looks like we hit the site at least once every other month or so since 2022.
-
pokechu22
True, but probably we wouldn't be downloading videos in those cases, right?
-
JAA
Yeah, probably not.
-
JAA
Unless they were directly in a <video> src.
-
JAA
I suppose WebM would be preferable for that.
-
Xe
pokechu22: people have asked for it before
-
Xe
isn't webm a mkv file with extra spice?
-
JAA
Yeah, they're closely related. Regular mkv files don't work across browsers though, I believe.
-
Xe
would mp4 video and aac audio in a webm container be overly fucked from a compat standpoint?
-
JAA
I don't believe WebM supports either of those codecs.
-
JAA
VP8, VP9, Vorbis, etc.
-
JAA
Ah, AV1 and Opus are the remaining ones.
-
JAA
Royalty-free codecs and all that.
-
Xe
ah, i chose mp4 and aac for my video stuff because it's cross platform universal
-
JAA
Right
-
Xe
(and iOS likes it)
-
Xe
most of my mobile viewers are iOS
-
JAA
Even iOS can play WebM these days, I think.
-
JAA
Yeah, added in iOS 15 in 2021.
-
Xe
i'll throw a script together that scans my S3 bucket for index.m3u8 files, fabricates the right URL, muxes it into an MKV container, and then uploads that next to the target as `foldername.mkv`
-
JAA
:-)
-
h2ibot
Flashfire42 edited URLTeam/Warrior (+75, /* Warrior projects */):
wiki.archiveteam.org/?diff=52109&oldid=51651
-
arkiver
pokechu22: thanks for covering womenwhocode.com and .dev, i was about to put it in
-
michaelblob_
why is pulling from atdr.meo.ws so slow? always takes more than three minutes to pull a 100MB chunk
-
wickerz
Could someone AB/crawl
samvirke.dk (non-urgent currently)? Their owners was recently sold and the new owner will try to fix their financial situation. Might be good proatively to archive these articles etc
-
c3manu
wickerz: sure, i'm gonna run it in #archivebot
-
katia
could
wiki.znc.in get AB'd? low coverage, TLS cert expired on ipv6
-
wickerz
Ty c3manu
-
pokechu22
-
katia
pokechu22, thanks
-
h2ibot
Wickedplayer494 edited Bazaar.tf (+2853, Bazaar is unfortunately dead for real this time):
wiki.archiveteam.org/?diff=52110&oldid=31538
-
kiwiirc
-
thuban
kiwiirc: any particular news?
-
kiwiirc
thuban the news is already in the article. It's from late last year. Tapatalk hosts many old forums from 15+ years that will be lost otherwise
-
that_lurker
Could someone throw
met.refeds.org to AB. Could use a complete grab. Mainly for the offsite links that it contains
-
that_lurker
Though with a little work adding all the exports to #// could also be a good idea maybe
-
thuban
that_lurker: running
-
that_lurker
thanks <3
-
JAA
michaelblob_: Sounds like you might have bad routing to Hetzner (Germany, FSN1)? I haven't seen anything anywhere near that slow, even when pulling from the other end of the world (literally, NZ).
-
nulldata
arkiver - just a reminder about Post.News. You asked for channel suggestions but I don't think a decision was made. Will there be a project? The shutdown post said within the next few weeks and this coming week would be the second week. The same post does say for users to export their data before May 31st so maybe there's still time?
-
michaelblob_
JAA: hm strange, i'm on the US east coast so i was expecting better speeds
-
that_lurker
Could someone also throw
people.nwtime.org to AB. Commoncrawl got a lot of the links, but the files are not grabbed.