-
webdownload95
Hello
-
webdownload95
I don't know if this is really a good idea, but I am attempting to archive videos from Vimeo manually (via Wget). I may need some help on this endeavor, regarding the downloading of videos.
-
webdownload95
Any suggestions?
-
thuban
webdownload: what exactly do you need help with?
-
thuban
(i generally use youtube-dl to save videos for myself and tubeup if i want to upload to the internet archive; both have vimeo support.)
-
webdownload
I'm not sure if I have the videos "saved".
-
thuban
?
-
webdownload
But as according to what I have seen, the descriptions have been archived.
-
webdownload
My concern is that I have used insufficient commands to do so.
-
thuban
can you describe exactly what you did and exactly what you think the problem is?
-
webdownload
Alright
-
JAA
I don't think Vimeo videos can be downloaded easily with wget.
-
JAA
There's JavaScript involved etc.
-
webdownload
What I did was that I used wget "
vimeo.com/watch" --warc-file="at"
-
webdownload
I think the problem is that the videos aren't downloaded.
-
JAA
wpull should download the video if you give it the player embed URL (under player.vimeo.com). Not sure what happens if you just throw in the vimeo.com video page.
-
thuban
ok, well, first of all, "
vimeo.com/watch" is just a browse page; there are no videos on it. are you trying to download a specific video?
-
webdownload
I wanted to download the entire catalog of videos from the website.
-
thuban
unless you have specialized infrastructure, that is probably a bad idea; there's a _lot_ of data there.
-
JAA
Just to give an idea of the scale: you're looking at ~550 million video IDs. Not all of those exist or are publicly accessible of course, but enough of them are that you're looking at petabytes of videos.
-
webdownload
So it would definitely need to be a big project to be suitably archived?
-
JAA
Probably the biggest (by data size) we've ever done...
-
webdownload
I always assumed that Vimeo was a lot smaller than that.
-
JAA
I mean, it *is* a lot smaller than YouTube, but it's still one of the largest video hosting sites out there.
-
Craigle
Didn't see it posted elsewhere, but apparently the
spiffyhacks.com forum will be shutting down at the end of the year. Per
spiffyhacks.com/thread-1676.html
-
Craigle
Doesn't look like there is much activity over there recently
-
thuban
"Our members have made a total of 5,872 posts in 780 threads." sounds like a job for archivebot
-
JAA
We archived it in late 2019, but I just threw it into AB again. :-)
-
Craigle
Agreed. I just get nervous running forums myself since I know they can explode/break completely in AB
-
Craigle
Thanks JAA!
-
JAA
Yeah, this one looks well-behaved. :-)
-
Wayward
videos on vimeo also tend to be a higher bitrate than youtube
-
Wayward
so ever so slightly crisper for the same vid
-
Inhonion
Whats the difference between the 2 URL projects?
-
JAA
URLTeam = URL shorteners, URLs = random URLs from various places, e.g. external links on Reddit
-
anonymous
Is this the team behind Archiverse?
-
Jake
I believe the data was compiled by us, and the interface is made by another user.
wiki.archiveteam.org/index.php/Miiverse
-
JAA
Archiverse is not an official AT site, but it uses the data from our Miiverse archival project and is run by someone who used to be around here at the time.
-
anonymous
Who do I contact with a request to remove my posts from the site?
-
hilda
-
anonymous
I read that; the only contact given is through Twitter, which I lack.
-
anonymous
Other than the contact of "Archive Team", hence my presence here.
-
JAA
They haven't been active on our IRC channels since 2018 as far as I can tell.
-
anonymous
There really should be a DMCA page on that site.
-
anonymous
I did not consent to an unknown third party archiving my posts.
-
EggplantN
(Wait till he finds out we archive reddit in realtime)
-
anonymous
I don't have reddit.
-
anonymous
I don't have really have any social media for that matter except email.
-
anonymous
But Miiverse was one I used to have.
-
JAA
Well, nothing we can do about it. As I said, Archiverse isn't an AT site.
-
JAA
Sucks that their only contact point is Twitter, yes.
-
anonymous
Are you saying that my only hope is to contact "Drastic Actions
-
JAA
Yes. They're the one running the site and likely the only one with any control over its contents.
-
anonymous
In some ways the internet is too archived, in other ways it is not archived enough.
-
anonymous
It's nice to see old 4chan posts from 2003, but then again, I wouldn't want my own posts to be archived, which is why I exclusively lurk imageboards.
-
anonymous
Speaking of archives, where can I read the supposed "public log" for this channel?
-
anonymous
ChanServ says "this channel is publicly logged."
-
anonymous
Where?
-
JAA
The web interface is currently broken.
-
anonymous
Does that mean I can't view the log?
-
Jake
(This one seems to be working a little bit.
hackint.logs.kiska.pw/archiveteam-bs/20210507 )
-
JAA
At the moment yes.
-
anonymous
Does that mean it's not archiving my messages?
-
JAA
No, everything's still logged.
-
anonymous
Why?
-
Krownest
It is the ArchiveTeam after all, I'd be shocked if they didn't log/archive every IRC channel.
-
JAA
Because it's useful to refer to previous discussion and IRC doesn't have native logging (though most clients have or can be configured to create local logs of course).
-
anonymous
How far do the logs go back?
-
JAA
Don't remember exactly. 2013 or earlier.
-
anonymous
Why would you possibly need to refer to previous discussion from eight years ago?
-
Ajay
why would we possibly need a book written more than 8 years ago?
-
JAA
Why would news articles from 8+ years ago ever be relevant?
-
anonymous
A book is made to be permanent.
-
anonymous
News articles are referenced for historical purpose.
-
JAA
So is previous discussion on archival projects, design of our software, etc.
-
anonymous
Message rooms are just another form of people having a conversation, and I think it would be disturbing if every conversation I ever had was recording.
-
JAA
Noone forces you to chat in here.
-
JAA
There are other channels that don't have public logs.
-
anonymous
If Archiverse's FAQ was more comprehensive, I wouldn't have had to come here. Of course, that is no fault of yours.
-
anonymous
Do those channels have private logs?
-
JAA
Always assume that every other user in a channel is keeping personal logs.
-
anonymous
Why?
-
JAA
Sigh...
-
anonymous
Your lack of explanation indicates ignorance.
-
anonymous
Of course, I'm supposedly ignorant here.
-
JAA
22:06:00 <@JAA> Because it's useful to refer to previous discussion and IRC doesn't have native logging.
-
JAA
^ Also an example of referring to previous discussion, by the way.
-
Ajay
same reason people don't delete their email
-
anonymous
Why would you possibly have to keep the discussions with me in them?
-
anonymous
>same reason people don't delete their email
-
JAA
You asked about Archiverse, now I have the FAQ link in my logs and the fact that they're only contactable via Twitter. Next time someone asks about it, I can search my logs to find that again because I definitely won't remember.
-
Ajay
-
» SCSi gets popcorn
-
» Wayward likes popcorn and wonders what crazy drama he just missed
-
anonymous
I don't like popcorn.
-
SCSi
oh, its interesting
-
anonymous
It has had a significant negative impact on the film industry.
-
SCSi
would a DMCA even work in this case
-
anonymous
I don't know, it was just an idea.
-
JAA
Please move this discussion elsewhere.
-
anonymous
Where?
-
anonymous
I thought this was the best channel for it.
-
JAA
#archiveteam-ot I suppose.
-
anonymous
It's publicly logged.
-
JAA
Yes
-
SCSi
99.99% of irc is logged by somebody somewhere
-
SCSi
get used to it
-
anonymous
Get used to perpetual surveillance?
-
anonymous
Comply with injustice instead of standing up to it?
-
hook54321
#archiveteam-ot
-
anonymous
What do you mean by this?
-
anonymous
How can a conversation simply move to another channel?
-
JAA
This discussion in here ends now.
-
anonymous
Then continue it there.
-
EggplantN
jeez