02:20:16 Hello 02:22:50 I don't know if this is really a good idea, but I am attempting to archive videos from Vimeo manually (via Wget). I may need some help on this endeavor, regarding the downloading of videos. 02:23:44 Any suggestions? 02:31:19 webdownload: what exactly do you need help with? 02:32:04 (i generally use youtube-dl to save videos for myself and tubeup if i want to upload to the internet archive; both have vimeo support.) 02:32:42 I'm not sure if I have the videos "saved". 02:33:09 ? 02:33:11 But as according to what I have seen, the descriptions have been archived. 02:34:23 My concern is that I have used insufficient commands to do so. 02:34:32 can you describe exactly what you did and exactly what you think the problem is? 02:34:49 Alright 02:35:03 I don't think Vimeo videos can be downloaded easily with wget. 02:35:11 There's JavaScript involved etc. 02:35:54 What I did was that I used wget "https://vimeo.com/watch" --warc-file="at" 02:36:21 I think the problem is that the videos aren't downloaded. 02:36:37 wpull should download the video if you give it the player embed URL (under player.vimeo.com). Not sure what happens if you just throw in the vimeo.com video page. 02:37:01 ok, well, first of all, "https://vimeo.com/watch" is just a browse page; there are no videos on it. are you trying to download a specific video? 02:38:46 I wanted to download the entire catalog of videos from the website. 02:40:22 unless you have specialized infrastructure, that is probably a bad idea; there's a _lot_ of data there. 02:44:50 Just to give an idea of the scale: you're looking at ~550 million video IDs. Not all of those exist or are publicly accessible of course, but enough of them are that you're looking at petabytes of videos. 03:05:53 So it would definitely need to be a big project to be suitably archived? 03:07:30 Probably the biggest (by data size) we've ever done... 03:09:30 I always assumed that Vimeo was a lot smaller than that. 03:11:43 I mean, it *is* a lot smaller than YouTube, but it's still one of the largest video hosting sites out there. 03:12:00 Didn't see it posted elsewhere, but apparently the https://spiffyhacks.com/ forum will be shutting down at the end of the year. Per https://spiffyhacks.com/thread-1676.html 03:12:16 Doesn't look like there is much activity over there recently 03:13:44 "Our members have made a total of 5,872 posts in 780 threads." sounds like a job for archivebot 03:14:00 We archived it in late 2019, but I just threw it into AB again. :-) 03:14:43 Agreed. I just get nervous running forums myself since I know they can explode/break completely in AB 03:14:48 Thanks JAA! 03:15:11 Yeah, this one looks well-behaved. :-) 04:51:33 videos on vimeo also tend to be a higher bitrate than youtube 04:51:57 so ever so slightly crisper for the same vid 06:06:25 Whats the difference between the 2 URL projects? 06:08:55 URLTeam = URL shorteners, URLs = random URLs from various places, e.g. external links on Reddit 21:51:35 Is this the team behind Archiverse? 21:54:44 I believe the data was compiled by us, and the interface is made by another user. https://wiki.archiveteam.org/index.php/Miiverse 21:55:07 Archiverse is not an official AT site, but it uses the data from our Miiverse archival project and is run by someone who used to be around here at the time. 21:55:23 Who do I contact with a request to remove my posts from the site? 21:55:34 https://archiverse.guide/faq 21:56:09 I read that; the only contact given is through Twitter, which I lack. 21:56:33 Other than the contact of "Archive Team", hence my presence here. 21:57:19 They haven't been active on our IRC channels since 2018 as far as I can tell. 21:57:30 There really should be a DMCA page on that site. 21:57:52 I did not consent to an unknown third party archiving my posts. 21:58:22 (Wait till he finds out we archive reddit in realtime) 21:58:41 I don't have reddit. 21:59:07 I don't have really have any social media for that matter except email. 21:59:26 But Miiverse was one I used to have. 22:00:10 Well, nothing we can do about it. As I said, Archiverse isn't an AT site. 22:00:41 Sucks that their only contact point is Twitter, yes. 22:00:47 Are you saying that my only hope is to contact "Drastic Actions 22:01:11 Yes. They're the one running the site and likely the only one with any control over its contents. 22:01:49 In some ways the internet is too archived, in other ways it is not archived enough. 22:02:48 It's nice to see old 4chan posts from 2003, but then again, I wouldn't want my own posts to be archived, which is why I exclusively lurk imageboards. 22:03:22 Speaking of archives, where can I read the supposed "public log" for this channel? 22:03:37 ChanServ says "this channel is publicly logged." 22:03:49 Where? 22:03:51 The web interface is currently broken. 22:04:04 Does that mean I can't view the log? 22:04:14 (This one seems to be working a little bit. https://hackint.logs.kiska.pw/archiveteam-bs/20210507 ) 22:04:17 At the moment yes. 22:04:27 Does that mean it's not archiving my messages? 22:04:43 No, everything's still logged. 22:04:47 Why? 22:05:16 It is the ArchiveTeam after all, I'd be shocked if they didn't log/archive every IRC channel. 22:06:00 Because it's useful to refer to previous discussion and IRC doesn't have native logging (though most clients have or can be configured to create local logs of course). 22:06:13 How far do the logs go back? 22:06:51 Don't remember exactly. 2013 or earlier. 22:07:11 Why would you possibly need to refer to previous discussion from eight years ago? 22:08:02 why would we possibly need a book written more than 8 years ago? 22:08:23 Why would news articles from 8+ years ago ever be relevant? 22:08:43 A book is made to be permanent. 22:08:58 News articles are referenced for historical purpose. 22:09:22 So is previous discussion on archival projects, design of our software, etc. 22:09:29 Message rooms are just another form of people having a conversation, and I think it would be disturbing if every conversation I ever had was recording. 22:09:43 Noone forces you to chat in here. 22:10:04 There are other channels that don't have public logs. 22:10:24 If Archiverse's FAQ was more comprehensive, I wouldn't have had to come here. Of course, that is no fault of yours. 22:10:36 Do those channels have private logs? 22:11:25 Always assume that every other user in a channel is keeping personal logs. 22:11:39 Why? 22:11:43 Sigh... 22:12:03 Your lack of explanation indicates ignorance. 22:12:16 Of course, I'm supposedly ignorant here. 22:12:18 22:06:00 <@JAA> Because it's useful to refer to previous discussion and IRC doesn't have native logging. 22:12:57 ^ Also an example of referring to previous discussion, by the way. 22:12:58 same reason people don't delete their email 22:13:20 Why would you possibly have to keep the discussions with me in them? 22:13:39 >same reason people don't delete their email 22:14:52 You asked about Archiverse, now I have the FAQ link in my logs and the fact that they're only contactable via Twitter. Next time someone asks about it, I can search my logs to find that again because I definitely won't remember. 22:20:50 anonymous: https://github.com/drasticactions/Archiverse/issues/6 22:34:50 * SCSi gets popcorn 22:37:37 * Wayward likes popcorn and wonders what crazy drama he just missed 22:37:45 I don't like popcorn. 22:37:55 oh, its interesting 22:38:09 It has had a significant negative impact on the film industry. 22:38:19 would a DMCA even work in this case 22:38:27 I don't know, it was just an idea. 22:38:44 Please move this discussion elsewhere. 22:38:50 Where? 22:38:56 I thought this was the best channel for it. 22:39:06 #archiveteam-ot I suppose. 22:39:41 It's publicly logged. 22:40:20 Yes 22:40:22 99.99% of irc is logged by somebody somewhere 22:40:24 get used to it 22:40:38 Get used to perpetual surveillance? 22:40:54 Comply with injustice instead of standing up to it? 22:41:14 #archiveteam-ot 22:41:52 What do you mean by this? 22:42:07 How can a conversation simply move to another channel? 22:42:14 This discussion in here ends now. 22:42:23 Then continue it there. 22:43:33 jeez