01:21:01 kiska: #wuciyuan just started 02:02:59 Yts98 edited Current Projects (+12, Move Banciyuan to current. Move ЯRUS to…): https://wiki.archiveteam.org/?diff=50114&oldid=50106 02:03:50 thanks yts98 :3 02:04:03 i had it in a background tab but never hit save lol 02:04:44 fireonlive: lol 02:04:59 FireonLive edited Banciyuan (+2, let's goooooooooo): https://wiki.archiveteam.org/?diff=50115&oldid=49979 02:53:08 JustAnotherArchivist moved Banciyuan to 半次元 (Official name is in Chinese): https://wiki.archiveteam.org/?title=%E5%8D%8A%E6%AC%A1%E5%85%83 02:53:09 JustAnotherArchivist edited 半次元 (+8): https://wiki.archiveteam.org/?diff=50118&oldid=50116 03:02:10 Yts98 edited Current Projects (-12, Adjust the link for 半次元): https://wiki.archiveteam.org/?diff=50119&oldid=50114 03:06:13 ooh a move :) 04:37:26 hahahahahahahahahaha 04:37:27 oh 04:37:40 https://twitter.com/TwitterSupport/status/1675990712297443330 04:40:28 https://transfer.archivete.am/inline/3sais/1688445603.png < don't say i never do anything for you all 😘 04:42:04 on meta theads: their apple app store privacy 'nutrition card': https://pbs.twimg.com/media/F0JrcJJaMAEzzfw?format=jpg&name=orig 05:09:50 Types of Data Collected: Yes 05:15:54 * Exorcism|m uploaded an image: (166KiB) < https://matrix.hackint.org/_matrix/media/v3/download/matrix.org/BLNnhhRcckVcLBUjQeHPCsAG/Screenshot_2023-07-04-07-15-16-39_0b2fce7a16bf2b728d6ffa28c8d60efb.jpg > 05:21:38 at least onylfans has a competent paywall 05:25:15 you da real mvp fireonlive 05:26:42 :D 06:24:07 11:18:28 PM <+rss> Instagram will shut down its companion app Threads by year end (2021): https://techcrunch.com/2021/11/17/instagram-will-shut-down-its-companion-app-threads-by-year-end/ → https://news.ycombinator.com/item?id=36582571 06:24:16 i got very confused; but it's a *different* threads 06:24:23 i guess they're reusing the branding 06:32:22 just what I'd expect SV ghouls to do 08:55:16 Yts98 edited Skyblog (+169, Add another short URL example): https://wiki.archiveteam.org/?diff=50120&oldid=50111 13:02:16 Starting CheckIP for Item Failed CheckIP for Item Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/seesaw/task.py", line 88, in enqueue self.process(item) File "", line 122, in processAssertionError: Your time 1688470941.1319983 is more than 180 seconds off of 1688475727.383.Waiting 10 seconds... 13:02:28 I just synced my pc clock but this is still showing up 14:06:09 Is your time zone correct? 14:25:43 if you're running a VM, do you have the VM host set to "hardware clock in UTC time"? 14:31:05 if you're running something unixish then yeah. 15:01:54 Hello everyone, I’m new to IRC so please do forgive me if this is the wrong place or if my etiquette isn’t perfect. I’m in need of some help - www.world.kano.me has an estimated 1 million+ user-generated artworks but no plan around the idea of archiving it. Full transparency, I work at Kano and we are looking to sunset the site and our apps 15:01:54 soon. How would you go about this mammoth task? Any and all advice is much appreciated. Thanks - holographicleah 15:05:04 Whoops, *https://world.kano.me/ 15:07:28 Hi holographicleah thanks for getting in touch! Please stick around as it may take a moment for someone to get back to you. JAA arkiver ^^ 15:29:12 holographicleah: thank you very much for letting us know! 15:29:25 holographicleah: when is the deadline for this? 15:29:56 not sure how well kano will playback in the Wayback Machine 15:33:15 nice i see it's all relatively easy to archive with the API as well, but again not sure about playback yet 15:33:26 holographicleah: do you perhaps have a list of all creations on the site? 15:39:28 I wish I had a concrete date. Kano World was actually spun off into a 'sister company' and has its own AWS S3 bucket with the creations in. we might transfer them over to our other AWS account somehow which would keep them around for longer, possibly 15:39:52 To put it mildly there are some unpaid bills. 15:40:18 holographicleah how large is the bucket? 15:46:22 About 690GB apparently in the bucket threedeeitguy 16:02:27 I think it's a case of - we're probably not going to have the site around for much longer (a few months at tops) and we have all of the creations, along with their code, but we want to find a way to make it accessible after the site closes. 16:22:56 holographicleah: are you planning to stick around on IRC? if not, feel free to contact me at arkiver⊙pc 16:26:32 arkiver: I don't know too much about using IRC so I may end up emailing you, thanks!! I'm seriously such a noob, i'm just here thru the web interface haha. I'm also wary of spamming the chat with too much info!! 16:34:03 Ah, I was too slow! arkiver feels like there may be two parallel approaches here? 1. Standard scrape to go over to IA for the wayback machine 2. dump raw resources (and maybe source code?) so it leaves the door open for this to live on in some more interactive form. I guess 2 depends on how much access they are comfortable handing out/what IP they 16:34:03 cannot give away. 16:45:21 holographicleah: Don't worry about that, #archiveteam-bs was made for spamming walls of text 16:56:58 I'm going to pop back in here in a couple days, hopefully with some more info, after I have a chance to chat more with engineers on the team (and hopefully the CEO) about the open-source future of user-generated content of world.kano.me. What I can say is that it's safe to put it on deathwatch, we just haven't set an official date. 16:59:55 JustAnotherArchivist edited Deathwatch (+60, /* 2023 */ Add Kano World Studio): https://wiki.archiveteam.org/?diff=50121&oldid=50107 17:00:01 holographicleah: Thanks for reaching out! I added it. 17:01:28 JAA: thanks so much! 18:12:48 Hey so I downloaded a bunch of CDX files generated by the archive team, and one of the columns appear to be a non-standard CDX column. "S" it's not in spec https://archive.org/web/researcher/cdx_file_format.php so I am unsure what it's suppose to be. I wonder if the IA backend is throwing an error about it too 18:13:08 what item? 18:26:58 Skylion: This is the current CDX spec: http://iipc.github.io/warc-specifications/specifications/cdx-format/cdx-2015/ 18:27:12 and both are crap 18:27:28 what is a "canonized URL"? 18:36:23 Yeah, neither is detailed, but at least that one lists all the fields actually in use, unlike IA's. 18:36:47 link rel=canonical? 18:36:54 No 18:37:02 oh 18:37:18 It's a mangled URL after it was run through surt. 18:37:32 that’s good its not the rel lol 18:37:36 Stripping protocol, auth, leading www, and port, lowercasing everything, etc. 18:37:36 ahh 18:37:50 interesting 18:38:02 That is one way to put it, yes... 18:38:15 >_> yeeeaah…. 18:38:26 see: imgur and lots of other stuff 18:38:27 lol 18:38:31 Especially the case collapsing causes issues all the time. 18:38:48 yeah :/ 18:41:10 Everyone should start using Unicode homoglyphs since those don't get collapsed!!1! 18:43:58 😁 18:56:38 Oh sorry missed it 18:56:59 Some of the 2017 flickr snapshots have this issue 18:57:59 Ah nvm, I see. Thanks! 19:06:20 Exorcism edited 半次元 (+41): https://wiki.archiveteam.org/?diff=50122&oldid=50118 19:57:16 nstrom|m myself Yes, I have both checked the timezone as well as enabled hardware clock in UTC time by default 20:12:46 https://matrix.org/blog/2023/07/deportalling-libera-chat/ interesting 20:15:58 you can still connect via regular IRC 20:16:21 Yeah, very unsurprising, I've seen lots of complains about how the Matrix bridge operates over there. 20:16:50 I don't know if there's anything to archive in this case 20:16:56 There isn't. 20:17:13 Anything on the Matrix side is behind a login wall anyway, and IRC is IRC. 20:49:33 What's a good matrix alternative? Unless someone here wants to change careers from IT to plumbing 21:03:05 To be clear, this is about another IRC network (Libera) and does not directly affect us here. 21:04:01 Oh great 21:09:26 What's a good matrix alternative? Unless someone here wants to change careers from IT to plumbing 21:09:36 Which part of Matrix? 21:09:59 The federated part, the group chat part, or the end-to-end encrypted part? 21:09:59 i'm thinking like the irc part 21:10:05 bridge 21:10:33 if that's the case 'the lounge' has treated me pretty ok 21:10:55 If you want federation, I think the only other chat option than IRC and Matrix is XMPP / Jabber 21:11:36 If you just want self-hosted Discord and federation does not matter there's Rocket.Chat, Revolt, and Fosscord 21:11:39 https://www.hackint.org/transport/xmpp indeed 21:12:06 There's also Mattermost, but that's a Slack clone 21:12:37 mattermost was interesting; but i had problems getting push notifications to work :/ 21:12:42 maybe that was just something on my end 21:12:53 (w/ the iOS app) 21:13:40 element's E2EE seems kinda buggy and pisses me off a lot lately? 21:14:07 i should really get around to trying fluffychat 21:14:55 should this be in #archiveteam-ot ? 21:15:41 Probably 21:16:03 ah yes 21:18:06 I apologize for asking again, but may you please archive this MEGA folder (34 GB): https://mega.nz/folder/sol2UZoK#oMACjgVHPcAv1hPGLX_PoA 21:20:04 mega is very hard to archive 21:20:29 The data in the folder is irreplaceable. The original creator of the folder quit the community due to his choice to spend time with his new girlfriend, and may stop paying for MEGA bills. In no ways this is intended as a form of harassment. 21:20:55 This is intended to be a manual download of the MEGA folder via MegaBastard 21:21:20 looks like a lot of apple stuff... nicolas17 ? 21:21:27 are you 'in' with the mega? 21:22:00 No I am not affiliated at all internally with the folder 21:22:01 I have an online friend that has Mega Pro 21:22:27 Let me know if Mega bandwidth cucks you and you can't bypass it 21:23:03 The folder contains many rare apple apps, and are most likely of interest to many Apple historians 21:23:13 MegaBasterd has a proxy switcher feature 21:23:43 So if you find a list of usable proxies online that might work 21:25:53 Website for AB: https://unknowntags.netlify.app 21:27:41 Additionally MEGA and other cloud services, especially OneDrive will remove AppleInternal data if they find out 21:27:58 I hope Showbuzzdaily https://showbuzzdaily.com/articles/some-unfortunate-news.html is being archived (already listed on Deathwatch). A lot of historical TV ratings data there. Probably already covered by Web Archive, but would probably be good to do a full archive nonetheless. 21:28:28 newsjunkie: https://archive.fart.website/archivebot/viewer/job/d5ioy 21:30:26 I’ve heard https://unknowntags.netlify.app/internalui also contains working mega links to apps not in the big mega folder 21:31:06 Thanks pokechu2 21:31:27 Thanks pokechu22 (sorry don't know how to mention people. 21:31:58 if you mention the nick(name) it's more than enough :) 21:32:12 e.g. fireonlive abcd 21:32:19 or fireonlive: you're a fucking twat 21:32:20 both work 21:35:27 upintheairsheep: grabbing the mega folder data unless someone else has already 21:41:23 got the ones from https://unknowntags.netlify.app/internalui as well (folder download is still going) 21:54:46 upintheairsheep: is this folder from unknowntags.netlify.app too? just thinking of what metadata to put so people can find it if they're looking for it 21:58:47 I tried saving the files in my own MEGA account to preserve them, and download them later 21:59:27 but something is broken and even trying to save a single 100KB folder is saying "can't complete this action because it would put you above your storage quota" 21:59:49 are you out of space :p 22:00:18 I'm using 28GB of 50GB, and a 100KB folder "doesn't fit" 22:00:29 nicolas17: if you want I can hand over the files to you once it's downloaded it, probably do a better job of labeling it properly when uploading and such 22:04:02 OK. Finally found the proof: https://unknownarchive.netlify.app/page2.html 22:05:02 okay I figured it out 22:05:57 I also have some other AppleInternal links that do not belong to Unknown_Tags but are at risk of Apple taking them down (happened many times before) 22:06:12 https://www.dropbox.com/scl/fo/7z9usnvxjja7jk8e3nlqq/h?dl=0&rlkey=i34t7clf1zhy0ai0wewcakeki 22:07:03 I copied as much as I could fit in my acct... so I'm missing the 14 biggest folders 22:07:15 upintheairsheep: are there any associated youtube channels for this person as well? 22:07:26 https://mega.nz/file/JPtHwRZB#PD362DDwcuRXcEloW83z-WmDH9LK9EiGOkCOfcvkXeA 22:07:41 https://mega.nz/folder/1W1BAZSa#evb9c2yO573COSM2OjH6wA 22:08:01 nicolas17 is very apple happy so the more the merrier i’m sure :) 22:16:20 imer: I have a volunteer with a Mega Pro account 22:16:25 Do you want him to rip it? 22:20:14 icedice: I should be good on the free download quota part (enough ips to cycle), speed isn't amazing though (9-20mib/s) so going to take me a few hours 22:21:09 can do though, certainly won't hurt to have two copies :) 22:21:31 I'm downloading 28GB of personal data that I had in MEGA and I wanted to move elsewhere but I kept putting it off :P I had to do it anyway, and it will free up space in my acct 22:28:06 brb 22:28:56 imer: He's downloading it 22:29:12 Send me a screencap of his Mega download speeds before I sent him the links 22:29:26 He almost hit 78 MiB/s 22:29:38 * sent 22:29:58 So should be done pretty quickly 22:31:13 He's not willing to post copyrighted content publicly though, but he's willing to hand it over privately to us which can then be uploaded to Internet Archive or elsewhere 22:32:20 icedice: afaik the "demo apps" in the last mega link here are "copyrighted content" 22:32:46 the 37GB folder is worse, they're Apple-internal employee-only apps 22:34:22 You're telling me I'm having my friend download leaked material on an account that can almost certainly be traced back to him via the payment method? 22:34:46 ask upintheairsheep :p 22:34:59 jfc 22:35:22 otoh I believe this folder has been up for more than a year 22:43:25 ooh apple internal apps 22:47:14 In case Internet Archive has to yeet it at some point I do know about a certain Russian VPS provider that openly ignores copyright and accepts Monero 22:48:33 that's interesting 22:51:24 https://vdsina.ru/ 22:53:10 As for domains, .st and .li from Njalla are pretty bulletproof when it comes to copyright 22:53:18 getting 4.3MB/s from mega 22:53:25 I'm surprised I didn't hit a daily cap or anything yet 22:53:31 There are some other TLDs as well, but they're harder to register 22:54:06 And would have to be registered via other registrars that don't fill in their own WHOIS info for the user 22:57:40 "51% done 22:57:42 Got all other links as well downloaded besides the 37 GB one" 22:57:53 ^ My friend sent me this 9 minutes ago 23:49:37 i've got all the mega stuff downloaded I believe