#archiveteam-bs

02:34

pabs

JAA: sounds about right. TBH this is the first time I heard of Freed-ora
02:35

monoxane

yo so how hard would it be to get some more targets online if we had the storage + network to provide for it
02:35

monoxane

if you've seen pixiv over the last 2 days you may have seen that me and a few friends have thrown some 100g boxes at it and are currently bottlenecked by the 2 online targets
02:36

monoxane

we know the targets need to offload to IA at an appropriate speed, but have quite a bit of available storage to buffer ourselves with
02:38

monoxane

at a point we were hitting 7.5gbps from the source but are now limited by the targets disks filling up and stopping connections 😔
02:39

monoxane

its less of a thing for this in particular but we're trying to work out how we can provide some infra for the next "oh shit its going down in 24 hours" site scrape
02:39

monoxane

500gbit of bandwidth, a /24, and 100tbit of local storage will help some of those a fair bit 😉
02:41

JAA

Pinging some relevant people: rewby HCross arkiver ^
02:43

monoxane

we're also working on rewriting an api compatible warrior that will scale much higher
02:44

monoxane

for reference last night we had 3328 warrior threads running across 13 nodes for shits n gigs, and were nowhere near capacity
02:45

monoxane

also considering rolling a new version of the megawarc factory with some improvements, the real question is how does it get from the targets to IA and what do we need to do to facilitate that
02:46

monoxane

and yes, aware that IA only has ~20gbps S3 capacity, we'd be egress shaping down to about 5gps, hence the fuck off massive target cache to hold it for a bit
07:28

monika

monoxane could you clarify on the "api compatible" warrior? are you modifying the existing warrior or writing one from scratch
07:30

monika

i believe modifying warrior code is a big no no
07:31

monoxane

new one that does the same thing with the same apis just less jank and some more options to allow us to vertically scale easier and with an updated docker image
07:32

monika

JAA what's your opinion ^
07:32

nepeat

i'd be interested in learning more and supporting this warrior improvement
07:33

nepeat

personally, i'd love to add on prom metrics and getting the logging to fit the structlog format to work with my systems
07:33

monoxane

im not the guy doing that so i might be wrong on whats actually happening, but we've found that one of the main limiting factors of the warrior is its concurrency settings and the inability to disable things like the web ui
07:33

monoxane

and also the fact that some of the python libs used in it are effectively vaporware that havnt been updated since 2017
07:33

monika

if you run the bare project containers the UI is already disabled
07:33

monika

atdr.meo.ws/archiveteam/<PROJECT>-grab
07:33

monika

allows for 20 concurrency too
07:34

monoxane

ooo we did not know that
07:34

monika

go crazy
07:34

monoxane

that is going to make a massive difference
07:34

monoxane

aight the warrior isnt being changed anymore :)
07:35

monoxane

but we are gonna write our own cluster agent and c2 implementation :P
07:36

nepeat

ditching k8s already?
07:36

monoxane

no, still using k8s, just writing a controller that handles the deployment and configuration of those bare images instead of the warrior
07:36

monoxane

we are already working on that but via the warrior, knowing about the bare images is a massive game changer
07:38

monoxane

hm these dont seem to actually contain anything though 😔
07:39

monika

huh?
07:39

monoxane

at least the pixiv-2-grab one'd dockerfile literally only has a from line in it
07:39

nepeat

it does fucky ONBUILD magic github.com/ArchiveTeam/grab-base-df/blob/master/Dockerfile
07:39

nepeat

this is the dockerfile to refer to
07:39

nepeat

the base is on github.com/ArchiveTeam/wget-lua/blob/v1.21.3-at/Dockerfile
07:40

monoxane

ah okay, thats some fucky shit i havnt seen before :P
07:40

monoxane

will play around with it after i finish my actual job for the day lol
07:41

OrIdow6

arkiver: See above, they have dropped their plan to modify "the warrior"
07:42

monoxane

yes now we're just gonna bypass it 😆
07:43

monoxane

we dont wanna do anything that will screw anyone else here but there are definitely challenges with scaling warrior to 3000+ instances over 10+ nodes and actually managing it
07:43

nepeat

eh github.com/general-programming/mega…/common/nomad_jobs/job_at_vlive.hcl
07:43

nepeat

i like nomad, it's simple and has scaled up with my 100-300 instances well
07:43

monoxane

yea the other thing is the nodes we're using already have k3s and are running some other workloads, so we cant just jump to nomad
07:44

nepeat

ah, preexisting prod
07:44

monoxane

yes, if you knew what these nodes usually do you'd be absolutely shocked that we can run AT workloads on them, and also absolutely not surprised at all that we can pin 500gbps
07:45

monoxane

but dont worry its all approved by the owners :)
07:46

OrIdow6

I haven't been following this conversation enough to know the meaning of "bypass it", but basically, the hard rules are: -don't modify wget-lua/wget-at, including messing with the build process to get it to accept wider ranges of library versions -don't modify Seesaw or the other libraries it uses -don't modify the project scripts -keep a clean, vanilla connection from wget and the project scripts to the Internet
07:48

monoxane

understood we’ll definitely be sticking to that
07:49

monoxane

i mean we’ll be running the project containers directly not managed through warrior
07:53

nepeat

that's what most of us hardcore users do
07:53

nepeat

you're definitely on the right path to hauling top rates
07:58

monoxane

we dont care about the leaderboards lmao, even considered randomising the DOWNLOADER ids so other people dont get discouraged by 1 name munching 10tb a day
07:59

monoxane

its more, if we can help in an "oh fuck" situation where theres 24 hours to get an entire site archived, we'll put in everything we've got
07:59

monoxane

because i've been part of some of those were even with all the capacity we've had, some content is still lost, and in a couple of cases it was a fair bit of content
08:07

schwarzkatz|m

Appreciate the work you guys do, monoxane!
08:11

Jake

(also related to earlier conversation, it's easier if you use a known downloader name so that you can be contacted)
08:13

monoxane

yea we're gonna use some sort of team name when its all up and running
08:14

monoxane

instead of just my nick lol
08:47

nepeat

kinda wondering, how up to date are all of the archive team repos?
09:23

neggles

"don't modify Seesaw or the other libraries it uses" aww
09:27

neggles

I believe the current plan was to use MagnusInstitute or possibly MagnusArchivist as downloader name, TBC though
09:32

neggles

OrIdow6: would it be OK to rework the warrior docker image somewhat so it's a bit more... modern, for lack of a better way to put it? I was digging through repos and whatnot last night piecing together how it all works and... oof.
09:34

OrIdow6

neggles: I don't know what that implies exactly
09:35

OrIdow6

The core that you shouldn't modify is in the READMEs under "Distribution-specific setup"
09:35

OrIdow6

And to my understanding the warrior, Docker images, etc. are basically just wrappers around a preconfigured version of this
09:35

OrIdow6

But I don't know the details of those, and if you want specifics you should wait around for someone who does
09:35

neggles

OK, no problem
09:40

neggles

don't want to step on anyone's toes; I have a local 3/4-ish-complete copy of what I'm talking about, it's mostly a slightly cleaner build process (same steps, same sources, similar end result) just with bullseye underneath, theoretically arm64 support, and a few more things configured through environment variables (webui port, UID/GID)
09:50

rewby

So the thing is: don't run custom builds of wget-at. It causes issues
09:51

rewby

Mostly around compression and or file integrity
09:51

rewby

And upgrading the base distro changes lib versions, which then causes the above
09:54

rewby

As for targets, we don't generally accept them from just anyone who shows up randomly. Once data is on a target it is really hard to figure out what needs to be redone if that target disappears.
09:55

rewby

Notably, we only accept targets in the form of bare metal or vms. We have provisioning playbooks for them
09:55

rewby

Also, they destroy ssds
09:56

rewby

And HDDs are not gonna keep up
09:56

rewby

Also, 100T isn't much
09:56

rewby

I have targets with that much sitting around as well
09:57

rewby

I can look into reshuffling a few things
09:59

rewby

Also, monoxane, do *NOT* use team names. That is forbidden. We will ban you if we discover this.
09:59

monoxane

oop okay
09:59

monoxane

will not
10:00

rewby

We have had many issues with this before
10:02

rewby

In the past, people have used team names and then one member's infra fucks up and we need then to stop. Inevitably that person is unreachable and the other members can't get to that specific bit of infra. We end up banning the whole thing because that's the most granular tool we have.
10:02

rewby

This has happened multiple times.
10:02

rewby

So we prohibit team names in general now
10:05

monoxane

yea that makes heaps of sense i dont know why i didnt think about it
10:10

rewby

Yeah, each person's infra needs a unique uploader name
10:10

rewby

If you wanna do TeamBlah-monoxane then by all means go for it
10:10

neggles

does it qualify as one person's infra if all the workers are being managed from a central point, and go idle if they can't talk to it?
10:11

rewby

Uh. Unsure.
10:12

monoxane

we'll take that as a no then
10:12

monoxane

dont wanna antagonise
10:12

rewby

Basically, each uploader name should be associated with the person who can run sudo poweroff
10:12

monoxane

may have come in a bit too hot with the ideas
10:12

monoxane

copy
10:12

neggles

the whole "we need any one of us to be able to hit the kill switch" thing did occur to ud
10:12

neggles

s/ud/us
10:12

rewby

Even if you don't have the skills to fix the issue, you can at least shut the thing down, you feel?
10:13

rewby

Yeah so, that "any one of us" idea has been tried before
10:13

rewby

It never works out in practice
10:13

rewby

So we want to be able to tracker-ban one control domain
10:14

BPCZ

rewby: would a target with 5PiB of flash and 100PiB of hdd be of much use?
10:14

monoxane

BPCZ isnt that just the IA :P
10:15

BPCZ

Though if a target going missing is an issue that might be an issue since that system is my testing ground :/
10:15

rewby

BPCZ: Depends on the networking, how much abuse against cpu and flash you're willing to take and how long it's available for
10:15

rewby

Yeah, no testing grounds
10:15

rewby

Targets going missing without >24h notice is a Big Problem
10:15

monoxane

someone buy a VAST cluster already
10:16

BPCZ

VAST is dog shit
10:16

rewby

Just give me bare metal tbh
10:16

rewby

That usually works best
10:16

BPCZ

Understandable
10:16

rewby

I have a whole Ansible system to provision and manage metal
10:17

rewby

Not OSS because, like a lot of AT code, it was all written with -2 hours of notice/planning
10:17

BPCZ

:( but we could clean it up
10:17

rewby

Specifically I just hardcodes a ton of secrets in it because I had a deadline of a few hours before
10:17

BPCZ

lol
10:18

rewby

It's the secrets bit that's the issue
10:20

BPCZ

Wish I could contribute hardware but that’s a big nono, I can chuck ungodly amounts of compute and ephemeral storage around but most OSS projects get annoyed when you show up do 5x the work they’ve done in 3 years then disappear
10:20

rewby

Our problem is ephemeral is a big no for targets
10:20

rewby

Workers, sure
10:20

rewby

And i can scale targets up if need be.
10:21

rewby

I've just been sick for the last 3 weeks and haven't been able to babysit them like I usually do
10:21

nepeat

oooooo ansible scripts
10:21

nepeat

i've been trying to research the backend infra and a lot of the stuff seems stale for that
10:22

BPCZ

Paperclips and chewing gum
10:22

rewby

Targets aren't that complicated tbh, it's mostly OSS except for my provisioning code
10:22

rewby

Tracker...
10:22

rewby

Talk to Kaz. He's been on a journey to RE that thing
10:23

nepeat

heh
10:23

nepeat

is the current tracker code open sourced?
10:23

rewby

Only F.usl really knows how that thing works.
10:23

rewby

You assume all of it even has a source code repo
10:23

rewby

Bold
10:23

nepeat

HAHA OH GOD
10:24

nepeat

my inner sre cries a little
10:24

BPCZ

>ruby
10:24

rewby

Same
10:24

BPCZ

Off to a terrible start
10:24

nepeat

ruby is cool!
10:24

rewby

Oh trackerproxy isn't ruby
10:24

rewby

It's all redis and nix+lua
10:24

rewby

*nginx
10:24

rewby

Damn autocorrect
10:25

BPCZ

I wish there was better docs on the infrastructure, seems neat
10:25

nepeat

+1
10:25

rewby

Same here
10:25

nepeat

i'd love to make some changes that would improve my quality of life with my infra
10:25

monoxane

+1
10:25

monoxane

i’ll just make my own with blackjack and hookers and an ia s3 key /s
10:26

nepeat

hell yeah prom exporters and structlogs
10:26

monoxane

too much work
10:26

rewby

Using your own S3 key wouldn't work btw
10:26

monoxane

yea i know
10:26

nepeat

spicy
10:26

rewby

You don't have access to the magical collections where we drop things.
10:26

BPCZ

IA is using S3
10:26

monoxane

it only lets you upload via the site doesn’t it
10:26

BPCZ

Now?
10:26

BPCZ

Sadage
10:26

nepeat

s3 compatible, not actual s3
10:27

monoxane

the web ui upload from ia is an s3 thing
10:27

rewby

It's an S3 "compatible" endpoint
10:27

rewby

We call it s3
10:27

monoxane

and yea not s3 from amazon, just the protocol
10:27

BPCZ

Thank god ok
10:27

nepeat

everyone implements s3 compatible apis
10:27

neggles

S3 =/= AWS S3
10:27

rewby

It's cursed
10:27

BPCZ

I don’t even know if IA has multiple tape libraries yet
10:27

rewby

It's all hdds
10:28

rewby

Afaik
10:28

nepeat

i've heard they're running ceph these days?
10:28

monoxane

yea i think it’s hdd with a little bit of flash in front for web stuff
10:28

BPCZ

Probably too much effort to keep a library alive, those bastards always have issues
10:28

monoxane

there’s a page on the site talking about petabox
10:28

monoxane

somewhere else talks about s3 on top of it too
10:29

monoxane

whichbis where i got the idea to just ask for a key from :P
10:29

rewby

Also, re SRE cries. You really don't wanna know the tracker. Some of it is Debian wheezy
10:29

monoxane

they’d absolutely say no though
10:29

BPCZ

If it’s Ceph then S3 is just gratis
10:29

monoxane

tell me to piss right off and never come back
10:29

rewby

You can get keys piss easy
10:29

rewby

Make an account on the IA and go to your profile
10:30

monoxane

not long lasting ones though
10:30

rewby

It'll give them
10:30

monoxane

oh interesting
10:30

rewby

They're just account creds iirc
10:30

rewby

The thing is, we have collections with special flags that make the wbm index them
10:30

monoxane

yea and they probably revolve them if i uploaded at 10gbps
10:30

monoxane

yeap
10:31

rewby

Randos cant just upload warcs and have them show up in the wbm
10:31

nepeat

reliability and automation would be great things to look at
10:31

neggles

most of what struck me as I was digging through code piecing together how this stuff works was, idk, disappointment? but the existential kind
10:31

nepeat

not pure brute force...
10:33

rewby

But our collections are special
10:33

rewby

And have restricted uploader access
10:33

rewby

But all of the IA side is managed by ark.iver
10:33

rewby

I get a set of S3 creds and a collection to shove stuff into
10:33

rewby

If you see us discuss vars, that's our slang for the info I need from him to interface with IA
10:33

rewby

Oh trust me, I wanna replace so much of it
10:33

nepeat

kinda curious, has something like vault been looked at for keeping the secrets outside of env files?
10:33

rewby

But there's only so many hours in a day and I'm overworked as is
10:34

neggles

IA is important, AT is important, but it seems like there's... can't find the right way to say it but "oh come on, companies spend tens of millions on <next stupid internet fad> but *none* of them feel like giving any real resources to something that actually does some good?"
10:34

rewby

Looked at? Sure. But time is limited for most of us.
10:34

rewby

Note that we have 0 budget
10:34

rewby

We fund this ourselves
10:34

neggles

yeah, absolutely not having a go at anyone here
10:34

rewby

Target costs are split between me and like 4-5 other people who all pay for the hardware they donate
10:35

rewby

But importantly, I have names, phone numbers, addresses etc
10:35

rewby

We know where to send goons if someone fucks off
10:36

neggles

I guess i'm just kinda surprised none of the tech giants have decided to get themselves some positive press by throwing a (for them) miniscule amount of funding and resources at this
10:36

rewby

We don't have an org
10:36

neggles

surprised isn't the right word, disappointed
10:36

rewby

Which makes that hard
10:36

nepeat

some of us work for the tech giants ;)
10:37

BPCZ

Some of us would prefer dirty money not get involved
10:37

nepeat

i wouldn't say the money's dirty
10:37

rewby

Money would be nice to finance proper target hw.
10:37

rewby

Or at least pay hosting bills
10:37

nepeat

it's what makes it possible for people like me to spin up a lot of instances for the warrior IPs
10:38

neggles

all money is dirty depending on how you look at it, but that's a whole other question, and if it doesn't come with any strings attached other than "tell people we did this" that's fine
10:39

rewby

From archiveteam.org: Archive Team is a loose collective of rogue archivists, programmers, writers and loudmouths dedicated to saving our digital heritage.
10:39

rewby

This makes money hard
10:39

neggles

(that sounded wrong, s/that's fine/i wouldn't have a problem with it at least/)
10:40

neggles

nepeat: the org whose resources we are making use of do have a /22 or so available
10:40

BPCZ

I’m kind of surprised IA can’t provided a reasonable set of targets
10:40

nepeat

BPCZ: this isn't the IA
10:40

rewby

We're not the IA
10:41

rewby

That graciously deal with the storage and retrieval parts of web archiving for us
10:41

neggles

(and they don't get nearly enough funding either, hence the relatively low amount of ingress they can handle)
10:41

rewby

Which is more than we could ask for anyways
10:43

neggles

yeah
10:43

nepeat

wondering, how can i help with some of the infra and client code?
10:44

nepeat

me putting out my thoughts is one thing for the overburdened team but i like to get my hands dirty and implement said thoughts
10:44

neggles

well, to say what we probably should've opened wit- heh nepeat that's p much what I was about to say
10:44

neggles

monoxane builds k8s-based application orchestration stacks for a living
10:45

rewby

I have a decently interesting design for new target software. But not had the time to implement it.
10:46

rewby

Also, F.usl has been working on a new tracker for years, might need help
10:47

monoxane

yea i’m kubelord, 80% of my job is building kube applications to orchestrate hundreds of gbit of traffic and the orchestration for the orchestrators to make it all manageable from a unified web interface
10:49

rewby

I personally don't trust kube for targets
10:49

rewby

This data is very persistent and not redundant
10:50

monoxane

replacing the warrior with a kubernetes controller that runs the direct job containers is gonna be a 3 day job at most, will look at it over christmas
10:50

monoxane

oh yea for targets it’s absolutely not the right tool
10:50

» rewby is the target person
10:50

nepeat

containerized targets would be very fucky, storage would have to be separated to force that to work...
10:51

nepeat

pretty much creating target2.0 if you are doing that
10:51

neggles

that's not particularly difficult if you're running on baremetal
10:51

neggles

but it's probably not worth the effort
10:51

rewby

I have plans for new target software
10:51

nepeat

agreed, given targets aren't disposable
10:51

monoxane

but for collection at scale? kube, a 100gbe host, and a /24 will give up to 4000 concurrent downloads across an entire public ip range in seconds
10:51

rewby

To: not destroy ssds as much and go faster
10:51

monoxane

ramdisk time :P
10:51

rewby

NO
10:52

rewby

Data loss
10:52

nepeat

:openeyescryinglaughing:
10:52

rewby

Again, if we lose uploaded data, it's gone
10:52

monoxane

yea ik
10:52

rewby

And we have no good way of figuring out what was lost
10:52

monoxane

1pb of zeusrams when
10:53

monoxane

actually a bluefield2 and some nvmeof would make a wonderful target
10:53

neggles

"if we lose data after it hits the target we can't tell what we lost" seems like a problem worth solving
10:53

rewby

Also, one of my servers is under 1.5 years old. Its ssds have 3.5PiB written
10:53

rewby

neggles: again, i have plans
10:53

BPCZ

Hahah I happen to know if a project trying to do multi tbps persisted storage via kube
10:53

rewby

I just need to write it down
10:53

BPCZ

It’s going poorly
10:53

monoxane

also that, maybe it’s worth adding another step to the tracker for “egresses to ia”
10:53

neggles

oh yeah no i'm not suggesting it's easy
10:54

neggles

cause doing what mono just suggested doubles tracker load (and it sounds like the tracker is a bit of a black box at the moment?)
10:55

schwarzkatz|m

are there even any good news regarding that site lately
10:55

schwarzkatz|m

why is it so awfully quiet here currently, where is everybody :c
10:56

rewby

schwarzkatz|m: It's not quiet?
10:56

joepie91|m

that way you optimize for scraping the high-result-count ones first
10:56

joepie91|m

I believe that this is part of Google's n-gram dataset somewhere
10:57

joepie91|m

hm, I thought there was a letter dataset also
10:57

joepie91|m

(which afaik is used in google's language detection thingem)
10:57

madpro|m

<rewby> "Also, F.usl has been working..." <- 🥲
10:57

madpro|m

<rewby> "Also, F.usl has been working..." <- 🥲
11:00

BPCZ

monoxane: how does one become a kubelord
11:01

monoxane

a lot of "wtf how the fuck does that work" and reading golang code
11:02

neggles

if my own attempts are anything to go by, the first step involves creating & recreating your cluster 27 times in 3 different configurations before you find one that doesn't have a showstopping problem that rears its head after you're 3/4 done
11:02

neggles

(assuming you don't want to pay <cloud provider> half a kidney)
11:03

monoxane

lmao also that
11:03

rewby

That tracks with my experience
11:03

monoxane

it took me 8 tries to make a kube cluster, now i can do it in 10 min from bare os
11:03

neggles

oh the other option is to pay red hat $texas for openshift
11:03

nepeat

boring
11:03

neggles

or go dig up all the OSS components of openshift and do it yourself
11:05

BPCZ

Oh ok so I’m most of the way there then. I write go for work and write kube oci providers and modify core kube crap to pass in hardware that’s not supposed to be passed in just yet
11:06

BPCZ

Just need to get to the standing up a cluster part … most of the time I barely figure out a process once and just have an ansible playbook for next time
11:07

neggles

having spent the better part of this year attempting to stand up a cluster that doesn't have some incredibly stupid limitation that makes me throw my hands up in defeat and forget about it for a month
11:07

neggles

good luck >.>
11:07

nepeat

this is overcomplicating the overcomplicated setup
11:08

BPCZ

neggles: I mean all clusters have limitations. I work in distributed systems and clusters professionally. Kube just isn’t used heavily for the big stuff
11:08

monoxane

BPCZ if you really wanna get standing up clusters down, do kubernetes-the-hard-way, like 4 times over, and you will know everything about the intenals and why things are like they are
11:09

monoxane

github.com/kelseyhightower/kubernetes-the-hard-way
11:09

BPCZ

Thanks!
11:09

neggles

the problem with k8s related stuff, from where i'm standing anyway, is it's all focused on "too big" or "too small"
11:09

nepeat

keep it simple. for my AT stuff, i got nomad (containers) + vault (mtls certs) + loki (logs!)
11:09

monoxane

(doesnt have to be gcloud, its just what they use as demo env)
11:09

neggles

there are a lot of ways to spin it up on single hosts that work quite well, are very straightforward, and behave
11:09

neggles

and a lot of ways to spin it up on <cloud provider> that work very well, are easy to manage, and cost an unpredictably-large fortune
11:09

monoxane

and yea, 1 node: easy, 2 to 6: incredibly painful, 6 to 1000: easy af
11:10

BPCZ

Did kube ever grow network topology knowledge? I recall that being a sticking point a while back
11:10

neggles

still a big problem.
11:10

BPCZ

Figures
11:10

neggles

there are several potential solutions, no clear winner
11:10

neggles

the frontrunner seems to be cillium
11:10

monoxane

its a problem but its got a whole lot better now, you can do l3 super easy with stuff like cilium or kube-router that dont rely on internal tunnels between nodes
11:11

monoxane

big thing about cilium is it does ebpf offloading so all the inter-pod stuff is done in the kernel and offloaded to the nic, instead of in userspace like the older CNIs
11:11

neggles

and you can handle rerouting traffic to the 'correct' node without overwriting the source address
11:12

monoxane

and also yknow just use bird to advertise everything between the nodes over bgp instead of cry when the vxlan is broken for no reason
11:12

monoxane

looking at you flannel
11:12

nepeat

+1 to using bgp lol
11:12

neggles

tl;dr it's getting a lot better, rapidly, but it's still not there yet
11:12

neggles

bit of an xkcd competing standards problem
11:12

nepeat

i just have wireguard tunnels and bgp to route my throwaway networks when i spin it up
11:13

BPCZ

Yeah I don’t trust kube for the workloads, and neither does google. Iirc they use nomad for some stuff, but the devs I’ve talked to over their says everything falls over when you get into the high hundred thousand messages a second range with nodes
11:13

nepeat

been looking at netmaker and got it rolled out for this iteration of my cluster
11:13

monoxane

i am currently working on standing up a cluster with 3 nodes in 3 locations connected via ipsec tunnels + bgp + kube-router for shits n gigs
11:13

neggles

google use borg, which is not k8s, but is not not k8s
11:13

BPCZ

Specific workload, they don’t use borg for it
11:13

nepeat

kinda curious, any of you got dashboards for the AT stuff yet?
11:13

monoxane

the best one to look at for implementaion and scale imo is spotify
11:13

nepeat

catgirl.host/i/c6s8s.png
11:13

BPCZ

They use a few thousand node nomad cluster
11:14

schwarzkatz|m

rewby: what do you mean, quiet?
11:14

monoxane

they run 98% of workloads in 13 globally distributed clusters with the capability to hard failover any clusters traffic to any other site in under 5 seconds, they manage it all with an internal tool theyre making open source called BackStage
11:15

BPCZ

Sounds cool
11:16

monoxane

my works clusters are a fair bit smaller and a completely different ballpark, we just have 6 nodes running ~110 pods total but the application stack is designed to be entirely fault tollerrant internally so any service or any node goes down and we're still good
11:16

monoxane

most of the clusters are completely offline most of their life too
11:17

neggles

there was a big sportsball event you might've heard about recently; i will not elaborate further
11:17

monoxane

yea and another, and another :P
11:17

nepeat

i can't say anything about what i do but it's reinforced some good ideas for my personal setups, this included
11:17

nepeat

oh god not the world cup
11:18

monoxane

kube runs the video routing for the superbowl and 80% of global live sports tv
11:18

BPCZ

I wish companies would actually rework applications they chuck into kube. I had to de-kube something recently because the company wrapped a state full system into kube and washed their hands like that would be fine and kube would recover things better than other options
11:18

monoxane

oh yea ours is kube from the ground up, you cannot forklift existing workloads into kube and expect it to go well
11:20

BPCZ

nepeat: you can just say you professionally scan everyone’s butthole while they sleep it’s ok we get it. Companies just really like to know what our bowls are doing
11:21

monoxane

lmao
11:21

nepeat

lmao
11:21

monoxane

but which type of scan
11:21

monoxane

optical or something more exotic like ground penetrating radar through the roof?
11:21

nepeat

i just work for a place that inspires creativity and brings joy...
11:22

monoxane

narrator: it does not
11:22

BPCZ

All the scans, WiFi, roomba radar, brain wave from your sexual partners. If it could detect butthole the kube workload nepeat works on tries to collect it
11:23

BPCZ

Mousewitz?
11:24

rewby

schwarzkatz|m: In response to: 11:54 <schwarzkatz|m> why is it so awfully quiet here currently, where is everybody :c
11:27

BPCZ

This whole conversation reminds me I need to be planning my next job and figuring out where to live next.
11:28

BPCZ

SF or Seattle seem to be the two big options
11:30

neggles

nepeat: so bytedance :v
11:35

rewby

Anyone happen to know the deadline for pixiv?
11:35

schwarzkatz|m

What is happening with this dumb matrix thing, sorry for posting duplicate messages
11:36

schwarzkatz|m

Deadline was 2022-12-15, that’s when their TOS changed
11:36

rewby

schwarzkatz|m: Yeah your matrix stuff is bork. I tried checking my matrix alt and it's delayed like mad.
11:36

rewby

*Ah*
11:36

rewby

Right okay
11:36

rewby

I'm gonna move some stuff around
11:37

schwarzkatz|m

I think it’s only happening in the mobile app though, I have the same problem with discord sometimes
11:40

rewby

monoxane: You were complaining about target limits? Right?
11:42

rewby

Gas gas gas s3.services.ams.aperture-laboratori…-0dca05f9d9ff/1671622931.615019.png
11:43

nepeat

oh neat
11:43

neggles

rewby: wound some more capacity in?
11:43

neggles

lets see how it looks this side...
11:43

rewby

It's provisioning
11:43

rewby

Just hit it as hard as you can
11:43

rewby

I'll scale it up to meet
11:44

rewby

I've hit the *deploy hetzner cloud* buttons
11:44

neggles

"just hit it as hard as you can" <- you may live to regret that
11:45

rewby

Trust me, I've seen worse
11:45

monoxane

rewby we have 1.4tbps online right now lol
11:45

rewby

And I have backpressure
11:46

rewby

The system doesn't accept more data than it can take
11:46

schwarzkatz|m

Argh I hat mobile apps
11:46

rewby

If you hit a target too hard, it'll just shut off inbound and process what it has on disk
11:46

rewby

I can easily scale this into 16-20 gbps
11:46

neggles

more the "scale up to meet" heh
11:46

rewby

Yea I guess
11:46

rewby

I can scale as high as IA has inbound on s3
11:46

neggles

looks like the source is now the limiter
11:48

rewby

Here's your reminder: There's several projects with deadlines in the next 10 days
11:48

rewby

pixiv, uploadir, vlive and buzzvideo are the main ones I know of
11:48

rewby

So throw spare capacity at those
11:48

BPCZ

Can’t believe I went home for Christmas and can’t even do this during the holiday
11:49

neggles

sounds like we should spin up some more workers pointed at those other projects then
11:50

neggles

~3000 of them seems to be about all pixiv can handle
11:50

nepeat

oh man, i see the pixiv spike
11:51

nepeat

snapshots.raintank.io/dashboard/sna…ot/FnHLJL9J0Stas95ZVy4565XdQLv7rJxs
11:51

rewby

I'm not done scaling everything
11:52

monoxane

ill have you know we're currently doing 20gbps
11:52

rewby

I'm well aware yes
11:52

neggles

pixiv has definitely run out of outbound, can't pull more than 300mbit or so from it on top of what we're hitting
11:52

rewby

I have metrics
11:52

neggles

so
11:52

neggles

time to point some at the others?
11:52

rewby

Yes
12:00

rewby

I have three separate ansibles going on trying to move stuff around
12:00

neggles

workin' on it
12:01

monoxane

uploadir has 0 tasks available, so id say we should focus on the others
12:02

rewby

211k out
12:02

rewby

Hm
12:02

rewby

Lemme flush that
12:05

rewby

If you refresh you'll see uploadir tasks
12:06

monoxane

cool got em
12:08

neggles

just provisioning some more VMs :)
12:22

rewby

Hm. I think I'm hitting an IA bottleneck
12:22

rewby

Lemme investigate
12:24

monoxane

IA's only doing 15gbps inbound monitor.archive.org/weathermap/weathermap.html
12:25

monoxane

its likely its saturated s3 ingress though
12:25

rewby

There's 2 lbs
12:25

rewby

I think 10g each?
12:25

monoxane

yeap
12:25

rewby

I need permission to override and use the other one though
12:26

rewby

I have asked, but can't do much until I hear back
12:26

monoxane

valid
12:26

monoxane

bit of fun eh? :P
12:27

rewby

Just let the targets fill up and workers back off, when I hear back I can get the throughput up
12:28

rewby

My inbound on targets was like 16gbps
12:28

rewby

Right up until disks started fillin gup
12:28

rewby

*filling up
12:28

monoxane

we were doing 20.02gbps at peak
12:29

monoxane

from the core network
12:34

rewby

On the upside, pixiv and vlive are looking to be done in ~24 hours according to my numbers
12:34

monoxane

sweet
12:34

rewby

vlive in <6 hours
12:34

neggles

vlive seems to have the most source capacity
12:35

rewby

We also don't have too many items there
12:35

rewby

Either way, when arkiver wakes up he's gonna have a field day finding more items and things to archive
12:36

neggles

there are six more spare boxes - much smaller, "only" 8 core, but with 10G links - i've just been handed keys to
12:36

neggles

used to be minecraft servers
12:37

rewby

I hit a spicy 18.6gbps inbound just a few ago
12:37

neggles

is uploadir stalled/already down?
12:38

neggles

seeing basically zero action out of those
12:40

rewby

Shouldn't be. But IIRC there's a speed limit on that
12:43

neggles

ah
12:46

monoxane

IA just cracked 20gbps inbound
12:49

Doomaholic

Holy crap
12:49

rewby

Over half of that is us
12:49

rewby

tbh, we've done better
12:50

neggles

um, maybe silly question but where do the -grab containers store the files between pull/push?
12:50

neggles

internally
12:50

rewby

In /grab
12:50

neggles

not in a subdir?
12:50

rewby

I don't remember
12:50

neggles

fairo
12:50

neggles

i guess i opened one that hasn't started yet
12:51

rewby

Data storage is synchronous
12:51

rewby

So it always does download -> upload -> download -> upload
12:51

rewby

So if it's waiting for work, it won't have any data stored
12:51

neggles

ah
12:51

neggles

some of these video files are big enough for kube to go "hey, you didn't ask for disk" and boot the pods
12:52

neggles

ah all under /grab/data excellent
12:52

monoxane

yea im currently looking at a cluster thats half evicted pods becuase they used 15gb of ephemeral storage 😆
12:52

rewby

Yeah, the video files are big
12:54

neggles

ok it's happy now
12:55

rewby

For people who were asking to donate target hw: This is what we do to disks: Data Units Written: 6,793,004,499 [3.47 PB]
12:55

rewby

That's in 1.5 years
12:55

madpro|m

I mean, Archive Team cannot be the only people making software for this nowadays. Can it?
12:55

madpro|m

There are tons of companies that do crawling for a business, surely they have open-sourced some more robust trackers by now?
12:55

madpro|m

Not that I know, as I have been searching for myself as well for the past 2 years or so.
12:56

neggles

anyone with a functional wide-scale web crawler / ripper is not going to hand that out for free
12:56

neggles

that's a surefire way to stop it working
12:57

madpro|m

I cannot say I'm nearly as skeptic, seeing other projects like Hadoop in distributed computing
12:57

neggles

hadoop is not the expensive/proprietary/"magic" part of a hadoop-based workflow though
12:58

neggles

it's worthless without the rules and flows and transforms etc
12:58

neggles

while the value of a crawler (on a commercial level anyway) comes from being able to skip around things trying to block crawlers
12:58

rewby

The thing with these kind of trackers: They are tied closely to your workflow.
12:58

rewby

They are very specific
12:59

neggles

same goes for hadoop setup
12:59

rewby

If you tried to make an end-all-be-all tracker you'd end up with something as complex as kube
12:59

neggles

or SAP
12:59

rewby

With a much smaller market
12:59

madpro|m

Well there you go
12:59

rewby

So instead people make trackers that are good enough for their workflow
12:59

neggles

ERP systems are the perfect example; they do everything for everyone, but they do it by having 27,000 different modules that can be wired together in practically infinite ways
13:00

rewby

But then you end up being very very tied to your company
13:02

monoxane

i think pixiv might need a purge too, theres 1m out but it never goes below 99.95k and surely theres not a million jobs being processed rn lmao
13:02

rewby

I'll give it a look in a sec
13:02

madpro|m

Better close this tangent, before discussion shifts back to pixiv.
13:02

rewby

I'm actually disappointed in my upload rate
13:02

rewby

I've done 25G to them before
13:03

rewby

I'm not too worried about pixiv
13:03

rewby

Tldr: It doesn't recycle jobs from the out-list until todo is empty
13:03

rewby

And there's 8M in too
13:03

rewby

*todo
13:04

monoxane

im not either its just a bit of a high number
13:04

monoxane

ah okay i didnt know that
13:04

rewby

Also, monoxane, the bare -grab containers will do concurrency up to 20
13:04

monoxane

yes we know
13:04

rewby

kk
13:04

madpro|m

For now, in terms of tracker development we should look to making do with what we have. The IA wiki and GitHub have a long way to go in terms of documentation.
13:04

madpro|m

Exploiting our own resources and all that.
13:04

monoxane

every single one of the 3000 containers currently running across 20 nodes are on max concurrency
13:04

rewby

Ah okay
13:05

monoxane

we are entirely restricted by IAs ingrest right now
13:06

monoxane

*ingest
13:06

neggles

"haha kubernetes go brrr"
13:07

rewby

I've redirected vlive to a pile of spinning rust
13:07

rewby

To sink your data into
13:08

Doomaholic

Bless
13:08

monoxane

excellent
13:09

neggles

ooooh one of these is a 5900X
13:10

neggles

lounge.neggl.es/uploads/e1333322a07ac379/bueno.png
13:10

rewby

mmmm data s3.services.ams.aperture-laboratori…-bd50f9a499e3/1671628237.819004.png
13:11

Doomaholic

Delicious
13:41

monoxane

i think the next bottleneck might actualy be rsync connections on the targets
13:42

monoxane

like 70% of my pods are sitting here idle waiting to retry dumping
13:42

monoxane

its error 400 not -1 so its not the disk full cutoff
14:38

arkiver

thanks for the ping OrIdow6
14:38

arkiver

still reading some backlog
14:38

arkiver

monoxane: are there several people running under a single 'team name'?
14:39

arkiver

neggles: feel free to make a PR on the warrior docker image
15:06

arkiver

monoxane: if you have a ton of IPs available - telegram could definitely benefit from that, we got quite some backlog to work through
15:07

arkiver

on uploadir - roughly half of the items was 404
15:16

MrSolid

hi guys
15:16

MrSolid

can you please help me archive the website ac-web.org
15:17

arkiver

MrSolid: what is the reason?
15:18

arkiver

ac-web.org is not loading for me
15:18

MrSolid

sites been down for months since being sold to new owners and trying to migrate to new community so information isnt lost
15:18

MrSolid

ac-web.org/index.php
15:18

arkiver

well if the site is down, we can't archive it
15:18

MrSolid

its up for me thats odd
15:19

MrSolid

maybe someones crawling it right now haha
15:19

arkiver

in any case, sounds like a site we should archive yes
15:19

MrSolid

thank you arkiver
15:20

arkiver

loading very slowly now
15:21

MrSolid

i just hope the new owner doesnt shut the site down again before its archived
16:15

monoxane

arkiver not any more, for a little bit yesterday there were a couple people using one name but after we got a harsh no we split and each person controlling a set of nodes is using a different name
16:15

monoxane

will switch some of them to telegram in the morning
16:23

arkiver

monoxane: sounds good - separate names is definitely better for keeping track of who doing what
16:24

arkiver

and yeah as rewby said, feel free to prepend something to the names to show people as being part of the same group
16:25

monoxane

yea will probably do that at some point, the team name thing was more because some people don’t want to be identified so those people are just using team name suffixed with country, identifiable enough for someone to tell ‘em to stop if it’s broken but not to be worked out from the leaderboard
16:26

arkiver

right yeah
16:26

arkiver

so what is this group of people?
16:27

monoxane

friends, some of which work at a tier 1 global isp and have some resources at their disposal
16:27

arkiver

pretty awesome
16:28

monoxane

the 20gbps we were pulling today didn’t even make a single pixel increase in their usage charts (outside of the routes going to targets and the sources)
16:29

arkiver

watch out that if you were to run a project like the URLs project (outlinks from various sources), it may contain any URL you can find online
16:29

monoxane

it was approved because it’s just a fun little load test on their links 😆
16:29

arkiver

though I'd say it is one of our most valuable projects
16:29

arkiver

telegram is likely very safe to run
16:29

arkiver

hah :) sounds good
16:29

monoxane

we will likely only run at full tilt when there’s an “oh fuck” event where we have 24 hours to pull and entire site
16:29

arkiver

alright
16:30

monoxane

and just leave a couple nodes running on a range of projects
16:30

arkiver

yeah we have some end of year shutdown going on at the moment
16:30

monoxane

>just a couple nodes
16:30

monoxane

i say this as if they’re not 100gbe directly attached to an isp core
16:30

arkiver

for the current short term projects bandwidth is the bottle neck somewhere along the way
16:30

arkiver

but for the long term projects, IPs if the bottle neck
16:31

arkiver

is*
16:31

monoxane

yea, we have some potential solutions to the ip bottleneck
16:31

monoxane

one of which involves giving a single node an entire /24 😅
16:32

arkiver

"couple nodes" with each a different /24?
16:32

arkiver

that'd be pretty awesome :)
16:35

monoxane

the only problem with that is burning /24s is less justifiable than burning 20gbps out of a 20+tbit network
16:38

arkiver

yeah which is likely the reason as well why our long term project have IPs as bottle neck rather than bandwidth
16:38

rewby

I think I'm currently still burning two /24s on telegram
16:38

rewby

Or rather, I'm burning someone else's /24s
16:38

arkiver

rewby: and it is really making a difference!
16:39

arkiver

we're slowly working through the huge telegram backlog
16:40

arkiver

note though that we currently cannot keep up with new discovered group posts (we can only keep up with new discovered channel posts)
16:40

arkiver

i'm stashing the group posts at another project at the moment, tracker.archiveteam.org/telegram-groups-temp , which now has 4 billion items
16:41

arkiver

so we'll just feed that in slowly whenever there is room
16:41

arkiver

it's already very good we can keep up with channel posts however, we're discovering and archiving many of them
16:45

Jake

(I missed quite the night here!)
17:14

mgrandi

@arkiver: how are you guys doing telegram? The web view of groups ?
17:14

mgrandi

Also, update on the FA forums, I'm pretty sure that's not what the GDPR means , and also lol, like that's going to stop anyone forums.furaffinity.net/threads/foru…rd-coming-soon.1682702/post-7381985
17:15

arkiver

mgrandi: yes
17:15

arkiver

on telegram
17:17

ivan

"I dont know how that works or if it can take as many messages or forum pages this site has." haha
17:18

mgrandi

@arkiver: that is the easiest way yeah, I have a lot of experience with tdlib but it's daunting how many things to support so the web view probably is the easiest way for now!
17:47

schwarzkatz|m

what is their concern with GDPR on an archived website
17:47

schwarzkatz|m

I don't really get it
17:50

arkiver

mgrandi: yeah, and the web view can go into the Wayback Machine
17:50

schwarzkatz|m

according to deathwatch, zhihu.com/club/explore will stop working on 12-26. it's probably a good idea to archive/grab all links from these pages beforehand
17:51

arkiver

schwarzkatz|m: as in, zhihu.com is shutting down?
17:51

arkiver

hmm
17:52

arkiver

i missed that on deathwatch
17:52

schwarzkatz|m

it says only /explore
17:53

arkiver

hmm yeah but I see the entire thing (zhihu.com) is shutting down next year?
17:53

schwarzkatz|m

looks like it
17:53

arkiver

fun project
18:02

arkiver

any idea where 'Circles' comes from in Zhihu Circles?
18:02

arkiver

JAA: do you know if anything was done for the furaffinity forums?
18:03

schwarzkatz|m

looks like a bunch of api calls to get, I'll try to grab them from /explore
18:03

arkiver

the deathwatch page says "Zhihu Circles" is removing that public access, is Zhihu Circles all of zhihu.com ?
18:05

schwarzkatz|m

a circle seems to be a /club/[0-9]+
18:05

schwarzkatz|m

so all items on /explore are circles
18:05

arkiver

i see. thank you
18:36

schwarzkatz|m

here is a list with all api calls for /explore transfer.archivete.am/aqUwF/zhihu.explore-api-calls.txt
18:50

mgrandi

schwarzkatz|m: the guy probably doesn't know what he is talking about or is thinking that we would be taking the legit forum database
19:28

rewby

Well then, today was an *experience*
19:28

rewby

We're aware pixiv, vlive, etc are having target issues
19:29

rewby

It's actually intentional
19:54

rewby

HCross and me have paused high activity projects at the moment. We have too much backlogged data to process and we need to be careful with the IA.
19:54

rewby

Announcements in project specific channels in a few.
21:36

neggles

rewby: ah, sounds like we might've sent it a bit too hard
21:48

JAA

arkiver: I thought Fur Affinity was thrown into AB, but apparently not. I'll take a look later. Might also qwarc it.
21:53

h2ibot

Arcorann edited Deathwatch (+292, /* 2023 */): wiki.archiveteam.org/?diff=49262&oldid=49255
21:53

schwarzkatz|m

JAA: would collecting all thread & subforum urls be helpful?
21:53

datechnoman

Worst case throw Fur Affinity in #Y project to be trawled through
21:55

JAA

schwarzkatz|m: Not needed, with qwarc, I'd probably just bruteforce thread IDs anyway.
21:55

schwarzkatz|m

with pagination? :O
21:55

JAA

Of course.
21:56

JAA

I've archived XenForo forums before with qwarc, so just need to adjust domains and am probably good to go.
21:56

schwarzkatz|m

okay then, let me know if I could help otherwise :)
21:57

JAA

If they can take the load, I could grab it all in hours. Chances are they can't though. :-)
22:00

datechnoman

Fur Affinity appears to be Cloudflare backed including their images so they will be able to process high throughput id say
22:04

arkiver

JAA: alright, sounds good
22:04

arkiver

and with bruteforcing thread IDs, you could still somehow get the pages?
22:04

arkiver

outlink can of course go into #// :)
22:04

JAA

datechnoman: That doesn't really mean much as it just depends on what the backend server is. Could be a RPi in someone's closet for all we know.
22:05

JAA

arkiver: Yes, I will get thread pagination. No images etc., but those can be extracted later and fed to #// along with the outlinks, yeah.
22:08

datechnoman

Fair call JAA. I guess the CDN helps with throughput and load but the backend processing of the requests is a different story. You can tell my main focus is the #// which is everything all over the place
22:09

JAA

Yeah, it certainly helps with things cached on the CDN. When bruteforcing threads, most won't be in the cache.
22:12

arkiver

sounds good
22:21

JAA

schwarzkatz|m: So forum.lacartoonerie.com is NXDOMAIN now. It was down since end of November anyway, but I guess that means it definitely won't be coming back.
22:22

schwarzkatz|m

good that we got it then :)
22:22

JAA

Your grab is on IA?
22:22

JAA

The ArchiveBot job didn't get far.
22:23

schwarzkatz|m

I thought you got it all :/
22:24

JAA

No, it got errors pretty soon after I started it. That's why I asked about whether you had also seen timeouts in your crawl.
22:24

JAA

I don't think it managed to retrieve much more after that.
22:26

JAA

Ah, I see archive.org/details/forum.lacartoon….com-2022-11-11-24a72456-00000.warc
22:26

JAA

Missing the -meta.warc.gz though, do you still have that?
22:26

schwarzkatz|m

that's unfortunate then
22:26

schwarzkatz|m

my grab is partially in WBM since I at first used SPN exclusively
22:27

JAA

Looks like someone else also did something in September, but it's in WARCZone: archive.org/details/warc_forum_lacartoonerie_com_20220927
22:28

schwarzkatz|m

I have deleted all files after I uploaded that, looks like I didn't see that one
22:28

JAA

Oof
22:28

schwarzkatz|m

what's in there?
22:29

JAA

Log
22:29

JAA

Less important than the data I suppose, but yeah, please upload it on future grabs.
22:29

schwarzkatz|m

will do
22:35

JAA

Do we know of any list of projects on SourceHut that will be removed? If not, can someone try to compile one? sourcehut.org/blog/2022-10-31-tos-update-cryptocurrency
22:41

schwarzkatz|m

searching for related words turns up maybe less than 20 public repos in total. maybe it's a good idea to get these and then archive all 1058 repos?
22:52

JAA

Sounds reasonable.
22:53

JAA

Not sure about archiving all repos actually, but sounds like it shouldn't be too big. Unless there are a dozen copies of Linux and Chromium on it. :-|
22:53

arkiver

"how about we just get everything?" "sounds reasonable" :P
22:54

JAA

transfer.archivete.am/inline/bG4mu/aatt.png
22:54

arkiver

hahaha yeah!
22:55

JAA

I will grab all of sr.ht eventually anyway (when that bot is ready), I'm just not entirely certain it's worth doing that now.
22:59

schwarzkatz|m

transfer.archivete.am/WZZDz/sourcehut.crypto-related.txt
22:59

JAA

Yeah, as expected, there are at least a couple copies of the Linux repo. Those would be duplicated.
22:59

schwarzkatz|m

contains also non cryptocurrency stuff, didn't sort that out
22:59

JAA

Is it 1058 repos or 1058 projects? Projects can have multiple repos, I think.
23:00

schwarzkatz|m

projects then :D
23:00

JAA

Thanks for the list, will do the magic later.
23:00

schwarzkatz|m

great
23:01

JAA

And I might just throw sr.ht into AB and add aggressive ignores to get a general record of what's on there.
23:02

JAA

The project pages should have some records of the (short) commit IDs, too, which could be used to verify mirrors, for example.
23:03

JAA

arkiver: Heard anything from GeoLog?
23:18

JAA

Ah, the repos are on a separate domain anyway, right. So it'd grab those and not recurse further, which is even better.
23:18

JAA

SourceHut does also support unlisted repos, which would be tricky to find.
23:24

arkiver

JAA: no, nothing
23:25

arkiver

ACTUALLY
23:25

arkiver

got a reply literally few hours ago
23:25

arkiver

:)
23:25

JAA

:-)
23:37

Ryz

Ooo, reply? O:
23:37

Ryz

arkiver?
23:39

pabs

JAA: #swh folks pointed me at this rejection of an API to list all SourceHut repos: lists.sr.ht/~sircmpwn/sr.ht-dev/patches/4859
23:40

pabs

JAA: btw, could you pastebin a link of the sr.ht repos you archive into #swh (libera) so they can grab them too?

a year ago

« a day earlier

a day later »

today »