-
jojomcbean
!list
-
OrIdow6^2
Who has the ability to compile and run wget-at? Not in a docker container or anything fancy, just on your computer
-
OrIdow6^2
If they just ran text files and coordinated on the wiki back in the old days we can still do that for MBNet
-
h2ibot
Ryz edited URLTeam (+385, /* Alive */ Added wu.to):
wiki.archiveteam.org/?diff=49076&oldid=49075
-
rewby
OrIdow6^2: Anyone can just pull the wget-at binary out of the docker image and run it. No need to compile.
-
h2ibot
Ryz edited URLTeam (+429, /* Alive */ Added wn.nr):
wiki.archiveteam.org/?diff=49077&oldid=49076
-
h2ibot
Ryz edited URLTeam (+274, /* Alive */ Added okt.to):
wiki.archiveteam.org/?diff=49078&oldid=49077
-
Maakuth|m
-
Maakuth|m
I can do it
-
h2ibot
Ryz edited URLTeam/Dead (+47, /* Dead or Broken */ Added hoot.to):
wiki.archiveteam.org/?diff=49079&oldid=48563
-
h2ibot
Ryz edited URLTeam (+200, /* Alive */ Added ptix.at):
wiki.archiveteam.org/?diff=49080&oldid=49078
-
Maakuth|m
I suppose it's no problem running it in docker, seems to be easiest way to have it built
-
Jake
I've had wget-at compiled for a while. Happy to help with anything.
-
h2ibot
Ryz edited URLTeam (+413, /* Alive */ Added dibb.me):
wiki.archiveteam.org/?diff=49081&oldid=49080
-
OrIdow6^2
Great Jake and Maakuth|m, basically my idea (not actually my idea, this is how AT did things before the automated tracker) as of present is that there will be an Etherpad with "items" that represent sets of subdomains; a person "claims" an item by editing the pad to say so, puts it into a bash script that calls wget, uploads it, then puts the link to the upload into the pad
-
OrIdow6^2
Will try to get something set up w/i the next hour
-
OrIdow6^2
And I'll warn you there's a ~60% chance this will in the end be unnecessary because #Y will become operational
-
OrIdow6^2
I guess this should move to #webroasting
-
Maakuth|m
ok, I'll join there
-
Jake
Just let me know. I'm going to bed somewhat soon, but just ping me and I'm happy to help.
-
OrIdow6^2
Ok
-
Maakuth|m
any ideas about 502 errors on upload on transfer.archivete.am?
paste.debian.net/1256536
-
Maakuth|m
what is BunnyCDN anyway
-
Maakuth|m
too big item perhaps?
-
JAA
Maakuth|m: Yeah, BunnyCDN errors out if you try to upload a too large file.
-
HCross
JAA: if the offload to S3 takes too long, that may cause the issues
-
TheTechRobo
I made an IRC bot that archives the metadata and chat of Twitch vods. It actually kinda works. Channel name suggestions?
-
JAA
HCross: Well yeah, if anything from the user to the transfer origin takes too long, I think. Depending on how the upload is done, the file gets buffered on the server before offloading to S3, by the way.
-
spirit
twirch?
-
JAA
burnthetwitch (which mgrandi is sitting on since two months ago, lol)
-
JAA
Oh right, that was actually the channel name on EFnet.
-
TheTechRobo
I'm up for whatever. :-)
-
Jake
Can't remember who asked, but forum.tek.com is _very_ hard to crawl. Got a additional things to try.
-
Bluebanana
Hello! I wonder what the difference is between "--level=" and "--page-requisites-level=" when using grab-site? Thanks in advance!
-
TheTechRobo
Bluebanana: --level is for links, --page-requisites-level is for page requisites (like stylesheets, images, iframes, etc) iirc
-
Bluebanana
TheTechRobo Thanks for the help!
-
TheTechRobo
Bluebanana: Any time!
-
TheTechRobo
Remember that offsite links ignore --level.
-
h2ibot
LegitSi edited Alive... OR ARE THEY (-464, i updated the identify your breyer entry based…):
wiki.archiveteam.org/?diff=49084&oldid=49062
-
h2ibot
-
Bluebanana
Is it normal to get this error message in the beginning of the crawl?
-
Bluebanana
/home/username/gs-venv/lib/python3.8/site-packages/wpull/protocol/http/client.py:185: UserWarning: HTTP session did not complete. warnings.warn(_('HTTP session did not complete.'))
-
Bluebanana
TheTechRobo
-
TheTechRobo
Bluebanana: Yes
-
TheTechRobo
I get that a lot
-
Bluebanana
TheTechRobo OK, thanks!
-
TheTechRobo
Don't worry, if anything fails it'll get retried
-
TheTechRobo
i think the default is 4 tries
-
Bluebanana
The weird thing is that it always in the start of the crawl, but not later.
-
Bluebanana
*it's always
-
Bluebanana
No matter the domain. But I guess I can ignore it then :)
-
h2ibot
TheTechRobo edited Elwha River Restoration Project (+30):
wiki.archiveteam.org/?diff=49086&oldid=48906
-
TheTechRobo
I guess I'Ll do #burnthetwitch since a few people are there already