-
Ryz
Not sure if this is the right place, what other internet search engines do you folks recommend for normal and specialized purposes in relation to internet archiving? I use DuckDuckGo normally, with Bing, and Google as a last result; then again, search results overall have been deteriorating more and more... :C
-
Ryz
I also try using Yandex, I believe the Russian version, which gives different results than the English version oO;
-
pokechu22
IIRC DuckDuckGo gets its results from bing - I generally only use it and bing
-
pokechu22
Yandex is probably good for russian stuff, and
search.yahoo.co.jp is good for japanese stuff, but I don't use them as general things
-
pokechu22
That said, google (and ddg) treat site:sinapism.net differently from site:www.sinapism.net, and you'll get fewer resutls. ddg/bing is also more willing to treat binary files as if they were text.
-
pokechu22
Google also supports site:*.sinapism.net -site:www.sinapism.net which is useful for subdomains (though not useful here). ddg does not support the * one (it does still support -)
-
Harzilein
hi
-
Harzilein
i naively thought wget-lua --save-headers --max-redirect=0 --content-on-error
tiny.cc/PirateChangelog would save the redirect headers, but alas it doesn't...
-
Harzilein
i understand you don't want to change the command line interface, but maybe some kind of pre-flight lua script could adjust things when someone extended the logic to support this case?
-
OrIdow6
Harzilein: We use wget-lua with WARC output rather than writing the body to a file
-
OrIdow6
(Which is kind of a standardized version of --save-headers)
-
Harzilein
hmm... yeah, that makes sense
-
Harzilein
wonder if there is an importer for warc-formatted responses into polipo format...
-
OrIdow6
Looks like vanilla wget doesn't save anything either
-
Harzilein
yeah, that's why i thought adding additional command line options over the lua ones would probably not be accepted into the wget-lua code
-
Harzilein
but a buffed out lua api for such cases might
-
Harzilein
but maybe the warc workaround isn't too bad
-
OrIdow6
What changes are you thinking of?
-
Harzilein
OrIdow6: modifying the user agent object (if such a thing exists) to also save headers when there is _no_ content
-
Harzilein
(i.e. a dictionary entry)
-
OrIdow6
Ah
-
Harzilein
s/to also/to allow to also/
-
Harzilein
i.e. with no lua loaded to explicitely enable it, it'd still behave like wget
-
OrIdow6
I don't think adding that as a command-line option would cause us much trouble? It'd make the --help a line longer, but aside from that sounds like it'd only be a few little differences
-
OrIdow6
A change like that might benefit more people being added to the vanilla wget (which has not taken stuff from wget-lua in I think close to a decade now)
-
Harzilein
arent't the wget2 people at the help at wget now?
-
Harzilein
helm*
-
Harzilein
i.e. they'd consider it in maintenance mode
-
OrIdow6
Oh, yeah
-
OrIdow6
Haven't really been keeping track of that
-
OrIdow6
Well, if you want to PR it, feel free; I'm not in control of the repo so it's not my decision but I don't think the people who do would mind
-
Harzilein
:)
-
h2ibot
Bzc6p edited Indafotó (+76, /* Archiving */ Some details):
wiki.archiveteam.org/?diff=50468&oldid=49975
-
arkiver
looks like we have backfeed issues
-
kaz
nobody tell immibis
-
nicolas17
what's going on with the tracker?
-
nicolas17
there's almost no completed items
-
nicolas17
I'm getting "Failed to submit discovered URLs.wantreadnil" on imgur
-
nicolas17
but todo keeps going down, so workers are getting items, and apparently failing them? I guess claims is going up fast
-
nicolas17
it may make sense to pause the projects so we stop giving out more items?
-
nicolas17
arkiver: ^
-
kaz
Being worked on
-
kaz
issue has been located
-
Exorcism
nicolas17: arkiver replied to you in #imgone
-
arkiver
no worries about this issues will be resolved very soon
-
nicolas17
dw take your time to resolve it, I'm just concerned about the item-trashing *until* it's resolved
-
flashfire42
Issue is still ongoing by the look of it I will pause the warrior for the time being
-
nicolas17
yeah, should pause all projects on the tracker side imo :/
-
flashfire42
cc arkiver
-
rewby
I think there's some redis related fuckery going on
-
rewby
It's being worked on as far as I knwo
-
arkiver
pausing projects
-
arkiver
should soon be back up
-
nicolas17
imgur todo is still going down 🤔
-
arkiver
uh
-
arkiver
therew go
-
nicolas17
skyblog also dequeueing 2000/min?
-
nicolas17
ah no old numbers, sorry
-
nicolas17
it's fine