-
pabs
this person regularly deletes older blog posts, would it be appropriate to use this project to save their new posts?
blog.sesse.net
-
pabs
curren't I'm just doing AB !ao but it would be nice to automate it
-
AK
arkiver will be able to correct me, but I guess a monthly queue of the homepage with a depth of 1 should mean we grab each new article
-
DLoader
Got abuse from theanarchistlibrary.org
-
DLoader
We suspect this may be from a botnet, as we have been hit by a number of requests from this the IP above as well as several others. It is a resource exhaustion attack against the webapp running at theanarchistlibrary.org resolvable to 216.252.162.140.
-
AK
Seeing quite a few status code 0 from a couple of domains:
share.aktheknight.co.uk/riJe0/NApuloJa91.png/raw
-
DLoader
itunese... doesn't exist anymore, but it's still seen quite a lot. digital-forum.it seems to have banned Hetzner, esug.org is just some 302 redirect always
-
DLoader
*itunesu
-
DLoader
arkiver maybe something to exclude ^
-
nstrom|m
digital-forum.it seems fine on all my non hetz workers
-
nstrom|m
I feel like we're looping on silky-europe.com due to session ids in url query strings but can't find any actual evidence of that. just seeing a lot of it in my logs
-
nstrom|m
-
nstrom|m
-
arkiver
yeah
-
arkiver
getting my brush
-
arkiver
putting the trash in our little /dev/null corner
-
arkiver
pabs: AK is correct. you can make a PR on urls-sources
-
arkiver
or else i'll add it later
-
arkiver
so this seems to be a bunch of spam sites coming from China aimed at Thailand
-
arkiver
we have indeed been getting quite a bit of anarchistlibary
-
arkiver
partially filtering it
-
arkiver
nstrom|m: indeed SID was not one we checked for, only sid
-
arkiver
fixing the forum URLs too
-
arkiver
cleaning done
-
arkiver
forcing new version
-
arkiver
new minimum version is set
-
arkiver
moving todo:backfeed to todo:secondary
-
nstrom|m
danke
-
AK
Thanks
-
imer
ooh, we're actually making progress now. many thanks :)
-
arkiver
still seem to be having a problem
-
arkiver
different one though
-
arkiver
but where do those bad URLs come from...
-
imer
arkiver: which ones specifically? can dig through my logs if you want
-
arkiver
these odd .de site (without ending /)
-
arkiver
trailing /, i should say
-
nstrom|m
one of them actually loaded once for me but otherwise nearly all seem to be broken
-
arkiver
yeah
-
arkiver
the problem is where they come from
-
arkiver
i don't see them being queued
-
arkiver
if you see them being queued somewhere, please give my a bit of log (including that URL, and the parent URL mentioned a little earlier)
-
nstrom|m
roger that
-
nstrom|m
2023-11-08T20:35:12.825451117Z Queuing for parent URL
bpgqt.petrography.de/app-ads.txt.
-
nstrom|m
2023-11-08T20:35:12.825602179Z Queuing URL
6pi8.stoutly.de.
-
arkiver
nice nice
-
arkiver
very nice thanks nstrom|m
-
nstrom|m
2023-11-08T20:37:24.035748601Z Queuing for parent URL
cbxrt.sociably.de/app-ads.txt.
-
nstrom|m
2023-11-08T20:37:24.036022264Z Queuing URL
78k7y.selflimited.de.
-
nstrom|m
2023-11-08T20:37:24.036036848Z Queuing for parent URL
68q3z.theorize.de/ads.txt.
-
nstrom|m
2023-11-08T20:37:24.036042207Z Queuing URL
9udw2.peripatetic.de.
-
nstrom|m
hope that helps
-
arkiver
greatly!
-
arkiver
i see the problem now
-
arkiver
project paused a bit
-
arkiver
interesting way of doing spam
-
arkiver
i'll disable this queuing for now
-
arkiver
they could do something similar with trust.txt but leaving logic for that in for now
-
arkiver
an update is in
-
arkiver
restarted
-
arkiver
looking pretty good
-
arkiver
thanks nstrom|m :)
-
arkiver
it's not blowing up :)
-
arkiver
or maybe it is
-
DLoader
-
DLoader
saw these
-
DLoader
maybe it's all or most of the special urls
-
arkiver
yeah but not sure if it's coming through those
-
JAA
-
JAA
Queuing URL custom:random=202311&url=https%3a%2f%2f4i792%2emeagre%2ede%2f%2ewell%2dknown%2fsecurity%2etxt.
-
JAA
And then quite a list of them.
-
arkiver
blegh :/
-
JAA
AK's logs are helpful again. :-)
-
arkiver
actually those are 301s
-
arkiver
right
-
arkiver
well that's an annoying loop
-
DLoader
# curl -A "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0)100101 Firefox/119.0"
5otqf.reprieve.de
-
DLoader
FUCK YOU!
-
DLoader
:D
-
arkiver
updates again
-
arkiver
DLoader: that real?
-
» arkiver check
-
arkiver
no :P
-
DLoader
I still get that, but not on my Hetzner servers
-
DLoader
so dunno what they doing
-
DLoader
-
DLoader
Queuing URL custom:random=202311&url=https%3a%2f%2f1mgu%2epainstaking%2ede%2fsitemap%2exml.
-
DLoader
still seeing this arkiver
-
arkiver
hmm
-
arkiver
DLoader: are you sure that second line belonged with that first line?
-
JAA
arkiver: Still seeing similar things in AK's logs.
-
JAA
And I think those are set to 1 concurrency to avoid log mixing etc.
-
arkiver
yeah i see them too now
-
arkiver
(couldn't check earlier)
-
DLoader
-
arkiver
i see the problem
-
arkiver
another fix pushed
-
arkiver
this should really fix it now
-
arkiver
will leave the queue as is, should go down fast
-
arkiver
alright not looking bad
-
arkiver
will move :backfeed to :secondary
-
arkiver
looks good!
-
pabs
-
pabs
arkiver: and I have one other PR open:
ArchiveTeam/urls-sources #22