-
arkiver
i've seen this 888 spam before
-
arkiver
thanks for the ping datechnoman and for pausing JAA , that *.top spam stuff needs some special filtering
-
datechnoman
All good mate, anytime :)
-
datechnoman
Will need some special regex to filter that rubbish spam out :o
-
arkiver
-
thuban
arkiver: i mean where in the repo
-
arkiver
huh
-
arkiver
did i forgot to `git add` the file? hmm
-
arkiver
thuban: it's included now
-
arkiver
900_bbc_mediaguide.txt
-
thuban
arkiver: cool, ty!
-
thuban
are all of those really duplicates? some of them look spurious. also, do you want the script i wrote to generate the list? (i see there's some similar stuff in /other)
-
arkiver
thuban: yeah, feel free to put the script in /other !
-
arkiver
thuban: yeah, those should be duplicates. they're deduplicated from other lists, using the deduplicate_lists.py script
-
thuban
ah, so they are. sorry, i checked with github's code search but github's code search is trash
-
arkiver
so need for sorry :)
-
arkiver
but yeah
-
that_lurker
datechnoman++
-
eggdrop
[karma] 'datechnoman' now has 8 karma!
-
that_lurker
JAA++
-
eggdrop
[karma] 'JAA' now has 33 karma!
-
michaelblob
is there negative karma
-
that_lurker
discord--
-
eggdrop
[karma] 'discord' now has -8 karma!
-
that_lurker
yes :P
-
arkiver
arkiver--
-
eggdrop
[karma] self karma is a selfish pursuit.
-
michaelblob
LOL
-
arkiver
so selfish of me!
-
that_lurker
arkiver++
-
eggdrop
[karma] 'arkiver' now has 21 karma!
-
michaelblob
arkiver++
-
eggdrop
[karma] 'arkiver' now has 22 karma!
-
thuban
arkiver: the list generation is a bit awkward because afaict the bbc doesn't have an index of these anywhere--i manually assembled a list of pages from search results (that's actually the bbc_mediaguides.tsv i accidentally uploaded earlier) and fed them to the scraper
-
thuban
-
eggdrop
-
thuban
can clean up for format etc later
-
datechnoman
Just about to get some rest arkiver . Assuming we will be on hold for awhile longer? No biggie if so. Just checking it I should wait around for a moment or not
-
nyany
datechnoman: probably
-
nyany
can ping you when we resume if you'd like
-
datechnoman
Ack cheers mate. I shall head off than. Night!
-
nyany
cheers mate
-
arkiver
the user agents list is updated
-
arkiver
the spam loop should be fixed
-
arkiver
resuming the project
-
arkiver
the fake nasa one is filtered out now simply
-
imer
nice
-
imer
whats being moved to todo?
-
arkiver
the nasa stuff
-
arkiver
so it if filtered out fast
-
arkiver
because if the filtering is done while the filtered out items are in :todo:backfeed, items will also be taken out of :todo:secondary, which will cause more items to be queued back to :todo:backfeed
-
imer
makes sense
-
arkiver
and then :todo:backfeed goes down slower, which decreases our ability to spot loops or reasons why it may not go down
-
arkiver
however i'll now also move :todo:backfeed to :todo:secondary to start with a fresh empty :todo:backfeed
-
arkiver
that is happening now
-
arkiver
20240417.03 is now the minimum version
-
arkiver
so the spam loop was a new 'version' of an old one
ArchiveTeam/urls-grab e926de3
-
arkiver
paused for a bit as items are being moved around
-
nyany
datechnoman: fyi for when you spin back up ^ (project is still paused)
-
nstrom
looks unpaused now, we're back in business
-
nyany
datechnoman: ^
-
datechnoman
Thanks for the ping. Will get my fleet up and running shortly
-
fireonlive
brrrrrrrrrrr
-
datechnoman
Slap the tracker and target around again xD
-
fireonlive
:D
-
» nyany slaps the tracker around a bit with a large elver