-
fireonlivelol
-
h2ibotQueuing bot shutting down.
-
h2ibotQueuing bot started.
-
h2ibotdatechnoman: Restarting unfinished job isAJoDlg for '!a transfer.archivete.am/gsq2b/unique_pdfs_output.txt'.
-
h2ibotdatechnoman: Restarting unfinished job cmKPUONH for '!a transfer.archivete.am/HTbgw/filtered_.pdf_output.txt'.
-
h2ibotdatechnoman: Restarting unfinished job cYpn43AW for '!a transfer.archivete.am/nIY9O/filtered_pdf_files_unique.txt'.
-
h2ibotdatechnoman: Restarting unfinished job YExqnYZo for '!a transfer.archivete.am/y01za/filtered_pdf_files_unique.txt'.
-
h2ibotdatechnoman: Skipped 50 invalid URLs: transfer.archivete.am/JAMvl/filtere…d_pdf_files_unique.txt.bad-urls.txt (YExqnYZo)
-
h2ibotdatechnoman: Deduplicating and queuing 494563 items. (YExqnYZo)
-
h2ibotdatechnoman: Skipped 199 invalid URLs: transfer.archivete.am/nXRHj/filtere…d_pdf_files_unique.txt.bad-urls.txt (cYpn43AW)
-
h2ibotdatechnoman: Skipped 32 very long URLs: transfer.archivete.am/i2SGt/filtered_pdf_files_unique.txt.skipped.txt (cYpn43AW)
-
h2ibotdatechnoman: Deduplicating and queuing 5516002 items. (cYpn43AW)
-
h2ibotdatechnoman: Skipped 4203 invalid URLs: transfer.archivete.am/RfldX/filtered_.pdf_output.txt.bad-urls.txt (cmKPUONH)
-
h2ibotdatechnoman: Fixed 1 unprintable URLs: transfer.archivete.am/ENqon/filtere…d_.pdf_output.txt.not-printable.txt (cmKPUONH)
-
h2ibotdatechnoman: Skipped 1 very long URLs: transfer.archivete.am/V2og1/filtered_.pdf_output.txt.skipped.txt (cmKPUONH)
-
h2ibotdatechnoman: Deduplicating and queuing 9401326 items. (cmKPUONH)
-
h2ibotdatechnoman: Skipped 4203 invalid URLs: transfer.archivete.am/10oMpU/unique_pdfs_output.txt.bad-urls.txt (isAJoDlg)
-
h2ibotdatechnoman: Fixed 1 unprintable URLs: transfer.archivete.am/C3jhU/unique_pdfs_output.txt.not-printable.txt (isAJoDlg)
-
h2ibotdatechnoman: Skipped 1 very long URLs: transfer.archivete.am/MxkEz/unique_pdfs_output.txt.skipped.txt (isAJoDlg)
-
h2ibotdatechnoman: Deduplicating and queuing 9401326 items. (isAJoDlg)
-
arkiverJAA: yeah something like this one is fine to filter, it's likely a loop very specific for this site alone
-
arkiverfiltering out and restarted!
-
thubanarkiver: while we're talking about sources, did you ever put in those pubmed feeds? i know you were thinking about setting up a recrawl system for the delayed-open-access journals, but i see no reason that can't be added later
-
arkiverthuban: no, those are not in at the moment
-
arkiverwe should first get through the current backlog
-
thubanah, k
-
datechnomanShouldnt take too long and we will be back to up-to-date :)
-
imerwe've been on like 3-7day eta for the past month it feels like haha
-
imerarkiver: this one is still around transfer.archivete.am/Jgl23/kep.adatbazisokonline.hu.log although low-ish volume
-
eggdropinline (for browser viewing): transfer.archivete.am/inline/Jgl23/kep.adatbazisokonline.hu.log
-
imerstill like 50/s extrapolated to the whole project
-
imerup to 180/s now
-
nyanyI added more capacity from my side to assist with the destruction of that queue
-
arkiverimer: it seems like legit images
-
arkiveri will try to see where they come from
-
arkiverdownloading a log from AK to check
-
nyanyunder 1 day to clear todo at current rate of speed
-
nyanykeep it up folks
-
nyanynever mind
-
imerits just endless redirects though, not seeing a single 200 unless im missing something
-
xkeyui, my runner passed 1.04TB in data scraped
-
xkeybtw, I asked the hoster; and so far no legal trouble with that machine
-
xkeyso sad
an hour ago