00:10:06 Arkiver edited YouTube (+86, Clarification on scope and newsworthy channels.): https://wiki.archiveteam.org/?diff=51140&oldid=51133 00:11:06 0KepOnline edited Spore (+1422, Add…): https://wiki.archiveteam.org/?diff=51141&oldid=51135 00:27:01 let 00:27:07 let's use #frogger for blogger 01:31:22 JustAnotherArchivist edited Blogger (-45, EFnet #frogger is dead, long live hackint #frogger): https://wiki.archiveteam.org/?diff=51142&oldid=51128 01:38:36 Are there any plans on restarting the furaffinity archival project? Due to its large amount of new content after the 2015 archive 01:38:57 TIL there was a 2015 project. 01:39:18 * Pedrosso looks up if it was 2012 01:39:37 No, I mean, I had no idea we ever did anything with it. 01:39:39 Ahhh 01:39:45 It was 2015 btw 01:40:43 I'm gonna assume that the answer to my question then is a hard no, haha 01:41:54 There was definitely no project since 2017. So probably correct. 01:42:42 I'm not sure it'll happen soon (with the Google stuff and lots of end-of-year shutdowns still to be announced as usual), but I'm all for it. 01:43:17 I see, the enumeration is quite nice. 02:13:12 Also; so that's why things are getting so busy now 03:00:39 JAABot edited CurrentWarriorProject (+4): https://wiki.archiveteam.org/?diff=51143&oldid=51125 03:10:41 PaulWise edited Mailman2 (+3016, more lists done and to do): https://wiki.archiveteam.org/?diff=51144&oldid=50963 03:16:05 -rss/#hackernews- RIP Google Groups Dejanews.com Archive: http://dejanews.com/ https://news.ycombinator.com/item?id=38238796 04:18:55 PaulWise edited Mailman2 (-12, distorted done, claws mail in progress): https://wiki.archiveteam.org/?diff=51145&oldid=51144 04:49:02 JustAnotherArchivist edited YouTube (+627, Datetimeify): https://wiki.archiveteam.org/?diff=51146&oldid=51140 04:52:51 <3 04:59:14 :-) 06:50:36 Composer of a major Japanese indie unit (https://ja.wikipedia.org/wiki/ツユ) unilaterally declares that they are extremely done with everything: https://nitter.net/Pusu_kun. Are Twitter accounts still archivable? (if yes, then someone please feed it to the archiver; I won't wait around to see the answer.) 06:51:52 (god are all those join/leave messages spamming the channel? im so sorry, pidgin won't open a chat window) 09:45:13 Hi, i got an 'Abmahnung' (written warning from a lawyer) for one of the IPs i use for my ArchiveTeam Warriors (on auto-mode). They accuse me of sharing copyrighted porn via BitTorrent. Is there any way that one of the ATWarrior projects could have used BitTorrent? 09:45:55 (The IPs i use are geolocated in Germany) 09:48:22 99.9% sure we do nothing with bittorrent; but feel free to stick around for an official answer 10:45:42 cc arkiver 13:24:11 99,9% is probably enough for me : ) I did figured that there would most prly be no p2p in ATWarrior projects 15:40:59 Correct, there is not. 16:29:58 How does AB handle PHP links with GET parameters? will it just consider them as normal URLs? 16:30:14 (want to put this website in: http://www.telford-electronics.co.uk/index.php - the owner sadly passed away earlier this year) 16:30:30 but the site is basically all done via PHP GET parameters 16:31:33 e.g. http://www.telford-electronics.co.uk/stock.php?type=man leads to a list of links like http://www.telford-electronics.co.uk/stock.php?type=man&alpha=t which are themselves links to pages like http://www.telford-electronics.co.uk/stock.php?type=man&man=30 16:31:58 and then there may be links to e.g. http://www.telford-electronics.co.uk/more_info.php?prod_id=205 16:33:47 betamax: Query strings are kept as is for the most part. Some session ID parameters get removed to avoid looping. 16:34:14 As long as they're links, not
, it should work fine. 16:35:12 great, thanks 17:15:43 Hello. I found a German site that might shut down do to legal issues. Is that something someone here can help me with? 17:21:46 the site is relatively small so if I understood it correctly I should ask in the ArchiveBot chat? 17:22:25 sadly I do not know how close a site needs to be to shutdown befor it becomes an archiveteam thing 17:25:10 https://openjur.de/ 17:26:12 they provide ruleings from german courts for their users 17:30:37 meta: I tried to archive openJur with ArchiveBot a few weeks ago. It got banned fairly quickly. Also, it's quite large, actually. 17:31:25 is there another way to save it? 17:37:36 Just checked, the IP is still banned. 17:37:48

Sie können openJur von dieser IP-Adresse leider nicht aufrufen. Dies ist das Resultat von missbräuchlichen Zugriffen und dadurch Verstößen gegen unsere Nutzungsbedingungen aus dem von Ihnen genutzten IP-Bereich bzw. im Netz des von Ihnen genutzen Internetanbieters. Bei Fragen zur Sperre wenden Sie sich bitte an abuse⊙od

17:39:11 still banned 17:39:19 Their terms are pretty strict about not allowing anything other than personal usage, search indexing, and metadata collection, possibly for legal reasons. So asking them for an exception might not be useful. 17:39:43 so they dug themself in 17:40:25 but archievel is actually alligned to their purpose 17:40:59 so asking might actuall get something useful 17:41:33 Perhaps 17:41:41 That ban is about a month old, by the way. 17:42:07 I think the strict restrictions are more about avoiding commercial use. 17:42:48 Also if your adress is fixed they might just think you wanted to DOS them. 17:44:00 The request rate was nowhere near what would be needed for a DoS though. 17:45:42 And the crawl had an unambiguous user agent identifying it as 'ArchiveTeam ArchiveBot/...'. 18:02:37 So, they just check number of requests against the IP? 18:03:12 I do not think a human made that decission. 18:14:50 Well, if you already tried, there is nothing more that can be done. Thank you for your time and effort. 18:15:20 I must go now. Thanks for speaking with me. 18:15:30 I'll try again. It's a very important resource. 18:15:36 Just not sure how yet. :-) 18:17:32 I wish you good luck. Perhaps I will one day lend my warrior to the effort. I must really go now, so once again, thank you. 18:18:21 I'm not sure whether the ban was automated or manual. If it was automated, the limit seemed quite arbitrary. But the time (17:12 UTC, so 19:12 CEST) makes a manual ban somewhat unlikely, too. 18:38:58 ^ Also up to lend a warrior 19:40:41 I talked about this earlier; now that the https://sporepedia2.foroactivo.com/ archive is done I'm wondering about how to get the failed imgur saves from the logs to send to #imgone . But I don't know how to get the logs (assuming the logs are saved) 19:48:07 Pedrosso: The log gets stored in the job's *-meta.warc.gz file. 19:49:05 Ahhh, good. Is it directly under the zip or do I need to interpret some .warc shenannigans? 19:49:53 It's a WARC file. But you can just `zstdgrep` it or similar. 19:50:29 (The zstd binaries support reading gzipped data, and they're much faster in my experience, likely due to different buffering.) 19:51:20 ah, alright. 19:51:59 Also, gzip has nothing to do with ZIP. 19:52:24 Yea no I got that, just don't have any other terminology for it 19:55:13 The important thing is that it's just a plain record. There's no HTTP nonsense like chunked transfer encoding on top. 20:11:12 Thank you 22:53:45 Grabbing He-Man.org with qwarc, thread pages should be done in 5-ish hours. Hopefully, it doesn't go down at midnight UTC. 22:56:13 that already went through archivebot, right? was there a problem with that job or is this just belt-and-suspenders? 23:00:30 It's running in AB, but it won't finish anytime soon. 23:39:27 Looks like the AB job actually got more or less as much as it could. It discovered around 94k threads. Homepage says almost 132k. Some threads require an account, possibly one with access permission, too. 23:40:29 ah. well, thanks for calling in the cavalry 23:40:47 :-) 23:43:10 https://www.youtube.com/watch?v=QItBdql_8FI <- The Completionist and his charity and livestream event are accused if not donating funds received. Might be good to back up everything. Looks like someone already started an AB job for the Open Hand site specifically. 23:43:56 https://www.indieland.org/ https://www.youtube.com/channel/UC-twB-Z1n73QMyAZ12d941w https://theopenhand.org/ 23:47:51 videos already added in #down-the-tube 23:55:47 Oh yeah, forum attachments on He-Man.org are all behind a login wall, and registration is closed, too. 23:56:23 https://bugmenot.com/view/he-man.org no luck here 23:56:37 (I didn't try it, but 9 years old + 14% success rate = probably banned) 23:58:49 Surprise, surprise, it does not. 23:59:01 (work)