-
pokechu22
looks like
random.fora.pl itself isn't listed on there, but that's reasonable since it's not "supposed to" be an actual forum presumably
-
flashfire42
icewarriors.fora.pl well the random forum isnt very good
-
pokechu22
it looks like the catalog in the sidebar orders them by the number of posts, which is also good for getting an idea of how big some of them are.
-
pokechu22
This is probably a situation where !a < list would be nice if it weren't completely broken :|
-
mikolaj|m
the list of the forums should be also obtainable by scraping the "katalog"
-
yts98
I'm looking for an operator with ArchiveBot access to save the list of games I provide for Game Atsumaru.
-
yts98
The lists I want to archive are data/public.txt, data/public_payment.txt, data/key_valid.txt in
github.com/yts98/game-atsumaru-discovery.
-
yts98
This site will close in 47 hours.
-
yts98
-
flashfire42
yts98 3tt38940863qg137qhybq2xbe see this job on the dashboard It is already running
-
flashfire42
and yts98 #gitgud for Github stuff
-
yts98
Got it, thanks for your help!
-
flashfire42
I dont know how to throw in those lists you provided or if I need ops to do it
-
flashfire42
Granted its at concurrency 1 because I cant actively babysit it. but it should run through a lot
-
yts98
Also, I'm sweeping its API to discover dynamically loaded danmaku comments and scoreboards. I would have an API url list later.
-
flashfire42
I am gonna try doing the lists you gave us too
-
yts98
data/key_valid.txt are unlisted games found on Google.
-
yts98
And also, I'm doing static analysis for the game to discover lazily loaded resources,
-
yts98
-
yts98
so when I have the resource lists, they cannot be digested by Archive Bot of WBM SPN directly.
-
yts98
so jobs e83a218c9a0dff1eac9a5c1f641eee49 9301dfd206dc1b81e97b8593564d63e9 will be all 404
-
vokunal|m
Ever since I posted like 100 links in the telegrab chat this happened
imgur.com/a/ePYCcur
-
vokunal|m
I completely broke the chat. It's on matrix too, so leaving the room didn't fix it
-
fireonlive
oh no
-
fireonlive
there's a 'clear cache and reload' button in the advanced section of settings (bottom) but i'm not entirely sure what it does other than what it sounds like (clear the entire cache/room states for all rooms on all servers)
-
fireonlive
this never happened, but a comment on
vector-im/element-web #5800 says "initial sync (same thing the clear cache & reload does)" so i mean it could take a long time depending how you use matrix (and depending on hackint retention you could lose older messages) so perhaps look for a solution from elsewhere before touching
-
fireonlive
the big red button
-
thuban
yts98: you may want to use grab-site (
github.com/ArchiveTeam/grab-site) or just wget to handle the resources gated by the session cookie--both can accept cookies and output warcs
-
thuban
said warcs won't be whitelisted and can't go in the wbm, but you can still upload them to the internet archive for safekeeping
-
thuban
(the wbm can't play back post requests, so the games wouldn't work through it anyway)
-
yts98
thuban: I'm currently using my scripts to let wget produce warcs, but my bandwidth is relatively limited :/
-
thuban
i have to go afk for now, but if (it's necessary and) you upload the details of your wget process to your github repo, i will help save stuff on my fat pipe tomorrow
-
fireonlive
:3
-
vokunal|m
I'll look into it. hackint.org works fine, and I never really browsed telegrab
-
Ivan226
-
fireonlive
vokunal|m: kk
-
fireonlive
might even be able to have fun with sqlite ™ or something (but ya know be sure to backup)
-
vokunal|m
tartarus needs to put his stuff on downthetube, not mediafire. I've been silently crying since I saw them join it, knowing they'd eventually cement a spot out of the top list forever, and now they've finally passed me by .01TiB
-
h2ibot
Yts98 edited Niconico (+708, Saving Game Atsumaru):
wiki.archiveteam.org/?diff=50014&oldid=50006
-
h2ibot
Switchnode edited Niconico (+13, /* Game Atsumaru */ correct archiving type):
wiki.archiveteam.org/?diff=50015&oldid=50014
-
thuban
yts98: back, ping me if you want me to run something
-
yts98
thuban: you can run step6-MV_iterate_.py on github.com/yts98/game-atsumaru-discovery at first.
-
yts98
you can specify the gameId range, and it will find game resources for 61% of the games.
-
thuban
yts98: is there an id range i should choose to avoid duplicating your work, or have you not run this step yet?
-
yts98
I only run a little (3~190), so you could archive all the RPGMaker MV games.
-
yts98
I'm going to analyze RPGMakerMZ and EasyRPG, and I'm still looking for many volunteers to analyze other game frameworks, or find scraping tools off the shelf
-
arkiver
off the shelf scraping tools for this type of stuff can be difficult
-
arkiver
especially when it comes to archiving work where one really needs to make sure all required URLs are preserved
-
yts98
Agree. other games use Akashic Engine, TyranoBuilder, Unity and others. I guess there may be some tools for Unity?
-
thuban
yts98: ok, running
-
thuban
yts98: i'm seeing a lot of 404s on some games (eg 191); is this correct or is there a problem with the script?
-
yts98
it's correct because some creators removed resources but not removed their reference from the script
-
thuban
ok, good
-
yts98
the script you're running is also doing plugin statistics across the games, in order to find out rarely used plugins that may point to more resources.
-
thuban
yts98: i am working on ids 191-6500, but my provider is having some storage issues i need to work around, so if you or anyone else wants to grab another id range, that would be helpful
-
thuban
(unfortunately i think the bottleneck here is not client-side bandwidth; 12x parallelized i've gotten ~230 games down so far)
-
arkiver
in other words - site is too slow?
-
thuban
not sure, suspect that python/wget could also be more efficient
-
thuban
-
thuban
ditto 4053
-
thuban
and 5047
-
thuban
5533 has a different error due to a bad start byte:
transfer.archivete.am/DclPO/atsumaru_5533.log
-
thuban
ditto 1695
-
thuban
(just dumping the ids here so i know what i've skipped)
-
thuban
4078 has the dict error
-
none
hello is anyone on
-
thuban
-
tzt
tiki.video is shutting down tomorrow
techcrunch.com/2023/06/11/tiki-india
-
arkiver
what
-
arkiver
how many hours do we have?
-
arkiver
eef this articles was posted June 11 and we didn't know
-
JAA
Oof
-
arkiver
-
arkiver
rewby: i hope you are still around
-
arkiver
we have a deadline for tomorrow
-
arkiver
can someone please figure out how many hours we have left given timezones?
-
tzt
21 hours
-
arkiver
> We regret to inform you that Tiki will be shutting down its operations. As of 11.59 PM India time, June 27, 2023, all Tiki functions and services will cease
-
Barto
oh crap
-
arkiver
alright
-
arkiver
rewby: we need an urgent target
-
arkiver
archiveteam_tiki_
-
arkiver
tiki_
-
arkiver
Archive Team Tiki:
-
arkiver
let's make a channel
-
arkiver
i have no ideas
-
arkiver
we'll get at least metadata
-
arkiver
tzt: how did you find out about this? i wonder how we can better monitor for this in the future
-
tzt
arkiver: shutting down OR "closing down" AND site OR service OR server after:2023-05-30
-
tzt
tiki'd off ?
-
JAA
Ah, India Time, one of those lovely half-hour-offset-because-fuck-you-that's-why time zones.
-
JAA
2023-06-27 18:29 UTC
-
Barto
oh boy
-
Barto
tikingbomb
-
» arkiver is rushing something together
-
arkiver
blegh much of this is POST requests
-
Barto
i've thrown some stuffs into ab
-
Barto
it aint much, but it's honest work
-
JAA
I have a German feces and football pun, but that's not a good idea. :-P
-
JAA
I like tikingbomb.
-
arkiver
JAA: please do tell
-
JAA
tiki-kacka
-
fireonlive
google says lei'dback
-
fireonlive
lol
-
arkiver
JAA: i'm for tiki-kacka given this SHITTY timing
-
JAA
lol
-
arkiver
ticking shit bomb
-
Barto
arkiver: interesting thing that ab spotted: a load of videos can be found via sitemap
-
arkiver
Barto: very nice!
-
JAA
I mean, I'm fine with that. :-P
-
manu|m
wait what’s the football part in this?
-
arkiver
#tiki-kacka
-
JAA
manu|m: Tiki Taka is a football play strategy thingy. The Spanish national team is well-known for it.
-
Barto
also titicaca lake is a thing
-
Barto
almost sounds the same
-
manu|m
JAA: oh okay i just don’t know enough about football then, i can live with that. thought for a moment I was stupid or sth ;)
-
fireonlive
🏈 this kind, right? ;)
-
fireonlive
:D
-
JAA
No, football, not handegg. :-)
-
fireonlive
:3
-
fireonlive
⚽ there ya go
-
arkiver
datechnoman: we might need you at #tiki-kacka
-
datechnoman
arkiver - Joining channel now
-
yts98
thuban: my new commit b2cf0ff fixs game 262 1116 5533. going to deal with 4078.
-
thuban
yts98: thanks! 4078 is the same error, the other one was 5533, 1695, and now 2755
-
thuban
(did you mean 5047?)
-
yts98
thuban: 1695 5047 also work now
-
h2ibot
JustAnotherArchivist created Tiki (+471, Created page with "{{Infobox project | URL =…):
wiki.archiveteam.org/?title=Tiki
-
h2ibot
JustAnotherArchivist edited Deathwatch (+20, /* 2023 */ Add Tiki):
wiki.archiveteam.org/?diff=50017&oldid=50005
-
thuban
yts98: i assume i should keep my local copy of data/MV_plugins.json, but what about data/iterate/urls_gm*.txt ?
-
thuban
nvm, appear to be identical
-
yts98
I thought the url list data/iterate/urls_gm*.txt could be sent to operators trusted by IA, but you can freely delete them because urls can be derived form warcs
-
thuban
do those urls work without the session cookie?
-
yts98
s/form/from
-
yts98
no, they must be fetched with session cookie
-
yts98
and I wonder whether the session cookie is binding with IP
-
thuban
-
yts98
thuban: commit 2e4fc6c fix 703.
-
yts98
there is new step6-ER_iterate_.py that can be run in parallel for 299 RM2000/RM2003 games.
-
yts98
oh no, data/tmp conflicts. I'll fix it.
-
yts98
wget url list file conflicts fixed (for Atsumaru).
-
thuban
i'm elbow deep in the storage thing, so you probably want to run ER yourself
-
yts98
thuban: got it. start running ER and start analyzing RMMZ.
-
h2ibot
Ryz edited Deathwatch (+104, /* 2023 */ Add Microsoft Language Portal):
wiki.archiveteam.org/?diff=50018&oldid=50017