-
V2
Hi, somebody please help me, how to get access to download this file
archive.org/download/archiveteam_urls_20240717143359_16efb7e3?
-
JAA
arkiver: ^
-
JAA
V2: It might help to state why you need access to the file.
-
V2
got it, I am building a URL shortening service, so I really need database of goo.gl/* shortlinks, its 14gb, and listed here
archive.org/download/archiveteam_urls_20240717143359_16efb7e3, and I see its Status as restricted
-
JAA
Uh
-
JAA
So first, please don't build another URL shortener that will die within a few years. And second, where did you find that that file contains a goo.gl database?
-
V2
JAA lot of users facing challenge to find its target links as it may contain useful information, so they are all getting shutdown, so that dababase will find them
-
V2
no, its not new URL shortener, but a small tool to help find its target links when existing links break..
-
JAA
I see. We intend to archive all goo.gl URLs into the Wayback Machine (project channel #urlteamwasright) before they die, so that tool will exist automatically. :-)
-
V2
I was browsing lot of links, but ended up there (
archive.org/download/archiveteam_urls_20240717143359_16efb7e3) as it contains list of all goo.gl links, if not please advise me where it is available
-
JAA
I doubt that file contains any significant number of goo.gl URLs. Certainly not all of them. Our project for that hasn't started yet, and it wouldn't happen here either.
-
JAA
That's why I'm wondering why you ended up at that file to begin with.
-
V2
oh ok, then should i ask in #urlteamwasright? I have been searching for weeks, pls help me give direct access to download it., any number of links also fine, as it definitely helps some users of the community
-
JAA
We do not have such a database yet.
-
V2
all I need is just goo.gl's shortlink (or hash), and its target links, just two fields is enough
-
JAA
#urlteamwasright is where that project will be coordinated in the next months or so.
-
JAA
Yes, we would like to have that, too.
-
V2
Here (
tracker.archiveteam.org:1338/status), I see goo-gl status with almost 7B links found, it should have been stored somewhere.. pls help me find or give some contacts so that I can followup
-
JAA
And now we've come full circle and are back at URLTeam. Yes, that data exists, and it's incomplete.
-
JAA
-
V2
JAA Thanks, but its Publication date 2013-07, quite old, is there a latest version, at least few months old archive?
-
V2
JAA, also its topic says "urlteam, 4url.cc, arseh.at, ff.im, kl.am, links.sharedby.co, litturl.com, surl.ws, tny.im, tr.im, t.co, ur1.ca,", does it definitely contains the
-
V2
"goo.gl" shortlinks also?
-
V2
JAA please clarify to my above two items
-
V2
JAA that torrent address "torrent:urn:sha1:4cf5896b507f3ca6f50819a2788e99dfa5bcb58b" didn't work, tried with several cilents, nothing downloaded
-
datechnoman
-
h2ibot
-
h2ibot
-
h2ibot
-
h2ibot
datechnoman: Deduplicating and queuing 19270666 items. (GuB0k6IH)
-
datechnoman
-
h2ibot
-
h2ibot
datechnoman: Deduplicated and queued 19270906 items. (GuB0k6IH)
-
h2ibot
-
h2ibot
-
h2ibot
datechnoman: Deduplicating and queuing 24995346 items. (tfzhZxEB)
-
datechnoman
-
h2ibot
-
h2ibot
datechnoman: Deduplicated and queued 24995621 items. (tfzhZxEB)
-
datechnoman
-
h2ibot
-
datechnoman
!status
-
h2ibot
datechnoman: Jobs running: 2, jobs waiting for a slot: 0.
-
datechnoman
-
h2ibot
-
datechnoman
!status
-
h2ibot
datechnoman: Jobs running: 3, jobs waiting for a slot: 0.
-
h2ibot
-
h2ibot
-
h2ibot
datechnoman: Deduplicating and queuing 24994220 items. (H3eIJevB)
-
h2ibot
datechnoman: Deduplicated and queued 24994516 items. (H3eIJevB)
-
h2ibot
-
h2ibot
-
h2ibot
datechnoman: Deduplicating and queuing 21083865 items. (YWdsTEyO)
-
h2ibot
-
h2ibot
-
h2ibot
datechnoman: Deduplicating and queuing 24995098 items. (Qff4OMGk)
-
h2ibot
datechnoman: Deduplicated and queued 21084163 items. (YWdsTEyO)
-
h2ibot
datechnoman: Deduplicated and queued 24995353 items. (Qff4OMGk)
-
knecht4
arkiver: any chance you could look into this PR? going on holiday so would be great to have this merged in case of restart in here!
ArchiveTeam/urls-sources #36
-
knecht4
let me know if it needs any changes
-
datechnoman
-
h2ibot
-
datechnoman
-
h2ibot
-
JAA
datechnoman: '#// is sad, so let me slap another couple dozen million URLs on top!'
-
datechnoman
hey, this is how I make myself feel better while it isnt running :(
-
JAA
:-P
-
JAA
datechnoman++
-
eggdrop
[karma] 'datechnoman' now has 22 karma!
-
datechnoman
Alcoholics drinks, I queue URLs
-
datechnoman
:P
-
datechnoman
I have about 1 billions in a stash and it keeps growing while we standby lol
-
datechnoman
1 billion*
-
JAA
Oof
-
datechnoman
My 16 workers are keen to attack the queue...
-
Vokun
Throw them at telegram
-
JAA
By the way, we averaged about 149 million snapshots and 20 TB per day before things ground to a halt.
-
datechnoman
We sure did :( those were the days