Simultaneous Downloads Failing - ComputerCraft Forums (archive)

oeed #1

oeed's profile picture

2151 posts

Location Auckland, New Zealand

Posted 02 October 2014 - 07:40 AM

The OneOS installer has been somewhat plagued with an issue which only appears to some users.

I'm not 100% sure what's causing it, but I think it might have something to do with the fact that I use the parallel API to download all the files (maybe about 100, not too sure) at once. This makes is significantly faster than doing one file at a time, which can take a few minutes rather than a few seconds. As I said, I'm not certain that this is the issue, but it seems the most likely reason.

Essentially, what's happening is http.get is returning nil at random points. Here is a heavily stripped version of the code. If you want to see the whole thing look at the Pastebin file.


local latestReleaseTag = 'v1.2.6' --it gets this dynamically

function downloadBlob(v)
    if v.type == 'tree' then
        fs.makeDir('/'..v.path)
    else
        local f = http.get(('https://raw.github.com/oeed/OneOS/'..latestReleaseTag..v.path):gsub(' ','%%20'))
        if not f then
            error('Downloading failed, try again. '..('https://raw.github.com/oeed/OneOS/'..latestReleaseTag..v.path):gsub(' ','%%20'))
        end
        local h = fs.open('/'..v.path, 'w')
        h.write(f.readAll())
        h.close()
    end
end

local downloads = {}
for i, v in ipairs(tree) do
    table.insert(downloads, function()downloadBlob(v)end)
end

parallel.waitForAll(unpack(downloads))

Any thoughts would be appreciated. It might be of use to take a look at this GitHub issue or the OneOS topic (my signature),

Edited on 02 October 2014 - 05:42 AM

ElvishJerricco #2

808 posts

Posted 02 October 2014 - 08:14 AM

I think this might be a bug. When I tried CC 1.64, I was completely unable to make downloads off github, despite my whitelist being set to *. In fact, I couldn't use the http library for anything except pastebin. Wouldn't even work with computercraft.info. I dismissed it as probably my own error, and lazily went back to 1.63.

Bomb Bloke #3

Bomb Bloke's profile picture

7083 posts

Location Tasmania (AU)

Posted 02 October 2014 - 08:24 AM

EJ has a good point; there've been a few people posting http issues with 1.64, and no suggested workarounds/fixes thus far.

That said, I still think opening up ~100 connections at once is a really bad idea, and it certainly wouldn't hurt to implement a basic retry mechanism. Eg:

local connectionLimit = 10

local latestReleaseTag = 'v1.2.6' --it gets this dynamically

function downloadBlob()
	while #tree > 0 do
		local v = table.remove(tree,1)

		if v.type == 'tree' then
			fs.makeDir('/'..v.path)
		else
			local tries, f = 0

			repeat 
				f = http.get(('https://raw.github.com/oeed/OneOS/'..latestReleaseTag..v.path):gsub(' ','%%20'))
				if not f then sleep(5) end
				tries = tries + 1
			until tries > 5 or f

			if not f then
				error('Downloading failed, try again. '..('https://raw.github.com/oeed/OneOS/'..latestReleaseTag..v.path):gsub(' ','%%20'))
			end

			local h = fs.open('/'..v.path, 'w')
			h.write(f.readAll())
			h.close()
		end
	end
end

local downloads = {}
for i=1,connectionLimit do
	table.insert(downloads, downloadBlob)
end

parallel.waitForAll(unpack(downloads))

oeed #4

oeed's profile picture

2151 posts

Location Auckland, New Zealand

Posted 02 October 2014 - 08:29 AM

Bomb Bloke, on 02 October 2014 - 08:24 AM said:

EJ has a good point; there've been a few people posting http issues with 1.64, and no suggested workarounds/fixes thus far.

That said, I still think opening up ~100 connections at once is a really bad idea, and it certainly wouldn't hurt to implement a basic retry mechanism. Eg:

local connectionLimit = 10

local latestReleaseTag = 'v1.2.6' --it gets this dynamically

function downloadBlob()
	while #tree > 0 do
		local v = table.remove(tree,1)

		if v.type == 'tree' then
			fs.makeDir('/'..v.path)
		else
			local tries, f = 0

			repeat
				f = http.get(('https://raw.github.com/oeed/OneOS/'..latestReleaseTag..v.path):gsub(' ','%%20'))
				if not f then sleep(5) end
				tries = tries + 1
			until tries > 5 or f

			if not f then
				error('Downloading failed, try again. '..('https://raw.github.com/oeed/OneOS/'..latestReleaseTag..v.path):gsub(' ','%%20'))
			end

			local h = fs.open('/'..v.path, 'w')
			h.write(f.readAll())
			h.close()
		end
	end
end

local downloads = {}
for i=1,connectionLimit do
	table.insert(downloads, downloadBlob)
end

parallel.waitForAll(unpack(downloads))

Yea, it may be an issue with 1.64, although if you've got to the point where you're downloading files it's made numerous successful HTTP calls to get the current version, file list, etc. It's been happening for quite a while too.

I'll put a retry mechanism in, although your code won't completely, work. I'm pretty sure it will only download the first 10 files and ignore the rest. What I really need to do is maybe loop over all the files and make a table of tables, each with a max of ten items. Not sure about that sleep(5) either, although I suppose it's better than failing.

Edited on 02 October 2014 - 06:30 AM

Bomb Bloke #5

Bomb Bloke's profile picture

7083 posts

Location Tasmania (AU)

Posted 02 October 2014 - 09:05 AM

oeed, on 02 October 2014 - 08:29 AM said:
I'm pretty sure it will only download the first 10 files and ignore the rest.

Eh?

theoriginalbit #6

7508 posts

Location Australia

Posted 02 October 2014 - 09:35 AM

oeed, on 02 October 2014 - 08:29 AM said:
Not sure about that sleep(5) either, although I suppose it's better than failing.

well the purpose of it would be that it has failed, this could be due to server load, so you just give it some time and then try again.

oeed #7

oeed's profile picture

2151 posts

Location Auckland, New Zealand

Posted 02 October 2014 - 09:39 AM

theoriginalbit, on 02 October 2014 - 09:35 AM said:
oeed, on 02 October 2014 - 08:29 AM said:
Not sure about that sleep(5) either, although I suppose it's better than failing.
well the purpose of it would be that it has failed, this could be due to server load, so you just give it some time and then try again.

Yea, it's just 5 seems a little high. Especially if you're trying 5 times.

Bomb Bloke #8

Bomb Bloke's profile picture

7083 posts

Location Tasmania (AU)

Posted 02 October 2014 - 10:24 AM

It serves as a throttle. If your connections are failing because either the sending or receiving server doesn't like you flooding, then the answer is to reduce the number of active connections at a time. Frankly I'd be tempted to increase the timer; bear in mind it only kicks in if there are "problems", and specifically serves to work around them.

You're more than capable of finding a balance that suits you, I'm sure. ;)/>

theoriginalbit #9

7508 posts

Location Australia

Posted 02 October 2014 - 10:25 AM

oeed, on 02 October 2014 - 09:39 AM said:
Yea, it's just 5 seems a little high. Especially if you're trying 5 times.

it just allows things to potentially calm down.

Lyqyd #10

Lyqyd's profile picture

8543 posts

Posted 02 October 2014 - 03:56 PM

Yeah, that many concurrent connections all kicking off at essentially the same time is a recipe for problems. I'd cut down the number of worker coroutines to 4-6 and implement a queue, whereby when a worker coroutine has finished downloading a file, the manager gives it a new file to download. You still gain any possible benefits from parallelizing the download process, without looking like a flooder or suddenly spiking the server's download bandwidth usage.