зеркало из https://github.com/mozilla/gecko-dev.git
Bug 1486071 - Retry docker-image and packages tasks that fail during apt-get. r=dustin
When apt-get fails, it has a distinctive error code (100). Most of the time, when apt-get fails, it's because of some network error, or possibly some problem unpacking archives. When that happens, retrying the task usually "fixes" the issue. One of the (currently) most common causes of problems is snapshot.debian.org not being available to some of the EC2 instances. It would be possible to only set things up so that we only retry when we detect such setup (checking the public IP of the instance is not in the known list of problematic IPs), but that would require possibly wrapping apt-get, or something along those line, which is not entirely trivial to do for the packages tasks, because they don't rely on docker images. However, since there aren't many apt-get failures other than these, and since there have been, historically, some intermittent apt-get failures of a different nature that were solved by re-running the tasks, it seems fair to just retry wheneven apt-get fails. One downside of the approach is that if for some reason a change to a Dockerfile ends up mentioning a package that doesn't exist, that too will result in multiple retries ; which might be inconvenient, but that's not something that's going to happen often. Differential Revision: https://phabricator.services.mozilla.com/D11420 --HG-- extra : moz-landing-system : lando
This commit is contained in:
Родитель
dc6ff756f2
Коммит
951d78513a
|
@ -179,6 +179,8 @@ def fill_template(config, tasks):
|
|||
'docker-in-docker': True,
|
||||
'taskcluster-proxy': True,
|
||||
'max-run-time': 7200,
|
||||
# Retry on apt-get errors.
|
||||
'retry-exit-status': [100],
|
||||
},
|
||||
}
|
||||
# Retry for 'funsize-update-generator' if exit status code is -1
|
||||
|
|
|
@ -82,6 +82,8 @@ def docker_worker_debian_package(config, job, taskdesc):
|
|||
repo=docker_repo,
|
||||
dist=run['dist'],
|
||||
date=run['snapshot'][:8])
|
||||
# Retry on apt-get errors.
|
||||
worker['retry-exit-status'] = [100]
|
||||
|
||||
add_artifacts(config, job, taskdesc, path='/tmp/artifacts')
|
||||
|
||||
|
|
|
@ -104,7 +104,15 @@ def post_to_docker(tar, api_path, **kwargs):
|
|||
elif 'stream' in data:
|
||||
sys.stderr.write(data['stream'])
|
||||
elif 'error' in data:
|
||||
raise Exception(data['error'])
|
||||
sys.stderr.write('{}\n'.format(data['error']))
|
||||
# Sadly, docker doesn't give more than a plain string for errors,
|
||||
# so the best we can do to propagate the error code from the command
|
||||
# that failed is to parse the error message...
|
||||
errcode = 1
|
||||
m = re.search(r'returned a non-zero code: (\d+)', data['error'])
|
||||
if m:
|
||||
errcode = int(m.group(1))
|
||||
sys.exit(errcode)
|
||||
else:
|
||||
raise NotImplementedError(repr(data))
|
||||
sys.stderr.flush()
|
||||
|
|
Загрузка…
Ссылка в новой задаче