[Container-tools] Delayed Release && Broken Images in Docker - Post Mortem

Dusty Mabe dusty at dustymabe.com
Thu Oct 22 13:16:34 UTC 2015


Hi all,

We set out to do a release yesterday, but we hit some issues along the way. I am 
documenting these issues here so we can better understand the problems and perhaps
avoid them in the future.

We started to do some testing yesterday for the release and we realized that building 
Atomic App in container images locally did not work. This is the error we recieved:

```
[root at f23 atomicapp]$ docker run -it --rm  projectatomic/atomicapp:latest run foo
Traceback (most recent call last):
  File "/usr/bin/atomicapp", line 5, in <module>
    from pkg_resources import load_entry_point
  File "/usr/lib/python2.7/site-packages/pkg_resources.py", line 3007, in <module>
    working_set.require(__requires__)
  File "/usr/lib/python2.7/site-packages/pkg_resources.py", line 728, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/usr/lib/python2.7/site-packages/pkg_resources.py", line 626, in resolve
    raise DistributionNotFound(req)
pkg_resources.DistributionNotFound: pip
```


That was odd. So we decided to pull 0.1.11 from docker hub and we saw the same
problem; very odd! We narrowed down the problem to a few factors:

- A new version of python-lockfile (one of our dependencies) was uploaded to pypi
  on 2015-10-12 [1]. This new version has some new dependencies and the dependency 
  chain includes pip itself.
- The old version of python-lockfile in pypi [2] did not have these dependencies.
- We remove the python-pip rpm in the Dockerfile build process.

This meant that building container images, even of our old tags, would result in
a broken container that wouldn't run Atomic App. But why did our images that we
had uploaded to Docker hub start breaking? Well, while charlie was preparing for
release yesterday he clicked the big orange button in the docker hub web inerface 
that triggered a re-build of all available tags. This caused most of our tagged 
builds to get rebuilt and thus now the images in the hub exhibit this problem.

We have a few solutions for this:

- Short term, we'll require the older python-lockfile and do a new release
  today. People can use the new release. We could try to "fix" the tags if needed
  but at this point rather than doing trickery, I'd rather just move forward.
- Long term, let's start using rpms (if possible) rather than pip to pull in
  software. If we were using rpm then the yum command to remove python-pip would
  have failed and the missing dependency would have been noticed during build.


Thoughts?

- Dusty

[1] - https://pypi.python.org/pypi/lockfile/0.11.0
[2] - https://pypi.python.org/pypi/lockfile/0.10.2




More information about the Container-tools mailing list