This makes sure fsdiff doesn't try to unmount things that shouldn't be.
**Note**: This is intended as a temporary solution to have as minor a
change as possible for 1.11.1. A bigger change will be required in order
to support container re-attach.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
People have reported following issue with overlay
$ docker run -ti --name=foo -v /dev/:/dev fedora bash
$ docker cp foo:/bin/bash /tmp
$ exit container
Upon container exit, /dev/pts gets unmounted too. This happens because
docker cp volume mounts get propagated to /run/docker/libcontainer/....
and when container exits, it must be tearing down mount point under
/run/docker/libcontainerd/... and as these are "shared" mounts it
propagates events to /dev/pts and it gets unmounted too.
One way to solve this problem is to make sure "docker cp" volume mounts
don't become visible under /run/docker/libcontainerd/..
Here are more details of what is actually happening.
Make overlay home directory (/var/lib/docker/overlay) private mount when
docker starts and unmount it when docker stops. Following is the reason
to do it.
In fedora and some other distributions / is "shared". That means when
docker creates a container and mounts it root in /var/lib/docker/overlay/...
that mount point is "shared".
Looks like after that containerd/runc bind mounts that rootfs into
/runc/docker/libcontainerd/container-id/rootfs. And this puts both source
and destination mounts points in shared group and they both are setup
to propagate mount events to each other.
Later when "docker cp" is run it sets up container volumes under
/var/lib/dokcer/overlay/container-id/... And all these mounts propagate
to /runc/docker/libcontainerd/... Now mountVolumes() makes these new
mount points private but by that time propagation already has happened
and private only takes affect when unmount happens.
So to stop this propagation of volumes by docker cp, make
/var/lib/docker/overlay a private mount point. That means when a container
rootfs is created, that mount point will be private too (it will inherit
property from parent). And that means when bind mount happens in /runc/
dir, overlay mount point will not propagate mounts to /runc/.
Other graphdrivers like devicemapper are already doing it and they don't
face this issue.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
In TP5, Hyper-V containers need all image files ACLed so that the virtual
machine process can access them. This was fixed post-TP5 in Windows, but
for TP5 we need to explicitly add these ACLs.
Signed-off-by: John Starks <jostarks@microsoft.com>
If aufs is already modprobe'd but we are in a user namespace, the
aufs driver will happily load but then get eperm when it actually tries
to do something. So detect that condition.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
This adds support to the Windows graph driver for ApplyDiff on a base
layer. It also adds support for hard links, which are needed because the
Windows base layers double in size without hard link support.
Signed-off-by: John Starks <jostarks@microsoft.com>
Fixes an issue that prevents nano server images from loading properly. Also updates logic for custom image loading to avoid preventing daemon start because an image failed to load.
Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>
Overlay tests were failing when /var/tmp was an overlay mount with a misleading message.
Now overlay tests will be skipped when attempting to be run on overlay.
Tests will now use the TMPDIR environment variable instead of only /var/tmp
Fixes#21686
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
On aufs, auplink is run before the Unmount. Irrespective of the
result, we proceed to issue a Unmount syscall. In which case,
demote erros on auplink to warning.
Signed-off-by: Anusha Ragunathan <anusha@docker.com>
Since the layer store was introduced, the level above the graphdriver
now differentiates between read/write and read-only layers. This
distinction is useful for graphdrivers that need to take special steps
when creating a layer based on whether it is read-only or not.
Adding this parameter allows the graphdrivers to differentiate, which
in the case of the Windows graphdriver, removes our dependence on parsing
the id of the parent for "-init" in order to infer this information.
This will also set the stage for unblocking some of the layer store
unit tests in the next preview build of Windows.
Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>
These fields are needed to specify the exact version of Windows that an
image can run on. They may be useful for other platforms in the future.
This also changes image.store.Create to validate that the loaded image is
supported on the current machine. This change affects Linux as well, since
it now validates the architecture and OS fields.
Signed-off-by: John Starks <jostarks@microsoft.com>
btrfs-progs-4.5 introduces device delete by devid
for this reason btrfs_ioctl_vol_args_v2's name was encapsulated
in a union
this patch is for setting btrfs_ioctl_vol_args_v2's name
using a C function in order to preserve compatibility
with all btrfs-progs versions
Signed-off-by: Julio Montes <imc.coder@gmail.com>
Instead of implementing refcounts at each graphdriver, implement this in
the layer package which is what the engine actually interacts with now.
This means interacting directly with the graphdriver is no longer
explicitly safe with regard to Get/Put calls being refcounted.
In addition, with the containerd, layers may still be mounted after
a daemon restart since we will no longer explicitly kill containers when
we shutdown or startup engine.
Because of this ref counts would need to be repopulated.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Instead of implementing refcounts at each graphdriver, implement this in
the layer package which is what the engine actually interacts with now.
This means interacting directly with the graphdriver is no longer
explicitly safe with regard to Get/Put calls being refcounted.
In addition, with the containerd, layers may still be mounted after
a daemon restart since we will no longer explicitly kill containers when
we shutdown or startup engine.
Because of this ref counts would need to be repopulated.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Now what we provide dynamic binaries for all plaforms,
we shouldn't try to run docker without udev sync support.
This change changes the previous warning to an Error,
unless the user explicitly overrides the warning, in
which case they're at their own risk.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Once thin pool gets full, bad things can happen. Especially in case of xfs
it is possible that xfs keeps on retrying IO infinitely (for certain kind
of IO) and container hangs.
One way to mitigate the problem is that once thin pool is about to get full,
start failing some of the docker operations like pulling new images or
creation of new containers. That way user will get warning ahead of time
and can try to rectify it by creating more free space in thin pool. This
can be done either by deleting existing images/containers or by adding more
free space to thin pool.
This patch adds a new option dm.min_free_space to devicemapper graph
driver. Say one specifies dm.min_free_space=10%. This means atleast
10% of data and metadata blocks should be free in pool before new device
creation is allowed, otherwise operation will fail.
By default min_free_space is 10%. User can change it by specifying
dm.min_free_space=X% on command line. A value of 0% will disable the
check.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Check whether or not the file system type of a mountpoint is aufs
by calling statfs() instead of parsing mountinfo. This assumes
that aufs graph driver does not allow aufs as a backing file
system.
Signed-off-by: Tatsushi Inagaki <e29253@jp.ibm.com>
Previously, Windows layer diffs were written using a Windows-internal
format based on the BackupRead/BackupWrite Win32 APIs. This caused
problems with tar-split and tarsum and led to performance problems
in implementing methods such as DiffPath. It also was just an
unnecessary differentiation point between Windows and Linux.
With this change, Windows layer diffs look much more like their
Linux counterparts. They use AUFS-style whiteout files for files
that have been removed, and they encode all metadata directly in
the tar file.
This change only affects Windows post-TP4, since changes to the Windows
container storage APIs were necessary to make this possible.
Signed-off-by: John Starks <jostarks@microsoft.com>
This allows a graph driver to provide a custom FileGetter for tar-split
to use. Windows will use this to provide a more efficient implementation
in a follow-up change.
Signed-off-by: John Starks <jostarks@microsoft.com>