First thing first. This post is about a problem that drove me crazy for the last 2 weeks. I am using lxc containers with lvm and it went from it’s ok and it’s working (with ansible) to nothing is working anymore.

I even made a meme of the situation.

crashing containers by usign lvm

The beginning

At LILIK we are trying to achieve a working infrastructure using containers and a configuration management tool. The good part is that when it’s working it’s flaswless, integrated and mostly boring but the bad part is that you have to really dive in complicated tools when things are not working.

LXC

We would like to use containers as we would like to host many services but only have 2 server (asus vivo pc), it could be done other way but with the softare separations granted by containers we enhance security and keep configuration simpler.

LVM

To provision logical partitions for LXC containers we use LVM, a logical volume manager that handles I/O on our host. I don’t see advantages over using directories to store the container root filesystem but I trust my peers advice.

Creating lxc containers on lvm

We are using Ansible to provision containers and configure them but it could be done manually from the terminal

lxc-create -n wiki \
           --bdev lvm \
	   --vgname biffvg \
	   --lvname vm_wiki \
	   --fsstye xfs \
	   --fssize 5G \
	   -t debian \
	   -- --packages=ssh,python --release=stretch

A really brief overview: create a LXC container called wiki, use lvm as storage, use the biffvg volume group, the volume name is vm_wiki, the filesystem is xfs, the volume is 5G large, use the debian template with the options –packages=ssh,python –release=stretch.

Recently I upgraded our container host and installed the 2.0.5 version of lxc and , this has lead to a problem with Ansible in the lxc_container module.

Ansible can’t really ask for user input during a task execution but during the container creation lvm needed an interactive confirmation.

lxc-create -n wiki \
           --bdev lvm \
	   --vgname biffvg \
	   --lvname vm_wiki \
	   --fssize 5G \
	   --fstype xfs \
	   -t debian -- --release=stretch --packages=ssh,python
File descriptor 3 (/var/lib/lxc/wiki/partial) leaked on lvcreate invocation. Parent PID 10761: lxc-create
  Using default stripesize 64.00 KiB.
  WARNING: xfs signature detected on /dev/biffvg/vm_wiki at offset 1080. Wipe it? [y/n]: 

This interctive question would break our playbook because of a timeout error, we couldn’t give an answer and Ansible interpreted the timeout as an error. This is odd as we have already addressed this problem with a destroy_container playbook that does exactly a filesystem wipe. Evidently lvm stores filesystem signatures somewhere in metadata not in the volume.

This behaviour can be overcome by setting a lvm configuration variable

         # Configuration option allocation/wipe_signatures_when_zeroing_new_lvs.
         # Look for and erase any signatures while zeroing a new LV.
         # The --wipesignatures option overrides this setting.
         # Zeroing is controlled by the -Z/--zero option, and if not specified,
         # zeroing is used by default if possible. Zeroing simply overwrites the
         # first 4KiB of a new LV with zeroes and does no signature detection or
         # wiping. Signature wiping goes beyond zeroing and detects exact types
         # and positions of signatures within the whole LV. It provides a
         # cleaner LV after creation as all known signatures are wiped. The LV
         # is not claimed incorrectly by other tools because of old signatures
         # from previous use. The number of signatures that LVM can detect
         # depends on the detection code that is selected (see
         # use_blkid_wiping.) Wiping each detected signature must be confirmed.
         # When this setting is disabled, signatures on new LVs are not detected
         # or erased unless the --wipesignatures option is used directly.
         wipe_signatures_when_zeroing_new_lvs = 0

This would stop lvm from asking about wiping filesystem signatures on creation of a lvm volume but also demands that you cleanup a container’s filesystem on deletion.

Another problem is that lvm when creating a logical volume requires that only 3 file descriptors are open, STIN, STOUD and STDERR, others will be closed and you will see a line like this in STDOUT (generally your terminal).

On container creation

lxc-create ...
File descriptor 3 (/var/lib/lxc/wiki/partial) leaked on lvcreate invocation. Parent PID 12651: lxc-create
Using default stripesize 64.00 KiB.
...

and on container destruction

lxc-destroy -n wiki 
File descriptor 3 (/run/lxc/lock/var/lib/lxc/.wiki) leaked on lvremove invocation. Parent PID 15673: lxc-destroy

Evidently lxc uses files descriptors to keep track of what is doing on which container, this is also a problem as ansible picks up this message as an error.

To get rid of this messages there is a environment variable to set

export LVM_SUPPRESS_FD_WARNINGS=1

Now lvm won’t advertise that file descriptors have been closed, this is not a good move as it is a log message that could come in handy when debugging.