Tuesday, October 22, 2024

Grub repair with LVM+Multipath

I recently encountered an issue where I ran a grub update in an off-prem cloudy environment (Power Virtual Server) where I couldn't use a USB key for grub rescue. 

I *really* wanted that VM back, so I lived through the pain of a browser console repair operation. (Tip: you can select and middle-mouse-click to paste from within the console, even if you can't copy-paste into the console.)

I used https://www.linuxfoundation.org/blog/blog/classic-sysadmin-how-to-rescue-a-non-booting-grub-2-on-linux, which is accurate and very helpful, except that it doesn't mention LVM/multipath disks. If you're seeing disk labels like (ieee1275//vdevice/vfc-client@30000007/disk@5005076810149062,msdos2) back when you execute the "ls -l" command, here's what I modified in the instructions to get my VM booted with a basic grub config.

1. My set root looked like this:

set root=(ieee1275//vdevice/vfc-client@30000007/disk@5005076810149062,msdos2) 

2. My linux line looked like this:

linux /boot/vmlinuz<tab-complete> root=UUID=85399057-074b-4c82-85fb-1f82770b2646

3. My initrd was an initramfs file, so my initrd line looked like this:

initrd /boot/initramfs<tab-complete>.img


The UUID for the disk I needed was included in the disk info that ls -l provided. You can also find it (after you find the root partition you need), under /boot/grub/grub.cfg.

Wednesday, March 25, 2020

Some arch-querying examples for building

Most likely when building, be it container images or standalone binaries, you don't want a separate set of scripts or Makefiles or Dockerfiles, or whatever you're using, for each architecture you want to target (e.g. amd64 or ppc64le). I was asked by some new team-members to provide examples of ways to get the architecture of the system on your scripts are running. It was a short little list of things I've used and seen over the years, so I thought it might be nice to share publicly.

-------------------------

The first example is using shell and uname in a Makefile. Under the lint target, you can see that linting is only done for amd64.

.PNONY: lint
lint:
ifeq ($(ARCH), amd64)
@git diff-tree --check $(shell git hash-object -t tree /dev/null) HEAD $(shell ls -d * | grep -v cfc-files)
@yamllint .
@docker run --rm -v $(shell pwd):/data -w /data $(ANSIBLE_IMAGE) ansible-playbook -e cluster/config.yaml playbook/install.yaml --syntax-check
endif

But where did `ARCH` come from? In this use-case, which I've lifted from an internal project, there's an included file called Configfile that gets the arch using `uname`. You could just as easily put that into the top of your Makefile.
ARCH ?= $(shell uname -m | sed 's/x86_64/amd64/g')
ifeq ($(ARCH), amd64)
DOCKER_FLAG ?= Dockerfile
else
DOCKER_FLAG ?= Dockerfile.$(ARCH)
You can see in this example that this project *does* have a separate Dockerfile for each target arch, which is fine. (However, you can just maintain one if you use multi-arch images, and/or build-args).

You can also see that there's a substitution done for x86_64. Most of the time using those interchangably is fine. There are most likely similar cases for ARM variants, so this is a good thing to keep in mind and keep your case statements cleaner later on.

---------------------------

My  next example is a trick I stole from the nvidia-docker maintainers that lets you get an if-statement into a Dockerfile. Docker intentionally excludes conditional logic in Dockerfiles so that your images are the same from build to build. So use this example with great caution, and keep your images consistent.


RUN set -eux; \
     \
     arch="$(uname -m)"; \
          case "${arch##*-}" in \
          x86_64 | amd64) ARCH='amd64' ;; \
          ppc64el | ppc64le) ARCH='ppc64le' ;; \
         *) echo "unsupported architecture"; exit 1 ;; \
     esac; \
wget -nv -O - https://storage.googleapis.com/golang/go${GO_VERSION}.linux-${ARCH}.tar.gz \
| tar -C /usr/local -xz

I have this in a Dockerfile of my own that I use it to set up a build environment. As you can see, it's a way to use a URL that has a hard-coded architecure.


-----------------------------

So that's it! If you have any of your own fun tricks for your projects, please share them!

Monday, August 28, 2017

Create a private docker registry with a self-signed cert

Disclaimer: Don't do this in production. :D




I have found myself needing to have my own registry so that I could see its internal debug log. It's easy enough to spin up a registry locally and use it with loopback and the --insecure-registry flag -- but I want to use TLS. And I want to use a self-signed cert. And I want to use an IP address instead of a hostname. And I want a pony (j/k, a kitten. j/k, 3 kittens).

The doc that I found didn't tell me about the /etc/docker/certs.d/<host>:<port>/ part of all this. And it was a pain for me to get the IP SAN working (not going to explain the embarrassing mistake I made there).

So here is my dirty dirty cheat sheet for next time:

create a cert with an IP SAN (Subject Alternative Name, not Storage Area Network):

$ cp /etc/pki/tls/openssl.cnf .   # location may vary
$ vi openssl.cnf

# uncomment
req_extensions = v3_req

# Modify the v3_req section as follows:
[ v3_req ]
subjectAltName = @alt_names
# Extensions to add to a certificate request
basicConstraints = CA:FALSE
keyUsage = nonRepudiation, digitalSignature, keyEncipherment
[alt_names]
IP.1 = 192.168.1.2
IP.2 = 10.53.10.1

--

For just one IP, you can simply use
subjectAltName =IP: 10.53.10.1
(removing the alt_names section)

--

# then run:
$ openssl req -x509 -nodes -days 730 -newkey rsa:4096  -keyout certs-dir/domain.key -out certs-dir/domain.crt -config openssl.cnf  -sha256

(If that doesn't work, add `-extensions v3_req`)

--

# [optional sanity-check] confirm IPs in cert:
openssl x509 -text -in certs-dir/domain.crt -noout | grep "IP Address"

# run docker and bind-mount in the cert + key:

docker run -dit -p 5000:5000 --name registry -v `pwd`/certs-dir/:/certs -e "REGISTRY_HTTP_TLS_CERTIFICATE=/certs/domain.crt" -e "REGISTRY_HTTP_TLS_KEY=/certs/domain.key" registry:2

export DOMAIN_NAME=192.168.1.2

# load the cert system-wide:
openssl s_client -connect $DOMAIN_NAME:5000 -showcerts </dev/null 2>/dev/null | openssl x509 -outform PEM | sudo tee /etc/pki/ca-trust/source/anchors/$DOMAIN_NAME.crt

$ sudo update-ca-trust

# load the cert into docker's config:
$ mkdir -p /etc/docker/certs.d/$DOMAIN_NAME:5000
$ openssl s_client -connect $DOMAIN_NAME:5000 -showcerts </dev/null 2>/dev/null | openssl x509 -outform PEM | sudo tee /etc/docker/certs.d/$DOMAIN_NAME:5000/ca.crt

$ sudo /bin/systemctl restart docker.service

# verify again
openssl x509 -text -in /etc/pki/ca-trust/source/anchors/$DOMAIN_NAME.crt -noout | grep "IP Addr"

$ docker start registry

$ docker push $DOMAIN_NAME:5000/hello-world

--

Saturday, April 15, 2017

DockerCon 2017: Multi-Arch Resources

A huge shout-out to everyone who came to our DockerCon talk! Here is a short list of resources if you'd like to get started on a multi-arch journey.
Thanks,

- Christy & Chris

Wednesday, August 31, 2016

Using the host timezone in a docker container

I was recently asked if there was a docker option available to set the timezone in a container. There isn't one, and I started looking into whether it would be a good feature-add. I found several github issues discussing how to best set it. There were recommendations of bind-mounting /etc/localtime, or setting an environment variable (though no one went into much detail on that one). I did a little googling this morning, and came across a new environment variable: TZ.

TZ is available on all POSIX-compliant systems. You can pass in a timezone using Olson Format, e.g. America/Chicago. Read more about TZ here: https://www.gnu.org/software/libc/manual/html_node/TZ-Variable.html.

To find the timezone, Ubuntu, and RHEL-based systems all seem to sym-link /etc/localtime to a timezone file. If someone has just overwritten it on your system, you'll have to modify accordingly.

On my laptop:
> readlink /etc/localtime
../usr/share/zoneinfo/America/Chicago

Docker has a flag, -e, to pass in environment variables via the CLI.

-----------------

Using the two together, all you really need is:

`docker run -it --rm -e "TZ=$(readlink /etc/localtime | cut -d '/' -f5,6 )" centos:7`

You can run it once without specifying the timezone:
> docker run -it --rm  centos:7 date +%Z
UTC

And again with it to check:
> docker run -it --rm -e "TZ=$(readlink /etc/localtime | cut -d '/' -f5,6)" centos:7 date +%Z
CDT

You can also choose a specific timezone:
> docker run -it --rm -e "TZ=Asia/Tehran" centos:7 date +%Z
IRDT

Wednesday, June 15, 2016

running x86 containers on your ppc64le system

My last post was about running other architectures' containers on your laptop. This one's about running x86_64/amd64 containers on your ppc64le system!

If you didn't read the previous post, and want to know how this works, here is is: http://christycodes.blogspot.com/2016/06/running-cross-arch-container-images-on.html

Here's how you can do this (informational items in gray):
~> uname -m
ppc64le


1. Get the qemu emulator binaries:
~> apt-get download qemu-user-static
~> ls qemu-user-static_1%3a2.5+dfsg-5ubuntu10.1_ppc64el.deb
qemu-user-static_1%3a2.5+dfsg-5ubuntu10.1_ppc64el.deb
~> sudo dpkg --force-all -i qemu-user-static_1%3a2.5+dfsg-5ubuntu10.1_ppc64el.deb
Note: I intentionally didn't use apt-get install for qemu-user-static because I didn't want the binfmt-utils package.

2. Get/run the container that registers the binfmt hooks:
  ~> mkdir multiarch && cd multiarch && git clone https://github.com/clnperez/qemu-user-static.git && cd qemu-user-static/register
~> docker build -t multiarch/qemu-user-static:register .
~> ls /proc/sys/fs/binfmt_misc/
register  status

~> docker run --rm --privileged multiarch/qemu-user-static:register
~> ls /proc/sys/fs/binfmt_misc/
aarch64  alpha  arm  armeb  i386  i486  m68k  mips  mips64  mips64el  mipsel  mipsn32  mipsn32el  register  s390x  sh4  sh4eb  sparc  status


3. Run your x86 image:
> docker run --rm -v /usr/bin/qemu-x86_64-static:/usr/bin/qemu-x86_64-static busybox uname -a
Linux 77ce603ac0f1 4.4.0-22-generic #40-Ubuntu SMP Thu May 12 22:03:35 UTC 2016 x86_64 GNU/Linux
warning: TCG doesn't support requested feature: CPUID.01H:ECX.vmx [bit 5]


Note: That TCG error means that there's a missing CPU feature, but I'm not entirely sure that TCG doesn't support vmx, so I'm going to ask around.



Running cross-arch container images on your linux laptop

With the introduction of Docker for Mac, I ran across an exciting blog post: http://blog.hypriot.com/post/first-touch-down-with-docker-for-mac. I don't use a Mac for development, but what made that blog post interesting to me was the "Easter Egg" bit. It was titled, "There is another big ARM surprise," which is pretty sweet (so hopefully you've read that by now).

But what about other architectures? And what about not just doing this in a Mac? Well, get your ribboned baskets ready, because that Easter Egg has led me to a giant Easter Egg minefield of awesome. There are some folks over at Scaleway working on a multiarch project, and they've put together two key things:

  1. All the Easter Eggs: [scroll to Downloads] https://github.com/multiarch/qemu-user-static/releases
  2. The prep your Easter basket needs to use them: https://hub.docker.com/r/multiarch/qemu-user-static
So let's back up a little and talk about what is going on here. Some of this was mentioned in the blog post I referenced at the beginning of this post, but I think a bit more exploration is fun. (If you don't, skip down to the teal deer).

In Linux, there's a binary that allows you to run ELF binaries that weren't compiled for the architecture you are running on. It's called binfmt_misc, and you can read more about it here: https://www.kernel.org/doc/Documentation/binfmt_misc.txt.That binary doesn't actually run the program. It just provides the mechanism to make sure the right interpreter does, based on some bits embedded in the program itself.  The binfmt_misc binary checks the magic bits, then cross-references with what it finds in /proc/sys/fs/binfmt_misc/ for what to do.

That's where #1 comes in. binfmt_misc comes shipped with most Linux distros, but it can't do the job alone. It needs an interpreter. And what better interepreter than qemu? The multiarch project includes over a dozen compiled static qemu binaries! There's more than just the ARM qemu binary in there, so whatever architecture you want to run, I bet it's in their list.

But you can't just plop the emulator into your system and start running ARM or POWER containers. You've got to let binfmt_misc know what binaries should do what. You've got to set up those magic numbers, and also have them point to the right place. That's where #2 is fantastic. Not only are all the strings that binfmt_misc needs already assembled, the Scaleway folks created a docker container that will add them all to your host! If they hadn't you'd have to put together strings like  

:ppc64le:M::\x7fELF\x02\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x15\x00:\xff\xff\xff\xff\xff\xff\xff\x00\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff\x00:'${QEMU_BIN_DIR}'/qemu-ppc64le-static:

and then get them in the right place in your fs. Instead, you just run:
$ docker run --rm --privileged multiarch/qemu-user-static:register

and you're set up! 

You can check that these were added by:
$ ls /proc/sys/fs/binfmt_misc/
aarch64  arm    kshcomp  mips    mips64el  mipsn32    ppc    ppc64le   s390x  sh4eb  status alpha    armeb  m68k     mips64  mipsel    mipsn32el  ppc64  register  sh4    sparc

 

(Note: You do also have to have binfmt_misc mounted on your system, but I'm leaving that step out because on my F23 workstation it was mounted by default.)

All that's left is running the container. But the container needs access to the emulator, so you can just bind-mount it at runtime (e.g. with docker's -v).

So now, for those of you who stayed with me, thanks. For everyone else, it's deer time.

Oh hai! Let's go:
Tada!

My example was with ppc64le, but you can download one of the other qemu binaries in the first step, depending on intended arch of the container you want run.