Wednesday, June 1, 2016

Creating docker images suitable for OpenShift v3 (ssh-git image HowTo)


This is not going to be a detailed guide for creating docker images. I'll present an example ssh-git image and highlight the more important concerns for running such image on OpenShift v3. Things are basically in documentation but I hope to just get you started quickly. (update: wow I thought it's gonna be a few lines but it turned into a monster)

tl;dr; skip to the OpenShift section

Plain Docker image

Starting with little docker experience and no knowledge about OpenShift requirements I just went ahead to create a standard SSH server image and because of a nice git feature, one can just create a local `bare` repo to be served over SSH (to whoever has a matching key in ~/.ssh/authorized_keys).

I looked around but found only a few Ubuntu examples. My favorite distro is Fedora (I'm affiliated but still) so thought it's a shame and went ahead to create a fedora based Dockerfile. In fact it was pretty much pain-free. Here's my initial version running OpenSSH as root:

The interesting points are:
  • `FROM fedora:latest`
  • `RUN ...` - pretty much standard commands to install ssh and configure a user; I usually also do `restorecon -R ~/.ssh` but inside docker selinux is nil, thus that's skipped.
  • `EXPOSE 22` - so that docker knows which ports are needed
  • `CMD ssh-keygen -A && exec /usr/sbin/sshd -D` - here interesting part is generating keys as OpenSSH can't work properly otherwise

Building, running, tagging, pushing


# docker build -t PATH

Where latest can also be another version.


# docker tag

As image you can use a tag or an image hash. Doesn't matter.


Launch container with ports exposed and giving container a name.
# docker run -d -P --name ssh-git-server myaccount/imagename:latest
btw you can try the built image from `aosqe/ssh-git-server:root-20150525`

Get exposed port number so you can use it later.
# docker port ssh-git-server 22

Put your ssh public key in.
# docker exec ssh-git-server bash -c 'echo "ssh-rsa ..." > /home/git/.ssh/authorized_keys'

Clone the sample repo:
$ git clone ssh://git@localhost:32769/repos/sample.git


# docker rm ssh-git-server
# docker rmi <image tag> # to get rid of the image locally

Sharing with others (pushing)

Then you can push these images to dockerhub:
# docker login
# docker push

Image where SSHd runs as a regular user

I knew OpenShift doesn't let you run images as root so next step was to create an image where OpenSSH runs as the `git` user. In fact it allows you, but you have to grant your user extra privileges and there is really no good reason to do that for a ssh-git server. Also a future OpenShift Online service would not allow such extra privileges for security reasons. At some point it is likely secure root pods to be allowed using user namespaces with some performance penalty.

That was even less painful thanks to an old post in cygwin list. Basically privilege separation needs to be turned off as it can only work as root and adjust some locations in sshd_config using `sed`. Finally little `chown/chmod` adjustments. And before I forget, port cannot be 22 so I selected 2022.

So new things are adding more `RUN` commands and the `USER git` directive so final CMD is run as the user instead root. Here's the result:

You can try:
# docker run -d -P --name ssh-git-server aosqe/ssh-git-server:git-20150525

But testing this on OpenShift I've got the strange error message:

No user exists for uid 1000530000

I was stuck here for a little while until I could figure that error is not produced by OpenShift but ssh server itself.

OpenShift ready Image

What I found out (see the official guidelines in the References section) is that regardless of your `USER` directive in Dockerfile, unless you give the user or service account that would launch pod as some random UID. The group will be static though - root.

Because that random UID will not be part of the passwd file, some programs will fail to start with an error message like what I saw above. Another issue is that pre-setup of SSH becomes impossible as some files need to be with permissions 700 for ssh to accept them. Obviously as a random UID we cannot repair that once pod stats.

Here's how I approached:
  1. move most setup to the container start CMD
  2. make a couple directories writable to the root group so that step #1 can create necessary new files (this time with proper owner and permissions)
  3. make `passwd` root group writable so that we can fix our UID (official guideline suggests using nss wrapper but I thought it's easier to just fix in-place)
End result is otherwise basically the same thing, just moving around the commands:

btw I have to change the multi-line CMD to a shell script and add to image. Would be easier to customize.

Doing it the OpenShift way

Since I've got an OpenShift Enterprise environment running I thought to use it directly instead of using plain docker commands (which would also work fine):
$ oc new-build -name=git-server --context-dir=ssh-git-openshift

FYI you can append `#branch` if you want to build of non-default branch. The good thing using this approach is that image can be rebuilt automatically when base image (fedora:latest) changes and when your code changes. You may need to configure hooks though. See triggers doc.

To monitor build:
$ oc logs -f bc/git-server

In the log, you will see something like (can be useful later):
The push refers to a repository []
Now run the image by:
$ oc new-app git-server:latest --name=git-server

You would end up with a deployment config called git-server that creates a replication controller `git-server-1` that keeps one pod called `git-server-...` running from the `git-server` image stream created by the `new-build` command. Also a service called `git-server` is created that will provide you with a stable IP to access the pod + its name can be used as a hostname of the git server in any pod or build that happens within the same project.

One last detail is to make service listen on port 22 for nicer git URLs:
$ oc edit svc git-server # change `port` to 22 from 2022

Note that services can only be accessed from pods running in same project or in project 'default'. To access service from the internet, you need to create a nodePort service. Because this is not HTTP based, we can't use regular routes. Hope to get on that later.

To see you pod name and use it, you can do:
$ oc get pod # see pod name

$ oc rsh git-server-... # configure ssh keys there, create repos, etc.

Now once you have your public key in the pod, you can access this server from other pods. You can do for trying out from the server pod itself. Provided you have the matching private key. While into `rsh` do:
$ git clone git@git-server:sample.git

To push your image to dockerhub, see how to set build config output. Or you can manually ssh to the OpenShift node and do as root:
# docker tag
# docker login 
# docker push
If you want to run your image off dockerhub, you can do:
$ oc run git-server --image=aosqe/ssh-git-server-openshift
$ oc expose dc git-server --port=22 --target-port=2022
$ oc set probe dc/git-server --readiness --open-tcp=2022

Setting the probe lets your replication controller notice pod is dead and spawn a new one.

Some words about persistent volumes
The way images I refer to above are built would cause any changes in public keys and repo data to be lost upon pod restart. To avoid that persistent volumes need to be used.
Persistent volumes at attach time will be chowned to the current UID of the pod. Provided the OpenShift ready image does setup at launch time, that should be easily supportable. i.e. mount volume to /home/git/

But a few changes will still need to be done:
  • creation of sample git repo needs to be conditional, when it doesn't exist
  • `sshd_config` and ssh-keygen should create host keys somewhere in `git` user home dir to keep host keys between pod restarts

Future work

  • make the OpenShift ready image runnable off a persistent volume  
  • add info about making repo accessible from the Internet
  • convert multi-line CMD to a startup script
Current post is based on the initial commit of the docker files in the repo.


    Friday, May 13, 2016

    quick debugging KVM VM issues

    See a hang or infinite loop, or perf issue with VM on KVM? Here's how to get a trace of it so a bugzilla report can be meaningful:

    First attach VM configuration XML. That is obtained by:

    >  sudo virsh dumpxml [vm_name] > some_file

    Cole Robinson wrote on 09/23/2014 04:24 PM:
    > sudo debuginfo-install qemu-system-x86
    > Then on the next hang, grab the pid of the busted VM from ps axwww, and do:
    > sudo pstack $pid
    > The dump that output in a bug report, along with
    > /var/log/libvirt/qemu/$vmname.log. File it against qemu

    Also interesting might be system log from Host and guest. On Fedora you can obtain it by a command similar to:

    > sudo journalctl --system --since today

    Tuesday, May 10, 2016

    replicating HTTP Server replies using ncat and socat

    I was looking at an issue that rest-client ruby gem raised an error on `#cookies_jar` on one particular server while it worked fine on a couple of public servers I tried [1].

    I was just going to write a simple script to serve as a HTTP server to return me same response as the offending HTTP server but hey, I thought, there must be an easier way.

    So I just obtained raw response from original server, put it into a file and asked netcat to listen and give it back on request.

    $ cat > response.raw << ""EOF"
    HTTP/1.1 200 OK
    Accept-Ranges: bytes
    Content-Length: 36
    Content-Type: text/html; charset=utf-8
    Last-Modified: Mon, 11 Apr 2016 05:39:53 GMT
    Server: Caddy
    Date: Tue, 10 May 2016 08:10:17 GMT
    Set-Cookie: OPENSHIFT_x7xn3_service-unsecure_SERVERID=c72192d7fe9c33d8dec083448dd4f40f; path=/; HttpOnly
    Cache-control: private
    Hello-OpenShift-Path-Test http-8080
    $ nc -l 8080 < response.raw
    ## on another console
    $ curl -v localhost:8080 

    That's the simplest I could get. It will return the same thing regardless of path and query string you put in your client URL. e.g. this will work the same:

    $ curl -v localhost:8080/path&asd=5

    Now if you want your server to return something multiple times, then you can try

    $ nc -kl 8080 -c 'cat response.raw'

    Another option if your system lacks netcat is the `socat` utility.

    $ socat TCP-LISTEN:8080,fork EXEC:"cat response.raw" 

    If you remove `fork` from the options, it will exit after first connection served. But we can also listen over HTTPS:

    $ socat OPENSSL-LISTEN:8080,cert=/path/cert.pem,verify=0 EXEC:"cat response.raw"

    Again, add `fork` option to keep listening. This above will ignore client certificate. In fact you can create proper client cert and configure SSL verify. But that's beyond today's topic. FYI, use `socat` version, otherwise you'd be hit with weak DH key used [2]. As a workaround you could generate DH key in a file and provide it with the `dhparams` option to socat.


    Friday, April 15, 2016

    ruby calling methods without parentheses can be misleading

    This one stayed in my draft for an year maybe. Thought to add a short explanation and publish it.
    [1] pry(main)> class Test
    [1] pry(main)*   def gah
    [1] pry(main)*     puts "gah"
    [1] pry(main)*   end 
    [1] pry(main)*   def fah
    [1] pry(main)*     gah
    [1] pry(main)*     puts gah
    [1] pry(main)*     gah="now local variable"
    [1] pry(main)*     puts gah
    [1] pry(main)*     gah()
    [1] pry(main)*   end 
    [1] pry(main)* end 

    [3] pry(main)>

    now local variable
    Basically, if for some reason a variable is defined in current context with the same name as a method, then calling that method later, may result in using the variable instead of the method.

    Calling `gah` in the beginning  results in calling the instance method `#gah`. But after we do `gah="now local variable"` then calling `gah` results in obtaining the local variable `gah` value. Finally calling `gah()` always results in calling the instance method.

    Simple thing but can be confusing. Solution would be to always call methods using parentheses or make your methods short to easily spot any such mistakes.

    My initial reaction when I first learn you can call methods without parentheses in ruby  was that it is a bad idea. Then I became lazy and stopped using them. How can we be so lazy? Only 2 characters? Oddly enough I don't feel like starting to write parentheses again.

    Monday, February 15, 2016

    rsync to/from OpenShift v3 pods

    Update: Some pointed out `oc rsync` command already exist. Shame on me i missed that. My only solace is that it does not support all rsync options (yet). Read below only if standard command does not work for you (or if curious to know how it works).

    I was thinking about easy copying files to/from OpenShift pods and thought it would be awesome if I can make `rsync` use `oc exec` instead of ssh to perform that. This should not be a common use case as pods should generally be stateless but one may want to backup data from a persistent volume for example when environment is not in control the pod owner's control.

    You may already know that ssh access is not available to OpenShift/Kubernates managed pods but at least in openshift one can use the client tool to access them in a ssh-like fashion. That's done using the `oc rsh` and `oc exec` sub-commands.

    In fact `oc rsh` only wraps `oc exec` by adding it's options `-i` to pass stdin to remote process as well `-t` for terminal. For rsync we only need `-i` though. Here's the magic incantation:

    $ rsync -av -e 'oc exec -n fs1d4 -i myapp-1-vytqm' -- /tmp/ec2-user/ --:haha/
    sending incremental file list
    created directory haha
    sent 142 bytes  received 38 bytes  72.00 bytes/sec
    total size is 4  speedup is 0.02

    Let me dissect that for you. First we use the `-a` option of rsync because I want recursive sync keeping all file properties, as well `-v` to see what actually happened. You can add and mix any other `rsync` options here like `--delete`, `--exclude`, etc.

    Then we specify the rsh command and that is 'oc exec -n fs1d4 -i myapp-1-vytqm' where `fs1d4` is your project name and `myapp-1-vytqm` is the desired pod name. You can use any other `oc` option here like `--config`, `--container`, etc. The important points are:

    • keep the `-i` option so that rsync can talk to remote process
    • do not include the `--` option terminator so that rsync can later add it

    After that we use the `--` option terminator to tell rsync threat any further command parameters as path location specifiers and not options. This is important because our hack forces us to use `--` as a hostname in SRC or DST location specifier. More on that later.

    Our SRC location specifier is `/tmp/ec2-user/` and that is a local test directory. The DST location specifier is `--:haha/` which means relative path `haha/` on host `--`. The reason is that if we only specify "haha/", then rsync will consider that location local and will not invoke the remote shell. So we need to specify some hostname. But whatever we specify, it will break our `oc exec` command. So I figured, I can specify `--` for the hostname and rsync will just append to the remote shell command. And in fact we need `--` appended, otherwise the remote call to `rsync` will fail.

    You may already know that local `rsync` calls `rsync` on the remote end and then the two processes communicate over stdin/stdout. To avoid `oc exec` interpret options of the remote `rsync` invocation, we need that `--` as a hostname`. And to avoid `--:haha/` being interpreted as an option to local `rsync` invocation, we need the first `--` above.

    I doubt my ability to explain but hopefully gave you some pointers how to make it work. Here are a few other important points:

    • make sure your pod has `rsync` already installed and pointed to in the pod PATH variable. Or use `--rsync-path` option.
    • make sure you have write access to destination dir
    • make sure you're already logged into OpenShift (oc login)
    • make sure to write correct commands as mistakes usually produce hardly informative output (`strace -f` helps to debug)

    Hope that helps somebody.