Monday, February 15, 2016

rsync to/from OpenShift v3 pods

Update: Some pointed out `oc rsync` command already exist. Shame on me i missed that. My only solace is that it does not support all rsync options (yet). Read below only if standard command does not work for you (or if curious to know how it works).

I was thinking about easy copying files to/from OpenShift pods and thought it would be awesome if I can make `rsync` use `oc exec` instead of ssh to perform that. This should not be a common use case as pods should generally be stateless but one may want to backup data from a persistent volume for example when environment is not in control the pod owner's control.

You may already know that ssh access is not available to OpenShift/Kubernates managed pods but at least in openshift one can use the client tool to access them in a ssh-like fashion. That's done using the `oc rsh` and `oc exec` sub-commands.

In fact `oc rsh` only wraps `oc exec` by adding it's options `-i` to pass stdin to remote process as well `-t` for terminal. For rsync we only need `-i` though. Here's the magic incantation:

$ rsync -av -e 'oc exec -n fs1d4 -i myapp-1-vytqm' -- /tmp/ec2-user/ --:haha/
sending incremental file list
created directory haha

sent 142 bytes  received 38 bytes  72.00 bytes/sec
total size is 4  speedup is 0.02

Let me dissect that for you. First we use the `-a` option of rsync because I want recursive sync keeping all file properties, as well `-v` to see what actually happened. You can add and mix any other `rsync` options here like `--delete`, `--exclude`, etc.

Then we specify the rsh command and that is 'oc exec -n fs1d4 -i myapp-1-vytqm' where `fs1d4` is your project name and `myapp-1-vytqm` is the desired pod name. You can use any other `oc` option here like `--config`, `--container`, etc. The important points are:

  • keep the `-i` option so that rsync can talk to remote process
  • do not include the `--` option terminator so that rsync can later add it

After that we use the `--` option terminator to tell rsync threat any further command parameters as path location specifiers and not options. This is important because our hack forces us to use `--` as a hostname in SRC or DST location specifier. More on that later.

Our SRC location specifier is `/tmp/ec2-user/` and that is a local test directory. The DST location specifier is `--:haha/` which means relative path `haha/` on host `--`. The reason is that if we only specify "haha/", then rsync will consider that location local and will not invoke the remote shell. So we need to specify some hostname. But whatever we specify, it will break our `oc exec` command. So I figured, I can specify `--` for the hostname and rsync will just append to the remote shell command. And in fact we need `--` appended, otherwise the remote call to `rsync` will fail.

You may already know that local `rsync` calls `rsync` on the remote end and then the two processes communicate over stdin/stdout. To avoid `oc exec` interpret options of the remote `rsync` invocation, we need that `--` as a hostname`. And to avoid `--:haha/` being interpreted as an option to local `rsync` invocation, we need the first `--` above.

I doubt my ability to explain but hopefully gave you some pointers how to make it work. Here are a few other important points:

  • make sure your pod has `rsync` already installed and pointed to in the pod PATH variable. Or use `--rsync-path` option.
  • make sure you have write access to destination dir
  • make sure you're already logged into OpenShift (oc login)
  • make sure to write correct commands as mistakes usually produce hardly informative output (`strace -f` helps to debug)

Hope that helps somebody.