Stephan Fabel
on 10 December 2018
Using GPGPUs with Kubernetes
This post walks through the use of GPGPUs with Kubernetes and DevicePlugins. We’ll use MicroK8s for a developer workstation example and charmed K8s for a cluster since that’s a consistent multi-cloud Kubernetes approach. The various cloud CAAS offerings like GKE are also enabling GPGPU facilities so you may want to try those too.
We’ll use Ubuntu as the OS because the underlying enablement for GPGPUs ‘Just Works’ in all the clouds and with all the local hardware, and making docker images on Ubuntu ensures that the CUDA libraries line up with the drivers properly.
In order for this all to work, the correct (and matching) driver needs to be installed on the worker node to make the device accessible from the OS; and typically it also requires some userland libraries in order to work. With NVIDIA GPUs this enablement further depends on using the right Docker runtime (nvidia-docker2
) which requires additional host-level configuration and post-deployment installation.
All of that is automated on Ubuntu with MicroK8s and the charmed Kubernetes charms, across all the public clouds where GPUs are available. It’s also currently activated in GKE, other cloud CAAS offerings will follow.
Workstation GPGPU containers with Microk8s
Microk8s is a snap of upstream Kubernetes that is designed for development purposes. It’s not a cluster but it gives you a small zero-ops kubernetes environment that is compatible with all the major multi-cloud K8s offerings. For our purposes the important thing is that it includes GPGPU enablement in the box.
To install MicroK8s:
$ snap install microk8s --classic
This will give you the latest stable version of MicroK8s which tracks upstream releases closely.
You can select a particular version using ‘snap channels’, see ‘snap info microk8s
’ for the available tracks. By selecting a particular track you can lock yourself to a particular version of Kubernetes. By default you will be on the ‘latest’ track, and get upgrades when upstream Kubernetes releases a new stable version. Select a particular track with --channel=track/stability
from the available channels. ‘Stable’ maps to ‘latest/stable’.
$ snap info microk8s
[...]
channels:
stable: v1.13.0 (340) 204MB classic
candidate: v1.13.0 (340) 204MB classic
beta: v1.13.0 (340) 204MB classic
edge: v1.13.0 (340) 204MB classic
1.13/stable: v1.13.0 (340) 204MB classic
1.13/candidate: v1.13.0 (340) 204MB classic
1.13/beta: v1.13.0 (340) 204MB classic
1.13/edge: v1.13.0 (341) 204MB classic
1.12/stable: v1.12.3 (336) 226MB classic
1.12/candidate: v1.12.3 (336) 226MB classic
1.12/beta: v1.12.3 (336) 226MB classic
1.12/edge: v1.12.3 (336) 226MB classic
1.11/stable: v1.11.5 (322) 219MB classic
1.11/candidate: v1.11.5 (322) 219MB classic
1.11/beta: v1.11.5 (322) 219MB classic
1.11/edge: v1.11.5 (322) 219MB classic
1.10/stable: v1.10.11 (321) 175MB classic
1.10/candidate: v1.10.11 (321) 175MB classic
1.10/beta: v1.10.11 (321) 175MB classic
1.10/edge: v1.10.11 (321) 175MB classic
Assuming you have an Nvidia GPU with a current driver installed, you can activate Kubernetes support for it with the “enable” subcommand:
$ microk8s.enable gpu
You can confirm that the GPU is available to Microk8s with this command:
$ microk8s.status
microk8s is running
addons:
gpu: enabled
storage: disabled
registry: disabled
ingress: disabled
dns: disabled
metrics-server: disabled
istio: disabled
dashboard: disabled
Running GPGPU-accelerated containers on Kubernetes
Now that you have GPGPU capacity available to Kubernetes you can deploy containers there that get access to the special hardware they need.
Your container needs to have the right userspace pieces, so again we suggest that you build the OCI images on Ubuntu with the CUDA libraries provided; those will be most portable across all the different cloud CAAS offerings as well as offerings from Canonical, VMware, Pivotal, Cisco and others that also use Ubuntu for K8s.
Your workloads can now use something like this to select appropriate worker nodes (example taken from here):
Listing 1: nvidia-pod-example.yaml
apiVersion: v1
kind: Pod
metadata:
name: cuda-vector-add
spec:
restartPolicy: OnFailure
containers:
- name: cuda-vector-add
image: "k8s.gcr.io/cuda-vector-add:v0.1"
resources:
limits:
nvidia.com/gpu: 1 # requesting 1 GPU
Kubernetes cluster deployment with GPGPUs
A compelling feature of the Charmed Distribution of Kubernetes (CDK) is that it will automatically enable GPGPU resources which are present on the worker node for use by K8s pods.
GPU resources are enabled through the use of Device Plugins which are deployed as DaemonSets. This ensures that each GPU-enabled worker node is allowed access to the GPU and sets the right paths to the driver plugins on the host.
With the DaemonSet deployed, the Kubernetes scheduler can leverage the NodeSelector to filter worker node candidates advertising the nvidia.com/gpu feature when scheduling workloads.
Charms fully automate the deployment of Kubernetes in a way that is model-driven and thus flexible for use on different kinds of cloud or cluster. We use charms successfully for HPC deployments of Kubernetes, for example, making the deployment of AI/ML pipelines on top of Kubernetes easier. GPU enablement is important for those sorts of workloads.
However, before deploying Kubeflow or similar frameworks, the Kubernetes layer needs to be fully automated and GPUs activated.
The charms of Kubernetes do all the work. As worker nodes get commissioned into the model, the Kubernetes charms auto-detect the presence of NVIDIA hardware, install the right driver and host libraries, replace the container runtime with the NVIDIA supported one, deploy the DaemonSet for the DevicePlugin and labels the nodes automatically.
The K8s cluster is best deployed with conjure-up
which will walk you through the entire process. You can use conjure-up on a public cloud with GPU-enabled instance types, or on MAAS for bare metal clusters with servers that contain GPUs. In both cases, the deployment process is exactly the same.
For example, you can use p2.xlarge
instances on AWS. In order to make that happen, we need to pass a constraint into the conjure-up command line so that we force the usage of the GPU enabled instance types when deploying workers.
Listing 2: cdk-gpu-worker.yaml
services:
"kubernetes-worker":
charm: "cs:~containers/kubernetes-worker"
num_units: 1
options:
channel: 1.13/stable
expose: true
constraints: "instance-type=p2.xlarge root-disk=32768"
Pass this to conjure-up:
$ conjure-up canonical-kubernetes --bundle-add cdk-gpu-worker.yaml
This will launch the conjure-up wizard interface and allow you to select additional add-ons to be deployed, for example, Kubeflow can be selected here. On the controller selection screen, you can either deploy a dedicated Juju controller (one more VM) or you can take advantage of JAAS, which provides Juju-as-a-service on the major public clouds.
Once the installation is kicked off, you see a status screen as shown below:
The status can also be shown using the Juju command directly. If you use JAAS, locate the model name using:
$ juju models -c jaas
Controller: jaas
Model Cloud/Region Status Machines Cores Access Last connection
conjure-canonical-kubern-9dc aws/us-east-1 available 0 0 admin never connected
Then, inspect the status of the model:
$ juju status -m jaas:conjure-canonical-kubern-9dc
Model Controller Cloud/Region Version SLA Timestamp
conjure-canonical-kubern-9dc jaas aws/us-east-1 2.4.5 unsupported 16:13:37-08:00
App Version Status Scale Charm Store Rev OS Notes
aws-integrator 1.15.71 active 1 aws-integrator jujucharms 7 ubuntu
easyrsa 3.0.1 maintenance 1 easyrsa jujucharms 117 ubuntu
etcd maintenance 3 etcd jujucharms 209 ubuntu
flannel waiting 0 flannel jujucharms 146 ubuntu
kubeapi-load-balancer maintenance 1 kubeapi-load-balancer jujucharms 162 ubuntu exposed
kubernetes-master maintenance 2 kubernetes-master jujucharms 219 ubuntu
kubernetes-worker waiting 0/1 kubernetes-worker jujucharms 239 ubuntu exposed
Unit Workload Agent Machine Public address Ports Message
aws-integrator/0* active idle 0 54.165.35.94 ready
easyrsa/0* maintenance executing 1 34.234.207.232 (install) installing charm
etcd/0* maintenance executing 2 54.208.163.252 (install) installing charm
etcd/1 maintenance executing 3 34.201.210.154 (install) installing charm
etcd/2 maintenance executing 4 54.235.228.45 (install) installing charm
kubeapi-load-balancer/0* maintenance executing 5 34.228.169.37 (install) installing charm
kubernetes-master/0 maintenance executing 6 18.207.179.122 (install) installing charm
kubernetes-master/1* maintenance executing 7 18.212.150.203 (install) installing charm
kubernetes-worker/0 waiting allocating 8 35.175.104.2 waiting for machine
Machine State DNS Inst id Series AZ Message
0 started 54.165.35.94 i-083ce279733998d59 bionic us-east-1a running
1 started 34.234.207.232 i-04828688ddfdb0c6c bionic us-east-1b running
2 started 54.208.163.252 i-03d910e892e7c09f6 bionic us-east-1a running
3 started 34.201.210.154 i-00adeecd668174ee0 bionic us-east-1b running
4 started 54.235.228.45 i-032875fd24a1c1e78 bionic us-east-1c running
5 started 34.228.169.37 i-0008405049b9bed6d bionic us-east-1d running
6 started 18.207.179.122 i-003abf7f3612a2f18 bionic us-east-1b running
7 started 18.212.150.203 i-0abe01060e8179618 bionic us-east-1a running
8 pending 35.175.104.2 i-0d493a35776b9217d bionic us-east-1e running
Once the installation has finished all GPGPU resources are properly configured and available to the Kubernetes operator. You can check this with:
$ kubectl get no -o wide -L cuda,gpu
Conclusion
Leveraging GPGPU resources in your Kubernetes cluster is automatic and easy to do when using the Charmed Distribution of Kubernetes or Microk8s.
What do you think? We’d love to hear about your use cases and how CDK and Microk8s helped with your GPGPU-sensitive workloads.