Setting up Cardano Relays using Kubernetes/microk8s

I recently started a Cardano Stake Pool, [CODER], hosted using Kubernetes. This is the first of several posts on how I set it up. The code/config in this post will be taken from my mainnet configuration, though I would recommend starting with testnet when starting out.

Posts in this series:

Step 1: Setting up Cardano Relays using Kubernetes/microk8s
Step 2: Monitoring Cardano Relays on Kubernetes with Grafana and Prometheus
Step 3: Using Kubernetes ConfigMaps for Cardano Node Topology Config
Step 4: Setting up a Cardano Producer node using Kubernetes/microk8s

Goals

My goals were:

Use Kubernetes
- Part of the reason for this project was to have a real project to learn more about Kubernetes
- I’d like to manage the nodes without needing to set up individual machines or manage individual containers via Docker
- I’d like to be able to migrate/rebuild the pool (eg. from Intel NUCs to the cloud) later with as little effort as possible
Use Grafana/Prometheus to monitor the nodes without having to learn too much about them or set them up myself (specifically via the microk8s addons and ServiceMonitors)
Secure the pool pledge and rewards with a crypto hardware wallet so if any node was compromised or destroyed, none of these are lost

This is my first time using Kubernetes so what’s described here may not be the optimal solution. If you do have suggestions/improvements, please do comment below!

If you find this post useful or are looking for somewhere to delegate while setting up your own pool, check out my pool [CODER]! 😀

Installing Kubernetes (microk8s)

The first step is setting up Kubernetes. I decided to go with microk8s by Canonical since it seemed to have as good reviews as any and I figured being from Canonical there would be fewer potential compatibility issues with Ubuntu Server. It also included addons for Prometheus/Grafana which looked like it may simplify setting those up. Installing it turned out to be rather trivial, as the Ubuntu Server installation wizard offered it as an option in the final step (which I believe just installs the Snap package).

Node Volumes

By default, changes made to a containers file system will be lost when the container is terminated/restarted. Since it takes many hours to sync the Cardano blockchain, I knew I’d need some way to persist its db folder. For simplicity, I decided (for now) to just map some folders from the host machine into the containers. While I only have one Kubernetes node right now, this might not always be the case so the volumes need to be configured to be specific to this node (and if I decide to balance the relays across nodes, I would manually copy the data and then update the nodeAffinity section).

Again, I created a .yml file to hold the config and applied it with microk8s.kubectl apply -f ...yml:

apiVersion: v1
kind: PersistentVolume
metadata:
  # To provide volumes for multiple nodes, this section is
  # duplicated and each one has a unique name
  name: cardano-mainnet-relay-data-pv-1
spec:
  capacity:
    storage: 25Gi
  accessModes:
    - ReadWriteOnce # Only allow one container to use this volume
  persistentVolumeReclaimPolicy: Delete
  storageClassName: local-storage-mainnet-relay
  # Set the path to the folder on the node that will be mounted for this volume
  # This volume is on the host named "k8snode1"
  local:
    path: /home/danny/cardano/relay-mainnet-1/data
  nodeAffinity:
    required:
      nodeSelectorTerms:
        - matchExpressions:
          # Restrict this volume to a specific Kubernetes node by hostname
          - key: kubernetes.io/hostname
            operator: In
            values:
              - k8snode1 # hostname for this volume

---

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-storage-mainnet-relay
provisioner: kubernetes.io/no-provisioner
# Docs say that local bindings should be set to WaitForFirstConsumer.
# https://kubernetes.io/docs/concepts/storage/storage-classes/#local
volumeBindingMode: WaitForFirstConsumer

Node Configuration Files

To run a node we’ll need some config files. For mainnet they can be downloaded like this:

wget https://hydra.iohk.io/job/Cardano/cardano-node/cardano-deployment/latest-finished/download/1/mainnet-config.json
wget https://hydra.iohk.io/job/Cardano/cardano-node/cardano-deployment/latest-finished/download/1/mainnet-byron-genesis.json
wget https://hydra.iohk.io/job/Cardano/cardano-node/cardano-deployment/latest-finished/download/1/mainnet-shelley-genesis.json
wget https://hydra.iohk.io/job/Cardano/cardano-node/cardano-deployment/latest-finished/download/1/mainnet-topology.json

I renamed the topology.json file to relay-topology.json since we’ll eventually also have a producer-topology.json that will be different, and I want to be able to collect/back up the files in one place.

StatefulSet/Pod Definition

StatefulSets in Kubernetes are like Deployments but with sticky identities. Rather than pod names being suffixed with some random key that changes upon destroy/recreate, the pods in a StatefulSet will be numbered, with the identities reused if a pod is destroyed/recreated. This helps if you need other services to know identities of each pod but also simplifies using the pod names in kubectl commands without having to keep looking them up if you’re constantly recreating them 😀

Again, I created a .yml file and microk8s.kubectl apply -f‘d it.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: cardano-mainnet-relay-deployment
  labels:
    app: cardano-mainnet-relay-deployment
spec:
  serviceName: cardano-mainnet-relay
  # Control the number of relays here (you'll need enough volumes to cover them!)
  replicas: 1
  selector:
    matchLabels:
      app: cardano-mainnet-node
      cardano-mainnet-node-type: relay
  template:
    metadata:
      labels:
        # I added two labels to make it easy to select either "all nodes" or
        # "just relays" or just producer node"
        app: cardano-mainnet-node
        cardano-mainnet-node-type: relay
    spec:
      containers:
        - name: cardano-mainnet-relay
          # I'm using the official IOHK/cardano-node image to avoid needing to build anything
          image: inputoutput/cardano-node
          # Expose both the cardano-node port and the /metrics endpoint port.
          ports:
            - containerPort: 12798
            - containerPort: 4000
          # Mount the data volume at /data (see the Volume Claim below)
          volumeMounts:
            - name: data
              mountPath: /data
          # My configuration lives inside the mounted /data folder, and that's also where
          # the db data should be written
          args: ["run", "--config", "/data/configuration/mainnet-config.json", "--topology", "/data/configuration/relay-topology.json", "--database-path", "/data/db", "--socket-path", "/data/node.socket", "--port", "4000"]
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes:
          - ReadWriteOnce
        # This storageClassName is used on the previously defined volume that
        # provides storage on the host machine
        storageClassName: local-storage-mainnet-relay
        resources:
          requests:
            storage: 25Gi

Exposing the Relay with NodePort

Next, we need to ensure the node is accessible to the world. We need inbound peers to be able to get mempool transactions (which we’ll ultimately need to include in any blocks produced by our producer). This uses a NodePort, just like the Grafana service above. One unresolved niggle I have is that this exposes only a single port and balances across multiple relays. This means right now I can only provide other peers a single relay hostname/port even though in reality I may be running multiple (if anyone has a solution to this, I’d love to know!).

apiVersion: v1
kind: Service
metadata:
  name: cardano-mainnet-relay-service
spec:
  type: NodePort
  selector:
    app: cardano-mainnet-node
    cardano-mainnet-node-type: relay
  ports:
    # Export port 30801 pointing at port 4000 on the pod(s)
    - port: 30801
      nodePort: 30801
      targetPort: 4000

With this applied, I shared my relay details with a few friendly SPOs and used the log files to verify I had incoming connections and transactions. Step one, complete!