Woohoo! I got the whole stack running in k8s on DigitalOcean

cbecker333 · May 27, 2023, 1:09am

Now ready to customize it for our project using the custom deployment project templates and guides…

I will get this polished and publish our terraform configs at some point so others can have a guide to deploy custom openremote with k8s-based infra setup in non-AWS environments.

Achievements:

translated the single-host stack example in docker-compose.yml to a k8s cluster configuration with service, ingress, and statefulSet resources to implement the infrastructure
infrastructure is fully managed by terraform, with terragrunt on top to provide parameterized customizations on a per-environment or per-deployment basis.
a DigitalOcean load balancer integrates with k8s ingress for inbound web traffic to the cluster and terminates SSL with a digtialocean-managed certificate. This eliminates the HAProxy container and LetsEncrypt config in the typical deployment
our build pipeline for shipping customized manager Docker images works: we publish to a digitalocean-managed private repository and pull those images down in our StatefulSet specs for containers.

michal · May 27, 2023, 6:35am

Congrats with this! Eliminating HAproxy is a good improvement, one point of worries is eliminated and probably you are much better protected against DDOS. Deployment through a private repo is another good decision. I’m eager to see your configs. Have you done some performance testing, scaling, resilience, etc?

cbecker333 · May 27, 2023, 5:47pm

Not yet, it will be nice to run some tests after we get more of the asset model and deployment customization figured out!

One big thing that remains for me is setting up a host with NFS for the volume mounts that manager and keycloak are using… presently we are limited to the ReadWriteOnce binding mode which only allows us to attach reusable volumes to a single container instance. After that is resolved, I can start to scale the backend java processes horizontally.

binayak87 · May 27, 2023, 7:13pm

Are you running with dev mode set to true?

cbecker333 · May 27, 2023, 7:31pm

OR_DEV_MODE is 0

Here’s the web StatefulSet as of now, I have plans to separate keycloak and manager so I can handle resource and scaling needs separately, but that depends on the aforementioned challenge with storage binding to multiple containers.

resource "kubernetes_stateful_set" "web" {
  metadata {
    name = "web"
    namespace = "default"
  }
  spec {
    replicas = 1
    selector {
      match_labels = {
        app = "web"
      }
    }
    service_name = "web"
    template {
      metadata {
        labels = {
          app = "web"
        }
      }
      spec {
        init_container {
          name = "mounts-perms-fix"
          image = "busybox"
          command = ["/bin/sh", "-c", "/bin/mkdir -p /deployment/manager && /bin/chmod -R 777 /deployment && /bin/chmod -R 777 /storage"]
          volume_mount {
            mount_path = "/deployment"
            name = "deployment-data"
          }
          volume_mount {
            mount_path = "/storage"
            name = "manager-data"
          }
        }  
        container {
          image = "openremote/keycloak:latest"
          name = "keycloak"
          port {
            container_port = 8080
            name = "http-keycloak"
          }
          volume_mount {
            mount_path = "/deployment"
            name = "deployment-data"
          }
          env {
            name = "KEYCLOAK_ADMIN"
            value = "admin"
          }
          env {
            name = "KEYCLOAK_ADMIN_PASSWORD"
            value = "password"
          }
          env {
            name = "KC_HOSTNAME"
            value = var.frontend_hostname
          }
          env {
            name = "KC_HOSTNAME_PATH"
            value = "auth"
          }
          env {
            name = "KC_HOSTNAME_ADMIN_URL"
            value = "https://${var.frontend_hostname}/auth"
          }
          env {
            name = "KC_DB_URL_HOST"
            value = "postgresql.backend"
          }
          env {
            name = "KC_HOSTNAME_STRICT_HTTPS"
            value = "true"
          }
          env {
            name = "KC_PROXY"
            value = "edge"
          }
          env {
            name = "KC_DB_URL"
            value = "jdbc:postgresql://postgresql.backend:5432/openremote?currentSchema=public"
          }
          env {
            name = "PROXY_ADDRESS_FORWARDING"
            value = "true"
          }
        }
        container {
          image = "registry.digitalocean.com/sk8net/openremote/manager:may25test00"
          name = "manager"
          port {
            container_port = 8090
            name = "http-manager"
          }
          port {
            container_port = 8443
            name = "https"
          }
          port {
            container_port = 8883
            name = "mqtt"
          }
          volume_mount {
            mount_path = "/storage"
            name = "manager-data"
          }
          volume_mount {
            mount_path = "/deployment"
            name = "deployment-data"
          }
          env {
            name = "OR_DB_HOST"
            value = "postgresql.backend"
          }
          env {
            name = "OR_ADMIN_PASSWORD"
            value = "password"
          }
          env {
            name = "OR_HOSTNAME"
            value = var.frontend_hostname
          }
          env {
            name = "OR_SSL_PORT"
            value = "-1"
          }
          env {
            name = "OR_WEBSERVER_LISTEN_PORT"
            value = "8090"
          }
          env {
            name = "OR_DEV_MODE"
            value = 0
          }
          env {
            name = "KEYCLOAK_AUTH_PATH"
            value = "auth"
          }
          env {
            name = "OR_KEYCLOAK_HOST"
            value = "web.default"
          }
          env {
            name = "OR_KEYCLOAK_PORT"
            value = "8080"
          }
        }
        termination_grace_period_seconds = 10
      }
    }
    volume_claim_template {
      metadata {
        name = "deployment-data"
      }
      spec {
        access_modes = [
          "ReadWriteOnce",
        ]
        volume_name = "deployment-data"
        resources {
          requests = {
            storage = "5Gi"
          }
        }
        storage_class_name = "do-block-storage"
      }
    }
    volume_claim_template {
      metadata {
        name = "manager-data"
      }
      spec {
        volume_name = "manager-data"
        access_modes = [
          "ReadWriteOnce",
        ]
        resources {
          requests = {
            storage = "5Gi"
          }
        }
        storage_class_name = "do-block-storage"
      }
    }
  }
}

I did some secops verification today as well with this test cluster:
On initial cold startup of a new deployment, the manager uses the hard-coded plaintext admin user & pass to auth with keycloak, but then creates a new user manager-keycloak and writes a credential file with a new securely generated random password.

test:
I changed the keycloak admin user password and replaced our manager backend container in k8s. A new instance auth’d with keycloak successfully after reading the secure credential file with the manager-keycloak user:

Loading OR_KEYCLOAK_GRANT_FILE: /deployment/manager/keycloak.json
Found stored credentials so attempting to use them
Keycloak proxy URI set to: http://web.default:8080/auth
Validating keycloak credentials
Credentials are valid

As well, we have a fully hygienic approach to secrets management for devops, using the pass command line tool for local GPG-encrypted secret storage:

Prepare your command line ENV with exports and apply terraform configs:

export TF_VAR_do_token=$(pass my_project/do_token)
export AWS_ACCESS_KEY_ID=$(pass my_project/spaces_access_id)
export AWS_SECRET_ACCESS_KEY=$(pass my_project/spaces_secret_key)
terragrunt apply

the aws env vars are used for our digitalocean Spaces bucket that serves as terraform state management backend:

terraform {
  required_providers {
    digitalocean = {
      source = "digitalocean/digitalocean"
      version = "~> 2.8.0"
    }
  }
 
  backend "s3" {
    skip_credentials_validation = true
    skip_metadata_api_check     = true
    endpoint                    = "https://nyc3.digitaloceanspaces.com"
    region                      = "us-east-1" // needed
    bucket                      = "terraform-states" // name of your space
    key                         = "infrastructure/terraform.tfstate"
  }
}

provider "digitalocean" {
  token = var.do_token
}

cbecker333 · May 29, 2023, 2:50pm

Ultimately it seems we will run HAProxy within the cluster - after much research, I found that there probably is some way to configure DOKS (Kubernetes on Digitalocean) load balancers to do what we want: terminate both HTTPS and mqtt/TLS… but it requires a big effort and extensive expertise in working with and customizing the Ingress Class for ingress-nginx in kubernetes.

So for me the cost of running another container to do TLS termination correctly with HAproxy is much lower than the DevOps cost of tinkering with obscure LB configs to get TLS termination working for 8883/TCP.

I’ll provide more details and complete configuration examples once this piece is sorted out!

cbecker333 · May 30, 2023, 10:30pm

Here’s the changes I’ve made to the Haproxy docker image to get it running in K8s

Again, kubernetes Ingress resource is meant for inbound traffic but solely for HTTP(S) system not for other TCP protocols. As a result, I was in a lot of pain trying to use it for both https and mqtt. Now the approach has changed: we simply define a kubernetes service of type “LoadBalancer” and specify the TCP ports that we want to pass on, and which ones will be TLS:

resource "kubernetes_service" "load_balancer" {
  metadata {
    name = "load-balancer"
    namespace = "frontend"
    labels = {
      app = "web"
    }
    annotations = {
      "service.beta.kubernetes.io/do-loadbalancer-tls-passthrough" = "true"
      "service.beta.kubernetes.io/do-loadbalancer-name" = var.loadbalancer_friendly_name
      "service.beta.kubernetes.io/do-loadbalancer-tls-ports" = "443,8883"
    }
  }
  
  spec {
    type = "LoadBalancer"
    selector = {
      app = "web"
    }
    port {
      name = "http"
      port = 80
      target_port = "http-haproxy"
      protocol = "TCP"
    }
    port {
      name = "http-stats"
      port = 8404
      target_port = "stats-haproxy"
      protocol = "TCP"
    }
    port {
      name = "https"
      port = 443
      target_port = "https-haproxy"
      protocol = "TCP"
    }
    port {
      name = "mqtt"
      port = 8883
      target_port = "mqtt-haproxy"
      protocol = "TCP"
    }
  }
}

Specifying a name for the corresponding DigitalOcean loadbalancer appliance ensures that it will maintain a consistent identity (and IP address) if you destroy & replace the k8s resources that configure said appliance and use the same name consistently.

Here’s the resulting DigitalOcean appliance as configured via sync of the kubernetes resource definition:

The wide open unencrypted stats port is obviously temporary, for debugging purposes.

The target_port references in the above service resource definition are pointers to my StatefulSet named ports, here’s the configuration for the proxy StatefulSet in terraform:

resource "kubernetes_stateful_set" "proxy" {
  metadata {
    name = "proxy"
    namespace = "frontend"
    labels = {
      web_dependency = kubernetes_stateful_set.web.metadata.0.name
    }
  }
  spec {
    replicas = 1
    selector {
      match_labels = {
        app = "web"
      }
    }
    service_name = "proxy"

    template {
      metadata {
        labels = {
          app = "web"
        }
      }
      spec {
        init_container {
          name = "mounts-perms-fix"
          image = "busybox"
          command = [
            "/bin/sh",
            "-c",
            "/bin/chmod -R 777 /proxy"
          ]
          volume_mount {
            mount_path = "/proxy"
            name = "proxy-data"
          }
        }
        container {
          image = "registry.digitalocean.com/sk8net/openremote/proxy:cfdafc5c9c40eff8f82ac5224a0d8f2ab90362b1"
          name = "haproxy"
          volume_mount {
            mount_path = "/deployment"
            name = "proxy-data"
          }
          port {
            container_port = 8080
            name = "http-haproxy"
          }
          port {
            container_port = 8443
            name = "https-haproxy"
          }
          port {
            container_port = 8404
            name = "stats-haproxy"
          }
          port {
            container_port = 8883
            name = "mqtt-haproxy"
          }
          env {
            name = "MANAGER_HOST"
            value = "web.default.svc.cluster.local"
          }
          env {
            name = "MANAGER_MQTT_PORT"
            value = "1883"
          }
          env {
            name = "MANAGER_WEB_PORT"
            value = "8090"
          }
          env {
            name = "KEYCLOAK_HOST"
            value = "web.default.svc.cluster.local"
          }
          env {
            name = "KEYCLOAK_PORT"
            value = "8080"
          }
          env {
            name = "LE_EMAIL"
            value = "admin@sk8net.org"
          }
          env {
            name = "DOMAINNAME"
            value = var.frontend_hostname
          }
          env {
            name = "CERT_DIR"
            value = "/deployment/certs"
          }
        }
        termination_grace_period_seconds = 10
      }
    }
    volume_claim_template {
      metadata {
        name = "proxy-data"
      }
      spec {
        volume_name = "proxy-data"
        access_modes = [
          "ReadWriteOnce",
        ]
        resources {
          requests = {
            storage = "5Gi"
          }
        }
        storage_class_name = "do-block-storage"
      }
    }
  }
}

cbecker333 · May 30, 2023, 10:36pm

A Look at how my deployed services are organized

The services forward traffic to ports defined in the statefulSets:

cbecker333 · May 31, 2023, 12:34am

I tore down and rebuilt my cluster, reusing the load balancer and storage volumes… everything worked flawlessly! The system came back up with the same DB and config, on a completely new cluster. This is the beauty of going to great lengths (it was not easy) to wrap everything in k8s & terraform.

I’ll put up a guide ASAP describing the process and providing everything you need to build what you see above.

Here’s the extent of what it takes to spin up an entire cluster for one stage of our ci_cd environment pipeline, once we configure project parameters in terragrunt.hcl:

export TF_VAR_do_token=$(pass sk8net/do_token)
export AWS_ACCESS_KEY_ID=$(pass sk8net/spaces_access_id)
export AWS_SECRET_ACCESS_KEY=$(pass sk8net/spaces_secret_key)
terragrunt apply -target=digitalocean_kubernetes_cluster.primary # bootstrap it

doctl kubernetes cluster kubeconfig save shared-dev # save config for kubectl

# human do this: Go into the digital ocean dashboard, container registry, click edit and enable integration for the newly created k8s cluster

terragrunt apply # this spins up the entire infrastructure!

# if a new loadbalancer was created (first time you deploy this env), you need to point a DNS record at it now

Behold, it is gorgeous:

Plan: 12 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

kubernetes_namespace.backend: Creating...
kubernetes_namespace.frontend: Creating...
kubernetes_service.postgresql: Creating...
kubernetes_service.web: Creating...
kubernetes_service.load_balancer: Creating...
kubernetes_persistent_volume.postgresql_data: Creating...
kubernetes_persistent_volume.proxy_data: Creating...
kubernetes_persistent_volume.manager_data: Creating...
kubernetes_persistent_volume.deployment_data: Creating...
kubernetes_stateful_set.pgsql: Creating...
....
kubernetes_stateful_set.pgsql: Creation complete after 1m27s [id=backend/pgsql]
kubernetes_stateful_set.web: Creating...
kubernetes_stateful_set.web: Still creating... [10s elapsed]
kubernetes_stateful_set.web: Still creating... [20s elapsed]
kubernetes_stateful_set.web: Still creating... [30s elapsed]
kubernetes_stateful_set.web: Still creating... [40s elapsed]
kubernetes_stateful_set.web: Still creating... [50s elapsed]
kubernetes_stateful_set.web: Still creating... [1m0s elapsed]
kubernetes_stateful_set.web: Still creating... [1m10s elapsed]
kubernetes_stateful_set.web: Creation complete after 1m17s [id=default/web]
kubernetes_stateful_set.proxy: Creating...
kubernetes_stateful_set.proxy: Still creating... [10s elapsed]
kubernetes_stateful_set.proxy: Still creating... [20s elapsed]
kubernetes_stateful_set.proxy: Creation complete after 27s [id=frontend/proxy]

Apply complete! Resources: 12 added, 0 changed, 0 destroyed.