Migrating OpenRemote to a New AWS Instance — Need Advice

Clint · April 6, 2025, 9:20am

I’m currently in the process of restoring our OpenRemote deployment to a new AWS EC2 instance, following these steps:

Pulled the latest OpenRemote code using the Quick Start guide
Modified the docker-compose.yml to reflect the public IP address of the new instance.
Followed the Backup/Restore OpenRemote DB procedure from the [Developer Guide: Useful Commands and Queries

I also updated the following values in docker-compose.yml to match the values used in the previous production system:

KEYCLOAK_ADMIN_PASSWORD
OR_ADMIN_PASSWORD

After restoring the database and starting the stack, the openremote-manager-1 container remains unhealthy and fails with the message:

‘dependency failed to start: container openremote-manager-1 is unhealthy’

I’d appreciate advice on the best practice for migrating OpenRemote to a new instance.

I have seen this error in the logs depending on the password settings in docker-compose.yml:

manager-1     | 2025-04-06 07:36:14.316  SEVERE  [main                          ] org.openremote.container.Container       : >>> Runtime container startup failed
manager-1     | java.io.IOException: Integrity check failed: java.security.UnrecoverableKeyException: Failed PKCS12 integrity checking
manager-1     |         at java.base/sun.security.pkcs12.PKCS12KeyStore.engineLoad(PKCS12KeyStore.java:2162)
manager-1     |         at java.base/sun.security.util.KeyStoreDelegator.engineLoad(KeyStoreDelegator.java:228)
manager-1     |         at java.base/java.security.KeyStore.load(KeyStore.java:1500)
manager-1     |         at java.base/java.security.KeyStore.getInstance(KeyStore.java:1828)
manager-1     |         at java.base/java.security.KeyStore.getInstance(KeyStore.java:1709)
manager-1     |         at org.openremote.manager.security.KeyStoreServiceImpl.start(KeyStoreServiceImpl.java:107)
manager-1     |         at org.openremote.container.Container.start(Container.java:179)
manager-1     |         at org.openremote.container.Container.startBackground(Container.java:223)
manager-1     |         at org.openremote.manager.Main.main(Main.java:36)
manager-1     | Caused by: java.security.UnrecoverableKeyException: Failed PKCS12 integrity checking
manager-1     |         at java.base/sun.security.pkcs12.PKCS12KeyStore.lambda$engineLoad$2(PKCS12KeyStore.java:2156)
manager-1     |         at java.base/sun.security.pkcs12.PKCS12KeyStore$RetryWithZero.run(PKCS12KeyStore.java:257)
manager-1     |         at java.base/sun.security.pkcs12.PKCS12KeyStore.engineLoad(PKCS12KeyStore.java:2140)
manager-1     |         ... 8 more
manager-1 exited with code 0

Here is my docker-compose.yml:

# OpenRemote v3
#
# Profile that runs the stack by default on https://localhost using a self-signed SSL certificate,
# but optionally on https://$OR_HOSTNAME with an auto generated SSL certificate from Letsencrypt.
#
# It is configured to use the AWS logging driver.
#
volumes:
  proxy-data:
  manager-data:
  postgresql-data:

services:

  proxy:
    image: openremote/proxy:${PROXY_VERSION:-latest}
    restart: always
    depends_on:
      manager:
        condition: service_healthy
    ports:
      - "80:80" # Needed for SSL generation using letsencrypt
      - "${OR_SSL_PORT:-443}:443"
      - "8883:8883"
      - "127.0.0.1:8404:8404" # Localhost metrics access
    volumes:
      - proxy-data:/deployment
    environment:
      LE_EMAIL: ${OR_EMAIL_ADMIN:-}
      DOMAINNAME: ${OR_HOSTNAME:-0.123.456.789}
      DOMAINNAMES: ${OR_ADDITIONAL_HOSTNAMES:-}
      # USE A CUSTOM PROXY CONFIG - COPY FROM https://raw.githubusercontent.com/openremote/proxy/main/haproxy.cfg
      #HAPROXY_CONFIG: '/data/proxy/haproxy.cfg'

  postgresql:
    restart: always
    image: openremote/postgresql:${POSTGRESQL_VERSION:-latest}
    shm_size: 128mb
    volumes:
      - postgresql-data:/var/lib/postgresql/data
      - manager-data:/storage

  keycloak:
    restart: always
    image: openremote/keycloak:${KEYCLOAK_VERSION:-latest}
    depends_on:
      postgresql:
        condition: service_healthy
    volumes:
      - ./deployment:/deployment
    environment:
      KEYCLOAK_ADMIN_PASSWORD: ${OR_ADMIN_PASSWORD:-password}
      KC_HOSTNAME: ${OR_HOSTNAME:-0.123.456.789}
      KC_HOSTNAME_PORT: ${OR_SSL_PORT:--1}


  manager:
    privileged: true
    restart: always
    image: openremote/manager:${MANAGER_VERSION:-latest}
    depends_on:
      keycloak:
        condition: service_healthy
    ports:
      - "127.0.0.1:8405:8405" # Localhost metrics access
    environment:
      OR_SETUP_TYPE:
      OR_ADMIN_PASSWORD: ${OR_ADMIN_PASSWORD:-password}
      OR_SETUP_RUN_ON_RESTART:
      OR_EMAIL_HOST:
      OR_EMAIL_USER:
      OR_EMAIL_PASSWORD:
      OR_EMAIL_X_HEADERS:
      OR_EMAIL_FROM:
      OR_EMAIL_ADMIN:
      OR_METRICS_ENABLED: ${OR_METRICS_ENABLED:-true}
      OR_HOSTNAME: ${OR_HOSTNAME:-0.123.456.789}
      OR_ADDITIONAL_HOSTNAMES:
      OR_SSL_PORT: ${OR_SSL_PORT:--1}
      OR_DEV_MODE: ${OR_DEV_MODE:-false}

      # The following variables will configure the demo
      OR_FORECAST_SOLAR_API_KEY:
      OR_OPEN_WEATHER_API_APP_ID:
      OR_SETUP_IMPORT_DEMO_AGENT_KNX:
      OR_SETUP_IMPORT_DEMO_AGENT_VELBUS:
    volumes:
      - manager-data:/storage
      - ./deployment:/deployment

panos · April 6, 2025, 10:17am

Hey @Clint ,

This happens because the KeystoreService doesn’t have the correct password to unlock the Keystore that OpenRemote creates. If you had changed the keystore’s password using OR_KEYSTORE_PASSWORD, use that.

Basically, the keystore’s password is going to be OR_KEYSTORE_PASSWORD, if that doesn’t exist then it’s going to be OR_ADMIN_PASSWORD, and if that also does not exist, it’s going to be secret. So the one that the Keystore uses should be the same as with the previous deployment.

If you haven’t used x509 authentication in the MQTT client, you can also delete the keystores in the deployment or tmp directory, they’re called client_keystore.p12 or client_truststore.p12. This will recreate the files.

Let me know if that works!

Clint · April 6, 2025, 1:26pm

Thanks Panos

I deleted the following and it seemed to then work very quickly with all the containers healthy and I could log in.

rm /var/lib/docker/volumes/openremote_manager-data/_data/manager/keycloak-credentials.json
rm /var/lib/docker/volumes/openremote_manager-data/_data/keystores/client_keystore.p12
rm /var/lib/docker/volumes/openremote_manager-data/_data/keystores/client_truststore.p12

I then tried to repeat the process and was back to the same problem with container openremote-manager-1 waiting and then failing as unhealthy.
I have tried repeating this process a number of times since but no luck.
I am pretty sure I have the passwords in docker-compose.yml as per the previous deployment.

Any assistance much appreciated.

panos · April 6, 2025, 2:11pm

Hey @Clint ,

Not sure what you mean by repeating the process? Since you deleted the keystores, then they were recreated, and they currently use either OR_KEYSTORE_PASSWORD or OR_ADMIN_PASSWORD.

Clint · April 7, 2025, 5:27am

Hi @panosp

As per below, I seem to be getting through the credential validation process but the openremote-manager-1 container never gets to ‘healthy’ so that I can log in.

I can see rules etc in the logs being uotput to the screen but the manager container seems to be waiting on something.

manager-1     | 2025-04-07 07:18:53.448  INFO    [main                          ] curity.keycloak.KeycloakIdentityProvider : No stored credentials so using OR_ADMIN_PASSWORD
manager-1     | 2025-04-07 07:18:53.454  INFO    [main                          ] curity.keycloak.KeycloakIdentityProvider : Keycloak proxy URI set to: http://keycloak:8080/auth
manager-1     | 2025-04-07 07:18:53.454  INFO    [main                          ] curity.keycloak.KeycloakIdentityProvider : Validating keycloak credentials
manager-1     | 2025-04-07 07:18:57.284  INFO    [main                          ] curity.keycloak.KeycloakIdentityProvider : Credentials are valid
manager-1     | 2025-04-07 07:18:57.288  INFO    [main                          ] curity.keycloak.KeycloakIdentityProvider : OR_ADMIN_PASSWORD credentials are valid so creating/recreating stored credentials
manager-1     | 2025-04-07 07:19:00.125  INFO    [main                          ] curity.keycloak.KeycloakIdentityProvider : Stored credentials successfully generated so using them
manager-1     | 2025-04-07 07:19:00.127  INFO    [main                          ] curity.keycloak.KeycloakIdentityProvider : Keycloak proxy URI set to: http://keycloak:8080/auth
manager-1     | 2025-04-07 07:19:00.130  INFO    [main                          ] curity.keycloak.KeycloakIdentityProvider : Validating keycloak credentials
manager-1     | 2025-04-07 07:19:00.500  INFO    [main                          ] curity.keycloak.KeycloakIdentityProvider : Credentials are valid

Rich · April 28, 2025, 1:36pm

Please use code quotation marks for logs etc. makes things easier to read.

So the system just hangs after the Credentials are valid log statement?

Clint · April 28, 2025, 2:07pm

Hi Rich

Please see recent topic ‘Procedure to Migrate from OpenRemote Version 1.3.4 to 1.3.5’.
I uploaded the full logs there as per your suggestion.

I suspect the issue may be related to rules we have created slowing things down. If I delete the rules then the restore works. Anyway the full logs may provide some insights.

Thanks

martin.peeters · April 29, 2025, 6:25am

https://forum.openremote.io/t/procedure-to-migrate-from-openremote-version-1-3-4-to-1-3-5/