Drools lagging behind schedule

I am seeing a weird issue after updating packages on my Ubuntu 14.04 box (running OpenRemote 2.6.0 B3).
Drools is running behind schedule, and eventually it seems to give up, missing events completely.

The log shows that the timers start out working correctly, but slowly lag further and further behind. Restarting OpenRemote does not fix the issue, only a full computer reboot mitigates the problem (for a while). For clarity, the Nissan Leaf command is a simple python script.

Here are the relevant rules:

rule “Turn Fan on the Hour Every Hour”

timer (cron: 00 00 * * * ? ) when eval (true) then

execute.command(“Node 5 (ON)”);

end

rule “Turn Fan off after 10 Minutes of Runtime”

timer (cron: 00 10 * * * ? ) when eval (true) then

execute.command(“Node 5 (OFF)”);

end

rule “Ping the Nissan Leaf for an update every 5 minutes”

timer (cron: 0 0/5 * * * ? ) when eval (true) then

execute.command(“Update Nissan Leaf Info”);

end

``

Here is part of the log:

DEBUG 2018-02-12 20:58:12,727 (Drools): rule “Ping the Nissan Leaf for an update every 5 minutes” // (package org.openremote.controller.model.event)

Declarations

LHS objects(antecedents)

Class: “InitialFactImpl”

Fields:

org.drools.core.reteoo.InitialFactImpl@4dde85f0

DEBUG 2018-02-12 21:05:22,303 (Drools): rule “Turn Fan on the Hour Every Hour” // (package org.openremote.controller.model.event)

Declarations

LHS objects(antecedents)

Class: “InitialFactImpl”

Fields:

org.drools.core.reteoo.InitialFactImpl@4dde85f0

DEBUG 2018-02-12 21:05:22,304 (Drools): rule “Ping the Nissan Leaf for an update every 5 minutes” // (package org.openremote.controller.model.event)

Declarations

LHS objects(antecedents)

Class: “InitialFactImpl”

Fields:

org.drools.core.reteoo.InitialFactImpl@4dde85f0

DEBUG 2018-02-12 21:08:19,931 (Drools): rule “Ping the Nissan Leaf for an update every 5 minutes” // (package org.openremote.controller.model.event)

    Declarations

    LHS objects(antecedents)

            Class: "InitialFactImpl"

            Fields:

                    org.drools.core.reteoo.InitialFactImpl@4dde85f0

DEBUG 2018-02-12 21:09:07,246 (Drools): rule “Ping the Nissan Leaf for an update every 5 minutes” // (package org.openremote.controller.model.event)

    Declarations

    LHS objects(antecedents)

            Class: "InitialFactImpl"

            Fields:

                    org.drools.core.reteoo.InitialFactImpl@4dde85f0

``

Hi

This is purely a guess, but could it be the use of chron in the timer?

I only use :

timer (int: 15s) // 15 seconds interval timer

``

So is it worth trying this in your rules?

rule “Ping the Nissan Leaf for an update every 5 minutes”

timer (int: 5m) when eval (true) then

   execute.command("Update Nissan Leaf Info");

end

``

Regarding the fan part…

You could try using the Alarm Protocol and set up a chron that triggered every hour.

https://github.com/openremote/Documentation/wiki/Alarm-Protocol

Then list two commands.

The first starts the fan

Wait 30 minutes

The second stops the fan.

Something like this :-

<?xml version="1.0" encoding="UTF-8"?> 0 0 * ? * SUN,MON,TUE,WED,THU,FRI,SAT


true

``

Small detail, you must set the device group name

I have never had an issue with Cron before, but since this seems to be a dependency issue, it could be. I use cron so no matter when the controller is started, it always aligns the data readings with time so that downstream graphs don’t need timestamps with each data point.

Fair comment :slight_smile:

It was only a suggestion. :slight_smile:

I appreciate the help! However I am looking to find the root cause here, and not a workaround.

Just to close the loop here:

The issue I was having is due to how often the scheduler is set to run the command, and how long the command takes to run. The command was set to run every 5 minutes. In the command, there was a TCP connection request that has a 120 second timeout, with a retry counter of 5. This means the script could take up to 10 minutes to run.

It doesn’t matter what times you have, if you set a command to run again while the previous one is currently running (even if it is 1 second from finishing), the whole thing crashes and causes the above symptoms. The only recovery is to reboot the controller. It would be nice if the commands were queued but I believe that is beyond the scope of the scheduler interface.