modbus plugin can get out of sync WRT transaction IDs #14317

cmock · 2023-11-18T18:29:48Z

Relevant telegraf.conf

[[inputs.modbus]]
name = "pv_inverter"
slave_id = 1
timeout = "5s"
controller = "tcp://192.168.66.3:502"
configuration_type = "metric"
optimization = "max_insert"

[inputs.modbus.workarounds]
pause_after_connect = "2000ms"

[[inputs.modbus.metric]]
slave_id = 1
measurement = "pv_inverter" # Huawei Sun2000
# cf https://github.com/wlcrs/huawei_solar/wiki
fields = [
  { address=32064, name="P_in", type="INT32", scale=1.0 },
  { address=32069, name="L1_U", type="UINT16", scale=0.1 },
  { address=32070, name="L2_U", type="UINT16", scale=0.1 },
  { address=32071, name="L3_U", type="UINT16", scale=0.1 },
  { address=32072, name="L1_I", type="INT32", scale=0.001 },
  { address=32074, name="L2_I", type="INT32", scale=0.001 },
  { address=32076, name="L3_I", type="INT32", scale=0.001 },
  { address=32080, name="P_out", type="INT32", scale=1.0 },
  { address=32082, name="Q_out", type="INT32", scale=1.0 },
  { address=32084, name="Power_Factor", type="INT16", scale=0.001 },
  { address=32085, name="Frequency", type="UINT16", scale=0.01 },
  { address=32086, name="Efficiency", type="UINT16", scale=0.01 },
  { address=32087, name="Temperature", type="INT16", scale=0.1 },
]

  [inputs.modbus.metric.tags]
  null="maybe"


[[inputs.modbus.metric]]
slave_id = 1
measurement = "pv_inverter" # Huawei Sun2000
# cf https://github.com/wlcrs/huawei_solar/wiki
fields = [
  { address=32000, name="state_1", type="UINT16" },
  { address=32002, name="state_2", type="UINT16" },
  { address=32003, name="state_3", type="UINT16" },
  { address=32008, name="alarm_1", type="UINT16" },
  { address=32009, name="alarm_2", type="UINT16" },
  { address=32010, name="alarm_3", type="UINT16" },
  { address=32016, name="PV1_U", type="INT16", scale=0.1 },
  { address=32017, name="PV1_I", type="INT16", scale=0.01 },
  { address=32018, name="PV2_U", type="INT16", scale=0.1 },
  { address=32019, name="PV2_I", type="INT16", scale=0.01 },
  { address=32078, name="Peak_Power_today", type="INT32", scale=1.0 },
  { address=32088, name="R_iso", type="UINT16", scale=1000.0 },
  { address=32089, name="device_status", type="UINT16" },
  { address=32090, name="fault_code", type="UINT16" },
  { address=32106, name="energy_total", type="UINT32", scale=10.0},
]

Logs from Telegraf

Nov 18 07:02:05 rockpro telegraf[1449366]: 2023-11-18T06:02:05Z E! [inputs.modbus] Error in plugin: slave 1: read tcp 192.168.57.7:56970->192.168.66.3:502: i/o timeout
Nov 18 07:02:10 rockpro telegraf[1449366]: 2023-11-18T06:02:10Z E! [inputs.modbus] Error in plugin: slave 1: modbus: response transaction id '7616' does not match request '7617'
Nov 18 07:02:20 rockpro telegraf[1449366]: 2023-11-18T06:02:20Z E! [inputs.modbus] Error in plugin: slave 1: modbus: response transaction id '7617' does not match request '7618'
Nov 18 07:02:30 rockpro telegraf[1449366]: 2023-11-18T06:02:30Z E! [inputs.modbus] Error in plugin: slave 1: modbus: response transaction id '7618' does not match request '7619'
Nov 18 07:02:40 rockpro telegraf[1449366]: 2023-11-18T06:02:40Z E! [inputs.modbus] Error in plugin: slave 1: modbus: response transaction id '7619' does not match request '7620'
Nov 18 07:02:50 rockpro telegraf[1449366]: 2023-11-18T06:02:50Z E! [inputs.modbus] Error in plugin: slave 1: modbus: response transaction id '7620' does not match request '7621'

[these lines repeat until telegraf is restarted]

System info

telegraf 1.28.5-1, Debian 12.2

Docker

No response

Steps to reproduce

The issue occurs occasionally; telegraf runs fine for weeks on end, but if this error is occuring, it does not recover by itself.

Expected behavior

I guess closing/reopening the Modbus connection after an i/o error, or at least resetting the internal counters for the transaction IDs, would be a solution.

Actual behavior

After this desynchronization the connection never recovers, until telegraf is restarted.

Additional info

The device being monitored is a Huawei Sun2000 solar inverter, whose modbus implementation is a bit flakey.

Not every "i/o timeout" message in the logs leads to this behaviour.

The text was updated successfully, but these errors were encountered:

srebhan · 2023-11-20T23:22:22Z

@cmock this seems to be an error in the upstream library. Can you please report the error there and add a link to this issue!?

cmock · 2023-11-21T11:44:26Z

done: grid-x/modbus#78

cmock added the bug unexpected problem or unintended behavior label Nov 18, 2023

srebhan added the upstream bug or issues that rely on dependency fixes label Nov 20, 2023

srebhan self-assigned this Nov 20, 2023

cmock mentioned this issue Nov 21, 2023

Gets out of sync WRT transaction IDs grid-x/modbus#78

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

modbus plugin can get out of sync WRT transaction IDs #14317

modbus plugin can get out of sync WRT transaction IDs #14317

cmock commented Nov 18, 2023 •

edited

Loading

srebhan commented Nov 20, 2023

cmock commented Nov 21, 2023

modbus plugin can get out of sync WRT transaction IDs #14317

modbus plugin can get out of sync WRT transaction IDs #14317

Comments

cmock commented Nov 18, 2023 • edited Loading

Relevant telegraf.conf

Logs from Telegraf

System info

Docker

Steps to reproduce

Expected behavior

Actual behavior

Additional info

srebhan commented Nov 20, 2023

cmock commented Nov 21, 2023

cmock commented Nov 18, 2023 •

edited

Loading