[collectd] Let IPMI plugin skip periods "unsupported state" periods
Bruno Prémont
bonbons at linux-vserver.org
Wed Feb 4 10:11:24 CET 2009
On a server IPMI readings suddenly failed causing the plugin to stop
collecting data.
The following entries got logged:
ipmi plugin: sensor_read_handler: Removing sensor Temp 7 processor (3.6), because it failed with status 0x10000d5.
ipmi plugin: sensor_read_handler: Removing sensor Temp 6 processor (3.5), because it failed with status 0x10000d5.
ipmi plugin: sensor_read_handler: Removing sensor Temp 5 power_supply (10.5), because it failed with status 0x10000d5.
ipmi plugin: sensor_read_handler: Removing sensor Temp 4 processor (3.4), because it failed with status 0x10000d5.
ipmi plugin: sensor_read_handler: Removing sensor Temp 3 processor (3.3), because it failed with status 0x10000d5.
ipmi plugin: sensor_read_handler: Removing sensor Temp 2 external_environment (39.1), because it failed with status 0x10000d5.
ipmi plugin: sensor_read_handler: Removing sensor Temp 1 system_internal_expansion_board (16.1), because it failed with status 0x10000d5.
This patch attempts to provide slightly better error message and just
skip current reading iteration if the error
IPMI_NOT_SUPPORTED_IN_PRESENT_STATE_CC is indicated.
This error happens e.g. when iLo firmware is upgraded (or reboots?) on
HP servers.
---
--- collectd-4.5.2/src/ipmi.c 2009-02-02 13:14:13.898736713 +0100
+++ collectd-4.5.2/src/ipmi.c 2009-02-02 13:21:58.987738365 +0100
@@ -148,11 +148,33 @@ static void sensor_read_handler (ipmi_se
}
}
}
+ else if (IPMI_IS_IPMI_ERR(err) && IPMI_GET_IPMI_ERR(err) == IPMI_NOT_SUPPORTED_IN_PRESENT_STATE_CC)
+ {
+ INFO ("ipmi plugin: sensor_read_handler: Sensor %s not ready",
+ list_item->sensor_name);
+ }
else
{
- INFO ("ipmi plugin: sensor_read_handler: Removing sensor %s, "
- "because it failed with status %#x.",
- list_item->sensor_name, err);
+ if (IPMI_IS_IPMI_ERR(err))
+ INFO ("ipmi plugin: sensor_read_handler: Removing sensor %s, "
+ "because it failed with IPMI error %#x.",
+ list_item->sensor_name, IPMI_GET_IPMI_ERR(err));
+ else if (IPMI_IS_OS_ERR(err))
+ INFO ("ipmi plugin: sensor_read_handler: Removing sensor %s, "
+ "because it failed with OS error %#x.",
+ list_item->sensor_name, IPMI_GET_OS_ERR(err));
+ else if (IPMI_IS_RMCPP_ERR(err))
+ INFO ("ipmi plugin: sensor_read_handler: Removing sensor %s, "
+ "because it failed with RMCPP error %#x.",
+ list_item->sensor_name, IPMI_GET_RMCPP_ERR(err));
+ else if (IPMI_IS_SOL_ERR(err))
+ INFO ("ipmi plugin: sensor_read_handler: Removing sensor %s, "
+ "because it failed with RMCPP error %#x.",
+ list_item->sensor_name, IPMI_GET_SOL_ERR(err));
+ else
+ INFO ("ipmi plugin: sensor_read_handler: Removing sensor %s, "
+ "because it failed with error %#x. of class %#x",
+ list_item->sensor_name, err & 0xff, err & 0xffffff00);
sensor_list_remove (sensor);
}
return;
More information about the collectd
mailing list