Trouble Shooting Apcupsd
TestingThe first step in trouble shooting apcupsd is to read the Testing Apcupsd section of this manual.
Network Problems with Mater/Slave ConfigurationsWhen working with a master/slave configuration (one UPS powering more than one computer), the master and slave communicate via the network. In many configurations, apcupsd is started before the network is initialized. In this case, it is possible that the master will be unable to contact the slave. On apcupsd versions prior to 3.8.0, this could cause apcupsd to error off. The solution to this problem is to either force apcupsd to be started after the network and the DNS (fiddle the symbolic links in /etc/rc.d), or put the names of the slave machines in your /etc/hosts file, or even more preferable, use IP addresses rather than machine names. On some configurations, you may need to use fully qualified names (host.domain.xxx) rather than simple host names.
Error Messages from a Master ConfigurationIn a master/slave configuration, you can get the following error messages from a master. The error message is followed by a possible explanation:
Cannot resolve slave name XXXTo contact the slave, the slave name given in the configuration file must be resolved to an IP address. In this case, apcupsd could not get the IP address. Either the slave name is incorrect, your DNS may not be working, or you have started apcupsd during the boot process before the network is operational.
Got slave shutdown from SSSThis message should not be printed as it is not yet used.
Cannot write to slave SSSThis message occurs when the master attempts to send a message to the slave SSS and gets an error. It indicates that either the slave machine is not responding (apcupsd died, the system crashed, ...) or that the network is down.
Cannot read magic from slave SSSThis message indicates that the master attempted to read the code key from the slave SSS and it did not match the value expected. A common cause of this problem is that the master and slave versions of apcupsd are not the same. Please be sure you are running the same version of apcupsd on all your master and slave machines.
Connect to slave SSS failedThis message is logged when the master attempts to connect to slave SSS and no connection is accepted. The most common cause of this problem is that the slave copy of apcuspd is not yet ready to accept connections or is not running. Generally, apcupsd will retry the connection a bit later. If the problem is persistent, it can indicate a network problem or the slave name on the SLAVE directive of the master's configuration file is incorrect.
Cannot open stream socketThis indicates a fundamental networking problem on your system -- either a lack of sufficient resources or you have not configured TCP/IP operations.
Error Messages from a Slave ConfigurationIn a master/slave configuration, you can get the following error messages from a slave. The error message is followed by a possible explanation:
Can't resolve master name MMMThis message is logged when the slave attempts to resolve the name given on the MASTER configuration directive to an IP address. It probably means that the master name MMM is not defined, your DNS is not properly working, or you have started apcupsd in the boot process before the network is initialized. Check the name MMM, or use an explicit IP address on the MASTER configuration directive in the slave's configuration file.
Cannot bind local address, probably already in useThis means that the slave has attempted to bind the port number so that it can listen for messages from the master. This can occur if already have a copy of apcupsd running, or you have previously run apcupsd in the past 5 or 10 minutes, because occasionally the operating system will not shutdown a port correctly for 5 to 10 minutes after a program exits. In this case, you can either wait a few minutes for the problem to go away, or use a different port in both your master and slave configuration files.
Socket accept errorThe slave got an error waiting on the accept() system call. This is probably due to a fundamental networking problem.
Unauthorised attempt from master MMMThe master named MMM (probably an IP address) contacted the slave but MMM is not the master that was listed on the MASTER configuration directive in /etc/apcupsd.conf, and consequently, it is not authorized to communicate with the slave. Please check that your MASTER and SLAVE names in your slave and master configuration files respectively are correct.
Read failure from socketThe slave got an error reading the socket open to the master. This indicates a fundamental networking problem.
Bad APC magic from master: MMMThe slave received a code key from the master that does not correspond to the one expected by the slave. The most common cause of this problem is that you are running a different version of apcupsd on the master and the slave. Please ensure that you are running the same version of apcupsd on all your master and slaves.
Bad user magic from master: MMMThis message indicates that the master and slave have previously communicated, but that the code key transmitted with the most recent message from the master does not correspond to what the slave expects. This problem is probably due to a network error or some other user or machine contacting the slave on the network port.
Master/Slave Connection Not WorkingMaster/slave problems are usually related to one of the following items:
CGI Programs Do Not WorkTry checking the following:
Battery ProblemsPlease see the Battery Chapter of this document for more details.
Cable or Connection ProblemsFrequently during the initial installation, users don't know what cable they have or have problems connecting to the serial port. If this is your case, one means of diagnosing the problem can be to use the apctest program. To do so, you must first build it with:
make apctestThen, you simply execute it with:
./apctestand follow the instructions. It will place the output from the session in the file apctest.output. If you are not able to resolve your problem, sometimes we can help if you email us this output file along with your apcupsd.conf file. Please see the Testing Chapter of this document for additional details on how to build and use apctest.
Bizarre Intermittent BehaviorIn one case, a user reported that he received random incorrect values from the UPS in the status output. It turned out that gpm, the mouse control program for command windows, was using the serial port without using the standard Unix locking mechanism. As a consequence, both apcupsd and gpm were reading the serial port. Please ensure that if you are running gpm that it is not configured with a serial port mouse on the same serial port.