The logs-check searches in logfiles for unknown or "bad" entries. For each logfile, you can configure for every message an errorlevel. Every message is a regular expression. Checking WebSphere / WebSphere MQ and DB2 logs is also possible.
LOGS is the enclosing tag for all logfile entries.
0 or 1. If you don't define LOGS, no log check will run.
<LOGS>
<LOGFILE>
<LOGFILENAME>/var/log/messages</LOGFILENAME>
<!-- if message file is older then 300 minutes,
somesting is going wrong with the syslog daemon -->
<AGE>
<MAXAGE>300</MAXAGE>
<ERRORLEVEL>ERROR</ERRORLEVEL>
</AGE>
<LOGFILTER><REGEX>syslogd.*restart</REGEX><ERRORLEVEL>NORMAL</ERRORLEVEL></LOGFILTER>
<!-- If you don't define ERRORLEVEL: Default is NORMAL -->
<!-- If you don't define PRIORITY: Default is 100 -->
<LOGFILTER><REGEX>-- MARK --</REGEX></LOGFILTER>
<LOGFILTER>
<REGEX>.*</REGEX>
<!-- That is AFTER all default priorities! -->
<PRIORITY>1000</PRIORITY>
<ERRORLEVEL>ERROR</ERRORLEVEL>
</LOGFILTER>
</LOGFILE>
<LOGFILE>
<LOGFILENAME>/home/oracle/product/9.2.0/rdbms/log/alert_osmart.log</LOGFILENAME>
<LOGFILTER>
<REGEX>.*</REGEX>
<PRIORITY>1000</PRIORITY>
<ERRORLEVEL>ERROR</ERRORLEVEL>
</LOGFILTER>
</LOGFILE>
</LOGS>
For every logfile you want to check, you define a LOGFILE-entry.
The common XML tags as described in Section 9.8, “Tags Common to All Checks and/or Checkpoints”
As many as you like.
Look at LOGS
"Normal" logfiles are line-orientated. One message - one line. Some IBM products (notably DB2 and WebSphere MQ) have ascii-logfiles, too, but they are paragraph orientated: One message - one paragraph.
logs has to know about that.
For "normal" logfiles two variations are possible:
check just complete lines in logfile. A line is complete, when ended with NL or CR/NL.
This is, what you want for most logfiles and this is the default.
check everything after the last NL, too.
UDB
MQS
NLALL (everything after the last NL,too)
NORMAL (this is the default)
0 or 1. If you don't define LOGTYPE, then NORMAL is used.
<LOGS>
<LOGFILE>
<LOGFILENAME>/opt/IBM/db2/V8.1/db2inst2/sqllib/db2dump/db2diag.log</LOGFILENAME>
<LOGFILETYPE>UDB</LOGFILETYPE>
<!-- exclude your unimportant log messages (blocks) here
<LOGFILTER><REGEX>database started</REGEX><ERRORLEVEL>NORMAL</ERRORLEVEL></LOGFILTER>
<LOGFILTER>
<REGEX>.*</REGEX>
<!-- That is AFTER all default priorities! -->
<PRIORITY>1000</PRIORITY>
<ERRORLEVEL>ERROR</ERRORLEVEL>
</LOGFILTER>
</LOGFILE>
</LOGS>
What do you want to filter / to mark in your logfile ? Normally, you want many LOGFILTER entries in your configfile.
If you don't configure an ERRORLEVEL, 'NORMAL' is used.
Many
Look at LOGFILE
logs has to save your not-NORMAL entries for the next run. If not, your logfile would be NORMAL after 5 minutes!.
These entries will be saved in ERRORFILE or, if ERRORFILE
isn't given, in
/home/osmart/var/$CHECKPOINTNAME.error (if you have OpenSMART installed in /home/osmart/)
any valid filename you can write to.
0 or 1, if you don't configure an ERRORFILE,
/home/osmart/var/$CHECKPOINTNAME.error (if
you have OpenSMART installed in
/home/osmart/) is used.
<LOGS>
<LOGFILE>
<LOGFILENAME>/var/log/messages</LOGFILENAME>
<ERRORFILE>/home/osmart/var/message_logfile.error</ERRORFILE>
<OFFSETFILE>/home/osmart/var/message_logfile.offset</OFFSETFILE>
<!-- If you don't add an ERRORLEVEL to LOGFILTER, the default is
NORMAL -->
<!-- If you don't add a priority to LOGFILTER, the default is 100
-->
<LOGFILTER><REGEX>syslogd.*restart</REGEX></LOGFILTER>
<LOGFILTER>
<REGEX>.*</REGEX><!-- Everything unknown -->
<PRIORITY>1000</PRIORITY>
<!-- That is AFTER all default priorities! -->
<ERRORLEVEL>ERROR</ERRORLEVEL>
</LOGFILTER>
</LOGFILE>
</LOGS>
logs has to save the point in your logfile, it has yet checked to (the offset). If not, your logfile would be completeley scanned every 5 minutes!.
These entries will be saved in OFFSETFILE or, if OFFSETFILE
isn't given, in
/home/osmart/var/$CHECKPOINTNAME.offset (if
you have OpenSMART installed in
/home/osmart/)
any valid filename you can write to.
0 or 1. If you don't configure OFFSETFILE,
/home/osmart/var/$CHECKPOINTNAME.offset
(if you have OpenSMART installed in
/home/osmart/)
will be used.
Look at ERRORFILE
max age of a logfile. This is helpful to check, that the applications you monitor works correctly.
Why I have to use such a "maxage"-function: Imagine, you have a running syslog daemon (listed in ps -ef), but this process wrote no loglines any more (maybe the disk was full). You monitor a logfile of regular expressions, but nobody writes into it.
ERRORLEVEL
0 or 1. If you don't configure AGE, no max age will be checked.
Look at LOGS
Which entries in your logfile do you want to mark good/bad ?
Write a perl regular expression. "^" and "$" are possible. If you want to learn about perl regular expressions, look at perldoc perlretut.
anything perl parses as regular expression.
1
Look at ERRORFILE
logs scans every line in your logfile with every regex until the first one matches. But in which order will these regexes be tried ?
You have to give a order with these priority values. Those regexes with low priorities will be scanned first.
Mostly, you can use the default priority (which is 100) (used always, if you don't write PRIORTY-Tags) and just give the "All the rest"-Regex (mostly this will be ".*") a higher priority to ensure that the "All the rest"-Regex will tried last.
Any number
0 or 1. If you don't define a PRIORITY the default 100 is used.
Look at ERRORFILE