9.23. Configuration for the logs check.

The logs-check searches in logfiles for unknown or "bad" entries. For each logfile, you can configure for every message an errorlevel. Every message is a regular expression. Checking WebSphere / WebSphere MQ and DB2 logs is also possible.

LOGS

Explanation

LOGS is the enclosing tag for all logfile entries.

Parent

OSAGENT

allowed values

Count

0 or 1. If you don't define LOGS, no log check will run.

Example
<LOGS>
  <LOGFILE>
    <LOGFILENAME>/var/log/messages</LOGFILENAME>
    <!-- if message file is older then 300 minutes, 
         somesting is going wrong with the syslog daemon -->
    <AGE>
      <MAXAGE>300</MAXAGE>
      <ERRORLEVEL>ERROR</ERRORLEVEL>
    </AGE>

    <LOGFILTER><REGEX>syslogd.*restart</REGEX><ERRORLEVEL>NORMAL</ERRORLEVEL></LOGFILTER>
    <!-- If you don't define ERRORLEVEL: Default is NORMAL -->
    <!-- If you don't define PRIORITY: Default is 100 -->
    <LOGFILTER><REGEX>-- MARK --</REGEX></LOGFILTER>
    <LOGFILTER>
      <REGEX>.*</REGEX>
      <!-- That is AFTER all default priorities! -->
      <PRIORITY>1000</PRIORITY>
      <ERRORLEVEL>ERROR</ERRORLEVEL>
    </LOGFILTER>
  </LOGFILE>

  <LOGFILE>
    <LOGFILENAME>/home/oracle/product/9.2.0/rdbms/log/alert_osmart.log</LOGFILENAME>
    <LOGFILTER>
      <REGEX>.*</REGEX>
      <PRIORITY>1000</PRIORITY>
      <ERRORLEVEL>ERROR</ERRORLEVEL>
    </LOGFILTER>
  </LOGFILE>
</LOGS>  
        

LOGFILE

Explanation

For every logfile you want to check, you define a LOGFILE-entry.

Parent

LOGS

allowed values

Count

As many as you like.

Example

Look at LOGS

LOGFILENAME

Explanation

Name of the logfile to be checked. The OpenSMART user has to have read rights for that file.

Parent

LOGFILE

allowed values

Valid filenames in your system.

Count

1

Example

Look at LOGFILE

LOGTYPE

Explanation

"Normal" logfiles are line-orientated. One message - one line. Some IBM products (notably DB2 and WebSphere MQ) have ascii-logfiles, too, but they are paragraph orientated: One message - one paragraph.

logs has to know about that.

For "normal" logfiles two variations are possible:

  • check just complete lines in logfile. A line is complete, when ended with NL or CR/NL.

    This is, what you want for most logfiles and this is the default.

  • check everything after the last NL, too.

Parent

LOGFILE

allowed values

  • UDB

  • MQS

  • NLALL (everything after the last NL,too)

  • NORMAL (this is the default)

Count

0 or 1. If you don't define LOGTYPE, then NORMAL is used.

Example
<LOGS>
  <LOGFILE>
    <LOGFILENAME>/opt/IBM/db2/V8.1/db2inst2/sqllib/db2dump/db2diag.log</LOGFILENAME>
    <LOGFILETYPE>UDB</LOGFILETYPE>
    <!-- exclude your unimportant log messages (blocks) here
    <LOGFILTER><REGEX>database started</REGEX><ERRORLEVEL>NORMAL</ERRORLEVEL></LOGFILTER>
    <LOGFILTER>
      <REGEX>.*</REGEX>
      <!-- That is AFTER all default priorities! -->
      <PRIORITY>1000</PRIORITY>
      <ERRORLEVEL>ERROR</ERRORLEVEL>
    </LOGFILTER>
  </LOGFILE>
</LOGS>
        

LOGFILTER

Explanation

What do you want to filter / to mark in your logfile ? Normally, you want many LOGFILTER entries in your configfile.

Parent

LOGFILE

allowed values
Count

Many

Example

Look at LOGFILE

ERRORFILE

Explanation

logs has to save your not-NORMAL entries for the next run. If not, your logfile would be NORMAL after 5 minutes!.

These entries will be saved in ERRORFILE or, if ERRORFILE isn't given, in /home/osmart/var/$CHECKPOINTNAME.error (if you have OpenSMART installed in /home/osmart/)

Parent

LOGFILE

allowed values
  • any valid filename you can write to.

Count

0 or 1, if you don't configure an ERRORFILE, /home/osmart/var/$CHECKPOINTNAME.error (if you have OpenSMART installed in /home/osmart/) is used.

Example
<LOGS>
  <LOGFILE>
    <LOGFILENAME>/var/log/messages</LOGFILENAME>
    <ERRORFILE>/home/osmart/var/message_logfile.error</ERRORFILE>
    <OFFSETFILE>/home/osmart/var/message_logfile.offset</OFFSETFILE>
    <!-- If you don't add an ERRORLEVEL to LOGFILTER, the default is
    NORMAL -->    
    <!-- If you don't add a priority to LOGFILTER, the default is 100
    -->
    <LOGFILTER><REGEX>syslogd.*restart</REGEX></LOGFILTER>
    <LOGFILTER>
      <REGEX>.*</REGEX><!-- Everything unknown -->
      <PRIORITY>1000</PRIORITY>
      <!-- That is AFTER all default priorities! -->
      <ERRORLEVEL>ERROR</ERRORLEVEL>
    </LOGFILTER>
  </LOGFILE>
</LOGS>
        

OFFSETFILE

Explanation

logs has to save the point in your logfile, it has yet checked to (the offset). If not, your logfile would be completeley scanned every 5 minutes!.

These entries will be saved in OFFSETFILE or, if OFFSETFILE isn't given, in /home/osmart/var/$CHECKPOINTNAME.offset (if you have OpenSMART installed in /home/osmart/)

Parent

LOGFILE

allowed values
  • any valid filename you can write to.

Count

0 or 1. If you don't configure OFFSETFILE, /home/osmart/var/$CHECKPOINTNAME.offset (if you have OpenSMART installed in /home/osmart/) will be used.

Example

Look at ERRORFILE

AGE

Explanation

max age of a logfile. This is helpful to check, that the applications you monitor works correctly.

Why I have to use such a "maxage"-function: Imagine, you have a running syslog daemon (listed in ps -ef), but this process wrote no loglines any more (maybe the disk was full). You monitor a logfile of regular expressions, but nobody writes into it.

Parent

LOGFILE

allowed values
Count

0 or 1. If you don't configure AGE, no max age will be checked.

Example

Look at LOGS

MAXAGE

Explanation

max age of a logfile (in minutes)

Parent

AGE

allowed values
  • integer value (minutes)

Count

0 or 1. If you don't configure AGE, no max age will be checked.

Example

Look at LOGS

REGEX

Explanation

Which entries in your logfile do you want to mark good/bad ?

Write a perl regular expression. "^" and "$" are possible. If you want to learn about perl regular expressions, look at perldoc perlretut.

Parent

LOGFILTER

allowed values

anything perl parses as regular expression.

Count

1

Example

PRIORITY

Explanation

logs scans every line in your logfile with every regex until the first one matches. But in which order will these regexes be tried ?

You have to give a order with these priority values. Those regexes with low priorities will be scanned first.

Mostly, you can use the default priority (which is 100) (used always, if you don't write PRIORTY-Tags) and just give the "All the rest"-Regex (mostly this will be ".*") a higher priority to ensure that the "All the rest"-Regex will tried last.

Parent

LOGFILTER

allowed values

Any number

Count

0 or 1. If you don't define a PRIORITY the default 100 is used.

Example