| 1 |
<HTML> |
|---|
| 2 |
<Head> |
|---|
| 3 |
<Title>The EDDIE-Tool User's Manual</Title> |
|---|
| 4 |
</Head> |
|---|
| 5 |
|
|---|
| 6 |
<Body bgcolor=white> |
|---|
| 7 |
<Center> |
|---|
| 8 |
<H2>The EDDIE-Tool User's Manual</H2> |
|---|
| 9 |
</Center> |
|---|
| 10 |
|
|---|
| 11 |
<p> |
|---|
| 12 |
<b>Contents:</b> |
|---|
| 13 |
<ul> |
|---|
| 14 |
<li> <a href="#introduction">Introduction</a> |
|---|
| 15 |
<li> <a href="#installation">Installation</a> |
|---|
| 16 |
<ul> |
|---|
| 17 |
<li> <a href="#downloading">Downloading</a> |
|---|
| 18 |
<li> <a href="#installing">Installing</a> |
|---|
| 19 |
</ul> |
|---|
| 20 |
<li> <a href="#cmdline">Command-Line Options</a> |
|---|
| 21 |
<li> <a href="#configuration">Configuration</a> |
|---|
| 22 |
<ul> |
|---|
| 23 |
<li> <a href="#config_files">Config Files</a> |
|---|
| 24 |
<li> <a href="#globals">Global Configurables</a> |
|---|
| 25 |
<li> <a href="#config_format">Configuration Format</a> |
|---|
| 26 |
<li> <a href="#simple_config">Simple Configuration</a> |
|---|
| 27 |
<li> <a href="#directives">Directives</a> |
|---|
| 28 |
<ul> |
|---|
| 29 |
<li> <a href="#commondirectiveargs">Common Directive Arguments</a> |
|---|
| 30 |
<li> <a href="#rule_format">Rule Format</a> |
|---|
| 31 |
<li> <a href="#builtindirectives">Built-in Directives</a> |
|---|
| 32 |
<li> <a href="#directivedetails">Directive Details</a> |
|---|
| 33 |
</ul> |
|---|
| 34 |
<li> <a href="#builtindirectives">Built-in Directives</a> |
|---|
| 35 |
<li> <a href="#actions">Actions</a> |
|---|
| 36 |
<li> <a href="#notif_msg_objects">Notification and Message objects</a> |
|---|
| 37 |
</ul> |
|---|
| 38 |
<li> <a href="#other_features">Other Features</a> |
|---|
| 39 |
<ul> |
|---|
| 40 |
<li> <a href="#console">Console</a> |
|---|
| 41 |
</ul> |
|---|
| 42 |
</ul> |
|---|
| 43 |
</p> |
|---|
| 44 |
|
|---|
| 45 |
<hr> |
|---|
| 46 |
|
|---|
| 47 |
<p> |
|---|
| 48 |
<H3><a name="introduction">Introduction</a></H3> |
|---|
| 49 |
</p> |
|---|
| 50 |
<p> |
|---|
| 51 |
The EDDIE-Tool (commonly just called EDDIE) |
|---|
| 52 |
is an agent for system, network and security monitoring. |
|---|
| 53 |
It is highly customizable and easily extendable. |
|---|
| 54 |
It has been designed to be as platform-independent as possible, with |
|---|
| 55 |
platform-specific code limited to a small group of modules, making it |
|---|
| 56 |
easily portable to new platforms. |
|---|
| 57 |
It is fully written in <a href="http://www.python.org/">Python</a> and |
|---|
| 58 |
the configuration has a Python "look-and-feel" to it, although no Python |
|---|
| 59 |
or coding skills are necessary to configure it. |
|---|
| 60 |
</p><p> |
|---|
| 61 |
This user's manual is specific for EDDIE-Tool versions 0.29 and above, |
|---|
| 62 |
as some significant changes were made to improve the configuration. |
|---|
| 63 |
These changes can be <a href="changes_ver_029.txt">read here</a>. |
|---|
| 64 |
The user's manual for earlier versions can be |
|---|
| 65 |
<a href="manual-pre-029.html">read here</a>. |
|---|
| 66 |
</p> |
|---|
| 67 |
|
|---|
| 68 |
<hr> |
|---|
| 69 |
|
|---|
| 70 |
<p> |
|---|
| 71 |
<H3><a name="installation">Installation</a></H3> |
|---|
| 72 |
<H4><a name="downloading">Downloading</a></H4> |
|---|
| 73 |
</p> |
|---|
| 74 |
<p> |
|---|
| 75 |
You need to download the following: |
|---|
| 76 |
<ul> |
|---|
| 77 |
<li> <b>Required:</b> |
|---|
| 78 |
<ul> |
|---|
| 79 |
<li> EDDIE - <a href="http://eddie-tool.net//download/">http://eddie-tool.net/download/</a> |
|---|
| 80 |
<li> Python 2.2.1+ - <a href="http://www.python.org/">http://www.python.org/</a> (must support threads) - Python 2.3+ is recommended, but Eddie has been tested with Python versions 2.2.1 through to 2.4.x. On Windows Python 2.3 or newer is required. |
|---|
| 81 |
</ul> |
|---|
| 82 |
<li> <b>Optional:</b> |
|---|
| 83 |
<ul> |
|---|
| 84 |
<li> Elvin 4 - <a href="http://elvin.dstc.edu.au/">http://elvin.dstc.edu.au/</a> - Elvin is the message system supported by EDDIE and some of its related |
|---|
| 85 |
tools to pass alerts, collected data, etc between them. It is not |
|---|
| 86 |
required for basic monitoring. |
|---|
| 87 |
<li> elvinrrd - <a href="http://www.psychofx.com/elvinrrd/">http://www.psychofx.com/elvinrrd/</a> - |
|---|
| 88 |
the daemon for storing EDDIE-collected data into RRD databases. |
|---|
| 89 |
<li> estored - <i>available soon</i> - |
|---|
| 90 |
the daemon for storing EDDIE-collected data into databases. |
|---|
| 91 |
</ul> |
|---|
| 92 |
</ul> |
|---|
| 93 |
</p> |
|---|
| 94 |
|
|---|
| 95 |
<p> |
|---|
| 96 |
<H4><a name="installing">Installing</a></H4> |
|---|
| 97 |
</p> |
|---|
| 98 |
Follow the <a href="QUICKSTART.txt">QUICKSTART</a> document |
|---|
| 99 |
(also located in the eddie/doc/ directory) or continue with |
|---|
| 100 |
the steps below. |
|---|
| 101 |
<p> |
|---|
| 102 |
<ul> |
|---|
| 103 |
<li> Python - first install Python by following the instructions |
|---|
| 104 |
included in the Python distribution. EDDIE is written in Python |
|---|
| 105 |
and will not work if Python is not available. Python must have |
|---|
| 106 |
been compiled with thread support as EDDIE is fully threaded. |
|---|
| 107 |
EDDIE has been tested with Python versions 2.2.1 through to 2.4.x |
|---|
| 108 |
under Linux, Solaris and HP-UX, OS X, FreeBSD, OpenBSD and Windows. |
|---|
| 109 |
<li> EDDIE - un-tar EDDIE into your favorite directory, eg: /opt |
|---|
| 110 |
<dir> <b>$ cd /opt</b> </dir> |
|---|
| 111 |
<dir> <b>$ gtar xvzf eddie-0.xx.tgz</b> </dir> |
|---|
| 112 |
Edit the main EDDIE file, eddie.py in the bin directory |
|---|
| 113 |
<dir> <b>$ vi eddie-0.xx/bin/eddie.py</b> </dir> |
|---|
| 114 |
and change the first line to point to your Python interpreter, eg: |
|---|
| 115 |
<dir> <b>#!/opt/python/bin/python</b> </dir> |
|---|
| 116 |
Note: DO NOT remove the special '#!' characters. |
|---|
| 117 |
<li> That should be all that is needed to get EDDIE up & running. |
|---|
| 118 |
You can test that EDDIE will work by |
|---|
| 119 |
running it, eg: |
|---|
| 120 |
<dir> <b>$ /installdir/eddie-0.xx/bin/eddie.py</b> </dir> |
|---|
| 121 |
You should see the version of EDDIE and your system type, eg: |
|---|
| 122 |
<dir> <b>EDDIE v0.30<br>systype: Linux/2.2.5-15/i586</b> </dir> |
|---|
| 123 |
You can see what systems are supported by EDDIE by looking in the |
|---|
| 124 |
eddie/lib/ directory. Besides common, the other directories will |
|---|
| 125 |
be specific to different operating systems (e.g., 'Linux', 'SunOS'). |
|---|
| 126 |
If your system is not supported yet you will have to wait for a |
|---|
| 127 |
port of EDDIE to your system or contact the developers to see about |
|---|
| 128 |
porting it yourself. |
|---|
| 129 |
<li> If you see the above output with no major error messages then |
|---|
| 130 |
EDDIE is working. However, without any configuration it will |
|---|
| 131 |
not be doing anything. Now the hard^H^H^H^Hfun part, |
|---|
| 132 |
building the configuration. |
|---|
| 133 |
(Press [CTRL]-C to stop it.) |
|---|
| 134 |
</ul> |
|---|
| 135 |
|
|---|
| 136 |
|
|---|
| 137 |
<hr> |
|---|
| 138 |
|
|---|
| 139 |
<p> |
|---|
| 140 |
<H3><a name="cmdline">Command-Line Options</a></H3> |
|---|
| 141 |
</p> |
|---|
| 142 |
|
|---|
| 143 |
<ul> |
|---|
| 144 |
<li><b>-h</b>, <b>--help</b>: display the help text, then exit |
|---|
| 145 |
<li><b>--version</b>: display the EDDIE-tool version, then exit |
|---|
| 146 |
<li><b>-c FILE</b>, <b>--config=FILE</b>: specify the path to the configuration file<br>The default is <i>(RUNDIR)/../config/eddie.cf</i> |
|---|
| 147 |
<li><b>--showconfig</b>: display the configuration, then exit |
|---|
| 148 |
<li><b>-v</b>, <b>--verbose</b>: display extra messages |
|---|
| 149 |
<li><b>-d</b>, <b>--daemon</b>: run in the background, and return the PID of the daemon process |
|---|
| 150 |
</ul> |
|---|
| 151 |
|
|---|
| 152 |
|
|---|
| 153 |
<hr> |
|---|
| 154 |
|
|---|
| 155 |
<p> |
|---|
| 156 |
<H3><a name="configuration">Configuration</a></H3> |
|---|
| 157 |
|
|---|
| 158 |
<H4><a name="config_files">Config files</a></H4> |
|---|
| 159 |
</p> |
|---|
| 160 |
<ul> |
|---|
| 161 |
<li> The EDDIE config files should be in /installdir/eddie-0.xx/config. |
|---|
| 162 |
EDDIE begins by looking for an eddie.cf file in this directory, ie: |
|---|
| 163 |
<dir> <b>/installdir/eddie-0.xx/config/eddie.cf</b> </dir> |
|---|
| 164 |
<li> EDDIE is distributed with a config.sample directory containing |
|---|
| 165 |
example configuration files. You should look over these to get |
|---|
| 166 |
an idea of how the configuration works. |
|---|
| 167 |
<li> You must first create a config directory, if one isn't there already, eg: |
|---|
| 168 |
<dir> <b>$ mkdir /installdir/eddie-0.xx/config</b> </dir> |
|---|
| 169 |
then create an eddie.cf file in this new directory. You can copy |
|---|
| 170 |
and change the config.sample/eddie.cf file if you like. |
|---|
| 171 |
<li> It is common for the EDDIE Directives (the rules which EDDIE |
|---|
| 172 |
uses to collect data and perform actions) to be written in |
|---|
| 173 |
separate rules files. These are then INCLUDEd in the main eddie.cf |
|---|
| 174 |
config file. The rules files can be in the same directory as eddie.cf |
|---|
| 175 |
or can be in subdirectories under it. It is common for a complex |
|---|
| 176 |
setup to split the rules into multiple files grouped by commonality. |
|---|
| 177 |
</ul> |
|---|
| 178 |
|
|---|
| 179 |
<p> |
|---|
| 180 |
<H4><a name="globals">Global Configurables</a></H4> |
|---|
| 181 |
</p> |
|---|
| 182 |
<p> |
|---|
| 183 |
The global configurables are usually in eddie.cf and are listed below: |
|---|
| 184 |
<ul> |
|---|
| 185 |
<li> <b>LOGFILE</b> - the file to log to (this should be the first thing set so that |
|---|
| 186 |
any problems are logged to the right place). |
|---|
| 187 |
<li> <b>LOGLEVEL</b> - how much detail to log to the above logfile. |
|---|
| 188 |
<li> <b>ADMIN</b> - the administrator's email address. |
|---|
| 189 |
<li> <b>ADMINLEVEL</b> - how much log detail to send to the admin. |
|---|
| 190 |
<li> <b>ADMIN_NOTIFY</b> - how often to send admin log emails. |
|---|
| 191 |
<li> <b>NUMTHREADS</b> - the maximum number of threads (including supporting threads |
|---|
| 192 |
and directive threads) EDDIE will attempt to limit itself to using. This |
|---|
| 193 |
should be a minimum of 5. |
|---|
| 194 |
<li> <b><a name="SCANPERIOD">SCANPERIOD</a></b> |
|---|
| 195 |
- the default time to wait between checks; can be overridden by |
|---|
| 196 |
each directive. |
|---|
| 197 |
See <a href="#timedef">Time Definition</a> for definition of time format. |
|---|
| 198 |
<li> <b>CONSOLE_PORT</b> - defines the tcp port EDDIE binds to to provide a console interface |
|---|
| 199 |
to the directive states; this defaults to 33343, and can be set to 0 to disable |
|---|
| 200 |
this feature (see <a href="#console">Console</a>). |
|---|
| 201 |
<li> <b>EMAIL_FROM</b> - the From: address to be used by the email action. Will default to the user EDDIE is run as. |
|---|
| 202 |
<li> <b>EMAIL_REPLYTO</b> - the Reply-To: address to be used by the email action. Defaults to empty string, "". |
|---|
| 203 |
<li> <b>SENDMAIL</b> - the location of the sendmail binary which EDDIE uses to send all emails. Is usually '/usr/lib/sendmail' (Solaris) or '/usr/sbin/sendmail' (Redhat Linux). Defaults to '/usr/lib/sendmail'. |
|---|
| 204 |
<li> <b>SMTP_SERVERS</b> - a comma-separated list of SMTP servers to use to make SMTP connections for email sending. Defaults to 'localhost'. |
|---|
| 205 |
<li> <b>ELVINURL</b> - the URL of an Elvin server (if required). |
|---|
| 206 |
<li> <b>ELVINSCOPE</b> - the scope of an Elvin server (if required). |
|---|
| 207 |
<li> <b><a name="INTERPRETERS">INTERPRETERS</a></b> |
|---|
| 208 |
- commands classed as "interpreters" by process checking rules |
|---|
| 209 |
(see the <a href="#PROC">PROC</a> directive). |
|---|
| 210 |
<li> <b><a name="WORKDIR">WORKDIR</a></b> |
|---|
| 211 |
- directory where Eddie can store temporary files. [Eddie 0.35+] |
|---|
| 212 |
<li> <b><a name="RESCANCONFIGS">RESCANCONFIGS</a></b> |
|---|
| 213 |
- Flag indicating the desire to constantly scan (and reload) config file changes. Defaults to true. |
|---|
| 214 |
<li> <b>CLASS</b> - a grouping of hosts which share the same directives. |
|---|
| 215 |
<li> <b>ALIAS</b> - global aliases can be set here, if required. |
|---|
| 216 |
<li> <b>INCLUDE</b> - include another configuration file; it is common to split the rules |
|---|
| 217 |
into different files in a large installation. |
|---|
| 218 |
</ul> |
|---|
| 219 |
eddie.cf is well documented, so read through the file and modify the |
|---|
| 220 |
settings to suit your environment. |
|---|
| 221 |
</p> |
|---|
| 222 |
|
|---|
| 223 |
<p> |
|---|
| 224 |
<H4><a name="config_format">Configuration Format</a></H4> |
|---|
| 225 |
</p> |
|---|
| 226 |
<p> |
|---|
| 227 |
The EDDIE configuration follows the standard Python code format. |
|---|
| 228 |
Where methods or child objects of an object are indicated by |
|---|
| 229 |
indenting them beneath the parent object definition, sub-objects |
|---|
| 230 |
or parameters of a directive object are similarly indicated by indenting |
|---|
| 231 |
them beneath the parent object definition. For example, a part of the |
|---|
| 232 |
configuration may look like: |
|---|
| 233 |
<pre> |
|---|
| 234 |
group testing: |
|---|
| 235 |
PING testping: |
|---|
| 236 |
host="10.0.0.1" |
|---|
| 237 |
numpings=10 |
|---|
| 238 |
rule="not alive" |
|---|
| 239 |
action=email("chris", "%(host)s failed ping") |
|---|
| 240 |
|
|---|
| 241 |
FILE file1: |
|---|
| 242 |
file='/tmp/file1.tmp' |
|---|
| 243 |
scanperiod='2m' |
|---|
| 244 |
rule='not exists' |
|---|
| 245 |
action=ticker("%(file)s does not exist", timeout=1) |
|---|
| 246 |
act2ok=ticker("%(file)s now exists", timeout=1) |
|---|
| 247 |
</pre> |
|---|
| 248 |
|
|---|
| 249 |
A config group called "testing" is defined, then the PING |
|---|
| 250 |
directive "testping" is configured inside this group because |
|---|
| 251 |
it is indented. Similarly, all testping's arguments are |
|---|
| 252 |
indented as they belong to the PING directive configuration. |
|---|
| 253 |
The second directive, FILE called "file1", is at the same |
|---|
| 254 |
indentation level as the group definition (i.e., not indented) |
|---|
| 255 |
and is therefore a global directive. Thus, all hosts using |
|---|
| 256 |
this example config would execute the FILE directive, but only |
|---|
| 257 |
those hosts in the "testing" group would execute the PING |
|---|
| 258 |
directive. |
|---|
| 259 |
<br><br> |
|---|
| 260 |
|
|---|
| 261 |
If you are used to Python coding this will be second nature to you. |
|---|
| 262 |
If you are not, it will not be hard to pick up. |
|---|
| 263 |
<br><br> |
|---|
| 264 |
|
|---|
| 265 |
The above example also introduces the format of directive |
|---|
| 266 |
definitions. Directives are the rules which do "something". |
|---|
| 267 |
More often than not, they will perform system or network checks |
|---|
| 268 |
of some sort. But they are very flexible and could be configured |
|---|
| 269 |
to do more than simple checks. |
|---|
| 270 |
<br><br> |
|---|
| 271 |
|
|---|
| 272 |
In any case, the format of directive definitions is: |
|---|
| 273 |
<pre> |
|---|
| 274 |
DIRECTIVE name: |
|---|
| 275 |
argument1=value1 |
|---|
| 276 |
[argument2=value2 |
|---|
| 277 |
...] |
|---|
| 278 |
</pre> |
|---|
| 279 |
|
|---|
| 280 |
where "DIRECTIVE" is the directive name, like PROC or FS, and |
|---|
| 281 |
"name" is the user-defined, unqie name of this directive object. The |
|---|
| 282 |
arguments customize the directive appropriately. Some arguments |
|---|
| 283 |
are directive-specific while others are common to all directives. |
|---|
| 284 |
<br><br> |
|---|
| 285 |
|
|---|
| 286 |
Example: |
|---|
| 287 |
<pre> |
|---|
| 288 |
PROC test: |
|---|
| 289 |
name='syslogd' |
|---|
| 290 |
rule='not exists' |
|---|
| 291 |
scanperiod='30s' |
|---|
| 292 |
action=email("alert@my.domain","syslogd is not running") |
|---|
| 293 |
</pre> |
|---|
| 294 |
This is an example definition of a PROC directive, called 'test'. |
|---|
| 295 |
It contains the PROC-specific argument, 'name'. 'rule', |
|---|
| 296 |
'scanperiod' and 'action' are arguments which are common to all |
|---|
| 297 |
directives. Some arguments are optional while others are required, |
|---|
| 298 |
and errors will be raised if they are missing. In this example |
|---|
| 299 |
'name' and 'rule' are required. 'scanperiod' and 'action' are optional. |
|---|
| 300 |
</p> |
|---|
| 301 |
|
|---|
| 302 |
<p> |
|---|
| 303 |
<H4><a name="simple_config">Simple Configuration</a></H4> |
|---|
| 304 |
</p> |
|---|
| 305 |
<p> |
|---|
| 306 |
An EDDIE configuration can be simple to get basic monitoring started |
|---|
| 307 |
quickly and made as complicated as required to perform advanced operations. |
|---|
| 308 |
A simple example rules file is shown below to monitor basic services on a host. |
|---|
| 309 |
This rules file, named simple.rules, would be placed in the same directory |
|---|
| 310 |
as eddie.cf and eddie.cf would contain the entry |
|---|
| 311 |
<dir> INCLUDE 'simple.rules' </dir> |
|---|
| 312 |
The file simple.rules contains |
|---|
| 313 |
<pre> |
|---|
| 314 |
# Process checks |
|---|
| 315 |
PROC syslogd: |
|---|
| 316 |
name='syslogd' |
|---|
| 317 |
rule='not exists' |
|---|
| 318 |
action=email('root', '%(name)s is not running on %(h)s') |
|---|
| 319 |
PROC inetd: |
|---|
| 320 |
name='inetd' |
|---|
| 321 |
rule='not exists' |
|---|
| 322 |
action=email('root', '%(name)s is not running on %(h)s') |
|---|
| 323 |
PROC sshd: |
|---|
| 324 |
name='sshd' |
|---|
| 325 |
rule='not exists' |
|---|
| 326 |
action=email('root', '%(name)s is not running on %(h)s') |
|---|
| 327 |
|
|---|
| 328 |
# Filesystem checks |
|---|
| 329 |
FS root: |
|---|
| 330 |
fs='/' |
|---|
| 331 |
rule='pctused > 90' |
|---|
| 332 |
action=email('root', '%(mountpt)s over 90%% on %(h)s') |
|---|
| 333 |
FS varlog: |
|---|
| 334 |
fs='/var/log' |
|---|
| 335 |
rule='pctused > 90' |
|---|
| 336 |
action=email('root', '%(mountpt)s over 90%% on %(h)s') |
|---|
| 337 |
|
|---|
| 338 |
# Service Port checks |
|---|
| 339 |
SP smtp_port: |
|---|
| 340 |
port='smtp' |
|---|
| 341 |
protocol='tcp' |
|---|
| 342 |
bindaddr='0.0.0.0' |
|---|
| 343 |
rule='not exists' |
|---|
| 344 |
action=email('root', '%(protocol)s/%(port)s on %(h)s is not listening') |
|---|
| 345 |
SP http_port: |
|---|
| 346 |
port='http' |
|---|
| 347 |
protocol='tcp' |
|---|
| 348 |
bindaddr='0.0.0.0' |
|---|
| 349 |
rule='not exists' |
|---|
| 350 |
action=email('root', '%(protocol)s/%(port)s on %(h)s is not listening') |
|---|
| 351 |
|
|---|
| 352 |
# System statistics checks |
|---|
| 353 |
SYS loadaverage: |
|---|
| 354 |
rule="loadavg1 > 3.00" |
|---|
| 355 |
scanperiod='1m' |
|---|
| 356 |
action=email('root', '%(h)s load-average > 3.00')</pre> |
|---|
| 357 |
</p> |
|---|
| 358 |
|
|---|
| 359 |
|
|---|
| 360 |
<p> |
|---|
| 361 |
<H4><a name="directives">Directives</a></H4> |
|---|
| 362 |
</p> |
|---|
| 363 |
The directives are the configuration commands which tell EDDIE |
|---|
| 364 |
what to do. |
|---|
| 365 |
They are of the form: |
|---|
| 366 |
<pre> |
|---|
| 367 |
DIRECTIVE name: |
|---|
| 368 |
arg1=value1 |
|---|
| 369 |
arg2=value2 |
|---|
| 370 |
argn=valuen </pre> |
|---|
| 371 |
Where "DIRECTIVE" is the name of the directive itself (see |
|---|
| 372 |
<a href="#builtindirectives">Built-in Directives</a>); |
|---|
| 373 |
"name" is a user-defined name of the directive definition (the directive ID is usually |
|---|
| 374 |
constructed as "DIRECTIVE.name", e.g., "FS.root", and will appear in the logs, console, |
|---|
| 375 |
etc); |
|---|
| 376 |
"args" are arguments to define what the directive should do and how it should do it. |
|---|
| 377 |
Some arguments are common to all directives and others are |
|---|
| 378 |
specific to that type of directive. |
|---|
| 379 |
|
|---|
| 380 |
<p> |
|---|
| 381 |
<H5><a name="commondirectiveargs">Common Directive Arguments</a></H5> |
|---|
| 382 |
</p> |
|---|
| 383 |
<ul> |
|---|
| 384 |
<li> <b><a name="rule">rule=<rule string></a></b> |
|---|
| 385 |
- a string defining a rule that will be evaluated |
|---|
| 386 |
(in a Python environment) using variables specific to the current directive. |
|---|
| 387 |
The rule should evaluate to 1 (true) or 0 (false). If the rule is true, the |
|---|
| 388 |
directive state will be set to "fail" and the <a href="#action">action</a> |
|---|
| 389 |
(if any) will be executed. |
|---|
| 390 |
If false, the state will be "ok" and the <a href="#actelse">actelse</a> |
|---|
| 391 |
(if any) will |
|---|
| 392 |
be executed. The rule argument may be optional to some directives if they |
|---|
| 393 |
provide a default rule. |
|---|
| 394 |
<li> <b><a name="scanperiod">scanperiod=<time></a></b> |
|---|
| 395 |
- this changes how often the directive will be run |
|---|
| 396 |
(the default is the time period specified by <a href="#SCANPERIOD">SCANPERIOD</a> |
|---|
| 397 |
in eddie.cf). See |
|---|
| 398 |
<a href="#timedef">Time Definition</a> for formats of <time>. |
|---|
| 399 |
<li> <b><a name="numchecks">numchecks=<integer></a></b> |
|---|
| 400 |
- the number of checks to perform before the |
|---|
| 401 |
state will be set to fail and actions called. The time between these "re-checks" |
|---|
| 402 |
is set by the <a href="#checkwait">checkwait</a> argument. |
|---|
| 403 |
<li> <b><a name="checkwait">checkwait=<time></a></b> |
|---|
| 404 |
- the time between "re-checks" if the <a href="#numchecks">numchecks</a> argument |
|---|
| 405 |
is greater than 1. See <a href="#timedef">Time Definition</a> for formats of <time>. |
|---|
| 406 |
<li> <b><a name="action">action=<actionlist></a></b> |
|---|
| 407 |
- this is a list of actions (see <a href="#actions">Actions</a>) |
|---|
| 408 |
to be performed if the directive enters the failed state. i.e., if the directive |
|---|
| 409 |
is performing a check and the check fails, these actions will be called. |
|---|
| 410 |
<li> <b><a name="act2ok">act2ok=<actionlist></a></b> |
|---|
| 411 |
- this is a list of actions (see <a href="#actions">Actions</a>) |
|---|
| 412 |
to be performed when the directive state changes from failed to ok. |
|---|
| 413 |
<li> <b><a name="actelse">actelse=<actionlist></a></b> |
|---|
| 414 |
- this is a list of actions (see <a href="#actions">Actions</a>) |
|---|
| 415 |
to be performed when the directive state was ok and is still ok after running |
|---|
| 416 |
any checks. |
|---|
| 417 |
<li> <b><a name="console_arg">console=<display_string></a></b> - define what is output to |
|---|
| 418 |
<a href="#console">EDDIE Console</a> connections for the current directive. |
|---|
| 419 |
Set to None to hide this directive from the Console. |
|---|
| 420 |
<li> <b><a name="checkdependson">checkdependson=<directive(s)></a></b> - don't |
|---|
| 421 |
perform the check if any of these directives are in a failed state. [Eddie 0.31+] |
|---|
| 422 |
<li> <b><a name="actiondependson">actiondependson=<directive(s)></a></b> - skip |
|---|
| 423 |
action execution if any of these directives are in a failed state. [Eddie 0.31+] |
|---|
| 424 |
<li> <b><a name="actionperiod">actionperiod=<time expression></a></b> - an expresion |
|---|
| 425 |
which defines how long to wait before calling the next consequetive action. The |
|---|
| 426 |
variable 't' is the current time between action calls, and defaults to the |
|---|
| 427 |
scanperiod after the first action call. The scan period is also available in the |
|---|
| 428 |
expression with the <i>scanperiod</i> variable. Example: actionperiod='t * 2' |
|---|
| 429 |
- double the time between consequetive action calls. [Eddie 0.31+] |
|---|
| 430 |
|
|---|
| 431 |
<li> <b><a name="history">history=<integer></a></b> - request directive to |
|---|
| 432 |
remember this many previous data samples. Access them in rules using the terminology |
|---|
| 433 |
'history[n].dataname', where n is how many samples ago, and dataname is the name of |
|---|
| 434 |
the data to retrieve. Example, alert if a filesystem grows by over 5% between checks: |
|---|
| 435 |
<pre> |
|---|
| 436 |
FS export00_grow: |
|---|
| 437 |
fs='/export/00' |
|---|
| 438 |
scanperiod='1m' |
|---|
| 439 |
history = 1 |
|---|
| 440 |
rule='(pctused - history[1].pctused) > 5' |
|---|
| 441 |
action=email('root','%(mountpt)s grew to %(pctused)s')</pre> |
|---|
| 442 |
Example 2, alert if average filesystem growth over last 3 sample periods is too high: |
|---|
| 443 |
<pre> |
|---|
| 444 |
FS export01_avg_grow: |
|---|
| 445 |
fs='/export/01' |
|---|
| 446 |
scanperiod='1m' |
|---|
| 447 |
history = 3 |
|---|
| 448 |
rule='(pctused + history[1].pctused + history[2].pctused + history[3].pctused) / 3 > 10' |
|---|
| 449 |
action=email('root','%(mountpt)s grew to %(pctused)s')</pre> |
|---|
| 450 |
Note that at startup, rules will not run until enough history data is available. |
|---|
| 451 |
So, in the previous example, the directive would wait for three scanperiods before |
|---|
| 452 |
it had enough history data (three sample periods) to be able to evaluate the rule. |
|---|
| 453 |
[Eddie 0.31+] |
|---|
| 454 |
|
|---|
| 455 |
<li> <b><a name="excludehosts">excludehosts=<hostlist></a></b> - |
|---|
| 456 |
do not execute this directive on any of the specified hosts. |
|---|
| 457 |
Hosts are specified as a string of comma-separated hostnames. |
|---|
| 458 |
[Eddie 0.32+] |
|---|
| 459 |
|
|---|
| 460 |
<li> <b><a name="actionmaxcalls">actionmaxcalls=<integer></a></b> - |
|---|
| 461 |
define the maximum number of times actions will be called for a particular |
|---|
| 462 |
failure. |
|---|
| 463 |
[Eddie 0.32+] |
|---|
| 464 |
|
|---|
| 465 |
<li> <b><a name="disabled">disabled=<integer></a></b> - |
|---|
| 466 |
set to 1 to force a directive to be disabled. If disabled, a directive |
|---|
| 467 |
still exists in the configuration, but no checks will be executed for it. |
|---|
| 468 |
[Eddie 0.33+] |
|---|
| 469 |
|
|---|
| 470 |
<li> <b><a name="checktime">checktime=<expression></a></b> - |
|---|
| 471 |
if specified, the directive will only be executed if the expression |
|---|
| 472 |
evaluates to true. The expression can make use of the current time and |
|---|
| 473 |
day with the variables: day ('mon', 'tue', etc); time (HHMM); hour (0-23); |
|---|
| 474 |
minute (0-59); second (0-59). |
|---|
| 475 |
And the for shorthands, the fixed lists: weekdays ('mon' - 'fri'), |
|---|
| 476 |
weekend ('sat', 'sun'). |
|---|
| 477 |
<br> Some Examples: |
|---|
| 478 |
<pre> |
|---|
| 479 |
checktime='day=="mon" or day=="tue"' |
|---|
| 480 |
checktime='day in weekdays and hour > 18' |
|---|
| 481 |
</pre> |
|---|
| 482 |
[Eddie 0.33+] |
|---|
| 483 |
|
|---|
| 484 |
</ul> |
|---|
| 485 |
</p> |
|---|
| 486 |
|
|---|
| 487 |
|
|---|
| 488 |
<p> |
|---|
| 489 |
<H5><a name="rule_format">Rule Format</a></H5> |
|---|
| 490 |
</p> |
|---|
| 491 |
|
|---|
| 492 |
The format of rules is very simple. |
|---|
| 493 |
For those familiar with Python, rules are simply Python expressions |
|---|
| 494 |
which are evaluated on-the-fly with the variables set by the directive |
|---|
| 495 |
at the time. |
|---|
| 496 |
For those unfamiliar with Python, the expressions are almost English-like |
|---|
| 497 |
using operators such as: not, and, or; and mathematical operators such as: |
|---|
| 498 |
==, !=, >, <, >=, <=. |
|---|
| 499 |
Use these operators to evaluate the variables you are interested in. |
|---|
| 500 |
The whole expression should evaluate to 1 (i.e., true) if the actions |
|---|
| 501 |
set by the |
|---|
| 502 |
<a href="#action">action</a> |
|---|
| 503 |
argument should be executed. If it evaluates to 0 |
|---|
| 504 |
(i.e., false) only the actions set by the |
|---|
| 505 |
<a href="#actelse">actelse</a> argument are executed. |
|---|
| 506 |
|
|---|
| 507 |
<p> |
|---|
| 508 |
As rule expressions are evaluated in a Python environment, links to |
|---|
| 509 |
related Python documentation is provided below. |
|---|
| 510 |
<li> <a href="http://www.python.org/doc/current/lib/boolean.html">Boolean Operations</a> |
|---|
| 511 |
<li> <a href="http://www.python.org/doc/current/lib/comparisons.html">Comparisons</a> |
|---|
| 512 |
<li> <a href="http://www.python.org/doc/current/lib/string-methods.html">String Methods</a> |
|---|
| 513 |
</p> |
|---|
| 514 |
|
|---|
| 515 |
|
|---|
| 516 |
<p> |
|---|
| 517 |
<H5><a name="builtindirectives">Built-in Directives</a></H5> |
|---|
| 518 |
</p> |
|---|
| 519 |
The built-in directives are grouped roughly into categories and are as follows: |
|---|
| 520 |
<p> |
|---|
| 521 |
<i>System Monitoring:</i> |
|---|
| 522 |
<ul> |
|---|
| 523 |
<li> <b><a href="#COM">COM</a></b> - define custom rules. |
|---|
| 524 |
<li> <b><a href="#FS">FS</a></b> - perform filesystem checks. |
|---|
| 525 |
<li> <b><a href="#IF">IF</a></b> - perform checks on local network interfaces. |
|---|
| 526 |
<li> <b><a href="#METASTAT">METASTAT</a></b> - some tests for Solaris Disksuite. |
|---|
| 527 |
<li> <b><a href="#NET">NET</a></b> - define a network statistics rule. |
|---|
| 528 |
<li> <b><a href="#PID">PID</a></b> - perform checks on PID-files. |
|---|
| 529 |
<li> <b><a href="#PROC">PROC</a></b> - perform checks on active processes. |
|---|
| 530 |
<li> <b><a href="#SP">SP</a></b> - define a local service port checking rule. |
|---|
| 531 |
<li> <b><a href="#DBI">DBI</a></b> - execute a database query. |
|---|
| 532 |
<li> <b><a href="#STORE">STORE</a></b> - define a data storage rule. |
|---|
| 533 |
<li> <b><a href="#SYS">SYS</a></b> - perform system statistics checks. |
|---|
| 534 |
<li> <b><a href="#DISK">DISK</a></b> - perform disk I/O statistics checks. |
|---|
| 535 |
<li> <b><a href="#TAPE">TAPE</a></b> - perform tape I/O statistics checks. |
|---|
| 536 |
</ul> |
|---|
| 537 |
|
|---|
| 538 |
<i>Network Monitoring:</i> |
|---|
| 539 |
<ul> |
|---|
| 540 |
<li> <b><a href="#HTTP">HTTP</a></b> - perform remote HTTP/HTTPS tests. |
|---|
| 541 |
<li> <b><a href="#PING">PING</a></b> - network host ping testing. |
|---|
| 542 |
<li> <b><a href="#POP3TIMING">POP3TIMING</a></b> - time & check POP3 connections. |
|---|
| 543 |
<li> <b><a href="#PORT">PORT</a></b> - remote TCP port checking rule. |
|---|
| 544 |
<li> <b><a href="#RADIUS">RADIUS</a></b> - perform radius authentication tests. |
|---|
| 545 |
<li> <b><a href="#SMTP">SMTP</a></b> - test and measure SMTP connections to servers. |
|---|
| 546 |
<li> <b><a href="#SNMP">SNMP</a></b> - retrieve and test SNMP data from remote hosts and devices. |
|---|
| 547 |
</ul> |
|---|
| 548 |
|
|---|
| 549 |
<i>Security Monitoring:</i> |
|---|
| 550 |
<ul> |
|---|
| 551 |
<li> <b><a href="#FILE">FILE</a></b> - for testing file stats or changes. |
|---|
| 552 |
<li> <b><a href="#LOGSCAN">LOGSCAN</a></b> - define a log scanning rule. |
|---|
| 553 |
</ul> |
|---|
| 554 |
|
|---|
| 555 |
Note that there may be many more directives depending on the version |
|---|
| 556 |
of EDDIE or any new or optional directives which may have been added |
|---|
| 557 |
to the distribution. See the EDDIE-Tool developer's guide for more |
|---|
| 558 |
information on creating new directives. |
|---|
| 559 |
</p> |
|---|
| 560 |
|
|---|
| 561 |
|
|---|
| 562 |
<p> |
|---|
| 563 |
<H5><a name="directivedetails">Directive Details</a></H5> |
|---|
| 564 |
</p> |
|---|
| 565 |
|
|---|
| 566 |
|
|---|
| 567 |
<p> |
|---|
| 568 |
<b><a name="COM">COM</a></b><br> |
|---|
| 569 |
|
|---|
| 570 |
COM is a generic directive used to perform custom |
|---|
| 571 |
checks that other directives are not available for. It simply executes the |
|---|
| 572 |
given command in a sub-shell, and captures the stdout/stderr and |
|---|
| 573 |
return value for testing by the directive rule. |
|---|
| 574 |
<br><br> |
|---|
| 575 |
Security note: if EDDIE is run as root, the config files should not |
|---|
| 576 |
be world-writable as obviously directives like COM can execute any |
|---|
| 577 |
commands on the system. |
|---|
| 578 |
|
|---|
| 579 |
<br><br> |
|---|
| 580 |
COM-specific Arguments: |
|---|
| 581 |
<ul> |
|---|
| 582 |
<li> <b>cmd=<command></b> <i>(required)</i>: |
|---|
| 583 |
<br> |
|---|
| 584 |
- specifies the command to be executed in a sub-shell. |
|---|
| 585 |
<br><br> |
|---|
| 586 |
<i>Example:</i> |
|---|
| 587 |
<pre> |
|---|
| 588 |
cmd="/bin/ls /tmp/*.tmp | wc -l"</pre> |
|---|
| 589 |
</ul><br> |
|---|
| 590 |
|
|---|
| 591 |
Rule Variables: |
|---|
| 592 |
<ul> |
|---|
| 593 |
<li> <b>out</b> (string) |
|---|
| 594 |
<br> |
|---|
| 595 |
- the out variable contains the standard output (stdout) of the executed |
|---|
| 596 |
command string. |
|---|
| 597 |
<li> <b>outfields<i>n</i></b> (int) |
|---|
| 598 |
<br> |
|---|
| 599 |
- Number of output fields |
|---|
| 600 |
<li> <b>outfield<i>n</i></b> (auto-typed) |
|---|
| 601 |
<br> |
|---|
| 602 |
- the standard output is also split (by whitespace) and stored in variables |
|---|
| 603 |
outfield1, outfield2, etc, to simplify rule creation. |
|---|
| 604 |
<li> <b>err</b> (string) |
|---|
| 605 |
<br> |
|---|
| 606 |
- the err variable contains the standard error (stderr) of the executed |
|---|
| 607 |
command string. |
|---|
| 608 |
<li> <b>ret</b> (int) |
|---|
| 609 |
<br> |
|---|
| 610 |
- the ret variable contains the return code of the executed command string. |
|---|
| 611 |
<br><br> |
|---|
| 612 |
<li> <i>Examples:</i> |
|---|
| 613 |
<pre> |
|---|
| 614 |
rule='out == "test"' # true if stdout is just "test" |
|---|
| 615 |
rule='out.find("test")' # true if stdout contains "test" |
|---|
| 616 |
rule='int(out) > 5' # true if out (converted to an integer) is > 5 |
|---|
| 617 |
rule='int(ret) != 0' # true if return value of the cmd is not 0 |
|---|
| 618 |
rule='int(outfield1) != 0' # true if stdout field 3 is not 0 |
|---|
| 619 |
</pre> |
|---|
| 620 |
</ul> |
|---|
| 621 |
|
|---|
| 622 |
Action Variables: |
|---|
| 623 |
<ul> |
|---|
| 624 |
<li> <b>cmd</b> (string) |
|---|
| 625 |
<br> |
|---|
| 626 |
- the command string as specified by the cmd argument. |
|---|
| 627 |
<li> Plus all the rule variables as usual. |
|---|
| 628 |
<li> <i>Examples:</i> |
|---|
| 629 |
<pre> |
|---|
| 630 |
action=email("alert", "the command '%(cmd)s' failed.")</pre> |
|---|
| 631 |
</ul> |
|---|
| 632 |
|
|---|
| 633 |
<i>Directive Examples:</i> |
|---|
| 634 |
<pre> |
|---|
| 635 |
# Check load average (the hard way, without using SYS) |
|---|
| 636 |
COM loadavg: |
|---|
| 637 |
cmd="uptime | cut -d, -f4 | awk '{print $3}'" |
|---|
| 638 |
rule="float(out) > 6.0" |
|---|
| 639 |
action=email("alert", "Load on %(h)s is > 6.0") |
|---|
| 640 |
|
|---|
| 641 |
# Check number of netscapes running |
|---|
| 642 |
COM count_ns: |
|---|
| 643 |
cmd="ps -ef | grep netscape | wc -l" |
|---|
| 644 |
rule="int(out) > 3.0" |
|---|
| 645 |
action=email("alert", "There are %(out)s netscapes running on %(h)s") |
|---|
| 646 |
|
|---|
| 647 |
# A variation on checking load average, using 'outfield' variables |
|---|
| 648 |
COM loadavg: |
|---|
| 649 |
cmd="uptime | cut -d, -f4" |
|---|
| 650 |
rule="float(outfield3) > 6.0" |
|---|
| 651 |
action=ticker("Load on %(h)s is %(outfield3)s", timeout=1) |
|---|
| 652 |
</pre> |
|---|
| 653 |
</p> |
|---|
| 654 |
|
|---|
| 655 |
|
|---|
| 656 |
|
|---|
| 657 |
<p> |
|---|
| 658 |
<b><a name="FILE">FILE</a></b><br> |
|---|
| 659 |
|
|---|
| 660 |
This is a directive for performing checks on files or changes to files. |
|---|
| 661 |
Rules can be written based on any changes to the file metadata, like |
|---|
| 662 |
modification date, size, ownership, permissions, etc. It can also |
|---|
| 663 |
pick up changes to the file itself, which can be useful as a security |
|---|
| 664 |
check. |
|---|
| 665 |
|
|---|
| 666 |
<br><br> |
|---|
| 667 |
FILE-specific Arguments: |
|---|
| 668 |
<ul> |
|---|
| 669 |
<li> <b>file=<file name></b> <i>(required)</i>: |
|---|
| 670 |
<br> |
|---|
| 671 |
- the file to be checked. |
|---|
| 672 |
<li> <b>keepdiff=true | false</b> <i>(optional)</i>: |
|---|
| 673 |
<br> |
|---|
| 674 |
- if true Eddie will generate diffs of changed files. |
|---|
| 675 |
Previous copies are stored in the <a href="#WORKDIR">WORKDIR</a> |
|---|
| 676 |
directory. [Eddie 0.35+] |
|---|
| 677 |
<li> <b>difftype=context | unified | full</b> <i>(optional)</i>: |
|---|
| 678 |
<br> |
|---|
| 679 |
- the type of diff to generate. [Eddie 0.35+] |
|---|
| 680 |
<li> <b>context_lines=<integer></b> <i>(optional)</i>: |
|---|
| 681 |
<br> |
|---|
| 682 |
- how |
|---|