Purpose/scope
* The purpose is the ability to detect events, investigate and determine the correct control action * The events (warnings and exceptions) can be used to automate many routine activities * Event Management can be applied to any aspects ofEvent handling
Event notification and detection
Event notifications can be proprietary, only certain management tools can be used to detect events. Most of the Configuration Items (CIs) generate event notifications using SNMP open protocol (Event filtering
Filtering means that the event notification can be ignored or communicated to the management tool. If ignored, the event will usually be recorded in a log file on the device, but no further action will be taken.Significance of event
Standard categorization based on the significance of an event: *Informational (INFO): the event does not require any immediate action and does not represent an exception. They are recorded in the log files and maintained for a predetermined period. This type of event is used to check the status of a device or service, to confirm the state of an activity, to generate statistics (user login, batch job completed, device power up, number of users logged into an application) *Warning (WARN / ALERT): the event is generated when a device or service, (application / utility), is approaching an agreed threshold ( KPI). Warnings are intended to notify the group/process/tool in order to take the necessary actions to prevent an exception occurring. *Exception (ERROR): means that a service or device is currently operating below the normal parameters/indicators (predefined). This mean that the business service is impacted and the device or service presents a failure, performance degradations or loss of functionality (web server down, CS coverage lost for several sites). A device failure is an error. Note the addition below is not an Event type but analysis that can be carried out from the Event logs: *Trend analysis The event logs should be regularly analyzed for indication that the event patterns NFO, WARN, ALERT, ERRORmay indicate an underlying Problem that may be addressed in advance of a serious service disruption.Response
At this point in the process, there are a number of response options available. Some of the options available are: *Event logging: regardless of the event type, a good practice should be to record the event and the actions taken. The event can be logged as an Event Record or it can be left as an entry in the system log of the device. *Alert and human intervention: for events that requires human intervention, the event needs to be escalated. The purpose of the alert is to notify the correct resource (person) to handle the event. Incident Record: an incident can be generated when an exception is detected. * RFC: in case of an RFC there are two scenarios underlined: **For an exception (two new network devices have been added without the necessary authorization) **For a change (in order to prevent a file system failure, the server needs to be upgraded. It may take a while for the change to start working.)Close event
*In the case of events that generated anSee also
* Information Technology Infrastructure Library * Incident management (ITSM)References
{{reflist ITIL