Enterprise Management 

Enterprise Management
The definition of what constitutes Enterprise Management depends largely on your environment, but could include the following:
All of this depends on a number of factors, such as your environment, your products, and the scope of what you are trying to achieve.  Regardless of this I'll run through each of these areas briefly to give you an idea of the things I think about when implementing an Enterprise Management solution using HP OpenView.  I then provide you with my recommended 3-phase approach to effective Enterprise Management and finally I summarise my thoughts & areas for potential concern.
 Enterprise Management - The Building Blocks 
Network Management & Monitoring
Network Management & Monitoring
 
The ability to monitor your network devices & network links is the foundation of any Enterprise Management task.  It is often overlooked or considered (assumed?) to be the responsibility of another team within the organisation.  Despite this, network monitoring is a relatively easy task given the appropriate focus, time & resources.  This would usually involve the identification & classification of network devices, configuration of SNMP traps, loading & configuring MIB files, and then configuring the alerting conditions with HP OpenView.  An effective Enterprise Management system must include network management & fault reporting, otherwise the whole "enterprise" monitoring concept is undermined.  What is the point in monitoring thousands of servers, all showing a status of "green" with no associated faults, if your underlying network is down?
Systems Management & Monitoring
Systems Management & Monitoring
 
The monitoring of servers is critical to the success of any Enterprise Monitoring strategy.  The scope of this monitoring depends heavily on your environment, but you might want to consider the monitoring of Windows, HP-UX, Solaris, AIX and Linux platforms, both physical & virtual.  Monitoring of servers will typically involve the installation of HP OpenView monitoring agents which pro-actively monitor the systems according to well-defined & agreed monitoring baselines.  Some examples of this are the monitoring of filesystems, disks, critical processes, important logfiles, as well as availability monitoring of the servers themselves.  All servers of importance should be monitored effectively within your environment.
Database Management & Monitoring
Database Management & Monitoring
 
Once you have your servers monitored using a well-defined and agreed strategy, the next natural progression should be to start to consider the next tier, for example databases such as MS SQL, Oracle, Sybase, or DB2.  All critical databases should be monitored to some degree, whether this is via in-house scripting and basic logfile/process monitoring, or via a more advanced/complete mechanism such as that provided by the HP OpenView Database Smart Plug-Ins (SPIs).  You may also consider integration with tools such as Oracle Enterprise Manager (Grid Control), but I don't recommend this being the sole method of database monitoring for many reasons (such as SPOF).
Application Management & Monitoring
Application Management & Monitoring
 
Now that you have considered network devices, servers and databases, the next thing to consider is application monitoring.  This does not need to be complicated, it could mean the monitoring of simple application processes, logfiles, or the running of scripts.  You should identify the critical applications in use within your organisation (or consider the top 5 initially) and engage with those teams to ascertain what level of monitoring would be useful.  If they can identify current issues, or where they have had problems/outages in the past, this can help drive the requirements for providing some basic, but ultimately proactive monitoring in the application space.  Once you have engaged with a team in this manner it might become apparent that more detailed monitoring is required, which may involve using an HP OpenView SPI module, integrating with a 3rd party tool, or writing some more complex monitoring functionality.
Application Integrations
Application Integrations
 
To further enhance your monitoring capability you might also consider integrating applications within your environment into your Enterprise Monitoring system.  For example, where an application is already performing a monitoring function, you might want these alerts integrated with your primary monitoring tool.  This could take the form of an application forwarding alerts into HP OpenView, where the application is providing additional benefits such as local suppression, correlation or enhanced functionality.  With any single-point integration you should always consider the risks associated with such configurations and try and mitigate these where possible - single-point integration can sometimes mean single-point-of-failure!  An alternative integration method can be to utilise a local monitoring agent (e.g. HP OpenView), so that an application agent sends alerts locally to the resident HP OpenView agent, which in turn alerts to your central monitoring console using standard mechanisms.  The benefit of this approach is that you can eliminate the single-point-of-failure scenario, and also reduce network traffic.  Examples of possible applications for integration are as follows: Systems Insight Manager (Windows), EMC Control Centre (SAN), Cisco Call Manager (Voice), SolarWinds (Networks), Oracle Enterprise Manager Grid Control (OEM).  You should investigate which applications are used in your company and review them thoroughly.
Incident Management
Incident Management
 
Incident management, in the context of Enterprise Monitoring, relates to how alerts & faults highlighted in the monitoring environment are escalated and recorded in your incident management system (for example HP Service Centre, BMC Remedy, ServiceNow).  Once an alert from your monitoring environment is produced it should be escalated to the incident management system so that a record is created for the failure and the relevant team can update the trouble-ticket with information on cause, effect, impact and resolution/remediation details (where possible).  Initially this may be a manual process of raising a trouble-ticket to the relevant system, but semi/full automation should be considered to allow datacentre operations teams to right-click to raise trouble-tickets based on the alert.  This could progress naturally to allow those alerts destined for a trouble-ticket only (i.e. no callout required) to be raised as "auto-tickets" - thereby ensuring tickets are created automatically, and reducing the need for datacentre operations teams to even see such alerts, thereby leaving them to concentrate on the more critical issues.  Once such a process is underway, tested and working satisfactorily, you could then consider automating trouble-ticket generation for all alerts seen by the datacentre operations team, thereby reducing the chances of alerts being missed.  This should be a gradual, phased approach to ensure the entire process works as your business requires.
Notification Management & Strategy
Notification Management & Strategy
 
Once your Enterprise Monitoring solution has matured you may naturally consider the use of a notification system.  Support teams within your organisation might also be requesting this from an early stage, especially if you have improved the monitoring capabilities for their particular domain!  Your organisation might have some form of notification tool already is use within the business, so this should be investigated and reviewed accordingly.  Examples where this might be used include helpdesk & incident management, business continuity planning (BCP) and indeed, current or previous monitoring systems.  You might decide to leverage the use of the resident notification tool (if it is fit for purpose), or you may consider a wider approach and review whether a new system should be used, phasing out the old system in due course, to create a new corporate notification system.  From an Enterprise Monitoring point of view, you might want to configure various HP OpenView alerts to automatically notify teams using SMS messages to mobile phones, pagers etc.  It is also possible to configure responses from such tools, so that the recipient can respond to the notification, escalate issues, or even take remediation steps remotely.  The benefits of automated notifications are numerous - faster alerting of faults to the correct teams, improved accuracy of information received (rather than a confusing telephone call), along with full auditing of the notification process itself.  Any notification strategy should obviously be planned, agreed and implemented in phases to ensure it delivers exactly what is required, and does not become a burden on resources, whether IT or people-based.
Performance & Capacity Management
Performance & Capacity Management
 
Performance monitoring in this context does not simply mean monitoring servers for CPU or disk utilisation for example, it is more concerned with creating a function whereby you can deliver performance data to teams if they require it.  This could be in the form of performance graphs/reports for reviewing testing cycle phases, for analysing environments during major faults/incidents, or for providing data evidence to support trending or capacity planning exercises.  With the HP OpenView monitoring agents, they collect and store a core set of metrics by default, which can then be graphed using tools like HP OpenView Performance Manager.  These tools are intended to be used on ad-hoc basis, as and when the need arises to produce performance graphs or on-demand reports.
Service Management
Service Management

So often senior management are desperate for a Service Management view of their environment.  They are keen to be able to show the critical services within the organisation, and show alerts & status to indicate service health/degradation, but this is sometimes considered prematurely.  There is no benefit in showing services as "green" (traffic light indicators are often a request!) if the underlying infrastructure components are not managed and monitored completely and effectively.  Showing a service as "healthy" is inaccurate if you are not monitoring the network (or other critical components).  Once you have a mature Enterprise Management and monitoring solution deployed, or at the very least the structure and plans to deliver such a solution, then consideration can be given to providing a service management capability.  There are various tools that can help with providing a view of the health of your core business services, but a very simple method is to utilise the data held within your incident management system, and represent this data graphically.  The benefit of using the incident management system itself is that this will record alarms originating from your Enterprise Monitoring solution (e.g. HP OpenView Operations Manager) but will also record incidents created from other sources, including those raised by clients and users.

Reporting
Reporting
 
Reports are of no use if nobody is going to read them, but that does not mean reporting should be dismissed out of hand.  Effective reporting on metrics or KPIs can help identify trends, show project benefits & ultimately justify projects & resourcing.  Examples of some useful types of reporting include incident reports (tickets per team, for example), notification reports (frequency of out-of-hours callouts), performance & capacity reports (CPU utilisation, free disk space etc.) and ultimately service reports (indicating service levels, SLAs etc.).  Reporting is not something that may need to be considered at the onset of an Enterprise Management programme, but it should be considered carefully and investigated completely when the requirement presents itself.
 Enterprise Management - My Recommended Approach 
Phase 1
Phase 1
 
- Review the current environment, including hardware, software, personnel, processes & procedures
- Define the project, the plan, the migration strategy (if applicable) and start to engage teams
- Purchase new software as required (for example agents, SPI modules, new products)
- Deliver the monitoring infrastructure - build, install, patch, review, refine, test, document
- Create standardised OS monitoring baselines for all platforms - plan, test, review, refine
- Deploy OS monitoring baselines to all platforms - plan, deploy, review, refine, document
- Monitor network devices (and any other SNMP based devices)
- Monitor databases - plan, define baselines, test, review, refine, deploy, document
- Define & document all processes & procedures
Phase 2
Phase 2
 
- Define incident management requirements & strategy and deliver appropriate solution
- Define notification requirements & strategy and deliver appropriate solution
- Monitor applications (basic)
- Monitor applications (advanced)
- Application integrations
- Investigate all other critical areas that require monitoring and plan/address accordingly
Phase 3
Phase 3
 
- Define patching strategy, product roadmaps, long-term plans etc.
- Review support - review agreements, consolidate as required, assess support effectiveness & suitability
- Review licensing - bulk purchases, license management, true-up exercises etc.
- Define performance management requirements & strategy and deliver appropriate solution
- Define service management requirements & strategy and deliver appropriate solution
- Define reporting requirements & strategy and deliver appropriate solution
 Enterprise Management - Summary 
Enterprise Management - Summary
In my experience, if you can achieve all of these things as part of your Enterprise Management & Monitoring solution you are in a very good place, and you are in a very small minority!  I have seen very few organisations deliver a complete Phase 1 monitoring solution, even fewer have fully implemented a Phase 2 level solution, and I have not seen ANY organisation successfully deliver a complete Enterprise Management solution covering all of the phases and scope described here.
 
Many organisations simply do not address all components of the solution, either with disparate teams not integrating with the overall solution or with core areas not being considered at all.
 
However, if you take the time to ensure you consider all of these aspects within your environment, coupled with ensuring you have the right people on-board to drive the solution forward into reality, you will have a firm foundation on which to build your Enterprise Management & Monitoring programme.


Copyright © Protocol Limited 2012
Registered in England No. 3182190 | VAT No. 677 7764 63