How to Organize for ITIL

Organizational structure plays a significant role in the success or failure adopting ITIL. Correct organizational structure is critical to your success -- but you probably should NOT reorganize to achieve it!

The IT Infrastructure Library™ (ITIL®) describes process, not performance. ITIL V3 expands the coverage of V2 to offer limited guidance regarding organizational structures and how they contribute to performing the processes. However, these two are often confused, with practitioners frequently transforming ITIL processes into organizational structure.

As a result, many new implementations quickly stall upon trying to map the ITIL processes into organizational structures.

While most leaders understand that organizational structure affects IT's ability to deliver required performance, they often are puzzled when the changes they make do not deliver the expected results.

In 2006 Gartner predicted that 45% of IT organizational realignments will fail due to confusing process with performance. This confusion about structure, in turn, generates resistance within organizations.

A study released in February 2006 by Evergreen Systems Inc. found that 72% of respondents to a survey claimed the biggest barrier to ITIL adoption in their business was organizational resistance.

Trying to map ITIL processes functionally, that is, an Incident group, a Problem group, and so on, quickly becomes problematic. Processes describe the work performed by people, not organizational structures. Following is a description of how to organize for effective ITIL adoption.

Process vs. Performance

Traditional IT organizational structure makes little distinction between process (task descriptions) and performance (task execution).

Traditional IT structure is silo based, with technical expertise concentrated into self-contained organizational units. This silo structure is a legacy inherited from early IT operations, which collected technology and its technologists into manageable units.

For example, the telephone department operates the telephone system, provides support to users, and performs system maintenance. The mainframe department does the same, as does the network department, and so on. Each silo has its own versions of similar processes, and there is often little coordination between silos.

There is nothing wrong with silos per-se, and they exist in most organizations and professions. Medicine, education, the military, and virtually every business use silos to collect specialists into manageable groups.

However, while the scope of IT has now since expanded beyond disconnected standalone systems, silo management has not kept pace. Today's IT services are complex communications and collaboration systems spanning multiple silos. This expansion of IT services is the cause of the dilemma around staffing for a process-driven operation.

To visualize an effective organization using a process-driven operation with dynamic teams composed of resources coming from various locations, consider how a typical volunteer fire department operates:

  1. Members work their regular jobs until they hear the fire alarm; then they all meet at the firehouse.
  2. Each fire fighter knows his or her role -- one takes command, one starts the truck, others gather tools and load.
  3. They all know their jobs, and there is no confusion about who is to do what, when, how, or why.
  4. Working together, they put out the fire.
  5. When the fire is out, they return to the firehouse to clean their equipment and complete documentation.
  6. Then they all go back to their regular jobs and await the next call.

While some organizational change to support the reality of modern IT is inevitable, the real changes are not in where IT workers sit or to whom they report. The real change is in how they dynamically organize and collaborate to accomplish an objective -- just like the volunteer the fire department analogy.

Where to Begin Organizing

Many companies start implementing ITIL around the ITIL V3 Service Operation phase (the Service Desk function, and the Incident and Problem Management processes). Commonly these companies have existing staff, but have neither written distinctions nor clear organizational boundaries between the activities performed.

Without any defined process there is often no regard for process boundaries -- whoever takes a call works the issue through to resolution. With no distinction between process and performance, workers within these self-contained units operate within self-contained roles.

The lack of a process for collaboration diminishes organizational learning and IT performance. Customer satisfaction tends to be low due to lack of follow up, miscommunication, and poor coordination -- the classic symptoms of a reactive organization with little process control.

Within the ITIL process framework the previously mentioned tasks of answering the call and working an issue to resolution traverses four to five distinct process boundaries. Service Desk governs user contact; Incident Management governs data collection and triage; Problem Management governs root-cause analysis; Change Management governs any changes required, and might invoke Release & Deployment Management if the change is large or complicated.

The question arising from this situation is how to allocate resources functionally to perform the work within such a process framework. One solution many companies arrive at is to split technical silos into multiple groups -- one performing triage (Incident Management), another providing root-cause analysis (Problem Management), etc.

The problem with this approach is the creation of more organizational silos resulting in the loss of organizational knowledge, decreased team building, fractured management, human resource issues, increased costs, and reduced communications -- exactly what we want to avoid.

A better approach is to consider the ITIL processes as oversight or governance over the work performed by staff. ITIL processes do not describe an organization, but rather the work an organization must perform.

World Class IT

There is another industry, which faced and overcame many of the issues facing IT today -- manufacturing. Over the last 300 years or so manufacturing has solved many organizational and process-related problems. Taking a cue from manufacturing operations can provide clear and proven methods for IT.

World Class Manufacturing (WCM) describes how a manufacturing operation aligns closely coupled processes -- eliminating waste and bridging organizational silos in the process. WCM is a process-driven approach to improving manufacturing operations wherein each process begins to work only when a downstream process requires it, pulling rather than pushing work (inventory) through a production line.

This is in direct conflict with the traditional capacity-driven manufacturing mentality found in most western cultures and virtually every IT organization. However, this is exactly what the ITIL describes with regard to IT! We need to (re)organize for World Class IT. Using the WCM pull or Kanban lean just-in-time concept as a model, we can leverage existing technical silos as shared resource pools that come together when needed.

Within the World Class IT organization, pre-ITIL organizational reporting structures remain largely the same. However, the expected workflow alignment changes from vertical to horizontal processes, and defines roles, responsibilities, and common oversight. Then, as required, workers are pulled into virtual organizations to address the need.

Just as the volunteer fire fighters work as a team, so must IT. Following is a dramatization of an effectively running IT organization following ITIL best practices:

  1. Engineering and technical workers perform their day-to-day administration, maintenance, and installation activities in various technical areas (e.g., silos) like hardware, software, network, etc.
  2. A blade-server fails, leading to an Incident. The Incident does not necessitate a Problem because the resolution is already known (replace the blade.) The fix is to remove the failed blade and insert a replacement blade of the same type and configuration, so there is no need for a Request for Change (RFC) either. The worker simply removes and restores the blade as a part of his/her daily work (Standard Change), while keeping accurate records and updating the Configuration Management Database (CMDB) as required.
  3. However, upon the invocation of "the Problem Management process," various members of required technical groups come together like volunteer fire fighters. Designated team members from the various functional areas and technical departments coalesce into a team, assuming new roles with unique responsibilities and authorities.
  4. These cross-silo teams come together with each member knowing his/her duties and place within the virtual organization. They perform the required tasks, resolve the Problem, and then disband until required again.

This is only possible using a combination of strong management controls, best practices like ITIL and a WCM/Lean manufacturing philosophy.

Understanding ITIL Workflow

As mentioned previously, many companies begin ITIL implementation with the Service Desk function, and the Incident and Problem Management processes. The following example shows how these groups might work together.

Call Center - Call Centers direct calls to the appropriate Service Desk or Service Desk staff member based on call attributes. For example, a user requests assistance using an application. This call is then routed by the Call Center to an appropriate agent at the Service Desk.

Service Desk - The Service Desk agent then takes ownership of the Incident. At the point of logging the incident in preparation to provide direct support to the user, the Service Desk staff member smoothly transitions into performing the tasks of the Incident Management process.

Incident Management - The agent, now an Incident handler operating under the auspices of the Incident Management process, collects information and attempts triage for the user. Should the agent be unable to assist the user, he or she may escalate, or open a Problem record.

Escalation is another point of confusion in many organizations planning ITIL. When the first level Incident handler exhausts all information sources available to him/her and/or determines he/she is unlikely to conclude within required time frame, and/or unable to assist, the Incident must route to another person (2nd level) or open a Problem.

Assuming the Incident routes to second level, the Incident handler, regardless of his/her location, still performs tasks under the auspices of Incident Management. The 2nd-level Incident handler can physically reside anywhere, be in any IT department, and even be outside of the IT organization, but as long as he/she works the Incident, he/she is working under Incident Management control.

There is no requirement that any level of support beyond the Service Desk agent (e.g., 2nd, 3rd, ...nth) be a part of any function or department, and any line of support may perform investigations, perhaps even using test procedures, diagnosis tools, etc.

For example, depending on the Incident, second level could be a technician in the network group or an administrator working in the telephone department. It is the work performed, not the organizational reporting structure, that matters. This often requires Operational Level Agreements (OLA) between IT and various processes, as well as functional groups.

Our example is still routine day-to-day work and considered an Incident, however. If second level determines it is unlikely or unable to resolve the issue, it may also escalate. The number of support levels is undefined, being unique to each company's infrastructure. It is not uncommon to have two, three, or four levels of Incident support, each with increasingly focused knowledge and expertise.

Problem Management

At any point, if it seems the issue is complex or requires involvement of multiple technology departments or silos, the Incident may require raising a Problem Record. Some organizations choose to raise a Problem whenever an Incident cannot be matched in the Knowledge Base.

When working a Problem Record, staff now operates under the governance of the Problem Management process. This normally means appointment of a "Problem Manager" or "Problem Coordinator" to lead the work and supervise cross-department interactions.

It is at this point that a "problem team" of staff members form from various technical groups. The Service Desk staff (performing 1st-level Incident handling) now takes on the responsibility of Incident oversight and monitors the resolution of the Problem, coordinating with the problem team leader and keeping the user informed as required.

Summary

It is precisely these roles and their responsibilities that create a process. As you can see from these examples, structuring for process does not always require re-organizing departments. It often does require training and organizing dynamic teams so that they know how and when to perform (e.g., Using diagnostic scripts in Incident Management), and may require Operational Level Agreements.

Related programs

Related articles