Conversation Driven Collaborative Triaging


By – Trupti Kolaskar
(Product Owner | ignioTM Experience and Collaboration | Digitate)

IT outages can significantly impact the productivity and profitability of an enterprise, including the negative publicity around long outages. According to an industry research, an outage can cost an enterprise over $100,000 per hour. Approximately 35 percent companies take one to 12 hours to fix the infrastructure outage and ~ 17 percent need two to seven days to resolve the issues.1 Around 60 percent of the surveyed companies consumed over 15 minutes to identify the right team to work on the issue, which resulted in loss of $25,000 just to get the ticket claimed and acknowledged.2

Let us consider a hypothetical enterprise and its incident management journey.

In all the hoopla, ABC Inc. had to endure substantial losses. It had a huge business impact and profitability was affected. This might not have been the first time the application went down due to a disk issue, but it still consumed hours to debug and resolve.

If we introspect the situation, then there are many ways in which the given situation could have been improved. For instance:

  • Probably Ashley was not the right person for the issue resolution. Assigning it to Antonio earlier would have expedited the triaging.
  • If Joe had a separate mechanism to get the regular incident updates rather than disturbing the operations team, it would be less stressful.
  • If there was a collaborative triaging setup where Application team, DB team and Server management team could jointly triage and debug the issue, the overall resolution time would have drastically reduced.
  • Easy access to automation to resolve the disk issue would have reduced mean time to fix the issue.

In nutshell, for effective incident handling, enterprises like ABC Inc. need:

  • Minimum time for an expert to acknowledge the incident
  • Easy mechanism to send continuous updates to the business stakeholders
  • Reduced triage time through collaborative triaging
  • Minimum time to take corrective action

Who can help do it all?

ignioTM provides an Integrated Collaboration Room solution – the platform to kick-start the transformation in the issue resolution process.


ignio’s Integrated Collaboration Room

Benefits of ignioTM-Integrated Collaboration Room for the Enterprises

Right Assignment and Collaborative Triaging

Using Machine Learning (ML), ignioTM identifies the expert for incident assignment. The assignment notification and the incident details are sent through the enterprise messenger, which enables the team to connect for collaborative triaging and resolution. ignioTM Collaboration Room connects all the triaging experts and resolvers on the enterprise messenger, to ensure transparency in the incident resolution process. Additionally, it ensures minimum time consumption in issue reassignment, acknowledgement and to share the right context across the teams.

The automated notification and the ability to send regular updates to business users, keeps the stakeholders posted on the recent activities.

Right Context to Assist Faster Triaging

For an operations team, ignioTM Collaboration Room provides certain important and helpful information that can expedite the resolution. For example:

  • Triaging Details: The framework of Collaboration Room assists the operations team with the details of the automated triaging performed by ignioTM for all the modeled parameters. These parameters contribute to the efficient system. The team also get the details of the fixes applied (if any) for the identified faults. The details assist the operations team in seamless transition from ignioTM to manual support, considerably reducing the triaging time. The Collaboration Room expedites the triaging process by running system check on the entire application topology and access report.
  • Current Context: The information about the critical components impacted by the incident is an important piece of information to help the team prioritize incidents. When ignioTM creates the incident collaboration room, this information is shared with the resolvers through collaboration channels such as Slack, Teams and so on.
  • Just in Time Context: Real-time notifications on duplicate issues flowing in, or issues that are correlated to the current issue helps the team understand the gravity of the situation.
  • Historical Context: Knowing the periodicity and the profile of the current issue is useful as it provides the detailed behavioral context associated with the issue. Some of the resolutions are purely based upon these aspects, and easy access to these details helps in understanding the issue precisely. Other key specifics such as time taken to resolve similar issues in the past and the standard steps followed for resolution are very useful for the ITOps team. However, it requires extensive historical data/ticket mining to extract these details.

Facilitating Corrective Actions

The ability to execute the automated fixes on selected targets from the Collaboration Room is pivotal in making the entire process agile and efficient. This is one of the critical features in the effective implementation of the Collaboration Room solution and it directly adds up to the operations agility.

Business User and Stakeholder Updates

When a critical issue flows to the manual queue for resolution, it becomes extremely important to notify business users who will be impacted. While the operation team is working on the issue, ignioTM Collaboration Room allows team members to send regular updates to the business users on the enterprise messenger.

Thus, any enterprise planning to make their operations agile, highly collaborative, transparent and efficient, with continuous learning, must embrace the Collaboration Room solution offered by ignioTM. It will ensure effective transformation in the issue resolution process and ensure fast track benefits such as improved MTTR, effective management of outages/SLA and knowledge building. An idea called Conversation Driven Operations can now help the enterprises reduce outages and in-turn improve profits significantly.

Sources:
1 https://www.devopsdigest.com/idc-survey-appdynamics-devops-application-performance
2 https://info.xmatters.com/rs/alarmpoint/images/xMatters-2015-Survey-Report.pdf

Leave a Reply