blog imag

By Prem Naraindas

Are IT operations drowning?

We all can agree on one thing. Every business now is becoming a digital business. The current pandemic situation has made digital transformation faster for many companies. Business leaders are calling on the CIO to push the innovation envelope to meet changing customer demands and expectations.

To drive digital transformation, businesses are rapidly evolving their IT stack to become more agile, scalable, and cost-effective. They are adopting new tools and platforms, such as public cloud, microservices, containers, and serverless technologies.

Businesses are also adopting DevOps and CI/CD practices to allow them to move faster in the marketplace. At the same time, enterprises continue to retain many of their legacy and homegrown technologies that were accumulated over many decades. Unfortunately, for most businesses, these changes have dramatically increased the scale, fragmentation, and complexity of their IT stack.

This has created a large volume of data draining the traditional ITSM/ESM (ITESM) tools and make it impossible for IT operations to become proactive. IT operations can feel like a thankless job. Teams may have the knowledge and experience to be a strategic partner to the business, but demonstrating value can be an uphill task.

The daily grind of chasing down alerts and patching problems can lock IT personnel into a cycle in which they are continually playing catch-up instead of preventing problems from arising. Increasing the use of the cloud can alleviate some of these issues, but it doesn’t make the operational complexity go away: someone still has to manage those cloud services and organizational interconnections.

With workloads rapidly growing and with no consistent, effective way to prioritize activities, IT operations are constantly on the back foot, perpetuating a stereotype that the function is reactive and slow to move – the very perceptions that IT functions have tried so hard to shake.


IT operations must deal with a number of key challenges:

Escalating Service Expectations, with Little Margin for Error.

As end-users’ expectations continue to grow, service-level agreement (SLA) requirements have become more stringent. At the same time, organizations increasingly expect IT operations to deliver both near-perfect service availability and a shorter mean time to recovery when incidents do occur.

A Dizzying Number of Services, Released at Faster Rates.

Although modular architectures are a boon to innovation, they’ve contributed to the creation of hundreds of new APIs and microservices that IT operations must monitor and maintain. Complex interdependencies among these services make finding the root cause of outages or other issues exponentially more difficult. In addition, the release cycle has accelerated as agile development practices and iterative launches become mainstream.

Torrents of Data and Alerts Without an Easy, Reliable Way to Filter Them.

The manual and rules-based monitoring systems that most IT operations now have in place can’t cope with the demands of today’s complex and dynamic environments. Chasing down thousands of alerts—many of which turn out to be false positives—often leads to “alert fatigue,” which may result in more fire drills and actual emergencies down the road.

A Fragmented and Increasingly Borderless IT Operations Landscape.

With the introduction of DevOps, more operations activities are now being managed by feature teams, whose members lack the specialized operations experience needed to address the nonautomated portion of those activities. And because those activities are less centralized, they’re harder for IT functions to coordinate. Likewise, as companies have expanded their partner ecosystems, IT operations have had to extend their monitoring activities across company boundaries and look for new ways to measure—and sometimes charge for—IT service consumption.

These demands are growing at a time when IT operations budgets are under increasing pressure. The only way for IT leaders to deliver the stability and cost-effectiveness that their budgets demand is to make their operations more predictive, proactive, and automated.

In the next blog, we will talk about what is the root cause of these challenges and how emerging technologies can make life better for IT operation, and how Opentelemetry, along with open-source, will disrupt the AI operations (AIOps) Space.