Setting Sail with Kentik NMS: Unified Network Telemetry

Kentik NMS has launched and is setting sail in familiar waters. Monitoring with SNMP and streaming telemetry is only the first leg of the journey. In

(This post originally appeared on the Kentik Blog)

A view from the prow as we set sail on the rolling seas of SNMP

So you may have heard by now that Kentik has released a network monitoring system, commonly known as an “NMS,” in the smoky rooms of observability aficionados. This is more than just a cute little add-on to our robust flow-based monitoring capabilities. This is a stem-to-stern product that could stand alone if we wanted it to.

But the question that arises in many people’s minds, like the first light peeking over a distant horizon, is: Why?

Why, when the discipline of network monitoring is solidly into its third decade, would we think their shores are yet uncharted?

More to the point, is Kentik trying to imply that – in this age of ubiquitous cloud, containers, microservices, and APIs – the regular old route-and-switch network even matters anymore?

The time is right for modern network monitoring

Given that we’re releasing Kentik NMS, the answer is obviously “yes.” But in this blog post, I need to get at why we’re doing it. Besides, you know, “customers keep asking for it” (although, admittedly, that is a pretty good reason).

First, traditional monitoring still matters. Hardware — both in terms of availability and performance — still matters. On-premises systems still matter. And “the network” - meaning anything from bare metal packet pushers in a closet all the way up to a Kubernetes cluster in the cloud — matters.

All of those things I’ve just named, along with a myriad of other infrastructure elements, are still critical components for organizations large and small. Being able to collect telemetry and visualize it effectively, turning data into information that drives action, is a core capability for any — and every — business.

Second, the people responsible for running and maintaining their networks keep telling us that the current set of solutions on the market have either failed to keep up or that the cost of keeping up is so high their speed of adoption is unacceptably slow.

Before you interpret what I just said as an insult to existing vendors, let me be clear: I have a tremendous amount of respect for and even love the existing monitoring tools on the market. They do many things well, and in some cases, they were the first to do those things. They blazed trails, educated consumers, and established whole markets and sub-specialties within IT.

But pivoting an entire product line is almost impossibly complicated. An established tool has existing customers who cannot be abandoned, which means keeping the current solutions more or less the same. Adding new capabilities is predicated on the ability of the tool to accommodate those new functions without breaking existing ones.

For example, let’s look at collecting network metrics via API rather than a more traditional method like using SNMP. And please understand the irony of calling SNMP “traditional” versus APIs when network devices have included API options for the better part of a decade.

While it’s been possible to collect data from hardware via API calls for quite some time, precious few network monitoring tools support this capability or do it particularly well. To be sure, the solutions that focus on application monitoring do it better, but even there, it’s in the context of the application rather than hardware.

The reason for this isn’t because monitoring solutions vendors are lazy or uninspired. It’s that the work of adding an API collector is hard; different vendors have implemented API interfaces in just a different enough way to create additional hurdles, and normalizing the API data with the other telemetry presents its own hurdles.

This difficulty stems from the fact that hardware didn’t support APIs when the tools were conceived and written.

And if all of that is true for a 10-year-old technology like REST-ful APIs, how much more so is it true for OpenTelemetry and its Cisco-specific cousin, streaming telemetry?

Keep your network even keel

Kentik Network Monitoring System dashboard

Kentik NMS in action

All of this is my way of saying that Kentik realized the world needed a new NMS because:
A) That data still matters, and
B) Creating an NMS from the ground up was actually easier than bolting additional capabilities onto an existing tool.

This brings us to the point we find ourselves at today: Kentik NMS has launched and is setting sail in familiar waters. Monitoring with SNMP and streaming telemetry is only the first leg of the journey. In short order, we’ll unfurl additional options, increasing NMS’s velocity and maneuverability.

So, now that the ship has set sail, I hope you come aboard and have a look around.