Kubernetes Monitoring Needs Maps
Microservices applications are an intricate web of service interactions. All transactions are fulfilled through sequences of API and DB calls that span multiple services. It is absolutely critical to understand the service dependencies in microservices applications such as those running on Kubernetes. This is where application maps, that capture all the services and their dependencies in real-time, come into play.
In a previous blog post, we defined application maps and compared various techniques for generating them. The Netsil Application Operations Center (AOC) generates auto-discovered application maps by analyzing service interactions without requiring any code change. Users can visualize and understand the applications from multiple perspectives by using the AOC generated maps. For e.g., the maps below show Kubernetes cluster at the host, namespace and pod levels.
This blog is intended as a walk-through tutorial that can help you to create maps for your Kubernetes clusters. We also highlight specific use cases for leveraging the dependency chains in maps to help with incident response and production deployments.
We will be using the sock-shop app running on a Kubernetes cluster as our target application for mapping and monitoring. The AOC is installed on a VM and the collectors are installed as a DaemonSet on each of the Kubernetes worker nodes (see figure below).
Discovering Your Application Using Default Map
Once the AOC and collectors are installed, login to the webapp and switch to the Map Sandbox. This will load the Default Map.
The Default Map uses an internal algorithm (AutoGroup) to identify services based on the protocol and attributes of the protocols such as HTTP URIs, DB Queries, etc. For e.g., in the picture below you see the auto-discovered HTTP, DNS, MySQL services.
The zoom and pan features of the map help you move around and visualize the discovered services. You can also search for specific services. The picture below searches for MySQL services in the application. In addition to discovering services, default map also captures dependencies and key performance metrics (latency and throughput) for services.
Summary Action Items:
* Login to Netsil webapp
* Select Map Sandbox from left navigation
* Use Default Map to discover and understand services in your application
* Use search to locate HTTP, DNS, MySQL, etc. services on the map
Creating Your First Map
The default map is great to quickly get started and get visibility into all the services making up your application. But if you are responsible for a specific subset of services, you can create a map containing just the right set of services.
Let’s say we are responsible for the sock-shop application. We can create a map for sock-shop consisting of all the pods and their dependencies. We will use the Filters and GroupBy features to customize the map. Netsil automatically collects the kubernetes metadata such as pod names, namespaces, service names, etc. So all we need to do is select the right grouping and apply right filters.
- Start with the Map Sandbox in the left navigation. This will load the default map.
- Apply filter to restrict pods to the sock-shop namespace. Use tags.kube_namespace attribute and set it to the sock-shop namespace.
- Since we want sock-shop map to be at the pod-level, change the grouping criteria from AutoGroups to pod_name
- Name the map and save it. That’s it we are done!
- The figure below displays the sock-shop map at the pod-level.
Summary Action Items:
* Load a Default Map from Map Sandbox
* Change the GroupBy from AutoGroup to pod_name
* Apply filter by using tags.kube_namespace attribute
* Name and save the map
Understanding Impact of Deployment
The rate of production deployments has increased significantly as a result of DevOps and microservices. Unfortunately, deployment and code changes are also among the top causes for production issues. By using application maps you can evaluate the impact of deployments in the complete application context and prevent costly incidents.
Let’s use a concrete example of updating the shipping pod in our sock-shop application. In the figure below (a) shows the sock-shop before and (b) shows the same map after deploying new shipping pod image. Even though the throughput, shown in requests per second (rps) remains roughly the same, there is an almost 2x jump in latency across all the pods in the dependency chain of front-end –> orders –> shipping. This is a good indicator to take a second look at the changes made to shipping pod before it hits production!
Summary Action Items:
* Use dependency chains in Netsil maps, to evaluate the impact of deployments on other dependent services
* Ensure there are no performance drifts before the deployment hits production
Accelerating Root Cause Analysis in Dependency Chains
Another natural use case for maps is to expedite root cause analysis. Let’s say we are monitoring the latency for front-end pod since that is the service exposed to end users. We have set an alert on a spike in latency of frontend and the pager goes off. In the figure below, we can compare the before and after maps.
A quick scan of dependencies reveals a spike in latency on the dependency chain leading up to the catalogue items database service. If metrics on other dependencies look normal, then catalogue database service seems like a good candidate to diagnose further. A very promising candidate for the root cause is revealed promptly using the maps. In the absence of maps such analysis would involve correlation or chase tcpdump across multiple machines. Netsil maps greatly accelerate root cause analysis thereby saving time, money and best of all delivering extra sleep for your on-call teams!
Microservices applications heavily utilize service interactions (API calls, DB queries, DNS lookups, etc.) to fulfill transactions. Additionally, due to freedom of parallel development, microservices applications change very frequently. These characteristics greatly complicate root-cause analysis during incidents and make it difficult to evaluate the impact of deployments.
Application maps generated by Netsil AOC helps you understand microservices dependencies, bring visibility and reliably run microservice applications in production. Best of all, generating maps doesn’t require any code or container changes!
You can easily get started with Netsil in your Kubernetes cluster. Check it out.