Using node local caching on your Kubernetes nodes to reduce CoreDNS traffic
Kubernetes 1.18 was recently released, and with it came a slew of super useful features! One feature that hit GA is node local caching. This allows each node in your cluster to cache DNS queries,...
View ArticleImproving my Linux diff experience with icdiff
I recently came across icdiff. This little gem allows you to see the difference between two files, but what makes it special is its ability to highlight the differences (sdiff, which was my go to diff...
View ArticleCollecting Nginx metrics with the Prometheus nginx_exporter
Over the past year I’ve rolled out numerous Prometheus exporters to provide visibility into the infrastructure I manage. Exporters are server processes that interface with an application (HAProxy,...
View ArticleDecoding JSON Web Tokens (JWTs) from the Linux command line
Over the past few months I’ve been spending some of my spare time trying to understand OAUTH2 and OIDC. At the core of OAUTH2 is the concept of a bearer token. The most common form of bearer token is...
View ArticleUsing dockle to check docker containers for known issues
As an SRE, I’m always on the look out for tooling that can help me do my job better. The Kubernetes ecosystem is filled with amazing tools, especially ones that can validate that your clusters and...
View ArticleUsing the sslsplit MITM proxy to capture Docker registry communications
This past weekend I got to debug a super fun issue! One of my Kubernetes clusters was seeing a slew of ErrImagePull errors. When I logged into one of the Kubernetes workers, the dockerd debug logs...
View ArticleHow the docker pull command works under the covers (with HTTP headers to...
I talked previously about needing to decode docker HTTP headers to debug a registry issue. That debugging session was super fun, but I had a few questions about how that interaction actually works. So...
View ArticleControlling the inventory order when running an Ansible playbook
This week I was updating some Ansible application and OS update playbooks. By default, when you run ansible-playbook it will apply your desired configuration to hosts in the order they are listed in...
View ArticleDebugging Kubernetes network issues with nsenter, dig and tcpdump
As a Kubernetes administrator I frequently find myself needing to debug application and system issues. Most of the issues I encounter can be solved with Grafana dashboards and Prometheus metrics, or by...
View ArticleUsing the Ansible uri module to test web services during playbook execution
Ansible has amazing support for testing services during playbook execution. This is super useful for validating your services are working after a set of changes take place, and when combined with...
View ArticleUsing Kubernetes affinity rules to control where your pods are scheduled
Kubernetes has truly revolutioned distributed computing. While it solves a number of super hard problems, it also adds a number of new challenges. One of these challenges is ensuring your Kubernetes...
View ArticleUpgrading an RPM to a specific version with yum
This past week I got to spend some time upgrading my CI/CD systems. The Gitlab upgrade process requires stepping to a specific version when you upgrade major versions, which can be a problem if the...
View ArticleUsing the Kubernetes K14S kapp utility to view deployment manifest changes...
If you’ve worked with Kubernetes for any length of time, you are probably intimately familiar with deployment manifests. If this concept is new to you, deployment manifests are used to add resources to...
View ArticleWays to debug Kubernetes pods without shells
Debugging production issues can sometimes be a challenge in Kubernetes environments. One specific challenge is debugging containers that don’t contain a shell. You may have seen the following when...
View ArticleUsing the Kubernetes can-i subcommand to debug authentication issues
When I was first getting started with Kubernetes, RBAC was one of the topics that took me the longest to grok. Not because the resources (Roles, ClusterRoles, etc) are hard to interpret, but learning...
View ArticleUsing tfswitch to manage Terraform versions
The growth of the Terraform community is absolutely astounding. New providers are constantly popping up, providers are being upgraded at a feverish pace, and amazing new features are constantly being...
View ArticleUnderstanding cloud spend in your Terraform workflows
Having worked in the “cloud” for several years, one thing that I’m super conscious about is our cloud bill. There are tons of subtleties associated with billing, such as AZ-to-AZ traffic costs or how...
View ArticleUsing terrascan to detect compliance and security violations
Over the past several years I’ve read numerous horror stories about cloud deployments gone wrong. S3 buckets with PCI data left open to the raw Internet, EC2 instance profiles that weren’t scoped...
View Article