Scaling Prometheus With Thanos – DZone – Uplaza

Observability is an important pillar of any software, and monitoring is a vital part of it. Having a well-suited, sturdy monitoring system is essential. It may enable you to detect points in your software and supply insights as soon as it’s deployed. It aids in efficiency, useful resource administration, and observability. Most significantly, it may enable you to save prices by figuring out points in your infrastructure. Probably the most widespread instruments in monitoring is Prometheus.

It units a de facto customary with its easy and highly effective question language PromQL, but it surely has limitations that make it unsuitable for long-term monitoring. Querying historic metrics in Prometheus is difficult as a result of it’s not designed for this function. Acquiring a world metrics view in Prometheus might be complicated. Whereas Prometheus can scale horizontally with ease on a small scale, it faces challenges when coping with tons of of clusters. In such situations, Prometheus requires important disk house to retailer metrics, sometimes retaining knowledge for round 15 days. For example, producing 1TB of metrics per week can result in elevated prices when scaling horizontally, particularly with the Horizontal Pod Autoscaler (HPA). Moreover, querying knowledge past 15 days with out downsampling additional escalates these prices.

There are numerous Tasks like Thanos, M3, Cortex, and Victoriametrics. However Thanos is the preferred amongst these. Thanos addresses these points with Prometheus and is the best resolution for scaling Prometheus in environments with intensive metrics or a number of clusters the place we require a world view of historic metrics. On this weblog, we’ll discover the elements of Thanos and can attempt to simplify its structure by constructing it step-by-step, beginning with the primary elements. We will even have a demo utilizing k6-metrics. Earlier than diving into Thanos, I like to recommend studying “Monitoring with Prometheus” if you’re not already accustomed to Prometheus.

Thanos

Began in November 2017, Thanos is an open-source CNCF incubating undertaking with over 12.8k stars on GitHub. Constructed on high of Prometheus, Thanos goals to supply a extremely out there Prometheus atmosphere with long-term storage help and a world view of metrics. Firms like Disney, Abode, eBay, SoundCloud, and ByteDance use Thanos for monitoring at scale. Nevertheless, organising Thanos might be complicated and requires experience with Prometheus and business expertise. 

Now, let’s delve into the elements of Thanos and perceive its full structure.

Thanos Parts and Structure

Thanos Question/Querier

Thanos Question serves because the backend for Thanos, using the gRPC StoreAPI to retrieve knowledge from numerous elements. It’s utterly stateless and horizontally scalable, permitting it to question a number of sources and merge them into one, successfully avoiding duplicate metrics. With Thanos Question, knowledge might be fetched from numerous sources. Beneath is an instance of retrieving knowledge from a Thanos Sidecar.


Thanos Question

Prometheus is unaware of StoreAPI, so Thanos Question requests metrics from the Thanos Sidecar. This fashion, Thanos Question not directly communicates with the Prometheus occasion in a sidecar structure. Whereas it’s attainable to deploy Thanos Question with out a sidecar mannequin, earlier than that, let’s discover the advantages and functionalities of a sidecar mannequin.

Thanos Sidecar

The Thanos Sidecar can do extra than simply retrieve metrics from Prometheus. It may additionally retailer these metrics in an Object Retailer. Thanos Question can then use the Retailer Gateway element to fetch knowledge instantly from the Object Retailer, eliminating the necessity to request metrics from the Sidecar. This enables for decreased retention in Prometheus, leading to decrease disk house utilization and value financial savings. Sidecar sends TSDB block knowledge from Prometheus to the Object Retailer each two hours by default, which reduces Prometheus’s useful resource consumption.

To keep away from knowledge loss throughout the two-hour window, Prometheus ought to stay stateful. Nevertheless, to make Prometheus stateless, Thanos affords a element referred to as Thanos Receiver. Utilizing Thanos Receiver we are able to get rid of the sidecar mannequin. Earlier than delving into Receiver, let’s discover the performance of Thanos Retailer Gateway.

Thanos Retailer Gateway

Thanos Retailer Gateway implements the Retailer API, enabling Thanos Question to retrieve knowledge from the distant Object Retailer. Performing as an API gateway between the Object Retailer and Thanos Question, the Thanos Retailer facilitates environment friendly knowledge entry. The Thanos Sidecar can instantly push knowledge to this Object Retailer. The Retailer Gateway element retains some knowledge from the Object Retailer on its native disk, guaranteeing correct synchronization with the Object Retailer. Try the beneath illustration.

Thanos Retailer

The usage of an Object Retailer eliminates the necessity to retailer giant quantities of information on disk, serving to us save on prices. Each time we require any knowledge, we are able to question it utilizing Thanos Question. The Thanos Question contains a dashboard element named Thanos Question Frontend, similar to that of Prometheus, the place customers can enter a PromQL question. The Thanos Question then makes use of the gRPC Retailer API to retrieve the info by way of the Thanos Retailer.

Thanos Compactor

Whereas we are able to retailer infinite quantities of information in an Object Retailer, long-term storage can grow to be expensive. Downsampling our knowledge helps mitigate this difficulty. Once we downsample a block of information, we improve the time interval of the info factors, for instance, from a one-minute block to a five-minute block. This not solely reduces storage prices but in addition enhances question efficiency utilizing PromQL.

Thanos Compactor

The Compactor is the only real element in Thanos with the potential to delete knowledge from the Object Retailer whereas all different elements solely have write permissions. The Compactor consolidates a number of blocks of information into one, optimizing storage effectivity. It is best follow to run just one occasion of the Compactor towards an Object Retailer.

Thanos Ruler

Thanos Ruler evaluates the Prometheus recording and alerting rule towards the handed question and can be utilized for alerting function. By default, the evaluated outcomes by Thanos Ruler are written again to the disk. The Thanos Ruler might be configured to retailer these leads to a distant Object Retailer.

Thanos Ruler

Thanos Receiver

Utilizing Thanos Receiver simplifies the complexities related to the Thanos Sidecar. When utilizing the sidecar, permissions have to be granted for sidecar elements to push metrics to the thing retailer, which entails opening a brand new port for communication with the shop. Thanos Receiver eliminates this complexity.

With Thanos Receiver, Prometheus is configured to make use of its distant write function to ship metrics on to the receiver. The Thanos Receiver then pushes these metrics to the thing retailer. The diagram beneath illustrates this setup. Prometheus repeatedly writes metrics to the Thanos Receiver, which, by default, pushes these metrics to the thing retailer after two hours. To question metrics in real-time, the Thanos Receiver exposes a Retailer API for Thanos Question which might be helpful for builders to see dwell metrics after deployment.

Thanos Receiver

Thanos Receiver wants to find out methods to distribute incoming time-series knowledge throughout completely different nodes. To deal with this, Thanos Receiver employs a hashring mechanism. When Thanos Receiver is configured on Kubernetes it takes the assistance of Thanos Receiver controller which automates the hashring administration. This element retains the hashring up-to-date when the Thanos receiver is scaled utilizing HPA or different auto-scalers.

Thanos Question Frontend

The Thanos Question frontend is a dashboard offered by Thanos that’s much like the Prometheus Dashboard. It additionally makes use of PromQL as its question language. With this element, customers can ask for metrics from the Thanos Question element.

Set up and Demo

On this demo, we’ll take a look at Thanos and scale Thanos receiver utilizing k6s-metrics.

  1. Putting in Minio for object storage
  2. Putting in Thanos and Prometheus
  3. Load take a look at utilizing k6s-metrics

Let’s begin by creating a form cluster.

sort create cluster --name my-cluster --config=

Putting in Minio (Object Retailer)

Minio is a well-liked open-source object storage, a substitute for AWS S3 that we’re utilizing right here in our native setup. When you have S3 or comparable storage, you need to use it right here.

  1. Run the script beneath to put in Minio within the thanos-test namespace.
#!/bin/bash

set -e
kubectl create ns thanos-test
echo "Installing Minio using Helm charts..."
helm repo add bitnami https://charts.bitnami.com/bitnami
helm set up minio bitnami/minio --version 14.2.0 -n thanos-test
sleep 40
echo "Exposing Minio on 127.0.0.1:8080"
echo "Username for Minio: admin"
echo "Password for Minio: $(kubectl get secrets -n thanos-test minio  -o json | jq -r '.data."root-password"' | base64 -d)"
kubectl port-forward svc/minio 8080:9001 -n thanos-test &
echo
  1. Entry the Minio dashboard at port 8080 and create a brand new bucket named “thanos”. Additionally, create an entry key and secret. As soon as performed, create a secret as of beneath and exchange the entry key and secret discipline.
apiVersion: v1
sort: Secret
metadata:
  title: minio-thanos
  namespace: thanos-test
stringData:
  objstore.yml: |
    kind: S3
    config:
      bucket: "thanos"
      endpoint: "minio.thanos-test.svc.cluster.local:9000"
      insecure: true
      access_key: 
      Secret_key: 
yaml

Putting in Thanos and Prometheus

Please execute the next script to put in Thanos and Prometheus.

#!/bin/bash

echo "Installing Thanos in $(kubectl config current-context)"
helm repo add bitnami https://charts.bitnami.com/bitnami
helm set up thanos bitnami/thanos --version 15.1.0 -n thanos-test
sleep 60
echo "thanos is installed"
kubectl get all -n thanos-test
echo "Exposing thanos on 127.0.0.1:8081"
kubectl port-forward svc/thanos-query-frontend -n thanos-test 8081:9090 &
echo "Exposing grafana on 127.0.0.1:8082"
kubectl port-forward svc/grafana -n thanos-test 8082:3000 &
echo "Password for grafana: $(kubectl get secrets -n thanos-test grafana-admin -o json | jq -r '.data."GF_SECURITY_ADMIN_PASSWORD"' | base64 -d)"
echo "Username for grafana: admin"
echo "For mointoring purpose installing kube-prometheus-stack"
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm set up kube-prometheus-stack prometheus-community/kube-prometheus-stack --version 58.2.1 -n thanos-test -f kube-prometheus-stack-values.yaml
sleep 60
echo "Prometheus installed connect with grafana at port 8082"
bash

Testing Utilizing k6s-metrics

Use the beneath script to check Thanos. You may change the digital customers and different fields.

import { test, sleep } from 'k6';
import distant from 'k6/x/remotewrite';

export let choices = {
  vus: 100,
  length: '800s',
};

const shopper = new distant.Consumer({
  url: 'http://127.0.0.1:8085/api/v1/obtain',
});

export default operate () {
  let res = shopper.retailer([
    {
      labels: [
        { name: '__name__', value: `test_metric_${__VU}` },
        { name: 'service', value: 'bar' },
      ],
      samples: [{ value: Math.random() * 100 }],
    },
  ]);
  test(res, {
    'is standing 200': (r) => r.standing === 200,
  });
  sleep(1);
}
javascript

You should use Grafana to visualise the Thanos receiver consumption. Extra Grafana dashboards can be found right here.

Conclusion

Among the advantages of utilizing Thanos are:

  • Lengthy-term metrics storage
  • Save value by utilizing Object Retailer
  • Environment friendly question with World View
  • HA Prometheus occasion
  • Information deduplication

Integrating Thanos into your monitoring setup can improve your software by offering entry to historic knowledge and overcoming the restrictions of a standalone Prometheus setup. Moreover, Thanos may help cut back the prices related to Prometheus. Nevertheless, Thanos will not be the best resolution for everybody. Decide what’s greatest on your infrastructure.

Share This Article
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Exit mobile version