An introduction to Loki

Veröffentlicht von Florian Stoeber am

Many enterprise customers are using ELK to store the logs from their Kubernetes clusters. Others are using solutions the cloud provider offers like Stackdriver or Cloudwatch. While ELK is difficult to configure and to operate, managed solutions are quite easy to use but they lead into a Vendor Lock-In. Both solution models are quite expensive if you want to store a big number of logs and if you have more than a few Kubernetes clusters. These are reasons to have a look at Grafana Loki which was released in the last year. It is licensed under an open-source license and can be set up in an easy way.

This blogpost is the third one in our observability series and will give you an overview what components are necessary for using Loki, what they are doing and how we are able to set up and configure the complete stack.

Modes

Loki is able to run in two different modes. One mode splits up Loki in various components. This “microservice mode” is able to scale all the components independently, so it should be considered to use it in environments that need this scalability. The other mode is called “monolithic mode” and consists out of one Loki instance. It is suitable for small environments and for Proof of Concepts. It is very hard to set up a scalable Loki instance that is running this mode. In the monolithic mode the installation of Loki is simple and easy. You can find an installation guide in the official documentation: https://grafana.com/docs/loki/latest/installation/. In this blogpost I will explain all the components and how to install Loki in distributed mode.

Components

Loki Architecture

In the graphic above you can see, that the Read- and the Write-Path are decoupled. If we are flooding the Read-Path with a very big query it would not influence our Write-Path. It will be still in good order and this is one of the big benefits of this setup, besides the better scalability.

Distributor

The distributor is the first component that handles the logs in the write path. It receives logs from various sources, e.g. fluentd, fluent-bit or promtail. Thereafter, the distributor validates the logs for correctness and forwards them to one or multiple ingester components.

Ingester

The ingester receives the logs from the distributors and writes the data from the incoming stream to a long-term-storage, like S3. Besides that, it stores short-term-data in-memory so that these logs can be accessed faster and it is not necessary to access a database or a bucket for it. The ingester verifies the logs so that they are in a right time-order. If logs are received with an incorrect time-order, the ingester will ignore the logs.

Compactor

The compactor is used to dedupe the index. While all the ingesters in our cluster are writing many files a day, it is worth to start deduping them and consolidate them to one file. In the end we are able to query faster through our logs.

Querier

The querier executes the log-query. It is connected to the ingester (for short-term-data) and the S3-Bucket (for long-term-data). After the querier fetches all the data from both sources, it will deduplicate it. This is done because the logs that are stored in the ingester will be fetched from the long-term-storage, too. After that it will return the data to the requester (either Grafana or a query-frontend).

Query-Frontend

The query-frontend was introduced in summer 2020 and it is an optional component in the distributed setup. You can think of it like some type of proxy-service. It receives the requests from Grafana, will perform some validation and caching and then forward the query to the querier.

Installation and configuration

The setup of this distributed setup was a bit tricky for a long time. The developers are providing a tanka setup to install and configure Loki. All in all, this is a nice and very flexible possibility to install applications on a Kubernetes cluster but most people are using helm-charts to install applications on Kubernetes. Our basic setup is built on helm-charts too so we had to find a possibility to use helm for this. At the same time, as we were going to evaluate Loki, the community built a helm-chart for it which is leveraging the distributed setup. With this chart we were able to stick to our basic GitOps workflow and set up Loki in the distributed-mode.

We will give you an introduction how we are setting this up on a GKE cluster. On the Kubernetes cluster, I have installed Grafana, a log generator and banzaicloud/logging-operator (Learn how to use this operator) and we will use GCS to store our data.

Custom values

If you are going to install a helm-chart you will often have to use custom values to adapt the application to your own needs. It is necessary to provide a dedicated Loki configuration where you can put the bucket information, the schema, various limits and other things.

We are using the basic configuration, but changed the storage to GCS:

compactor:
  enabled: true
ingester:
  replicas: 2
distributor:
  replicas: 2
querier:
  replicas: 2
queryFrontend:
  replicas: 2
loki:
  config: |
    auth_enabled: false
    server:
      log_level: info
      # Must be set to 3100
      http_listen_port: 3100

    distributor:
      ring:
        kvstore:
          store: memberlist

    ingester:
      # Disable chunk transfer which is not possible with statefulsets
      # and unnecessary for boltdb-shipper
      max_transfer_retries: 0
      chunk_idle_period: 1h
      chunk_target_size: 1536000
      max_chunk_age: 1h
      lifecycler:
        join_after: 0s
        ring:
          kvstore:
            store: memberlist

    memberlist:
      join_members:
        - {{ include "loki.fullname" . }}-memberlist

    limits_config:
      ingestion_rate_mb: 10
      ingestion_burst_size_mb: 20
      max_concurrent_tail_requests: 20
      max_cache_freshness_per_query: 10m

    schema_config:
      configs:
        - from: 2020-09-07
          store: boltdb-shipper
          object_store: gcs
          schema: v11
          index:
            prefix: loki_index_
            period: 24h

    storage_config:
      gcs:
        bucket_name: loki-demo-test
      boltdb_shipper:
        active_index_directory: /var/loki/index
        shared_store: gcs
        cache_location: /var/loki/cache

    query_range:
      # make queries more cache-able by aligning them with their step intervals
      align_queries_with_step: true
      max_retries: 5
      # parallelize queries in 15min intervals
      split_queries_by_interval: 15m
      cache_results: true

      results_cache:
        cache:
          enable_fifocache: true
          fifocache:
            max_size_items: 1024
            validity: 24h

    frontend_worker:
      frontend_address: {{ include "loki.queryFrontendFullname" . }}:9095

    frontend:
      log_queries_longer_than: 5s
      compress_responses: true
    compactor:
      shared_store: gcs

Now we are able to install the helm-chart with the custom values-file:

helm install loki grafana/loki-distributed --values=loki.yml
NAME: loki
LAST DEPLOYED: Sun Feb 21 19:28:54 2021
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
***********************************************************************
 Welcome to Grafana Loki
 Chart version: 0.25.0
 Loki version: 2.1.0
***********************************************************************

Installed components:
* gateway
* ingester
* distributor
* querier
* query-frontend
* compactor

Now we are able to scrape all the pods and see that we have installed Loki successful:

kubectl get pods
NAME                                                    READY   STATUS    RESTARTS   AGE
NAME                                                    READY   STATUS    RESTARTS   AGE
grafana-76899cd4f-7tvqd                                 1/1     Running   0          2m2s
loki-loki-distributed-compactor-68cb5787ff-67spm        1/1     Running   0          2m17s
loki-loki-distributed-distributor-665bd5694-dgfdf       1/1     Running   0          2m17s
loki-loki-distributed-distributor-665bd5694-m9gct       1/1     Running   0          2m17s
loki-loki-distributed-gateway-7cd4b5446f-rfwrl          1/1     Running   0          2m17s
loki-loki-distributed-ingester-0                        1/1     Running   0          2m16s
loki-loki-distributed-ingester-1                        1/1     Running   0          66s
loki-loki-distributed-querier-0                         1/1     Running   0          2m16s
loki-loki-distributed-querier-1                         1/1     Running   0          96s
loki-loki-distributed-query-frontend-7b59fb7875-4vrv5   1/1     Running   0          2m17s
loki-loki-distributed-query-frontend-7b59fb7875-6nk26   1/1     Running   0          2m16s

The last thing, we have to do, is connecting Grafana to the Loki instance. To do this, we are opening the Grafana web-frontend and go to the “Data Sources”-tab. There we can add the new configuration:

Configure Loki in Grafana

After saving this configuration, we can open the “Explore”-tab in Grafana and use the LogQL syntax to send queries to Loki. For example, lets scrape all the application logs in the default namespace:

Scrape all logs in default namespace
Expand one log line

If you want to learn the syntax of LogQL you might be interested in the short introduction in the official documentation: https://grafana.com/docs/loki/latest/logql/

Conclusion

All in all, it is very easy to setup and configure a Loki instance leveraging Kubernetes and Helm. It is possible to avoid a Vendor Lock-In and you can use a cheap and fast object storage that is accessible in multiple cloud environments. Maybe you would like to see how we used Loki at scale in a multi-cluster-environment or dig deeper in the configuration of the logging-operator. Besides that, the code that was used in this tutorial is accessible at GitHub.


References

Loki-Logo: https://github.com/grafana/loki/blob/master/docs/sources/logo_and_name.png

https://github.com/grafana/loki

https://grafana.com/docs/loki/latest/architecture

https://grafana.com/docs/loki/latest/storage/

https://grafana.com/docs/loki/latest/logql/

https://banzaicloud.com/docs/one-eye/logging-operator/

https://github.com/banzaicloud/logging-operator

https://github.com/Liquid-Reply/blogpost-resources


Florian Stoeber

Florian is working as a Kubernetes Engineer for Liquid Reply. After his apprenticeship and his studies he specialized in Kubernetes technologies and worked on a few projects to build up Monitoring and Logging solutions.