Analyze Your Logs with Ease with CloudWatch Insights

Analyze Your Logs with Ease with CloudWatch Insights

ยท

6 min read

TLDR; Infographic

AWS CloudWatch Logs Insights Infograhic

Introduction

Amazon CloudWatch is the service that is used in almost every AWS application. But still, many developers and administrators out there are not very familiar with how to analyze their logs more efficiently. That's why so many companies are using third-party monitoring solutions out there. While the solutions can definitely make sense, they are often really expensive.

CloudWatch Logs Insights can help you with analyzing your logs in a SQL-like query language. You will see the magic of CloudWatch Insights, especially when working in a distributed cloud architecture with several services and functions. CloudWatch Logs Insights lets you query multiple log groups at once with a powerful query language.

CloudWatch Insights querying multiple log groups

This post will dive deep into CloudWatch Logs Insights, shows you how to use it, and gives you several real-world examples. Let's dive straight in ๐ŸŠ๐Ÿฝโ€โ™‚๏ธ

CloudWatch Logs

CloudWatch Logs is the main product of the whole CloudWatch suite. CloudWatch Logs is the centralized logging place of AWS. Each native service like Lambda, ECS, or API Gateway is logging directly to CloudWatch. This makes CloudWatch a really powerful monitoring and analysis tool.

Lambda and Fargate log Application logs to CloudWatch

To be able to analyze logs more efficiently it is needed that you have one common place where all your logs are stored. This is already a huge benefit of using CloudWatch Logs. But how can you actually analyze the logs?

CloudWatch Logs Insights

This is where Logs Insights comes into play. Logs insights let you query your logs by introducing its own query language.

The query language is somewhat similar to SQL (Structured Query Language). With SQL you can query your database in a descriptive manner. That means you only describe what you want from the database instead of telling the program how to receive it.

This is similar to CloudWatch Insights. If you open CloudWatch Insights for the first time you see the following query:

fields @timestamp, @message, @logStream, @log
| sort @timestamp desc
| limit 20

CloudWatch Insights Query Definition

Let's look at the query in a bit more detail. The query consists of the following fields:

  • fields: These are the fields you want to display

  • sort: This is the direction you want to sort your data. In our example, we want to sort it descending by the timestamp

  • limit: We limit the end results to 20 to not be overwhelmed with data.

But there are many more fields to select!

  • stats: Calculate aggregations such as counts, averages, medians, etc.

  • display: Show only certain fields, for example, after parsing certain data from a message

  • parse: Parse lets you extract data from log messages

  • unmask: With unmask, you can show data that was masked due to data privacy

With these 7 operations, you are able to build very powerful commands.

Discovered Fields

As mentioned above with the fields parameter you can query fields from your logs. But which fields are actually available in your logs?

To be able to understand which ones are available Log Insights shows you the Discovered Fields tab on the right pane.

CloudWatch Insights Discovered Fields

For that open CloudWatch Insights and select at least one Log Group. If you now click on Discovered Fields you will see the default fields, which are:

  • @log

  • @logStream

  • @message

  • @timestamp

These fields are available in every log message. The percentage behind the field shows you in how many log messages this field was found. In that case, this is 100% because these fields are always available.

You can even click on the field to see in which log groups the field was found, in case you have selected more than one log group.

CloudWatch Insights Field Details

Find more fields

If you have more fields available run a sample query once. CloudWatch Insights will then scan all the fields it can find and adds them on the right pane.

CloudWatch Insights discovered fields after run

In the quick video, you can see how much more fields could be found after scanning the logs. Make sure to scan a longer range of your logs to find enough logs.

Sample Queries from AWS

Creating queries can be quite difficult at the beginning. This is why AWS provides you with a set of sample queries. If you head to the right side you see the tab Sample Queries:

Sample queries in CloudWatch Insights

These sample queries can help you a lot to get started. Let's look at a few of them in more detail:

Lambda - Find the most expensive requests

filter @type = "REPORT"
| fields @requestId, @billedDuration
| sort by @billedDuration desc

This query shows you the most expensive requests in Lambda. Lambda has its own field @billedDuration. The query selects the fields @requestId, @billedDuration and sorts them by @billedDuration descending.

The output looks like that:

Output of Lambda most expensive queries

Lambda - Determine the amount of overprovisioned memory

A second interesting query is to find overprovisioned memory in Lambda.

filter @type = "REPORT"
| stats max(@memorySize / 1000 / 1000) as provisionedMemoryMB,
  min(@maxMemoryUsed / 1000 / 1000) as smallestMemoryRequestMB,
  avg(@maxMemoryUsed / 1000 / 1000) as avgMemoryUsedMB,
  max(@maxMemoryUsed / 1000 / 1000) as maxMemoryUsedMB,
  provisionedMemoryMB - maxMemoryUsedMB as overProvisionedMB

This query uses the statistical functions max, min, and avg. It calculates how much memory was used and how much was actually needed. This is the recommendation to set as a new memory setting.

CloudWatch Insights that shows the query for Lambda Overprovisioned Mamory

This Lambda for example has 957 MB too much memory.

Lambda - View Duration Statistics

The last Lambda sample query is to display latencies in your Lambda function:

filter @type = "REPORT"
| stats avg(@duration), max(@duration), min(@duration) by bin(5m)

The output looks like that:

CloudWatch Insights Latency

This shows you how long your Lambda functions run in average, max, and min in 5 minutes bins.

More Queries

There are many more sample queries out there, for example:

  • Find errors in your functions

  • 25 recent log events

  • Trim whitespaces

  • AppSync - Requests by status code

You can find even more query ideas on serverlessland.

Multiple Log Groups

CloudWatch Insights is especially useful if you have multiple log groups. User sessions normally log to several log groups at once. For example, a user journey could look like that:

  1. User signs up -> API Gateway, Cognito, Lambda

  2. User creates a first item in your application: Lambda, DynamoDB

  3. User receives welcome mail: SES, Lambda, DynamoDB

CloudWatch Insights queries multiple log groups

Only to understand this whole flow it is needed to query at least 3-5 log groups. With Log Insights this is possible.

Excourse: Structured Logs

One important thing to mention is how you actually log in to your applications. CloudWatch Log Insights is only able to query your logs if you use some sort of structured logging approach. Structured logging means that you need to log in a structured way with JSON logs.

CloudWatch Logs Insights can automatically parse this JSON and discovers the fields.

Here is an example of logging in a non-structured way in Node.JS:

console.log(`UserID ${userId} has faced error ${error}`)

Here is an example of a structured log:

logger.info("User faced error", {userId, error})

We highly recommend using some sort of structured logger (e.g. Lambda Powertools) that takes care of that automatically.

Thanks!

That's it. Thank you so much for reading the article.

Do you have any queries you always use and would like to add here? Add them to the comments ๐Ÿ‘‡๐Ÿฝ

ย