• Modern SecOps
  • Posts
  • Deep Dive into Microsoft Sentinel Summary Rules

Deep Dive into Microsoft Sentinel Summary Rules

Neurology, technology, and Sentinel Summary rules. Explaining the intuition and process of building summary rules from scratch.

We’re going to dive deep into Microsoft Sentinel summary rules.

But first, let’s try to find out why they’re so important.

To do that, we’ll take a small detour into the human brain.

Neurology inspires technology

…and not just by making infrastructure so complex that it takes centuries to understand.

The best example of this is artificial neural networks, originally modeled after real neural networks in the brain.

And yes, those are the same artificial neural networks that are the backbone of today’s AI wave.

So it’s only natural we look to the brain for more inspiration. Let’s start by looking at memory.

Here’s an interesting idea about memory:

Our brain doesn’t store memory as a continuous stream of film.

Instead, scientists found evidence that supports the “event segmentation” theory. 

The theory says that our brain stores memory as distinct moments grouped into events. They even found “boundary cells” that activate when our memory switches from one event to the next.

An illustration depicting the event segmentation theory in memory. The image is divided into three sections: the first section shows a series of small images representing distinct moments, with stick figures engaged in simple activities such as walking, talking, and sitting. The second section groups these images into folders, each representing an event, similar to a photo album. The third section highlights a single image from one folder, with additional images from the same folder emerging around it, symbolizing the process of recalling and expanding on a specific memory. The design is minimalist and labeled for clarity.

Event Segmentation Theory of Memory

They theorize that memory works like photos on a computer. Photos taken in the same time and place are grouped together and a key photo is used to represent the group.

When we want to expand on an event, we pick the “key photo” from the group. Then we find photos similar to that key photo.

Ok, that’s kinda cool.. but so what? How do we translate that to SIEM design?

Well.. what if we implemented event segmentation for a SIEM?

Hear me out.

What if we had “key records” that represented a bundle of events, and when we want to investigate more, we pivoted to those logs?

That way we:

  • Keep a cleaner SIEM

  • Triage incidents much quicker

  • Make it easier to surface anomalies

  • Save on storage costs by tiering key events

Here’s how: summary rules.

What are Microsoft Sentinel Summary Rules?

Summary rules aggregate a group of logs.

In specific, we will be looking at Microsoft Sentinel Summary Rules to give concrete examples of building summaries.

Here’s how they work:

  • Choose a source table

  • Craft a KQL (search query)

  • Choose the destination table

  • Set the frequency that the rule runs

Creating a Summary Rule in Microsoft Sentinel

Creating a Summary Rule in Microsoft Sentinel

It’s that simple.. or is it?

Creating the rule is the easy part. The key is choosing what goes inside.

How to build a Microsoft Sentinel summary rule

I think of summary rules as a form of compression.

Here’s why:

We want to make data smaller but keep security value.

But what is security value? How do we measure it? How do we maximize it?

Answering those question leads us to my favorite method of designing summaries.

Always start with use cases, then work back to the data you need.

Use cases can be:

  • Hunting for anomalies in sign-in activity

  • Analytic rules that detect threats in network logs

  • Workbooks that report on trends over time in web app access

The use case will give you a clear outcome to work towards.

Here are a few other lessons I’ve learned building summaries for over a petabyte of logs…

Microsoft Sentinel summary rules best practices

Not future-proof, just future-resistant

When creating a summary rule you select the destination table. Either a table that exists, or a new one.

Your table will have a schema. The schema defines the columns and their data types.

Sample table columns in Microsoft Sentinel for a Microsoft sentinel summary rule

Sample table columns in Microsoft Sentinel

When writing your query or creating your table, consider whether your schema will hold up against the test of time.

Here are some questions to ask:

  • Could we add additional columns that will cover a whole different set of use cases without creating a whole new summary table?

  • Could we add additional columns that provide immediate hunting value, will save analysts time, and don’t incur too much storage (maybe by sampling some data)?

  • What if more data sources are added that need to be included in your summary? (Hint: writing summaries on top of ASIM could be an easy solution here)

The answer to these questions is the core of making summaries, and that is your ability to balance two things:

Cutting storage while keeping security use cases.

We talk about this tradeoff below in The future of summaries.

It’s not just about the output

You’re probably not the only person that will see your code.

Other team members will need to understand, explain, and modify your KQL, so don’t make their lives harder.

Which one of these do you prefer to look at?

_ASim_NetworkSession() | summarize min = min(TimeGenerated), max = max(TimeGenerated), sum(EventCount) by IpAddr = DstIpAddr, NetworkDirection, DvcAction | extend type = "Dst"|
union (_ASim_NetworkSession() | summarize min = min(TimeGenerated), max = max(TimeGenerated), sum(EventCount) by IpAddr = SrcIpAddr, NetworkDirection, DvcAction | extend type = "Src"

Or:

let DestinationSummary = _ASim_NetworkSession()
    | summarize
        FirstSeen = min(TimeGenerated),
        LastSeen = max(TimeGenerated),
        // Cannot use count, each row != one event
        EventCount = sum(EventCount)
        by IpAddr = DstIpAddr, NetworkDirection, DvcAction 
    // Need to identify if IP is src or dest
    | extend type = "Dst";
let SourceSummary = _ASim_NetworkSession()
    | summarize
        FirstSeen = min(TimeGenerated),
        LastSeen = max(TimeGenerated),
        EventCount=  sum(EventCount)
        by IpAddr = SrcIpAddr, NetworkDirection, DvcAction
    | extend type = "Src";
SourceSummary
| union DestinationSummary

You probably said the second one…

Even though they achieve the same outcome, the second query is easier to read and understand.

Here’s why:

  • Uses short but descriptive variable and column names

  • Adds comments to explain “the why” behind the code

  • Is well formatted (hint: use Alt + Shift + F in Sentinel to auto-format)

If you’re working as part of a team, push code that’s hard to write but easy to read, not the opposite.

Summaries don’t have to be the end

Just because you’re using summaries doesn’t mean you’re stuck with summary data.

Here’s an example:

  • You have logs coming into auxiliary tier (think of this kinda like cold tier) in Sentinel

  • You run summaries on these logs and store the summaries in analytics tier (think of this as hot tier)

  • An alert triggers on summary data

  • You start to investigate, but realize the summary data isn’t enough. The data you need lives in auxiliary tier…

What can you do?

Well, if it’s your first time encountering this, you’ll probably go to the original table and search for those logs.

If you want to avoid that, here’s what you do:

Create automation that “rehydrates” the data as-needed.

That way, an incident that bubbles up from summary data is automatically enriched with more granular context!

Logic App with Sentinel Incident Trigger

Now that you’re becoming an expert on summary design, let’s get a bit more advanced.

The future of summaries

Let’s take the analogy between summaries and compression to the next level.

Compression has a north star, it’s called lossless compression.

In compression, an algorithm is lossless if it is able to compress data without losing any important information.

In summaries, we can track important information as the use cases covered by a table (maybe by using something like the # of TTPs covered).

What if we could get to lossless compression for summaries?

Summary Rules Efficiency Curves

Maybe, one day we could.

And the answer might lie in neural networks themselves.

Wait, neural networks, for compression?

Yes, and here’s why:

One of the interpretations of neural networks is that they condense a lot of data into a model. Then we can use the model to get back information about the initial data.

That sounds almost like… compression!

Maybe the brain has more secrets than we expected…

This won’t be the last time I’ll talk about this. Subscribe with the link below so you don’t miss my article on advanced summary techniques.

Microsoft Sentinel Summary Rules Cheat Sheet

Want a quick reference for the material in this article?

Check out the cheat sheet below, it’s all yours!

Microsoft Sentinel Summary Rules Cheat Sheet

Microsoft Sentinel Summary Rules Cheat Sheet

I’m writing an in depth post about summary logic and techniques.

A post where I give you sample queries with in depth KQL explanations.

Don’t want to miss it? Subscribe with the link below.

Enjoyed the article (even a little bit)? Follow me on LinkedIn on to hear more of my rants: https://www.linkedin.com/in/nouraie/

Disclaimer: written by a Microsoft employee. All writing is my own opinions

Reply

or to participate.