Anomaly-based detection

How it works

An anomaly-detection engine observes a stream of over time and constructs a baseline of typical behavior for the entities it sees. The baseline can be statistical (a distribution of login times, packet sizes, or process counts), machine-learned (clustering, , autoencoders), or a hybrid of the two. Once the baseline is stable, the engine flags any observation that falls far enough outside the baseline to be considered an outlier.

The defining property of the family is that it does not need a prior example of the . Signature detection needs to have seen the before. Anomaly detection only needs to know what your environment usually does. That makes anomaly the strongest family against truly novel attacks, and the noisiest family during periods of organizational change.

What anomaly detection catches

🕒 Unusual authentication times

An administrative login at 03:00 from an account that historically logs in between 09:00 and 17:00. The deviation is from this user’s historical pattern, not a global rule.

📤 Atypical data transfer volumes

A file server moves 2 GB outbound when its baseline is single-digit megabytes per day. The traffic itself is over an allowed channel; the volume is the deviation.

⚙️ Abnormal process behavior

powershell.exe launching python.exe on an endpoint where that combination has never been observed. Both binaries are legitimate; the relationship is what is new.

🚪 Authentication pattern shifts

Rapid lateral authentication attempts across multiple endpoints within seconds. Each attempt is legitimate-looking; the rate and breadth are anomalous.

Representative platforms: XDR (UEBA), , Advanced Analytics, ML, Securonix.

Example: a 2 AM file transfer

Walk through a typical anomaly alert

A UEBA platform produces the following alert:

Alert: Data transfer volume deviation
Severity: Medium
Confidence: 64%
Entity: srv-fileshare-01
Observation: 2.1 GB outbound to ext-archive[.]example over HTTPS
Baseline: median 6 MB/day, 99th percentile 80 MB/day
Window: 02:14 - 02:31 UTC

The analyst’s first question is which family fired. The “Confidence: 64%” and the baseline statistics make it obvious this is an anomaly engine. The engine is reporting a real deviation, not a verdict on whether the deviation is malicious.

What the analyst checks next:

Is the destination legitimate? Look up ext-archive[.]example in DNS history, threat intel, and asset inventory. A long-standing partner is one story; a recently registered domain is another.
Is there a corresponding business reason? Check change management, scheduled backup jobs, and ITSM tickets for the 02:14 window. A documented backup window neutralizes the alert.
Is the process source consistent? The transfer must have been initiated by a process on the file . Correlating process telemetry from that window will identify whether a known backup agent or an unexpected process started the transfer.
Are there sibling anomalies? Did the same server also show unusual authentication or process activity in the same window? Multiple anomalies converging on one is a stronger signal than any single one.

Anomaly detection is rarely decisive on its own. The job is to gather the corroborating context that either explains the deviation away or escalates it.

Strengths and limitations

Strengths

Catches zero-day attacks and novel techniques the signature world has no name for yet.
Adapts to environment-specific patterns. Every organization's 'normal' is different, and anomaly engines learn that on the ground.
Identifies subtle or gradual changes that single-event detection would miss.
Resilient to evasion tactics that target static signatures or static rules.

Limitations

High false-positive rate during baseline-learning periods, environment shifts, mergers, migrations, or new tool rollouts.
Computationally expensive. Sustained ML inference on streaming telemetry has real cost.
Requires ongoing tuning. A baseline that has not been refreshed in six months starts to lie about what normal is.
Decision logic can be opaque. ML-driven alerts can be hard to explain to stakeholders or auditors.

Operational considerations

Allow a meaningful baseline period before trusting alerts. Several weeks of stable telemetry is a reasonable starting point. Less than that and the baseline is noise.
Monitor for baseline drift during operational change. A merger, a new SaaS rollout, or a major upgrade will all invalidate prior baselines temporarily.
Tune sensitivity continuously. Anomaly detection is never “set and forget.” Treat the engine like a living asset that needs review.
Pair with corroborating data sources. Anomaly alerts are most useful when they can be joined to identity, process, and network telemetry inside a single investigation surface.

A Alert S Subject S Scope U Uncover R Risk E Escalation D Documentation

Next up

Rule-based detection

The family that encodes institutional knowledge: if-then logic combining multiple indicators into scenarios worth flagging.

Read rule-based