Signature-based detection

How it works

A signature engine maintains a library of and compares observed telemetry against that library in near-real time. The comparison is deterministic. Either the input matches a signature in the library, or it does not. There is no probability, no learning, no . The simplicity of the model is the source of both its strengths and its blind spots.

Modern signature engines extend the basic pattern-match approach with hierarchical classification, enrichment, mapping, and automated response orchestration. The underlying mechanism is still “compare against the list,” but the list is richer, the matching is faster, and the response can be automated.

What signatures look like

The library is not a single thing. A signature engine typically maintains several kinds of indicators side by side, each useful for catching a different layer of intrusion.

📄 File hashes

SHA-256 and MD5 fingerprints for known malware samples. Trivially specific. A single byte change produces a different hash and therefore no alert.

🧬 Byte sequences

Shellcode patterns, exploit-specific opcodes, and packer signatures. More robust to trivial changes than hashes because they match patterns inside the file.

🗃️ Registry artifacts

Persistence keys, configuration footprints, and known autorun entries under paths like HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Run.

🌐 Network indicators

Domain names, IP addresses, / hashes, and certificate fingerprints associated with command-and-control infrastructure.

Representative platforms: , for , Singularity, and for network IDS, YARA for content-based matching, and most commercial AV products.

Example: a known commodity malware hash

Walk through a typical signature hit

An EDR endpoint reports the following alert:

Alert: Malicious file detected
Severity: High
Detection: Signature match
Hash (SHA-256): 4f5ee...c8d3 (matches "Win32.Trojan.GenericKD")
Path: C:\Users\jdoe\Downloads\invoice.exe
Process: explorer.exe spawned the write

The analyst’s first question is which family fired. The “Detection: Signature match” field answers it. This is a deterministic match against a known-bad hash, which means the engine is highly confident the file is what it says it is.

What the analyst checks next:

Confirm the hash. Cross-reference the SHA-256 against threat- sources ( , Mandiant, internal lookup) to see what family this hash belongs to and what behavior it typically produces.
Trace the source. The file is in Downloads, so the (or something acting as the user) downloaded it. Mail logs, browser history, and proxy logs help reconstruct how it got there.
Check for execution. A signature hit on a file does not always mean the file ran. Process telemetry (was invoice.exe ever executed) decides whether this is a contained problem or an active one.
Look for sibling infections. A “sibling” here is another host that shows the same file, the same , or the same download source. One infected machine is a contained problem. Several with the same signature is a campaign, and the response changes accordingly. The analyst queries the EDR or SIEM for the same hash across the fleet, the same source URL in proxy logs, and the same user across other endpoints.

Notice what did not tell the analyst: it did not say whether the file was actually executed, whether other hosts received the same file, or whether the campaign has a custom payload variant that the library does not yet cover. Those questions belong to the other detection families and to the rest of the investigation.

Strengths and limitations

Strengths

Very low false-positive rate for known threats. When it fires, it is almost always real.
Lightweight at runtime, with minimal performance impact on endpoints and network sensors.
Transparent and auditable. The analyst can point at the signature and explain exactly why the alert fired.
Fast. A signature match gives a high-confidence answer immediately, which is critical during fast-moving incidents.

Limitations

Blind to novel threats. Zero-day exploits and polymorphic malware slip past until a signature is published.
Heavily dependent on the cadence and quality of threat-intelligence feeds that populate the signature library.
Trivially evaded by adversaries who alter their payload. A different hash means a different alert, or no alert at all.
Narrow by design. Signatures detect known-bad artifacts. They say nothing about behavior, intent, or context.

Operational considerations

Maintain a robust signature distribution pipeline. Stale signatures are dead weight, and the pipeline that updates them is part of the security boundary.
Validate signature updates before deployment. Supply-chain compromise of a signature vendor is a real risk, so updates should be authenticated and tested.
Layer with the other families. Signature detection alone cannot stop polymorphic or living-off-the-land attacks. The other three families compensate for that.
Track coverage gaps. for emerging IoCs is a useful operational metric. The longer the gap, the more exposure.

A Alert S Subject S Scope U Uncover R Risk E Escalation D Documentation

Next up

Anomaly-based detection

The opposite of signature-based. Where signature needs to have seen the threat before, anomaly only needs to know what normal looks like in this environment.

Read anomaly-based