Signature-based detection
How it works
A signature engine maintains a library of indicators of compromise An IoC is an artifact that, when observed, suggests intrusion: a specific file hash, a domain name, an IP address, a registry key value, or a byte sequence. IoCs are curated from threat intelligence, reverse engineering, and prior incident response. and compares observed telemetry against that library in near-real time. The comparison is deterministic. Either the input matches a signature in the library, or it does not. There is no probability, no learning, no fuzzy match A match that returns a degree of similarity rather than a yes/no. Fuzzy matching can catch slightly modified malware that an exact-match signature would miss, but it also produces uncertainty about how close a match needs to be before it counts. . The simplicity of the model is the source of both its strengths and its blind spots.
Modern signature engines extend the basic pattern-match approach with hierarchical Threat An actor (or capability) with intent and means to cause harm. A vulnerability is what they exploit; risk is the product of threat, vulnerability, and impact. classification, Metadata Data about data: file timestamps, owner, size, hash; an email's headers; a process's parent, command line, and signing certificate. In triage, metadata is often more diagnostic than the content itself. enrichment, MITRE ATT&CK A globally-accessible knowledge base of adversary tactics and techniques based on real-world observations, used for threat modeling and security operations. mapping, and automated response orchestration. The underlying mechanism is still โcompare against the list,โ but the list is richer, the matching is faster, and the response can be automated.
What signatures look like
The library is not a single thing. A signature engine typically maintains several kinds of indicators side by side, each useful for catching a different layer of intrusion.
๐ File hashes
SHA-256 and MD5 fingerprints for known malware samples. Trivially specific. A single byte change produces a different hash and therefore no alert.
๐งฌ Byte sequences
Shellcode patterns, exploit-specific opcodes, and packer signatures. More robust to trivial changes than hashes because they match patterns inside the file.
๐๏ธ Registry artifacts
Persistence keys, configuration footprints, and known autorun entries under paths like HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Run.
๐ Network indicators
Domain names, IP addresses, JA3 A fingerprint of a TLS client hello packet. Used to identify the software making a TLS connection (a specific malware family, for example) without decrypting the traffic. JA4 (released 2023 by FoxIO) is the modernized successor with improved entropy across TLS 1.3 and QUIC; many newer detection systems publish both JA3 and JA4 hashes side-by-side. / JA4 The 2023 successor to JA3 from FoxIO. A modernized TLS-client fingerprint that handles TLS 1.3 cipher-suite ordering and adds a separate fingerprint family for QUIC (JA4Q), HTTP (JA4H), and TLS-server (JA4S). Often paired with JA3 in modern threat-intel feeds during the transition. hashes, and certificate fingerprints associated with command-and-control infrastructure.
Representative platforms: CrowdStrike Falcon Cloud-native EDR with a lightweight agent; combines behavioral analytics, threat intel, and ML. , Microsoft Defender Antivirus, endpoint protection, EDR (Defender for Endpoint), and XDR ties to Sentinel and Entra ID. for Endpoint A device that initiates network connections and runs user-facing software: laptop, desktop, server, phone, tablet. Endpoints are where most adversary tradecraft eventually shows up, which is why EDR exists. , SentinelOne Unified EPP/EDR with behavioral AI, autonomous response, rollback on Windows, and threat-hunting telemetry. Singularity, Snort Open-source NIDS with deep packet inspection and a large community rule set maintained by Cisco Talos. and Suricata Open-source NIDS/IPS with multi-threaded inspection, file extraction, TLS inspection, and JSON output. for network IDS, YARA for content-based matching, and most commercial AV products.
Example: a known commodity malware hash
Walk through a typical signature hit
An EDR endpoint reports the following alert:
Alert: Malicious file detected
Severity: High
Detection: Signature match
Hash (SHA-256): 4f5ee...c8d3 (matches "Win32.Trojan.GenericKD")
Path: C:\Users\jdoe\Downloads\invoice.exe
Process: explorer.exe spawned the writeThe analystโs first question is which family fired. The โDetection: Signature matchโ field answers it. This is a deterministic match against a known-bad hash, which means the engine is highly confident the file is what it says it is.
What the analyst checks next:
- Confirm the hash. Cross-reference the SHA-256 against threat- Intelligence Information gathered and analyzed to understand and predict potential security threats. sources ( VirusTotal Free analysis of files and URLs across many AV engines; community sharing and reputation data. , Mandiant, internal lookup) to see what Malware Software whose author intends harm: ransomware, trojans, worms, viruses, spyware, wipers, rootkits, RATs. The B.A.D. glossary catalogs the families in detail. family this hash belongs to and what behavior it typically produces.
- Trace the source. The file is in Downloads, so the User An individual who interacts with a system, network, or application. (or something acting as the user) downloaded it. Mail logs, browser history, and proxy logs help reconstruct how it got there.
- Check for execution. A signature hit on a file does not always mean the file ran. Process telemetry (was
invoice.exeever executed) decides whether this is a contained problem or an active one. - Look for sibling infections. A โsiblingโ here is another host that shows the same file, the same User Account A unique identity or profile used to authenticate and authorize access to a system or resource. , or the same download source. One infected machine is a contained problem. Several with the same signature is a campaign, and the response changes accordingly. The analyst queries the EDR or SIEM for the same hash across the fleet, the same source URL in proxy logs, and the same user across other endpoints.
Notice what Signature-Based Detection Security method that identifies threats by matching observed activity against a database of known malicious patterns or signatures. did not tell the analyst: it did not say whether the file was actually executed, whether other hosts received the same file, or whether the campaign has a custom payload variant that the library does not yet cover. Those questions belong to the other detection families and to the rest of the investigation.
Strengths and limitations
Strengths
- Very low false-positive rate for known threats. When it fires, it is almost always real.
- Lightweight at runtime, with minimal performance impact on endpoints and network sensors.
- Transparent and auditable. The analyst can point at the signature and explain exactly why the alert fired.
- Fast. A signature match gives a high-confidence answer immediately, which is critical during fast-moving incidents.
Limitations
- Blind to novel threats. Zero-day exploits and polymorphic malware slip past until a signature is published.
- Heavily dependent on the cadence and quality of threat-intelligence feeds that populate the signature library.
- Trivially evaded by adversaries who alter their payload. A different hash means a different alert, or no alert at all.
- Narrow by design. Signatures detect known-bad artifacts. They say nothing about behavior, intent, or context.
Operational considerations
- Maintain a robust signature distribution pipeline. Stale signatures are dead weight, and the pipeline that updates them is part of the security boundary.
- Validate signature updates before deployment. Supply-chain compromise of a signature vendor is a real risk, so updates should be authenticated and tested.
- Layer with the other families. Signature detection alone cannot stop polymorphic or living-off-the-land attacks. The other three families compensate for that.
- Track coverage gaps. Time-to-signature The time between when a new threat appears in the wild and when a detection signature for it lands in the engineโs library. A short time-to-signature means the SOC catches new threats quickly. A long one is exposure. for emerging IoCs is a useful operational metric. The longer the gap, the more exposure.
Next up
Anomaly-based detection
The opposite of signature-based. Where signature needs to have seen the threat before, anomaly only needs to know what normal looks like in this environment.
Read anomaly-based