Schema normalization
Why alert fatigue is a schema problem
Analysts face a flood of alerts from multiple systems, each using distinct field names, timestamp conventions, and nested structures. Without Normalization The process of transforming data into a standard format to improve analysis and comparison. :
Signal buried in noise
Critical indicators get lost in irrelevant fields and inconsistent naming.
Investigations slow down
Analysts spend cycles reconciling field formats instead of investigating.
MTTR inflates
Mean time to response grows linearly with alert volume.
A malicious PowerShell A command-line shell and scripting language built on the .NET framework, commonly used for system administration and potentially for malicious purposes. Execution The attacker successfully runs malicious code on a system, typically using interpreters, scripts, payloads, or legitimate tools. detected by an EDR generates an alert with proprietary fields. The related network connection logged by a Firewall A network security system that controls and monitors incoming and outgoing traffic based on predetermined security rules. or cloud proxy appears in a completely different format. Linking them manually requires effort that scales with volume. Volume keeps going up.
The schemas that matter
Three open standards are worth knowing. Implementations differ; the goal is the same.
OCSF
Open Cybersecurity Schema Framework. Vendor-neutral, cloud-friendly, increasingly the default for cross-tool integration. If your platform supports it, prioritize it.
STIX 2.1
Structured Threat Information Expression. Designed for sharing threat intelligence. Strong for indicators and TTPs. Less ergonomic for raw telemetry.
OpenC2
Open Command and Control. Designed for expressing response actions in a portable way. Pairs well with STIX for indicators and OCSF for events.
Core normalization principles
Whatever schema you adopt, normalization maps diverse vendor-specific fields into four buckets:
Detection metadata
Alert name, detection logic, severity, confidence. Timestamps standardized to UTC in ISO 8601.
Affected entities
Hostnames, IPs, user accounts, cloud assets, containers. Anything the alert references.
Primary observables
File hashes, domains, URLs, registry keys, IoC Indicator of Compromise. An artifact that suggests intrusion: a file hash, domain, IP, registry key, or behavioral pattern. IoCs feed signature detection and post-event correlation. . The artifacts to correlate against.
Contextual data
MITRE ATT&CK mappings, enrichment tags, kill-chain stage, threat-intelligence annotations.
Normalization also requires flattening nested structures where useful and harmonizing identifiers so automated cross-tool correlation works.
Operational benefits
Holistic visibility. Events from endpoint, network, and cloud correlate into a unified timeline.
Faster response. Standardized structures let SOAR triage, enrich, and escalate at speed.
Advanced analytics. Consistent fields make ML, anomaly detection, and threat hunting possible.
Improved scoring. Standardized observables get reliable threat-intelligence enrichment.
Simpler reporting. Predictable fields streamline dashboards and compliance reports.
Governance and sustainability
Normalization is not a one-time project. As new data sources arrive, as vendors update their formats, as new schemas emerge, the normalization layer needs maintenance.
Tools that help: Logstash Server-side pipeline that ingests, transforms, and forwards data; central to the Elastic Stack. , Fluentd Open-source data collector that unifies log collection and processing across distributed systems via plugins. , Cribl Data routing and processing for observability pipelines; filter, enrich, and reduce log data before forwarding. , and the native normalization capabilities in modern SIEMs.
Within ASSURED, prioritize normalization of detection Metadata Data about data: file timestamps, owner, size, hash; an email's headers; a process's parent, command line, and signing certificate. In triage, metadata is often more diagnostic than the content itself. , affected entities, and primary observables. Those are the fields downstream phases depend on.
Next up
Alert working example
Two cases worked side by side through the Alert phase: a real phishing intrusion and a developer-workstation false-positive. Both threads continue on every later chapter's example.
See the worked examples