Parsing alerts
This matters because adversaries hide their intent inside the very fields you would parse. An EDR alert might report powershell.exe executing with an encoded command parameter. Without parsing, the underlying command (and the attackerβs intent) stays obscured.
What parsing has to handle
Modern SOCs ingest logs from legacy Syslog A standard protocol for message logging in network devices and systems. all the way to cloud-native JSON. Parsing frameworks have to normalize that into queryable, structured data, and they have to do it while the adversary is actively trying to hide.
Obfuscated scripts
PowerShell or Python encoded to dodge plain-text matching.
Fileless malware
Payloads that live in memory with minimal on-disk footprint.
Living off the land
Native binaries executing malicious operations under their legitimate signature.
Partial metadata
Alerts with truncated command lines, missing parent process information, or incomplete context.
Good parsing decodes these patterns while preserving the context downstream phases need.
The six parsing surfaces
Field extraction is the floor. Advanced parsing reconstructs adversary behavior across six Telemetry Collection and transmission of security-relevant data from remote sources for monitoring and analysis. surfaces. Each gets its own page.
Multi-format standardization
Windows Event Logs, syslog, CEF, JSON. The four formats parsing has to live with, and how to normalize them.
Read β02π₯οΈCommand line analysis
Decode the true intent of process execution. Base64, concatenation, variable expansion. Plus how to read execution context.
Read β03πProcess relationships
Parent-child chains and lineage reconstruction. Recognize macro-execution chains, persistence, hollowing, token theft.
Read β04πNetwork correlation
Tie host activity to outbound traffic. C2 detection through timing, TLS, DNS, and exfiltration indicators.
Read β05πFile system activity
Persistence, payload deployment, ransomware staging, exfiltration prep. Tracking the file lifecycle and metadata.
Read β06π§Schema normalization
Unified shape for heterogeneous alerts. OCSF, STIX, OpenC2. Governance and sustainability.
Read βThree categories of information from every alert
Whatever surface you are parsing, the output should populate three buckets:
Who saw it, when, how confident, what bucket it fits.
- Alert ID and timestamp for unique identification and temporal sequencing.
- Source system and detection mechanism (which engine fired, under what logic).
- Severity and confidence to guide triage priority.
- Category and classification (privilege escalation, lateral movement, exfiltration).
The investigative substance.
- Affected systems and accounts (endpoints, servers, identities involved).
- Network information (IPs, ports, protocols, session IDs).
- File and process information (paths, hashes, command-line parameters).
- Registry and configuration changes (persistence, policy mods, drift).
The organizational lens.
- Business impact (what does this potentially disrupt).
- Regulatory implications (HIPAA, PCI DSS, GDPR triggers).
- Historical correlation (does this fit a recurring pattern or campaign).
- Threat intelligence (known IoCs, adversary TTPs, campaign signatures).
Next up
Multi-format standardization
Start with the formats. Windows Event Logs, syslog, CEF, JSON, and the data quality patterns that follow from each.
Begin with multi-format