π Day 29 β SOC Thinking with Linux Pipelines, Pivots, and Process Chains
π― Goal
Build a stronger SOC analyst mental model using Linux command-line workflows by learning how to turn raw output into evidence through:
- pivots (investigation anchors)
- filtering, extraction, and grouping
- parent β child process relationships
- process-name vocabulary (especially common Windows admin tools and LOLBins)
I also practiced this thinking using a fake endpoint process log lab with awk, sort, and uniq.
π οΈ What I Did
Reinforced the core SOC idea: raw events are material, not the answer
A key idea that became clearer today:
Raw events are narrative fragments.
Counting and grouping transforms them into evidence.
To analyze data effectively, I focused on three questions:
-
What part matters?
Which field answers the question? -
How do I isolate it?
Filter noise and extract relevant fields. -
How do I detect patterns?
Count, group, and compare behavior.
This thinking directly connects Linux pipelines with SOC triage workflows.
Learned what a βpivotβ means in investigations
A pivot is a value used to move an investigation forward across events.
Common pivots include:
- IP address
- username
- hostname
- process name
- parent process
- command line
- domain
- file hash
- URL path
- event result or status code
Two clarifications that helped:
- hash β fingerprint of a fileβs contents (used to identify known malware or known-good files)
- status/result β outcome field such as success, failure, HTTP status codes, etc.
Without pivots, logs are just text.
With pivots, they become investigation paths.
Studied process names and their investigative value
Process names are clues, not conclusions.
Examples studied:
ssh
bash
powershell.exe
cmd.exe
curl
python
rundll32.exe
mshta.exe
wmic.exe
certutil.exe
wget
nc
Many attackers abuse legitimate tools known as:
LOLBins (Living Off the Land Binaries)
Legitimate system utilities used for malicious actions.
Seeing powershell.exe alone means nothing.
Context matters:
- parent process
- command line
- user
- host
- timestamp
- follow-on activity
Built a working vocabulary strategy
Instead of memorizing thousands of process names, I started grouping them by function.
Example categories:
- admin tools
- scripting engines and runtimes
- browsers and Office apps (important parent processes)
- LOLBins
- security tools
This reduces cognitive load and improves pattern recognition.
Learned why parent β child relationships matter
Parent-child process relationships are one of the fastest ways to spot suspicious behavior.
Example comparison:
powershell.exe
might be normal.
But:
winword.exe β powershell.exe
is far more suspicious.
Examples of suspicious combinations studied:
winword.exe β powershell.exe
excel.exe β cmd.exe
outlook.exe β mshta.exe
chrome.exe β powershell.exe
msedge.exe β mshta.exe
w3wp.exe β cmd.exe
mshta.exe β powershell.exe
Follow-up investigation should check:
- command line arguments
- user context
- host type
- process path
- execution time
- subsequent processes
- network activity
- baseline frequency
Built a fake endpoint process log lab
To practice analysis, I created a synthetic log file called:
proc_events.log
The file contained pipe-delimited fields:
- timestamp
- host
- user
- parent process
- child process
- command line
Correct parsing required:
awk -F'|'
Important lesson:
Do not assume spaces are delimiters, especially when command lines contain spaces.
Practiced SOC-style analysis pipelines
Using Linux pipelines, I answered investigative questions such as:
- Which parent β child combinations appear most often?
- Which processes appear rarely?
- Which hosts execute suspicious commands?
- Which users trigger unusual activity?
Example pipeline pattern:
awk -F'|' '{print $4 " -> " $5}' proc_events.log | sort | uniq -c | sort -nr
Other exercises included:
- filtering suspicious process names
- grouping by host
- grouping by user
- identifying URLs inside command lines
- finding rare processes
Reconstructed attack narratives
By examining suspicious chains in the synthetic dataset, I could reconstruct realistic attack sequences.
Example attack story:
- Office application launches PowerShell
- reconnaissance commands run (
whoami,ipconfig) - payload download using
certutil - DLL execution via
rundll32 - persistence established with:
reg.exeschtasks.exe
This made the phrase βraw events are narrative fragmentsβ feel very real.
π Key Cybersecurity Connections
Linux pipelines mirror SIEM queries
Linux command-line workflows use the same logic as SIEM queries:
filter β extract β group β rank β interpret
Practicing pipelines builds the same reasoning required for SOC analysis.
Process triage depends on context
Suspicious tools alone are not enough to confirm malicious behavior.
Signal comes from:
- parent β child relationship
- command-line arguments
- user role
- host type
- baseline frequency
- follow-on activity
Pivots drive investigations
Pivots allow analysts to follow activity across logs:
- IP
- host
- user
- process
- domain
- hash
- result
Without pivots, analysts only read logs.
With pivots, they investigate behavior.
β οΈ Challenges
Process vocabulary is still developing
Many Windows process names are still unfamiliar, which slows interpretation.
Avoiding name-based conclusions
It is easy to treat scary-looking process names as proof of compromise.
Reminder:
name = clue
context = verdict
Parsing discipline
Incorrect assumptions about log structure can produce incorrect results.
Always inspect:
- delimiter
- field positions
- log format
before parsing.
π§ What I Learned
Technical
- how SOC pivots work
- why parent process and command line matter more than names
- how common Windows utilities are abused
- how to parse pipe-delimited logs with
awk - how to construct analysis pipelines
Analytical mindset
- raw events are not evidence
- grouping reveals patterns
- rare events often contain valuable signals
- process chains reveal activity narratives
βοΈ Next Steps
- Repeat the fake process log lab without notes
- Run a challenge round on the same dataset
- Expand process-name vocabulary (5 processes per day)
- Practice additional log formats:
- authentication logs
- web access logs
- JSON logs (
jq)
- Begin mapping suspicious process chains to detection rules
π Reflection
Today felt like a major SOC foundations day.
The key realization was that Linux command-line skills are not just terminal tricks β they are a method for transforming raw event data into evidence.
Grouping process names into functional categories also made endpoint telemetry easier to interpret.
This type of groundwork directly supports future work in:
- detection engineering
- SOC triage
- incident response
- SIEM query building
π§© Lessons Learned
What worked
- grouping processes by function instead of memorizing names
- using parent β child relationships for fast context
- practicing on synthetic logs
- repeating the filter β isolate β group workflow
What broke
- unfamiliarity with many Windows process names
- risk of treating process names as conclusions
Why it broke
- early stage of building SOC vocabulary
- endpoint telemetry requires context-rich thinking
Fix / takeaway
- build a working vocabulary of common processes
- always inspect log format before parsing
- treat process names as starting points, not conclusions
