By the end of this lab, you will be able to:
- Understand the structure and contents of the CIC-IDS2017 dataset
- Use Wireshark to analyze PCAP files from the dataset
- Use Splunk to ingest and query the labeled flow data
- Differentiate between benign traffic and various attack types (Brute Force, DoS, Web Attacks)
This lab will provide hands-on experience with the CIC-IDS2017 dataset, a comprehensive intrusion detection dataset. You will learn to identify and analyze various types of network attacks using this dataset.
The CIC-IDS2017 dataset contains benign and the most up-to-date common attacks, which resembles true real-world data. It includes:
- PCAPs: Raw packet captures of network traffic
- Labeled Flow Data: CSV files with extracted flow features and attack labels
- Attack Types: Brute Force, DoS, DDoS, Web Attacks, Infiltration, Botnet
- Duration: 5 days of network activity (Monday to Friday)
- Total Size: Approximately 8GB
Dataset Location: /home/ubuntu/soc-training-program/Datasets/CIC-IDS2017/
- Wireshark installed on your analysis machine
- Splunk instance running (from Week 9 lab)
- At least 20GB of free disk space
- Basic understanding of network protocols (TCP, UDP, HTTP, DNS)
Approximately 4-5 hours
The dataset is available from the Canadian Institute for Cybersecurity (CIC).
-
Navigate to the dataset directory:
cd /home/ubuntu/soc-training-program/Datasets/CIC-IDS2017/ -
Read the README file for download instructions:
cat README.md
-
Download the dataset from the official source:
- Official URL: https://www.unb.ca/cic/datasets/ids-2017.html
- Download both the PCAPs and GeneratedLabelledFlows directories
-
Verify the download:
ls -lh
The CIC-IDS2017 dataset is organized by day:
CIC-IDS2017/
├── Monday-WorkingHours.pcap # Benign traffic only
├── Tuesday-WorkingHours.pcap # FTP-Patator, SSH-Patator
├── Wednesday-WorkingHours.pcap # DoS/DDoS attacks
├── Thursday-WorkingHours.pcap # Web attacks, Infiltration
├── Friday-WorkingHours.pcap # Botnet, DDoS
└── GeneratedLabelledFlows/
├── Monday-WorkingHours.pcap_ISCX.csv
├── Tuesday-WorkingHours.pcap_ISCX.csv
├── Wednesday-workingHours.pcap_ISCX.csv
├── Thursday-WorkingHours.pcap_ISCX.csv
└── Friday-WorkingHours.pcap_ISCX.csv
Attack Schedule:
| Day | Time Period | Attack Type |
|---|---|---|
| Monday | All day | Benign (Normal) traffic only |
| Tuesday | 9:20 AM - 10:20 AM | FTP-Patator (Brute Force) |
| Tuesday | 2:00 PM - 3:00 PM | SSH-Patator (Brute Force) |
| Wednesday | 9:47 AM - 10:10 AM | DoS GoldenEye |
| Wednesday | 10:14 AM - 10:35 AM | DoS Hulk |
| Wednesday | 10:43 AM - 11:00 AM | DoS Slowhttptest |
| Wednesday | 11:10 AM - 11:23 AM | DoS Slowloris |
| Wednesday | 3:00 PM - 4:00 PM | Heartbleed |
| Thursday | 9:20 AM - 10:00 AM | Web Attack - Brute Force |
| Thursday | 10:15 AM - 10:35 AM | Web Attack - XSS |
| Thursday | 10:40 AM - 10:42 AM | Web Attack - SQL Injection |
| Thursday | 2:30 PM - 3:30 PM | Infiltration |
| Friday | 10:00 AM - 11:00 AM | Botnet ARES |
| Friday | 3:00 PM - 4:00 PM | DDoS LOIT |
-
Navigate to the GeneratedLabelledFlows directory:
cd GeneratedLabelledFlows/ -
View the first few lines of a CSV file:
head -20 Monday-WorkingHours.pcap_ISCX.csv
-
Count the number of flows:
wc -l Monday-WorkingHours.pcap_ISCX.csv
CSV File Structure:
The CSV files contain 79 features extracted from each network flow, including:
- Flow identifiers: Source IP, Destination IP, Source Port, Destination Port, Protocol
- Flow statistics: Duration, packet counts, byte counts
- Timing features: Flow IAT (Inter-Arrival Time), Packet IAT
- Flag counts: FIN, SYN, RST, PSH, ACK, URG, CWE, ECE
- Packet length statistics: Mean, Std, Max, Min
- Label: The attack type or "BENIGN"
Objective: Establish a baseline of normal network activity.
-
Open Monday's PCAP file in Wireshark:
wireshark Monday-WorkingHours.pcap & -
Analyze Protocol Distribution:
- Go to Statistics → Protocol Hierarchy
- Document the percentage of each protocol (HTTP, HTTPS, DNS, etc.)
-
Identify Top Talkers:
- Go to Statistics → Conversations
- Sort by "Bytes" to find the most active hosts
- Document the top 5 IP addresses and their traffic volume
-
Examine HTTP Traffic:
- Apply filter:
http - Analyze typical HTTP requests
- Document common User-Agents and requested URLs
- Apply filter:
-
Examine DNS Traffic:
- Apply filter:
dns - Analyze DNS query patterns
- Document frequently queried domains
- Apply filter:
Questions to Answer:
- What is the most common protocol in benign traffic?
- What are the top 5 destination ports?
- What is the average packet size?
- What User-Agents are present in HTTP traffic?
- Are there any unusual patterns in the benign traffic?
Objective: Import and analyze benign flow data in Splunk.
-
Import CSV into Splunk:
- Log in to your Splunk instance
- Go to Settings → Add Data → Upload
- Upload
Monday-WorkingHours.pcap_ISCX.csv - Set source type to
csv - Create a new index called
cic_ids_2017
-
Basic Splunk Queries:
Query 1: Count total flows
index=cic_ids_2017 source="*Monday*" | stats countQuery 2: Count flows by label
index=cic_ids_2017 source="*Monday*" | stats count by LabelQuery 3: Top source IPs
index=cic_ids_2017 source="*Monday*" | stats count by "Source IP" | sort -count | head 10Query 4: Protocol distribution
index=cic_ids_2017 source="*Monday*" | stats count by ProtocolQuery 5: Average flow duration
index=cic_ids_2017 source="*Monday*" | stats avg("Flow Duration") as avg_duration -
Create a Baseline Dashboard:
- Create a new dashboard called "CIC-IDS2017 Baseline"
- Add the following panels:
- Protocol distribution (pie chart)
- Top source IPs (bar chart)
- Flow duration over time (line chart)
- Packet size distribution (histogram)
Deliverable: Screenshot of your Splunk dashboard showing benign traffic analysis.
Attack Description: FTP-Patator is a brute force attack tool that attempts to guess FTP credentials by trying multiple username/password combinations.
-
Open Tuesday's PCAP in Wireshark:
wireshark Tuesday-WorkingHours.pcap & -
Filter for FTP traffic during the attack window:
ftp && frame.time >= "2017-07-04 09:20:00" && frame.time <= "2017-07-04 10:20:00" -
Analyze the attack:
- Identify the attacker's IP address
- Count the number of FTP login attempts
- Examine the FTP responses (530 Login incorrect)
- Look for successful logins (230 Login successful)
-
Follow an FTP stream:
- Right-click on an FTP packet → Follow → TCP Stream
- Observe the brute force attempts
Wireshark Questions:
- What is the attacker's IP address?
- What is the target's IP address?
- How many FTP login attempts were made?
- What usernames were tried?
- Were any login attempts successful?
-
Filter for SSH traffic during the attack window:
ssh && frame.time >= "2017-07-04 14:00:00" && frame.time <= "2017-07-04 15:00:00" -
Analyze SSH connection attempts:
- Look for multiple SSH handshakes from the same source
- Identify failed authentication attempts
- Calculate the rate of connection attempts
Splunk Analysis:
-
Import Tuesday's CSV into Splunk
-
Detect FTP Brute Force:
index=cic_ids_2017 source="*Tuesday*" Label="FTP-Patator" | stats count by "Source IP", "Destination IP" | sort -count -
Detect SSH Brute Force:
index=cic_ids_2017 source="*Tuesday*" Label="SSH-Patator" | stats count by "Source IP", "Destination IP" | sort -count -
Create a detection rule:
index=cic_ids_2017 "Destination Port"=21 OR "Destination Port"=22 | stats count by "Source IP", "Destination IP", "Destination Port" | where count > 100 | eval attack_type=if('Destination Port'=21, "FTP Brute Force", "SSH Brute Force") -
Save as alert: Save this search as an alert that triggers when count > 100
Deliverable:
- Wireshark screenshots showing brute force attempts
- Splunk query results identifying the attacks
- A written analysis of the attack patterns
Attack Description: GoldenEye is a HTTP DoS tool that sends legitimate HTTP requests to overwhelm the target server.
-
Open Wednesday's PCAP in Wireshark:
wireshark Wednesday-workingHours.pcap & -
Filter for the attack window:
frame.time >= "2017-07-05 09:47:00" && frame.time <= "2017-07-05 10:10:00" -
Analyze HTTP traffic:
- Apply filter:
http - Go to Statistics → HTTP → Requests
- Identify the target web server
- Count the request rate
- Apply filter:
-
Examine packet patterns:
- Look for high packet rates from specific IPs
- Analyze HTTP request headers
- Check for randomized User-Agents
Splunk Analysis:
-
Import Wednesday's CSV into Splunk
-
Detect DoS GoldenEye:
index=cic_ids_2017 source="*Wednesday*" Label="DoS GoldenEye" | timechart span=1m count by "Source IP" -
Calculate packets per second:
index=cic_ids_2017 source="*Wednesday*" Label="DoS GoldenEye" | bin _time span=1s | stats count as packets_per_second by _time, "Source IP" | stats avg(packets_per_second) as avg_pps, max(packets_per_second) as max_pps by "Source IP"
Attack Description: Slowloris is a low-bandwidth DoS attack that keeps many connections open to the target server.
-
Filter for Slowloris timeframe:
frame.time >= "2017-07-05 11:10:00" && frame.time <= "2017-07-05 11:23:00" -
Analyze connection patterns:
- Go to Statistics → Conversations → TCP
- Look for many long-lived connections from the same source
- Examine incomplete HTTP requests
Splunk Analysis:
index=cic_ids_2017 source="*Wednesday*" Label="DoS Slowloris"
| stats avg("Flow Duration") as avg_duration, count by "Source IP"
| where avg_duration > 10000
Deliverable:
- Comparison table of different DoS attack characteristics
- Splunk dashboard showing attack timelines
- Detection rules for each DoS type
Attack Description: SQL Injection attempts to manipulate database queries through web application input fields.
-
Open Thursday's PCAP in Wireshark:
wireshark Thursday-WorkingHours.pcap & -
Filter for HTTP traffic during attack:
http && frame.time >= "2017-07-06 10:40:00" && frame.time <= "2017-07-06 10:42:00" -
Examine HTTP requests:
- Right-click on HTTP packets → Follow → HTTP Stream
- Look for SQL keywords in URLs or POST data:
' OR '1'='1UNION SELECTDROP TABLE--(SQL comment)
-
Extract malicious payloads:
- Go to File → Export Objects → HTTP
- Save suspicious requests for analysis
Splunk Analysis:
index=cic_ids_2017 source="*Thursday*" Label="Web Attack � Sql Injection"
| stats count by "Source IP", "Destination IP"
| sort -count
Attack Description: Cross-Site Scripting (XSS) attempts to inject malicious scripts into web pages.
-
Filter for XSS timeframe:
http && frame.time >= "2017-07-06 10:15:00" && frame.time <= "2017-07-06 10:35:00" -
Look for XSS patterns:
<script>tags in URLs or POST data- JavaScript event handlers (
onclick,onerror, etc.) - Encoded payloads (
%3Cscript%3E)
Splunk Detection Rule:
index=cic_ids_2017 source="*Thursday*"
(Label="Web Attack � XSS" OR Label="Web Attack � Sql Injection" OR Label="Web Attack � Brute Force")
| stats count by Label, "Source IP"
| sort -count
Deliverable:
- List of extracted malicious payloads
- Screenshots of XSS and SQL injection attempts
- Splunk alert for web attack detection
Create a comprehensive Splunk dashboard that displays:
-
Overview Panel:
- Total flows analyzed
- Benign vs. malicious traffic ratio
- Attack type distribution
-
Timeline Panel:
- Attacks over time (by day and hour)
-
Top Attackers Panel:
- Source IPs with most malicious flows
- Geographic location (if available)
-
Attack Type Breakdown:
- Count of each attack type
- Pie chart visualization
-
Target Analysis:
- Most targeted destination IPs
- Most targeted ports
Sample Dashboard Query:
index=cic_ids_2017
| eval attack_category=case(
Label="BENIGN", "Benign",
Label LIKE "%Brute Force%", "Brute Force",
Label LIKE "%DoS%", "Denial of Service",
Label LIKE "%DDoS%", "DDoS",
Label LIKE "%Web Attack%", "Web Attack",
Label="Infiltration", "Infiltration",
Label="Bot", "Botnet",
1=1, "Other"
)
| stats count by attack_category
Submit the following:
-
Lab Report (PDF or Markdown):
- Answers to all questions in each exercise
- Screenshots of Wireshark analysis
- Screenshots of Splunk queries and results
- Analysis of each attack type
-
Splunk Dashboard:
- Export your dashboard as XML
- Include screenshots of the dashboard
-
Detection Rules:
- Document all Splunk queries you created
- Explain the logic behind each detection rule
-
IOC List:
- Compile a list of all malicious IP addresses
- Document attack signatures and patterns
- Completeness: Did you complete all exercises?
- Technical Accuracy: Are your analyses correct?
- Detection Rules: Are your Splunk queries effective?
- Documentation: Is your report well-organized and detailed?
- Dashboard: Is your dashboard informative and well-designed?
- Create a machine learning model to classify traffic as benign or malicious
- Write a Python script to automatically parse the CSV files and generate statistics
- Compare the CIC-IDS2017 dataset with other datasets (UNSW-NB15, CTU-13)
- Develop custom Wireshark filters for each attack type
After completing this lab, you will have hands-on experience analyzing real-world network attacks. In the next module, you will learn how to use a SIEM to automate the detection of these attacks and create correlation rules to identify complex attack patterns.