Time
|
TTh 4:15pm - 5:30pm
|
Location
|
Herrin T175 (click for map)
|
Staff
|
Instructor |
Bahman Bahmani |
bahman@cs
|
Office Hours: 5:45 - 6:45pm outside the classroom |
TA |
Dima Brezhnev |
brezhnev@cs
|
Office Hours: 1:00 - 3:00pm Fridays @ Huang open area on bottom floor outside of ICME + extra office hours before assignment due dates |
|
Piazza
|
Link
|
Description
The massive increase in the rate of novel cyber attacks has made data-mining-based techniques a critical component in detecting security threats. The course covers various applications of data mining in computer and network security. Topics include: Overview of the state of information security; malware detection; network and host intrusion detection; web, email, and social network security; authentication and authorization anomaly detection; alert correlation; and potential issues such as privacy issues and adversarial machine learning. Prerequisites: Data mining / machine learning at the level of CS 246 or CS 229; familiarity with computer systems and networks at least at the level of CS 110; CS 140 and CS 144 strongly recommended; CS 155 recommended but not required.
- Introduction: Overview of information security, current security landscape, the case for security data mining [pdf]
- Botnets: Botnet topologies, botnet detection using NetFlow analysis [pdf]
- Botnets Cont'd, Insider Threats: Botnet detection using DNS analysis, introduction to insider threats, masquerader detection strategies [pdf]
Readings:
- Behavioral Biometrics: Active authentication using behavioral and cognitive biometrics [pdf]
Reading: Ch 4 + Ch 6 of "Behavioral Biometrics, A Remote Access Approach" by Kenneth Revett (2008).
- Behavioral Biometrics Cont'd: Mouse dynamics analysis for active authentication [pdf]
- Security at Wells Fargo: Guest speaker Avi Avivi, VP Enterprise Information Security Architecture at Wells Fargo [pdf]
- Behavioral Biometrics Cont'd: Mouse dynamics analysis cont'd, touch and swipe pattern analysis for mobile active authentication [pdf]
- Web Security: Web threat detection via web server log analysis [pdf]
- Security at Union Bank: Guest speaker Gary Lorenz, Chief Information Security Officer (CISO) and Managing Director at MUFG Union Bank
- Multi-Classifier Systems, Adversarial Machine-Learning: Overview of multi-classifier systems (MCS), advantages of MCS in security analytics, security of machine learning [pdf]
- Security Data Mining at Google: Guest speaker Massimiliano Poletto, head of Google Security Monitoring Tools group [pdf]
- Web Security Cont'd, Deep Packet Inspection: Alert aggregation for web security, packet payload modeling for network intrusion detection [pdf]
- Machine Learning for Security: Challenges in applying machine learning (ML) to security, guidelines for applying ML to security [pdf]
- Polymorphism: Polymorphic blending attacks, infeasibility of modeling polymorphic attacks [pdf]
- Deep Packet Inspection Cont'd: One-class multi-classifier systems, one-class MCS for packet payload modeling and network intrusion detection [pdf]
Note to students: Please also refer to class notes for mathemtical derivations of one-class MCS fusion rules
- Phishing Detection: Phishing email detection, phishing website detection [pdf]
- Industry Perspectives: Q&A with guest speaker Michael Fey, EVP and CTO of Intel Security Group (aka McAfee)
- Student Presentations: [pdf]
- Student Presentations Cont'd: [pdf]
- Automatic Alert Correlation, Final Thoughts: Building attack scenarios from individual alerts, course review, current and future trends in security [pdf]
First homework: Google Doc. It is due on 10/21. Submission instructions will be posted closer to the due date.
Second homework: Google Doc. It is due on 11/5 night.
Third homework: Google Doc. It is due on Friday before Thanksgiving break. Note that this assignment requires you to sign up before 10/14 for a presentation.
Course Review/Fourth homework: Google Doc. Due Friday 12/12 noon. Early submissions are appreciated.
These titles are available for free online through the Stanford library resources.
- Applications of Data Mining in Computer Security
Daniel Barbara and Sushil Jajodia
- Machine Learning and Data Mining for Computer Security
Marcus A. Maloof
- Enhancing Computer Security with Smart Technology
V Rao Vemuri
- Insider Attack and Cyber Security: Beyond the Hacker
S. Stolfo, S. Bellovin, S. Hershkop, A. Keromytis, S. Sinclair, S. Smith
- Network Anomaly Detection: A Machine Learning Perspective
Dhruba K. Bhattacharyya, Jugal K. Kalita
- Data Warehousing and Data Mining Techniques for Cyber Security
Anoop Singhal
- Crimeware, Understanding New Attacks and Defenses
Markus Jakobsson and Zulfikar Ramzan
- The Art of Computer Virus Research and Defense
Peter Szor
- Introduction: Introduction to Information Security, Introduction to
Data Mining for Information Security
- Malware Detection: Obfuscation, Polymorphism, Payloadbased detection
of worms, Botnet detection/takedown
- Network Intrusion Detection: Signature-based solutions (Snort, etc),
Data-mining-based solutions (supervised and unsupervised), Deep packet
inspection
- Host Intrusion Detection: Analysis of shell command sequences,
system call sequences, and audit trails,
Masquerader/Impersonator/Insider threat detection
- Web Security: Anomaly detection of web-based attacks using web
server logs, Anomaly detection in web proxy logs
- Email: Spam detection, Phishing detection
- Social network security: Detecting compromised accounts, detecting
social network spam
- Authentication: Anomaly detection of Single SignOn (Kerberos, Active
Directory), Detecting Pass-the-Hash and Pass-the-Ticket attacks
- Automated correlation: Attack trees, Building attack scenarios from
individual alerts
- Issues: Privacy issues, Adversarial machine learning (use of machine
learning by attackers, how to make ML algorithms robust/secure against
adversaries)
- Other potential topics: Fraud detection, IoT/Infrastructure
security, Mobile/Wireless security
There will be 4 homework assignments. Students will design and implement data mining algorithms for various security applications taught in class. There will be a significant programming component in each assignment; assignments will also have reading components (mostly research literature) to give initial pointers to students about the problems in the programming component. Assignments will be chosen from a subset of the
following:
- Web attack detection
- User profiling for authentication and authorization
- Network profiling and intrusion detection
- Botnet detection
- Host-based insider threat detection
- Deep packet inspection
- Web proxy log analysis
- Algorithmic alert correlation