Course information

Time TTh 4:15pm - 5:30pm
Location Herrin T175 (click for map)
Instructor Bahman Bahmani bahman@cs Office Hours: 5:45 - 6:45pm outside the classroom
TA Dima Brezhnev brezhnev@cs Office Hours: 1:00 - 3:00pm Fridays @ Huang open area on bottom floor outside of ICME + extra office hours before assignment due dates
Piazza Link

Description The massive increase in the rate of novel cyber attacks has made data-mining-based techniques a critical component in detecting security threats. The course covers various applications of data mining in computer and network security. Topics include: Overview of the state of information security; malware detection; network and host intrusion detection; web, email, and social network security; authentication and authorization anomaly detection; alert correlation; and potential issues such as privacy issues and adversarial machine learning. Prerequisites: Data mining / machine learning at the level of CS 246 or CS 229; familiarity with computer systems and networks at least at the level of CS 110; CS 140 and CS 144 strongly recommended; CS 155 recommended but not required.


  1. Introduction: Overview of information security, current security landscape, the case for security data mining [pdf]
  2. Botnets: Botnet topologies, botnet detection using NetFlow analysis [pdf]
  3. Botnets Cont'd, Insider Threats: Botnet detection using DNS analysis, introduction to insider threats, masquerader detection strategies [pdf]
  4. Behavioral Biometrics: Active authentication using behavioral and cognitive biometrics [pdf]
    Reading: Ch 4 + Ch 6 of "Behavioral Biometrics, A Remote Access Approach" by Kenneth Revett (2008).
  5. Behavioral Biometrics Cont'd: Mouse dynamics analysis for active authentication [pdf]
  6. Security at Wells Fargo: Guest speaker Avi Avivi, VP Enterprise Information Security Architecture at Wells Fargo [pdf]
  7. Behavioral Biometrics Cont'd: Mouse dynamics analysis cont'd, touch and swipe pattern analysis for mobile active authentication [pdf]
  8. Web Security: Web threat detection via web server log analysis [pdf]
  9. Security at Union Bank: Guest speaker Gary Lorenz, Chief Information Security Officer (CISO) and Managing Director at MUFG Union Bank
  10. Multi-Classifier Systems, Adversarial Machine-Learning: Overview of multi-classifier systems (MCS), advantages of MCS in security analytics, security of machine learning [pdf]
  11. Security Data Mining at Google: Guest speaker Massimiliano Poletto, head of Google Security Monitoring Tools group [pdf]
  12. Web Security Cont'd, Deep Packet Inspection: Alert aggregation for web security, packet payload modeling for network intrusion detection [pdf]
  13. Machine Learning for Security: Challenges in applying machine learning (ML) to security, guidelines for applying ML to security [pdf]
  14. Polymorphism: Polymorphic blending attacks, infeasibility of modeling polymorphic attacks [pdf]
  15. Deep Packet Inspection Cont'd: One-class multi-classifier systems, one-class MCS for packet payload modeling and network intrusion detection [pdf]
    Note to students: Please also refer to class notes for mathemtical derivations of one-class MCS fusion rules
  16. Phishing Detection: Phishing email detection, phishing website detection [pdf]
  17. Industry Perspectives: Q&A with guest speaker Michael Fey, EVP and CTO of Intel Security Group (aka McAfee)
  18. Student Presentations: [pdf]
  19. Student Presentations Cont'd: [pdf]
  20. Automatic Alert Correlation, Final Thoughts: Building attack scenarios from individual alerts, course review, current and future trends in security [pdf]


First homework: Google Doc. It is due on 10/21. Submission instructions will be posted closer to the due date.

Second homework: Google Doc. It is due on 11/5 night.

Third homework: Google Doc. It is due on Friday before Thanksgiving break. Note that this assignment requires you to sign up before 10/14 for a presentation.

Course Review/Fourth homework: Google Doc. Due Friday 12/12 noon. Early submissions are appreciated.

Recommended Readings

These titles are available for free online through the Stanford library resources.



There will be 4 homework assignments. Students will design and implement data mining algorithms for various security applications taught in class. There will be a significant programming component in each assignment; assignments will also have reading components (mostly research literature) to give initial pointers to students about the problems in the programming component. Assignments will be chosen from a subset of the following:

  1. Web attack detection
  2. User profiling for authentication and authorization
  3. Network profiling and intrusion detection
  4. Botnet detection
  5. Host­-based insider threat detection
  6. Deep packet inspection
  7. Web proxy log analysis
  8. Algorithmic alert correlation