Please note: All workshops will be available strictly on-demand. Registrants will have access to the sessions and materials starting on June 22, 2020 until December 31, 2020.
The amount of digital data we collectively store doubles every two years and is on track to reach 44 trillion GB by 2020. With this comes a natural increase in the velocity of data being generated per day. Combined with an estimated 1 million open cybersecurity jobs, current methodologies/teams are at a severe disadvantage from the start. The industry must move to intelligent automation that assists human experts adapt to evolving threats. In this interactive session, experts in data science, ML, and architecture will guide the audience in specific approaches to address cybersecurity use cases.
Part 1 – Context-Aware Network Mapping and Asset Classification
Traditional means of network mapping rely on expert knowledge, well-curated databases of network assets, and active internal scanning. Network maps are frequently out of date and often unable to provide the necessary ground-truth data to IT and security. We'll show how to leverage Nvidia RAPIDS and GPU-Accelerated data science to learn a network mapping from passively generated logs. We'll discuss how we take this a step further by applying multiple machine learning analytics to the graph to infer asset ownership, classify assets and services on the network, and provide near real-time updates and alerts based on changes to the network topology. We'll explain how near real-time ingest and processing capabilities allow us to visualize the network quickly and provide context to the security professional in a timely manner.
Cyber Use Case Tutorial: Multiclass Classification on IoT Flow Data with XGBoost
- Learn the basics of cyber network data with respect to consumer IoT devices
- Load network data into a cuDF
- Explore network data and features
- Use XGBoost to build a classification model
- Evaluate the model
Part 2 – Using the Data You Collect: Cybersecurity Applications with AI
Network defense and cybersecurity applications traditionally rely on heuristics and signatures to protect networks and detect anomalies. Large companies may generate over 10TB of data daily, spread across different sensors and heterogenous data types. The difficulty of providing timely ingest, feature engineering, feature exploration, and model generation has made signature-based detection the only option. We'll show how to use Nvidia RAPIDS and GPU acceleration to overcome these obstacles. We'll walk through data engineering steps involving large amounts of heterogeneous data (both source and format) and explore how GPUs can accelerate feature exploration and hyperparameter selection. This enables more in-house data scientists and information security experts to use internally collected data to generate predictive models for anomaly detection rather than rely on signature-based detection.
Cyber Use Case Tutorial: Network Mapping using AI Techniques
- Parse raw Windows Event Logs using cuDF
- Load netflow data into a cuDF
- Map parsed data to network graph edges using cuDF
- Use cuGraph pagerank
- Build a network graph
Technical Requirements: Internet connectivity, Bring Your Own Laptop, We will provide the GPU environment and data. Users do not need to know a great deal of technical detail regarding GPUs, but should have an understanding of jupyter notebooks and python.
Technical Level: High