Dec 26, 2024  
2021-2022 Course Catalog 
    
2021-2022 Course Catalog [ARCHIVED CATALOG]

Add to Portfolio (opens a new window)

CVF 1071 - Introduction to Big Data Analytics and Security

Credits: 4
Hours/Week: Lecture 3 Lab 1
Course Description: This course provides a fundamental and introductory-level overview of the field of Big Data and related security topics to enable effective participation in Big Data and other analytics projects as a practitioner. It provides students with an opportunity to search, navigate, tag, build alerts, and create simple reports and dashboards with Splunk.  The course begins with an introduction to Big Data and the data analytics lifecycle to address business challenges that leverage Big Data. It also provides grounding in basic analytic methods and an introduction to Big Data analytics technology and tools, including MapReduce, Splunk, and Hadoop. This course employs both “open source technology” (Hadoop) and “commercial technology” (Splunk).  This course is for those new to the Big Data field as well as the security threat landscape. No prior programming experience or statistics background is required. An EMCDSA (Big Data industry) certification exam is part of this course.
MnTC Goals
None

Prerequisite(s): Course placement into college-level English and Reading OR completion of ENGL 0950  with a grade of C or higher OR completion of RDNG 0940  with a grade of C or higher and qualifying English Placement Exam OR completion of RDNG 0950  with a grade of C or higher and ENGL 0090  with a grade of C or higher OR completion of ESOL 0051  with a grade of C or higher and ESOL 0052  with a grade of C or higher.
Corequisite(s): None
Recommendation: None

Major Content
  1. Introduction to Big Data Analytics
    1. Big Data Overview
    2. State of the Practice in Analytics
    3. The Data Scientist
    4. Big Data Analytics in Industry Verticals 
  2. Data Analytics Lifecycle
    1. Discovery
    2. Data Preparation
    3. Model Planning
    4. Model Building
    5. Communicating Results
    6. Operationalizing
  3. Review of Basic Data Analytic Methods Using R
    1. Using R to Look at Data - Introduction to R
    2. Analyzing and Exploring the Data
    3. Statistics for Model Building and Evaluation 
  4. Analytics - Theory And Methods
    1. K Means Clustering
    2. Association Rules
    3. Linear Regression
    4. Logistic Regression
    5. Naïve Bayesian Classifier
    6. Decision Trees
    7. Time Series Analysis
    8. Text Analysis
  5. Analytics - Technologies and Tools
    1. Analytics for Unstructured Data - MapReduce and Hadoop
    2. The Hadoop Ecosystem
  6. Big Data Threat
    1. Mapping Threats to Big Data Assets
    2. Incorrect designs / inadequate planning
    3. Identity theft
    4. Malicious activity / software
    5. Legal
    6. Organization
  7. Gap Analysis
    1. Gaps in data protection
    2. Use of cryptography in applications and back-end services
    3. Gaps on computing and storage models
    4. Gaps on roles (administrators, data scientist, and final users)
  8. The Endgame, or Putting it All Together
    1. Introduction to Splunk and the Search app
    2. Running basic searches with Splunk
    3. Search results
    4. Search job control
    5. Time range of a search
  9. Saving Results and Searches
    1. Search results
    2. Saving and sharing search results
    3. Searches  Scheduling
  10. Using Fields
    1. Fields
    2. Fields in searches
    3. Fields sidebar
  11. Tags and Event Types
    1. Tags
    2. Creating tags and using tags in a search
    3. Event types and their uses
    4. Creating and using event types in a search
  12. Creating Alerts
    1. Alerts
    2. Creating an alert
    3. Fired alerts
  13. Creating Reports
    1. Reports and charts
    2. Creating dashboards and adding reports
    3. Editing dashboards

Learning Outcomes
At the end of this course, students will be able to:

  1. employ the Data Analytics Lifecycle to address Big Data analytics projects.
  2. explain how to structure data analysis and get values out of Big Data.
  3. describe the landscape of Big Data Analytics by exploring several examples of real world problems.
  4. explain the impact of Big Data on data collection, data analysis, data reporting, data monitoring, and data storage.
  5. apply appropriate Splunk’s analytic techniques and tools to analyze Big Data.
  6. identify the possible problems that are associated with Big Data.
  7. reorganize the possible problems that are associated with Big Data as data science questions.
  8. install and run programs by using tools such as R and RStudio, MapReduce/Hadoop.
  9. build alerts and create simple reports and dashboards with Splunk.
  10. identify the threats affecting Big Data.

Competency 1 (1-6)
None
Competency 2 (7-10)
None


Courses and Registration



Add to Portfolio (opens a new window)