کارگاه آموزشی spark و python در کلان داده


Big Data

  1. Introduction to Big Data
  2. Apache Hadoop Architecture
  • HDFS
  • YARN
  • MapReduce


  1. Big Data Definition
  2. Introduction Apache Spark
  3. Install Setting
  4. Spark cluster mode overview
  5. Spark local host view


Spark RDD

  1. Spark RDD Basics
  2. RDD operations
  3. Spark lazy transformation
  4. Spark fault-tolerance

Spark DataFrame

  1. Spark DataFrame Basic
  2. Spark DataFrame Operation
  3. DataFrame Group By
  4. DataFrame Aggregate Functions
  5. Missing Data
  6. Dates and Timestamps

Machine learning in Spark

  1. Introduction of Machine Learning
  2. Regression Algorithm
  3. Classification Algorithm
  4. Clustering Algorithm
  5. Natural Language Processing