About Hadoop Spark Training in Bangalore
Apache Spark Course Content in Bangalore
1.Dive into Scala
1.What is Scala
2.Setup and configuration of Scala
3.Develop and run basic Scala Programs
4.Scala operations
5.Functions and procedures in Scala
6.Different Scala APIs for common operations
7.Loops and collections Array, Map, Lists, Tuples
8.Pattern matching for advanced operations
9.Eclipse with Scala
2.Object Oriented and Functional Programming
1.Introduction to object oriented programming
2.Different oops concepts
3.Constructor, getter, setter, singleton, overloading and
overriding
4.Nested Classes, Visibility Rules
5.Functional Structures
6.Functional programming constructs
7.Call by Name, Call by Value
3.Big Data and need for Spark
1.Introduction to Big Data
2.Challenges with old Big Data solutions
3.Batch vs Real time vs in-Memory processing
4.MapReduce and its limitations
5.Apache Storm and its limitations
6.Need for general purpose solution – Apache Spark
7.Introduction to Apache Spark
8.Internals of architecture and design principles
9.Apache Spark features and characteristics
10.Spark Eco-system components and their insights
4.Deploy Spark
1.Setup Environment
2.Install and configure prerequisites
3.Installation of Spark in local mode
4.Installation of Spark in standalone mode
5.Installation of Spark in YARN mode
6.Installation and configuration of Spark on a real cluster
7.Best practices for Spark deployment
5.Demystify Apache Spark
1.Work on Spark shell
2.Execute Scala and Java statement in shell
3.Understand SparkContext and driver
4.Read data from local file-system and HDFS
5.Cache the data in memory for further use
6.Distributed persistence
7.Handle stream with Spark streaming
8.Testing and troubleshooting
6.Basic Abstraction RDDs
1.What is Spark RDDs
2.How RDDs make Spark a feature rich framework
3.Transformations, action and persistence
4.Lazy operations and fault tolerance
5.Load data and create RDD
6.Persist RDD in memory or disk
7.Pair operations and key-value
8.Spark Hadoop Integration
7.Spark streaming and MLlib
1.Need for stream analytics
2.Comparison with Storm and S4
3.Real time data processing using streaming
4.Fault tolerance and check pointing
5.Stateful Stream Processing
6.DStream and window operations
7.Spark Stream execution flow
8.Connection to various source systems
9.Performance optimizations in Spark
10.Need for machine learning
11.Introduction to machine learning
12.Various Spark libraries
13.Algorithms for clustering, statistical analytics, classification etc
8.Spark-SQL and GraphX
1.What is Spark SQL
2.Features and Data flow
3.Spark SQL architecture and components
4.Hive and Spark together
5.Data frames and loading data
6.Hive Queries through Spark
7.Various DDL and DML operations
8.Performance tuning
9.Introduction to Graph
10.Need for different graph processing engine
11.Graph handling using Spark
12.Real Life Spark Project