spark-configs
Spark Configurations |
|||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Contents | |||||||||||||||||||||||||||||||||||||||
|
1. Overview The entry point to the spark application is where you define configurations and can call internal APIs to create RDDs, DataFrames & DataSets. In Spark 1.X versions, the entry points are spark context, sql context and hive context. we need to build the sql and hive contexts from main type which is spark context. In Spark 2.X versions, spark session is the unified entry point and above three contexts are available under this. We can specify the configurations in 3 ways. 1. spark-submit command (via runtime configs) Secondary priority 2. code level configs(via SparkConf) High priority 3. spark-defaults.conf file Least priority |
|||||||||||||||||||||||||||||||||||||||
2. Via Spark Submit command
|
|||||||||||||||||||||||||||||||||||||||
|
3. Via Spark Conf class SparkConf is a util class to maintain all the spark related configurations handy in k,v pairs. It can be passed to spark builder to create Spark session. SparkConf() - default constructor is equivalent to SparkConf(true) SparkConf(boolean loadDefaults) - true to load all values from classpath and environment variables, false to not.
|
|||||||||||||||||||||||||||||||||||||||
|
4. Via spark-defaults.conf file By default, spark-submit command reads spark-defaults.conf file, if it's in the SPARK_HOME/conf directory.
|
|||||||||||||||||||||||||||||||||||||||
|
5. Some Important Configurations Legends: Application Execution Memory Compression Serialization Resource Allocation
|
|||||||||||||||||||||||||||||||||||||||
6. Calculating Spark Job resource values
|
|||||||||||||||||||||||||||||||||||||||
Comments
Post a Comment