Код: Выделить всё
Windows PowerShell
Copyright (C) Microsoft Corporation. All rights reserved.
Install the latest PowerShell for new features and improvements! https://aka.ms/PSWindows
PS C:\Spark\spark-3.5.1-bin-hadoop3> spark-submit --properties-file C:\Spark\spark-3.5.1-bin-hadoop3\conf\spark-defaults.conf 'C:\Users\JainRonit\OneDrive - STCO\Desktop\Personal\Study\Coding\Pyspark\02-Spark-First-Project\HelloSpark.py'
24/07/26 11:13:49 INFO SparkContext: Running Spark version 3.5.1
24/07/26 11:13:49 INFO SparkContext: OS info Windows 11, 10.0, amd64
24/07/26 11:13:49 INFO SparkContext: Java version 11.0.23
24/07/26 11:13:50 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
24/07/26 11:13:50 ERROR SparkContext: Error initializing SparkContext.
java.lang.Exception: spark.executor.extraJavaOptions is not allowed to set Spark options (was '-Dlog4j.configuration=file:log4j.properties -Dspark.yarn.app.container.log.dir=app-logs -Dlogfile.name=HelloSpark'). Set them directly on a SparkConf or in a properties file when using ./bin/spark-submit.
at org.apache.spark.SparkConf.$anonfun$validateSettings$4(SparkConf.scala:525)
at org.apache.spark.SparkConf.$anonfun$validateSettings$4$adapted(SparkConf.scala:521)
at scala.Option.foreach(Option.scala:407)
at org.apache.spark.SparkConf.validateSettings(SparkConf.scala:521)
at org.apache.spark.SparkContext.(SparkContext.scala:410)
at org.apache.spark.api.java.JavaSparkContext.(JavaSparkContext.scala:58)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374)
at py4j.Gateway.invoke(Gateway.java:238)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
at java.base/java.lang.Thread.run(Thread.java:834)
24/07/26 11:13:50 INFO SparkContext: SparkContext is stopping with exitCode 0.
24/07/26 11:13:50 INFO SparkContext: Successfully stopped SparkContext
Traceback (most recent call last):
File "C:\Users\JainRonit\OneDrive - STCO\Desktop\Personal\Study\Coding\Pyspark\02-Spark-First-Project\HelloSpark.py", line 13, in
.getOrCreate()
^^^^^^^^^^^^^
File "C:\Spark\spark-3.5.1-bin-hadoop3\python\lib\pyspark.zip\pyspark\sql\session.py", line 497, in getOrCreate
File "C:\Spark\spark-3.5.1-bin-hadoop3\python\lib\pyspark.zip\pyspark\context.py", line 515, in getOrCreate
File "C:\Spark\spark-3.5.1-bin-hadoop3\python\lib\pyspark.zip\pyspark\context.py", line 203, in __init__
File "C:\Spark\spark-3.5.1-bin-hadoop3\python\lib\pyspark.zip\pyspark\context.py", line 296, in _do_init
File "C:\Spark\spark-3.5.1-bin-hadoop3\python\lib\pyspark.zip\pyspark\context.py", line 421, in _initialize_context
File "C:\Spark\spark-3.5.1-bin-hadoop3\python\lib\py4j-0.10.9.7-src.zip\py4j\java_gateway.py", line 1587, in __call__
File "C:\Spark\spark-3.5.1-bin-hadoop3\python\lib\py4j-0.10.9.7-src.zip\py4j\protocol.py", line 326, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.Exception: spark.executor.extraJavaOptions is not allowed to set Spark options (was '-Dlog4j.configuration=file:log4j.properties -Dspark.yarn.app.container.log.dir=app-logs -Dlogfile.name=HelloSpark'). Set them directly on a SparkConf or in a properties file when using ./bin/spark-submit.
at org.apache.spark.SparkConf.$anonfun$validateSettings$4(SparkConf.scala:525)
at org.apache.spark.SparkConf.$anonfun$validateSettings$4$adapted(SparkConf.scala:521)
at scala.Option.foreach(Option.scala:407)
at org.apache.spark.SparkConf.validateSettings(SparkConf.scala:521)
at org.apache.spark.SparkContext.(SparkContext.scala:410)
at org.apache.spark.api.java.JavaSparkContext.(JavaSparkContext.scala:58)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374)
at py4j.Gateway.invoke(Gateway.java:238)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
at java.base/java.lang.Thread.run(Thread.java:834)
24/07/26 11:13:50 INFO ShutdownHookManager: Shutdown hook called
24/07/26 11:13:50 INFO ShutdownHookManager: Deleting directory C:\Users\JainRonit\AppData\Local\Temp\spark-0326d309-090a-4a5f-af13-d7fe347ab38d
Код: Выделить всё
from pyspark.sql import *
from pyspark import SparkConf
from lib.logger import Log4j
# conf = SparkConf()
# conf.set("spark.executor.extraJavaOptions", "-Dlog4j.configuration=file:log4j.properties -Dspark.yarn.app.container.log.dir=app-logs -Dlogfile.name=hello-spark")
if __name__ == "__main__":
spark = SparkSession.builder \
.appName("Hello Spark") \
.master("local[3") \
.getOrCreate()
logger = Log4j(spark)
logger.info("Starting HelloSpark")
# your processing code
logger.info("Finished HelloSpark")
# spark.stop()
Код: Выделить всё
spark.executor.extraJavaOptions=-Dlog4j.configuration=file:log4j.properties -Dspark.yarn.app.container.log.dir=app-logs -Dlogfile.name=HelloSpark
Версия Java: 11.0.23
Я попытался выполнить свой код после настройки параметров по умолчанию в файле spark-defaults.conf, но при его выполнении столкнулся с указанной выше ошибкой.
Подробнее здесь: https://stackoverflow.com/questions/787 ... rk-options