Код: Выделить всё
spark = (SparkSession
.builder
.config("spark.jars.packages", pydeequ.deequ_maven_coord)
.config("spark.jars.excludes", pydeequ.f2j_maven_coord)
.getOrCreate())
< /code>
заранее я импортировал Sparksession (from pyspark.sql import SparkSession, Row
Output exceeds the size limit. Open the full output data in a text editor---------------------
------------------------------------------------------
Py4JJavaError Traceback (most recent call last)
Cell In[17], line 5
1 spark = (SparkSession
2 .builder
3 .config("spark.jars.packages", pydeequ.deequ_maven_coord)
4 .config("spark.jars.excludes", pydeequ.f2j_maven_coord)
----> 5 .getOrCreate())
File ~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\pyspark\sql\session.py:477, in SparkSession.Builder.getOrCreate(self)
475 sparkConf.set(key, value)
476 # This SparkContext may be an existing one.
--> 477 sc = SparkContext.getOrCreate(sparkConf)
478 # Do not update `SparkConf` for existing `SparkContext`, as it's shared
479 # by all sessions.
480 session = SparkSession(sc, options=self._options)
File ~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\pyspark\context.py:512, in SparkContext.getOrCreate(cls, conf)
510 with SparkContext._lock:
511 if SparkContext._active_spark_context is None:
--> 512 SparkContext(conf=conf or SparkConf())
513 assert SparkContext._active_spark_context is not None
514 return SparkContext._active_spark_context
File ~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\pyspark\context.py:200, in SparkContext.__init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls, udf_profiler_cls, memory_profiler_cls)
...
at org.apache.hadoop.util.Shell.checkHadoopHomeInner(Shell.java:467)
at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:438)
at org.apache.hadoop.util.Shell.(Shell.java:515)
... 23 more
< /code>
Я пытался исправить его, используя < /p>
import findspark
findspark.init()
< /code>
Однако, даже если это не дает никаких ошибок, это не устраняет проблему. Заранее спасибо
Подробнее здесь: https://stackoverflow.com/questions/760 ... rk-session