Код: Выделить всё
**I am providing all the pre requsite in vscode IDE to run the pyspark context **
I am using jdk 17 and pyspark 3.5, python 3.11
%pip install findspark
import findspark`your text`
findspark.init() # Automatically sets SPARK_HOME and initializes PySpark,
I have given error trace bellow:
TypeError Traceback (most recent call last)
Cell In[11], line 2
1 from pyspark.sql import SparkSession
----> 2 spark1 = SparkSession.builder.appName('Basics').getOrCreate()
File c:\Users\ajitj\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyspark\sql\session.py:559, in SparkSession.Builder.getOrCreate(self)
556 sc = SparkContext.getOrCreate(sparkConf)
557 # Do not update `SparkConf` for existing `SparkContext`, as it's shared
558 # by all sessions.
--> 559 session = SparkSession(sc, options=self._options)
560 else:
561 module = SparkSession._get_j_spark_session_module(session._jvm)
File c:\Users\ajitj\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyspark\sql\session.py:635, in SparkSession.__init__(self, sparkContext, jsparkSession, options)
631 jSparkSessionModule = SparkSession._get_j_spark_session_module(self._jvm)
633 if jsparkSession is None:
634 if (
--> 635 jSparkSessionClass.getDefaultSession().isDefined()
636 and not jSparkSessionClass.getDefaultSession().get().sparkContext().isStopped()
637 ):
638 jsparkSession = jSparkSessionClass.getDefaultSession().get()
639 jSparkSessionModule.applyModifiableSettings(jsparkSession, options)
TypeError: 'JavaPackage' object is not callable
< /code>
Я ожидаю получить сеанс и выполнить код. < /p>
# importing module
import pyspark
# importing sparksession from pyspark.sql module
from pyspark.sql import SparkSession
# creating sparksession and giving an app name
spark = SparkSession.builder.appName('sparkdf').getOrCreate()
# list of students data
data = \[\["1", "sravan", "vignan"\], \["2", "ojaswi", "vvit"\],
\["3", "rohith", "vvit"\], \["4", "sridevi", "vignan"\],
\["1", "sravan", "vignan"\], \["5", "gnanesh", "iit"\]\]
# specify column names
columns = \['student ID', 'student NAME', 'college'\]
# creating a dataframe from the lists of data
dataframe = spark.createDataFrame(data, columns)
print("Actual data in dataframe")
# show dataframe
dataframe.show()\*\*\]
< /code>
,
каждый раз, когда я застрял здесь с этим кодом даже после установки Java pyspark Hadoop, установка версии Python. Why Session context is not created there is nothing wrong in code.Running on Windows 11 with VSCODE IDE
H:\Codes\Python\Pyspark>echo %JAVA_HOME%
C:\Program Files\Java\jdk-17
H:\Codes\Python\Pyspark>echo %SPARK_HOME%
C: \ spark < /p>
h: \ codes \ python \ pyspark> python –version
python 3.11.0 < /p>
h: \ codes \ python \ pyspark> python –version
python 3.11.0 < /p>
Подробнее здесь: https://stackoverflow.com/questions/797 ... ilder-appn