ERROR SparkContext: Failed to add home/areaapache/software/spark-3.5.2-bin-hadoop3/jars/spark-streaming-kafka-0-10_2.13-3.5.2.jar \
to Spark environment
File "/home/tuanle/apachearea/software/spark-3.5.2-bin-hadoop3/python/lib/py4j-0.10.9.7-src.zip/py4j/protocol.py", line 326, \
in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling o47.load. : \
java.lang.NoClassDefFoundError: scala/$less$colon$less \
at org.apache.spark.sql.kafka010.KafkaSourceProvider.org$apache$spark$sql$kafka010$KafkaSourceProvider$$validateStreamOptions(KafkaSourceProvider.scala:338) \
at org.apache.spark.sql.kafka010.KafkaSourceProvider.sourceSchema(KafkaSourceProvider.scala:71) \
at org.apache.spark.sql.execution.datasources.DataSource.sourceSchema(DataSource.scala:233) \
at org.apache.spark.sql.execution.datasources.DataSource.sourceInfo$lzycompute(DataSource.scala:118) \
at org.apache.spark.sql.execution.datasources.DataSource.sourceInfo(DataSource.scala:118) \
at org.apache.spark.sql.execution.streaming.StreamingRelation$.apply(StreamingRelation.scala:36) \
at org.apache.spark.sql.streaming.DataStreamReader.loadInternal(DataStreamReader.scala:169) \
at org.apache.spark.sql.streaming.DataStreamReader.load(DataStreamReader.scala:145) \
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) \
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) \
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) \
at java.lang.reflect.Method.invoke(Method.java:498) \
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) \
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374) \
at py4j.Gateway.invoke(Gateway.java:282) \
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) \
at py4j.commands.CallCommand.execute(CallCommand.java:79) \
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182) \
at py4j.ClientServerConnection.run(ClientServerConnection.java:106) \
at java.lang.Thread.run(Thread.java:750) \
Caused by: java.lang.ClassNotFoundException: scala.$less$colon$less \
at java.net.URLClassLoader.findClass(URLClassLoader.java:387) \
at java.lang.ClassLoader.loadClass(ClassLoader.java:418) \
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) \
at java.lang.ClassLoader.loadClass(ClassLoader.java:351) \
... 20 more
2024-09-26 22:42:24,971 - INFO - Closing down clientserver connection`
Я не знаю, почему я уже загрузил все необходимые файлы jar, соответствующие версии.
Затем сохраните их в папке jars, но Spark все равно не может их найти. .
try: while True: time.sleep(300) # Wait for 5 minutes process_and_save_to_mysql(spark) except KeyboardInterrupt: logger.info("Application interrupted. Stopping streams...") query_cassandra.stop() spark.stop() logger.info("Application stopped successfully") except Exception as e: logger.error(f"An error occurred: {str(e)}") query_cassandra.stop() spark.stop() logger.info("Application stopped due to an error") [/code] [code]File "/home/tuanle/apachearea/software/spark-3.5.2-bin-hadoop3/python/lib/py4j-0.10.9.7-src.zip/py4j/protocol.py", line 326, \ in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling o47.load. : \ java.lang.NoClassDefFoundError: scala/$less$colon$less \ at org.apache.spark.sql.kafka010.KafkaSourceProvider.org$apache$spark$sql$kafka010$KafkaSourceProvider$$validateStreamOptions(KafkaSourceProvider.scala:338) \ at org.apache.spark.sql.kafka010.KafkaSourceProvider.sourceSchema(KafkaSourceProvider.scala:71) \ at org.apache.spark.sql.execution.datasources.DataSource.sourceSchema(DataSource.scala:233) \ at org.apache.spark.sql.execution.datasources.DataSource.sourceInfo$lzycompute(DataSource.scala:118) \ at org.apache.spark.sql.execution.datasources.DataSource.sourceInfo(DataSource.scala:118) \ at org.apache.spark.sql.execution.streaming.StreamingRelation$.apply(StreamingRelation.scala:36) \ at org.apache.spark.sql.streaming.DataStreamReader.loadInternal(DataStreamReader.scala:169) \ at org.apache.spark.sql.streaming.DataStreamReader.load(DataStreamReader.scala:145) \ at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) \ at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) \ at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) \ at java.lang.reflect.Method.invoke(Method.java:498) \ at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) \ at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374) \ at py4j.Gateway.invoke(Gateway.java:282) \ at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) \ at py4j.commands.CallCommand.execute(CallCommand.java:79) \ at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182) \ at py4j.ClientServerConnection.run(ClientServerConnection.java:106) \ at java.lang.Thread.run(Thread.java:750) \ Caused by: java.lang.ClassNotFoundException: scala.$less$colon$less \ at java.net.URLClassLoader.findClass(URLClassLoader.java:387) \ at java.lang.ClassLoader.loadClass(ClassLoader.java:418) \ at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) \ at java.lang.ClassLoader.loadClass(ClassLoader.java:351) \ ... 20 more 2024-09-26 22:42:24,971 - INFO - Closing down clientserver connection` [/code] Я не знаю, почему я уже загрузил все необходимые файлы jar, соответствующие версии. Затем сохраните их в папке jars, но Spark все равно не может их найти. .
ERROR SparkContext: Failed to add home/areaapache/software/spark-3.5.2-bin-hadoop3/jars/spark-streaming-kafka-0-10_2.13-3.5.2.jar to Spark environment s
import logging
from pyspark.sql import SparkSession
from pyspark.sql.functions import...
Я пытаюсь запустить сеанс Spark в Jupyter Notebook на компьютере EC2 Linux с помощью кода Visual Studio. Мой код выглядит следующим образом:
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName( spark_app ).getOrCreate()...
Я сталкиваюсь с вышеуказанным исключением, когда пытаюсь применить метод (ComputeDwt) к входным данным RDD[(Int,ArrayBuffer )].
Я даже использую расширения Параметр сериализации для сериализации объектов в Spark. Вот фрагмент кода....