I'm trying to run a simple RNN model using keras on Tensorflow and I'm getting an error, probably linked to the graphics card (I'm not sure).
It should be noted that this code often works correctly and sometimes it gives me this error and I don't really understand. I've tried varying the number of units in the single RNN layer or the number of batches, but to no avail. I've also tried to install libraries by reading the error comments, without success either. Could this be due to the way Tensorflow-gpu was installed? I'd like to point out that there are times when the code works without a hitch and there are other times when I get this kind of error, so I don't really understand.
Код: Выделить всё
learning_rate = 0.001 model=keras.models.Sequential() model.add(keras.layers.InputLayer(input_shape=(120,6))) model.add(keras.layers.SimpleRNN(50)) model.add(keras.layers.Dense(120)) model.compile(optimizer=keras.optimizers.Adam(learning_rate=learning_rate), loss="mse") model.summary() epochs = 5 model.fit(dataset_train, epochs=epochs, validation_data=dataset_val) Epoch 1/5 2024-03-09 10:46:27.854106: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:185] failed to create cublas handle: the library was not initialized 2024-03-09 10:46:27.854155: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:188] Failure to initialize cublas may be due to OOM (cublas needs some free memory when you initialize it, and your deep-learning framework may have preallocated more than its fair share), or may be because this binary was not built with support for the GPU in your machine. 2024-03-09 10:46:27.854177: W tensorflow/core/framework/op_kernel.cc:1839] OP_REQUIRES failed at matmul_op_impl.h:817 : INTERNAL: Attempting to perform BLAS operation using StreamExecutor without BLAS support 2024-03-09 10:46:27.854223: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:185] failed to create cublas handle: the library was not initialized 2024-03-09 10:46:27.854231: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:188] Failure to initialize cublas may be due to OOM (cublas needs some free memory when you initialize it, and your deep-learning framework may have preallocated more than its fair share), or may be because this binary was not built with support for the GPU in your machine. 2024-03-09 10:46:27.854242: W tensorflow/core/framework/op_kernel.cc:1839] OP_REQUIRES failed at matmul_op_impl.h:817 : INTERNAL: Attempting to perform BLAS operation using StreamExecutor without BLAS support --------------------------------------------------------------------------- InternalError Traceback (most recent call last) Cell In[33], line 3 1 epochs = 5 ----> 3 model.fit(dataset_train, 4 epochs=epochs, 5 validation_data=dataset_val) File ~/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py:70, in filter_traceback..error_handler(*args, **kwargs) 67 filtered_tb = _process_traceback_frames(e.__traceback__) 68 # To get the full stack trace, call: 69 # `tf.debugging.disable_traceback_filtering()` ---> 70 raise e.with_traceback(filtered_tb) from None 71 finally: 72 del filtered_tb File ~/Tensorflow/tf-gpu/lib/python3.11/site-packages/tensorflow/python/eager/execute.py:53, in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name) 51 try: 52 ctx.ensure_initialized() ---> 53 tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name, 54 inputs, attrs, num_outputs) 55 except core._NotOkStatusException as e: 56 if name is not None: InternalError: Graph execution error: Detected at node sequential_7/simple_rnn_7/while/simple_rnn_cell/MatMul_1 defined at (most recent call last): File "", line 198, in _run_module_as_main File "", line 88, in _run_code File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/ipykernel_launcher.py", line 17, in File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/traitlets/config/application.py", line 1075, in launch_instance File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/ipykernel/kernelapp.py", line 739, in start File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/tornado/platform/asyncio.py", line 205, in start File "/usr/lib/python3.11/asyncio/base_events.py", line 607, in run_forever File "/usr/lib/python3.11/asyncio/base_events.py", line 1922, in _run_once File "/usr/lib/python3.11/asyncio/events.py", line 80, in _run File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/ipykernel/kernelbase.py", line 542, in dispatch_queue File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/ipykernel/kernelbase.py", line 531, in process_one File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/ipykernel/kernelbase.py", line 437, in dispatch_shell File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/ipykernel/ipkernel.py", line 359, in execute_request File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/ipykernel/kernelbase.py", line 775, in execute_request File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/ipykernel/ipkernel.py", line 446, in do_execute File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/ipykernel/zmqshell.py", line 549, in run_cell File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3051, in run_cell File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3106, in _run_cell File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/IPython/core/async_helpers.py", line 129, in _pseudo_sync_runner File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3311, in run_cell_async File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3493, in run_ast_nodes File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3553, in run_code File "/tmp/ipykernel_8025/2690822035.py", line 3, in File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py", line 65, in error_handler File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/engine/training.py", line 1807, in fit File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/engine/training.py", line 1401, in train_function File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/engine/training.py", line 1384, in step_function File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/engine/training.py", line 1373, in run_step File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/engine/training.py", line 1150, in train_step File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py", line 65, in error_handler File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/engine/training.py", line 590, in __call__ File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py", line 65, in error_handler File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/engine/base_layer.py", line 1149, in __call__ File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py", line 96, in error_handler File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/engine/sequential.py", line 398, in call File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/engine/functional.py", line 515, in call File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/engine/functional.py", line 672, in _run_internal_graph File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/layers/rnn/base_rnn.py", line 556, in __call__ File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py", line 65, in error_handler File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/engine/base_layer.py", line 1149, in __call__ File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py", line 96, in error_handler File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/layers/rnn/simple_rnn.py", line 411, in call File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/layers/rnn/base_rnn.py", line 722, in call File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/backend.py", line 5168, in rnn File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/backend.py", line 5147, in _step File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/layers/rnn/base_rnn.py", line 717, in step File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py", line 65, in error_handler File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/engine/base_layer.py", line 1149, in __call__ File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py", line 96, in error_handler File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/layers/rnn/simple_rnn.py", line 209, in call File "/home/otakagle/Tensorflow/tf-gpu/lib/python3.11/site-packages/keras/src/backend.py", line 2463, in dot Attempting to perform BLAS operation using StreamExecutor without BLAS support [[{{node sequential_7/simple_rnn_7/while/simple_rnn_cell/MatMul_1}}]] [Op:__inference_train_function_10914]
Источник: https://stackoverflow.com/questions/781 ... l-training