У меня есть небольшой сервер резервного копирования Proxmox (мини-ПК Morefine S500+ AMD R7-5800H), на котором работают два диска SN700 WD Red NVMe.
Несколько дней назад один диск временно вышел из строя. Перезагрузка не помогла, но полное выключение и снова онлайн. Температура постоянно составляет максимум 45°C, так что теоретически это не должно быть проблемой.
Кто-нибудь понял, в чем может быть проблема?
Oct 19 23:24:25 pbs kernel: nvme nvme1: I/O tag 753 (92f1) opcode 0x1 (I/O Cmd) QID 5 timeout, aborting req_op:WRITE(1) size:8192
Oct 19 23:26:30 pbs kernel: nvme nvme1: I/O tag 193 (50c1) opcode 0x1 (I/O Cmd) QID 2 timeout, aborting req_op:WRITE(1) size:4096
Oct 19 23:26:30 pbs kernel: nvme nvme1: I/O tag 606 (325e) opcode 0x1 (I/O Cmd) QID 9 timeout, aborting req_op:WRITE(1) size:4096
Oct 19 23:26:30 pbs kernel: nvme nvme1: I/O tag 753 (92f1) opcode 0x1 (I/O Cmd) QID 5 timeout, reset controller
Oct 19 23:26:30 pbs kernel: nvme nvme1: Device not ready; aborting reset, CSTS=0x1
Oct 19 23:26:30 pbs kernel: nvme nvme1: Abort status: 0x371
Oct 19 23:26:30 pbs kernel: nvme nvme1: Abort status: 0x371
Oct 19 23:26:30 pbs kernel: nvme nvme1: Abort status: 0x371
Oct 19 23:26:30 pbs kernel: INFO: task txg_sync:460 blocked for more than 122 seconds.
Oct 19 23:26:30 pbs kernel: Tainted: P O 6.8.12-2-pve #1
Oct 19 23:26:30 pbs kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 19 23:26:30 pbs kernel: task:txg_sync state:D stack:0 pid:460 tgid:460 ppid:2 flags:0x00004000
Oct 19 23:26:30 pbs kernel: Call Trace:
Oct 19 23:26:30 pbs kernel:
Oct 19 23:26:30 pbs kernel: __schedule+0x401/0x15e0
Oct 19 23:26:30 pbs kernel: ? ttwu_queue_wakelist+0x101/0x110
Oct 19 23:26:30 pbs kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Oct 19 23:26:30 pbs kernel: ? try_to_wake_up+0x248/0x5f0
Oct 19 23:26:30 pbs kernel: schedule+0x33/0x110
Oct 19 23:26:30 pbs kernel: cv_wait_common+0x109/0x140 [spl]
Oct 19 23:26:30 pbs kernel: ? __pfx_autoremove_wake_function+0x10/0x10
Oct 19 23:26:30 pbs kernel: __cv_wait+0x15/0x30 [spl]
Oct 19 23:26:30 pbs kernel: zil_sync+0xdd/0x580 [zfs]
Oct 19 23:26:30 pbs kernel: ? spa_taskq_dispatch_ent+0x66/0xe0 [zfs]
Oct 19 23:26:30 pbs kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Oct 19 23:26:30 pbs kernel: ? zio_issue_async+0x53/0xb0 [zfs]
Oct 19 23:26:30 pbs kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Oct 19 23:26:30 pbs kernel: ? zio_nowait+0xd5/0x1c0 [zfs]
Oct 19 23:26:30 pbs kernel: dmu_objset_sync+0x441/0x600 [zfs]
Oct 19 23:26:30 pbs kernel: dsl_dataset_sync+0x61/0x200 [zfs]
Oct 19 23:26:30 pbs kernel: dsl_pool_sync+0xb2/0x4e0 [zfs]
Oct 19 23:26:30 pbs kernel: spa_sync+0x578/0x1050 [zfs]
Oct 19 23:26:30 pbs kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Oct 19 23:26:30 pbs kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Oct 19 23:26:30 pbs kernel: ? spa_txg_history_init_io+0x120/0x130 [zfs]
Oct 19 23:26:30 pbs kernel: txg_sync_thread+0x207/0x3a0 [zfs]
Oct 19 23:26:30 pbs kernel: ? __pfx_txg_sync_thread+0x10/0x10 [zfs]
Oct 19 23:26:30 pbs kernel: ? __pfx_thread_generic_wrapper+0x10/0x10 [spl]
Oct 19 23:26:30 pbs kernel: thread_generic_wrapper+0x5f/0x70 [spl]
Oct 19 23:26:30 pbs kernel: kthread+0xf2/0x120
Oct 19 23:26:30 pbs kernel: ? __pfx_kthread+0x10/0x10
Oct 19 23:26:30 pbs kernel: ret_from_fork+0x47/0x70
Oct 19 23:26:30 pbs kernel: ? __pfx_kthread+0x10/0x10
Oct 19 23:26:30 pbs kernel: ret_from_fork_asm+0x1b/0x30
Oct 19 23:26:30 pbs kernel:
Oct 19 23:26:30 pbs kernel: INFO: task tokio-runtime-w:366496 blocked for more than 122 seconds.
Oct 19 23:26:30 pbs kernel: Tainted: P O 6.8.12-2-pve #1
Oct 19 23:26:30 pbs kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 19 23:26:30 pbs kernel: task:tokio-runtime-w state:D stack:0 pid:366496 tgid:921 ppid:1 flags:0x00000002
Oct 19 23:26:30 pbs kernel: Call Trace:
Oct 19 23:26:30 pbs kernel:
Oct 19 23:26:30 pbs kernel: __schedule+0x401/0x15e0
Oct 19 23:26:30 pbs kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Oct 19 23:26:30 pbs kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Oct 19 23:26:30 pbs kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Oct 19 23:26:30 pbs kernel: ? zio_nowait+0xd5/0x1c0 [zfs]
Oct 19 23:26:30 pbs kernel: schedule+0x33/0x110
Oct 19 23:26:30 pbs kernel: cv_wait_common+0x109/0x140 [spl]
Oct 19 23:26:30 pbs kernel: ? __pfx_autoremove_wake_function+0x10/0x10
Oct 19 23:26:30 pbs kernel: __cv_wait+0x15/0x30 [spl]
Oct 19 23:26:30 pbs kernel: zil_commit_impl+0x326/0x14b0 [zfs]
Oct 19 23:26:30 pbs kernel: zil_commit+0x3d/0x80 [zfs]
Oct 19 23:26:30 pbs kernel: zfs_fsync+0xa5/0x140 [zfs]
Oct 19 23:26:30 pbs kernel: zpl_fsync+0x112/0x1a0 [zfs]
Oct 19 23:26:30 pbs kernel: __x64_sys_fdatasync+0x52/0xa0
Oct 19 23:26:30 pbs kernel: x64_sys_call+0x21d4/0x24b0
Oct 19 23:26:30 pbs kernel: do_syscall_64+0x81/0x170
Oct 19 23:26:30 pbs kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Oct 19 23:26:30 pbs kernel: ? __f_unlock_pos+0x12/0x20
Oct 19 23:26:30 pbs kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Oct 19 23:26:30 pbs kernel: ? ksys_write+0xe6/0x100
Oct 19 23:26:30 pbs kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Oct 19 23:26:30 pbs kernel: ? syscall_exit_to_user_mode+0x89/0x260
Oct 19 23:26:30 pbs kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Oct 19 23:26:30 pbs kernel: ? do_syscall_64+0x8d/0x170
Oct 19 23:26:30 pbs kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Oct 19 23:26:30 pbs kernel: ? syscall_exit_to_user_mode+0x89/0x260
Oct 19 23:26:30 pbs kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Oct 19 23:26:30 pbs kernel: ? do_syscall_64+0x8d/0x170
Oct 19 23:26:30 pbs kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Oct 19 23:26:30 pbs kernel: ? syscall_exit_to_user_mode+0x89/0x260
Oct 19 23:26:30 pbs kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Oct 19 23:26:30 pbs kernel: ? do_syscall_64+0x8d/0x170
Oct 19 23:26:30 pbs kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Oct 19 23:26:30 pbs kernel: ? irqentry_exit+0x43/0x50
Oct 19 23:26:30 pbs kernel: ? srso_alias_return_thunk+0x5/0xfbef5
Oct 19 23:26:30 pbs kernel: entry_SYSCALL_64_after_hwframe+0x78/0x80
Oct 19 23:26:30 pbs kernel: RIP: 0033:0x7f8d81781bfa
Oct 19 23:26:30 pbs kernel: RSP: 002b:00007f8d791ff6d0 EFLAGS: 00000293 ORIG_RAX: 000000000000004b
Oct 19 23:26:30 pbs kernel: RAX: ffffffffffffffda RBX: 00006396898912a0 RCX: 00007f8d81781bfa
Oct 19 23:26:30 pbs kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000000000c
Oct 19 23:26:30 pbs kernel: RBP: 00007f8d30267d80 R08: 0000000000000007 R09: 00007f8cbc1057c0
Oct 19 23:26:30 pbs kernel: R10: 0ad5968c6ef1cbc4 R11: 0000000000000293 R12: 000063968985e218
Oct 19 23:26:30 pbs kernel: R13: 0000639687a2f3f0 R14: 0000639689891290 R15: 00007f8d30267db0
Oct 19 23:26:30 pbs kernel:
Oct 19 23:26:30 pbs kernel: nvme nvme1: Device not ready; aborting reset, CSTS=0x3
Oct 19 23:26:30 pbs kernel: nvme nvme1: Disabling device after reset failure: -19
Oct 19 23:26:30 pbs kernel: zio pool=rpool vdev=/dev/disk/by-id/nvme-eui.e8238fa6bf530001001b448b4736aea1-part3 error=5 type=2 offset=1924908388352 size=4096 flags=1572992
Oct 19 23:26:30 pbs kernel: zio pool=rpool vdev=/dev/disk/by-id/nvme-eui.e8238fa6bf530001001b448b4736aea1-part3 error=5 type=2 offset=343609511936 size=8192 flags=1572992
Oct 19 23:26:30 pbs kernel: zio pool=rpool vdev=/dev/disk/by-id/nvme-eui.e8238fa6bf530001001b448b4736aea1-part3 error=5 type=2 offset=1925171638272 size=4096 flags=1572992
Oct 19 23:26:30 pbs kernel: zio pool=rpool vdev=/dev/disk/by-id/nvme-eui.e8238fa6bf530001001b448b4736aea1-part3 error=5 type=2 offset=343609520128 size=8192 flags=1572992
Oct 19 23:26:30 pbs kernel: zio pool=rpool vdev=/dev/disk/by-id/nvme-eui.e8238fa6bf530001001b448b4736aea1-part3 error=5 type=5 offset=0 size=0 flags=1049728
Oct 19 23:26:30 pbs kernel: zio pool=rpool vdev=/dev/disk/by-id/nvme-eui.e8238fa6bf530001001b448b4736aea1-part3 error=5 type=5 offset=0 size=0 flags=1049728
Oct 19 23:26:30 pbs kernel: zio pool=rpool vdev=/dev/disk/by-id/nvme-eui.e8238fa6bf530001001b448b4736aea1-part3 error=5 type=5 offset=0 size=0 flags=1049728
Oct 19 23:26:30 pbs kernel: zio pool=rpool vdev=/dev/disk/by-id/nvme-eui.e8238fa6bf530001001b448b4736aea1-part3 error=5 type=5 offset=0 size=0 flags=1049728
Oct 19 23:26:30 pbs zed[372909]: eid=10 class=statechange pool='rpool' vdev=nvme-eui.e8238fa6bf530001001b448b4736aea1-part3 vdev_state=REMOVED
Oct 19 23:26:30 pbs zed[372916]: eid=11 class=removed pool='rpool' vdev=nvme-eui.e8238fa6bf530001001b448b4736aea1-part3 vdev_state=REMOVED
Подробнее здесь: https://stackoverflow.com/questions/791 ... og-content
Ошибка NVMe – вопрос о содержимом журнала [закрыто] ⇐ Linux
-
- Похожие темы
- Ответы
- Просмотры
- Последнее сообщение
-
-
Резкое завершение работы из-за того, что nvme не работает с ioctl [закрыто]
Anonymous » » в форуме C++ - 0 Ответы
- 36 Просмотры
-
Последнее сообщение Anonymous
-
-
-
Резкое завершение работы из-за того, что nvme не работает с ioctl [закрыто]
Anonymous » » в форуме Linux - 0 Ответы
- 64 Просмотры
-
Последнее сообщение Anonymous
-