Почему прогнозируемые клавиатуры слегка отклоняются от наземной истины в моделировании камеры Pyrender? - Цифровое Кемерово

Почему прогнозируемые клавиатуры слегка отклоняются от наземной истины в моделировании камеры Pyrender? ⇐ Python

Ответить Пред. тема След. тема

1 сообщение • Страница 1 из 1

Anonymous

Почему прогнозируемые клавиатуры слегка отклоняются от наземной истины в моделировании камеры Pyrender?

Цитата

Сообщение Anonymous » 12 мар 2025, 15:31

Чтобы проверить библиотеку калибровки камеры, я написал сценарий Python с помощью Pyrender, где я установил камеру для обмоток для захвата платы круга сетки. Код доступен по адресу: https://github.com/tanjoe/camsim.образно Позиции в файле JSON. < /p>
Скрипт моделирования: < /p>
import os
import datetime
import cv2
import pyrender
import trimesh
import json
import numpy as np
import pyglet
from pyrender.constants import GLTF

def projectWorldToImage(
camera: pyrender.IntrinsicsCamera, camera_pose: np.ndarray, world_point: np.ndarray
) -> tuple[float, float]:
# Transform the point to camera coordinates
point_3d_camera = np.linalg.inv(camera_pose) @ world_point
point_3d_camera = point_3d_camera[:3]
X, Y, Z = point_3d_camera

# Project the 3D point to 2D image coordinates
x = camera.cx + (camera.fx * X) / -Z
# Flip the Y-axis to follow OpenCV convention
y = camera.cy - (camera.fy * Y) / -Z

return (x, y)

class MyViewer(pyrender.Viewer):
def __init__(
self,
scene: pyrender.Scene,
camera: pyrender.IntrinsicsCamera,
interested_points: list[np.ndarray],
viewport_size: tuple[int, int],
render_flags=None,
viewer_flags=None,
registered_keys=None,
run_in_thread=False,
**kwargs,
):
self.camera = camera
self.interested_points = interested_points
super().__init__(
scene,
viewport_size,
render_flags,
viewer_flags,
registered_keys,
run_in_thread,
**kwargs,
)

def on_key_press(self, symbol, modifiers):
if symbol == pyglet.window.key.ENTER:
timestamp = datetime.datetime.now().strftime("%H%M%S")
pyglet.image.get_buffer_manager().get_color_buffer().save(
f"output/{timestamp}.png"
)

# Get the current camera pose
camera_pose = self.scene.get_pose(self.scene.main_camera_node)
print(f"Camera pose: {camera_pose}")
np.savetxt(f"output/{timestamp}-camera_pose.txt", camera_pose)

loc_truth = []
for point in self.interested_points:
loc_truth.append(projectWorldToImage(self.camera, camera_pose, point))
with open(f"output/{timestamp}-loc_truth.json", "w") as truth_file:
json.dump(loc_truth, truth_file, indent=4)

return super().on_key_press(symbol, modifiers)

def createBoard(image_path: str) -> pyrender.Mesh:
# Create a texture from the image
image = cv2.imread(image_path)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
sampler = pyrender.Sampler(
magFilter=GLTF.NEAREST,
minFilter=None,
wrapS=GLTF.CLAMP_TO_EDGE,
wrapT=GLTF.CLAMP_TO_EDGE,
)
texture = pyrender.Texture(
sampler=sampler,
source=image,
source_channels="RGB",
width=image.shape[1],
height=image.shape[0],
)

# Create a material using the texture
material = pyrender.MetallicRoughnessMaterial(
baseColorTexture=texture,
emissiveTexture=texture,
emissiveFactor=[0.9, 0.9, 0.9],
doubleSided=False,
smooth=False,
)

# Create a plane mesh with UV coordinates
# Compute vertices based on image resolution to keep the aspect ratio
half_w = image.shape[1] / 1000.0 / 2.0
half_h = image.shape[0] / 1000.0 / 2.0
vertices = np.array(
[
[-half_w, -half_h, 0], # Bottom-left
[half_w, -half_h, 0], # Bottom-right
[-half_w, half_h, 0], # Top-left
[half_w, half_h, 0], # Top-right
],
dtype=np.float64,
)

faces = np.array(
[
[0, 1, 2], # First triangle
[1, 3, 2], # Second triangle
],
dtype=np.uint32,
)

# Define UV coordinates for texture mapping
uv_coords = np.array(
[
[0, 0], # Bottom-left
[1, 0], # Bottom-right
[0, 1], # Top-left
[1, 1], # Top-right
],
dtype=np.float64,
)

# Create a Trimesh object with vertices, faces, and UV coordinates
plane = trimesh.Trimesh(
vertices=vertices,
faces=faces,
visual=trimesh.visual.TextureVisuals(uv=uv_coords, image=image),
)

# Create a Pyrender mesh from the Trimesh object
plane_mesh = pyrender.Mesh.from_trimesh(plane, material=material, smooth=False)
return plane_mesh

def computeCircleCoordinates(json_path: str, board_mesh: pyrender.Mesh) -> np.ndarray:
coordinates: list[list[float]] = []
with open(json_path, "r") as content:
board_info = json.load(content)
image_size = board_info["image_size"]
centers = board_info["centers"]

width_ratio = board_mesh.extents[0] / image_size[0]
height_ratio = board_mesh.extents[1] / image_size[1]
start_x = board_mesh.bounds[0][0]
start_y = board_mesh.bounds[0][1]

for c in centers:
# Flip y to make it follows OpenGL convention (+Y should be upward)
c[1] = image_size[1] - c[1]
x = start_x + c[0] * width_ratio
y = start_y + c[1] * height_ratio
coordinates.append([x, y, 0])
return np.array(coordinates)

def main() -> None:
plane_mesh = createBoard("resource/board.png")
plane_pose = np.eye(4)
plane_pose[2, 3] = -10
plane_node = pyrender.Node(mesh=plane_mesh, matrix=plane_pose)

centers = computeCircleCoordinates("resource/board.json", plane_mesh)
transformed_centers = []
for center in centers:
center_homo = np.append(center, 1)
transformed_centers.append(plane_pose @ center_homo)

# Create a camera at the origin looking down the z-axis
view_width = 1920
view_height = 1080
camera = pyrender.IntrinsicsCamera(
fx=view_width, fy=view_width, cx=(view_width / 2), cy=(view_height / 2)
)
camera_pose = np.eye(4)
camera_node = pyrender.Node(camera=camera, matrix=camera_pose)

# Create a scene
scene = pyrender.Scene(bg_color=[0, 0, 0, 1])
scene.add_node(plane_node)
scene.add_node(camera_node)

# Render the scene using the interactive viewer
os.makedirs("./output", exist_ok=True)
MyViewer(
scene,
camera=camera,
interested_points=transformed_centers,
viewport_size=(view_width, view_height),
)

if __name__ == "__main__":
main()
< /code>
Чтобы проверить, точны ли вычисленные значения истины, я взял несколько изображений с камерой, обращенной непосредственно к плате, без ротации. Затем я написал еще один сценарий для загрузки захваченного изображения и наложения всех клавиш на него.import os
import cv2
import json
import matplotlib.pyplot as plt
import typer
from pathlib import Path

def selectImageFromDir(img_dir: Path) -> Path:
if not img_dir.exists():
print(f"Error: '{img_dir}' directory not found.")
raise typer.Exit(code=1)

image_files = [f for f in os.listdir(img_dir) if f.endswith(".png")]
if not image_files:
print(f"Error: No .png images found in the '{img_dir}' directory.")
raise typer.Exit(code=1)

print("Available images:")
for i, file in enumerate(image_files):
print(f"{i + 1}. {file}")

while True:
try:
choice = int(input("Enter the number of the image to process: ")) - 1
if 0
Я ожидаю, что все клавиатуры будут совсем идеально соответствовать центрам кругов на изображении. Тем не менее, между ними существует небольшое отклонение. Проблема становится более заметной, когда камера увеличивается ближе к плате. />
Я понимаю, что численные ошибки существуют при вычислении значений истины, но, поскольку все расчеты с плавающей точкой. наблюдается.
Где я мог ошибиться?

Подробнее здесь: https://stackoverflow.com/questions/794 ... der-camera

Реклама

1741782683

Anonymous

 Чтобы проверить библиотеку калибровки камеры, я написал сценарий Python с помощью Pyrender, где я установил камеру для обмоток для захвата платы круга сетки. Код доступен по адресу: https://github.com/tanjoe/camsim.образно Позиции в файле JSON.  < /p>
Скрипт моделирования: < /p>
import os
import datetime
import cv2
import pyrender
import trimesh
import json
import numpy as np
import pyglet
from pyrender.constants import GLTF

def projectWorldToImage(
camera: pyrender.IntrinsicsCamera, camera_pose: np.ndarray, world_point: np.ndarray
) -> tuple[float, float]:
# Transform the point to camera coordinates
point_3d_camera = np.linalg.inv(camera_pose) @ world_point
point_3d_camera = point_3d_camera[:3]
X, Y, Z = point_3d_camera

# Project the 3D point to 2D image coordinates
x = camera.cx + (camera.fx * X) / -Z
# Flip the Y-axis to follow OpenCV convention
y = camera.cy - (camera.fy * Y) / -Z

return (x, y)

class MyViewer(pyrender.Viewer):
def __init__(
self,
scene: pyrender.Scene,
camera: pyrender.IntrinsicsCamera,
interested_points: list[np.ndarray],
viewport_size: tuple[int, int],
render_flags=None,
viewer_flags=None,
registered_keys=None,
run_in_thread=False,
**kwargs,
):
self.camera = camera
self.interested_points = interested_points
super().__init__(
scene,
viewport_size,
render_flags,
viewer_flags,
registered_keys,
run_in_thread,
**kwargs,
)

def on_key_press(self, symbol, modifiers):
if symbol == pyglet.window.key.ENTER:
timestamp = datetime.datetime.now().strftime("%H%M%S")
pyglet.image.get_buffer_manager().get_color_buffer().save(
f"output/{timestamp}.png"
)

# Get the current camera pose
camera_pose = self.scene.get_pose(self.scene.main_camera_node)
print(f"Camera pose: {camera_pose}")
np.savetxt(f"output/{timestamp}-camera_pose.txt", camera_pose)

loc_truth = []
for point in self.interested_points:
loc_truth.append(projectWorldToImage(self.camera, camera_pose, point))
with open(f"output/{timestamp}-loc_truth.json", "w") as truth_file:
json.dump(loc_truth, truth_file, indent=4)

return super().on_key_press(symbol, modifiers)

def createBoard(image_path: str) ->  pyrender.Mesh:
# Create a texture from the image
image = cv2.imread(image_path)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
sampler = pyrender.Sampler(
magFilter=GLTF.NEAREST,
minFilter=None,
wrapS=GLTF.CLAMP_TO_EDGE,
wrapT=GLTF.CLAMP_TO_EDGE,
)
texture = pyrender.Texture(
sampler=sampler,
source=image,
source_channels="RGB",
width=image.shape[1],
height=image.shape[0],
)

# Create a material using the texture
material = pyrender.MetallicRoughnessMaterial(
baseColorTexture=texture,
emissiveTexture=texture,
emissiveFactor=[0.9, 0.9, 0.9],
doubleSided=False,
smooth=False,
)

# Create a plane mesh with UV coordinates
# Compute vertices based on image resolution to keep the aspect ratio
half_w = image.shape[1] / 1000.0 / 2.0
half_h = image.shape[0] / 1000.0 / 2.0
vertices = np.array(
[
[-half_w, -half_h, 0],  # Bottom-left
[half_w, -half_h, 0],  # Bottom-right
[-half_w, half_h, 0],  # Top-left
[half_w, half_h, 0],  # Top-right
],
dtype=np.float64,
)

faces = np.array(
[
[0, 1, 2],  # First triangle
[1, 3, 2],  # Second triangle
],
dtype=np.uint32,
)

# Define UV coordinates for texture mapping
uv_coords = np.array(
[
[0, 0],  # Bottom-left
[1, 0],  # Bottom-right
[0, 1],  # Top-left
[1, 1],  # Top-right
],
dtype=np.float64,
)

# Create a Trimesh object with vertices, faces, and UV coordinates
plane = trimesh.Trimesh(
vertices=vertices,
faces=faces,
visual=trimesh.visual.TextureVisuals(uv=uv_coords, image=image),
)

# Create a Pyrender mesh from the Trimesh object
plane_mesh = pyrender.Mesh.from_trimesh(plane, material=material, smooth=False)
return plane_mesh

def computeCircleCoordinates(json_path: str, board_mesh: pyrender.Mesh) -> np.ndarray:
coordinates: list[list[float]] = []
with open(json_path, "r") as content:
board_info = json.load(content)
image_size = board_info["image_size"]
centers = board_info["centers"]

width_ratio = board_mesh.extents[0] / image_size[0]
height_ratio = board_mesh.extents[1] / image_size[1]
start_x = board_mesh.bounds[0][0]
start_y = board_mesh.bounds[0][1]

for c in centers:
# Flip y to make it follows OpenGL convention (+Y should be upward)
c[1] = image_size[1] - c[1]
x = start_x + c[0] * width_ratio
y = start_y + c[1] * height_ratio
coordinates.append([x, y, 0])
return np.array(coordinates)

def main() -> None:
plane_mesh = createBoard("resource/board.png")
plane_pose = np.eye(4)
plane_pose[2, 3] = -10
plane_node = pyrender.Node(mesh=plane_mesh, matrix=plane_pose)

centers = computeCircleCoordinates("resource/board.json", plane_mesh)
transformed_centers = []
for center in centers:
center_homo = np.append(center, 1)
transformed_centers.append(plane_pose @ center_homo)

# Create a camera at the origin looking down the z-axis
view_width = 1920
view_height = 1080
camera = pyrender.IntrinsicsCamera(
fx=view_width, fy=view_width, cx=(view_width / 2), cy=(view_height / 2)
)
camera_pose = np.eye(4)
camera_node = pyrender.Node(camera=camera, matrix=camera_pose)

# Create a scene
scene = pyrender.Scene(bg_color=[0, 0, 0, 1])
scene.add_node(plane_node)
scene.add_node(camera_node)

# Render the scene using the interactive viewer
os.makedirs("./output", exist_ok=True)
MyViewer(
scene,
camera=camera,
interested_points=transformed_centers,
viewport_size=(view_width, view_height),
)

if __name__ == "__main__":
main()
< /code>
Чтобы проверить, точны ли вычисленные значения истины, я взял несколько изображений с камерой, обращенной непосредственно к плате, без ротации.  Затем я написал еще один сценарий для загрузки захваченного изображения и наложения всех клавиш на него.import os
import cv2
import json
import matplotlib.pyplot as plt
import typer
from pathlib import Path

def selectImageFromDir(img_dir: Path) -> Path:
if not img_dir.exists():
print(f"Error: '{img_dir}' directory not found.")
raise typer.Exit(code=1)

image_files = [f for f in os.listdir(img_dir) if f.endswith(".png")]
if not image_files:
print(f"Error: No .png images found in the '{img_dir}' directory.")
raise typer.Exit(code=1)

print("Available images:")
for i, file in enumerate(image_files):
print(f"{i + 1}. {file}")

while True:
try:
choice = int(input("Enter the number of the image to process: ")) - 1
if 0 
Я ожидаю, что все клавиатуры будут совсем идеально соответствовать центрам кругов на изображении. Тем не менее, между ними существует небольшое отклонение. Проблема становится более заметной, когда камера увеличивается ближе к плате. />  
Я понимаю, что численные ошибки существуют при вычислении значений истины, но, поскольку все расчеты с плавающей точкой. наблюдается. 
Где я мог ошибиться?  

Подробнее здесь: [url]https://stackoverflow.com/questions/79493788/why-do-projected-keypoints-slightly-deviate-from-ground-truth-in-pyrender-camera[/url]

Ответить Пред. тема След. тема

1 сообщение • Страница 1 из 1

Быстрый ответ

Заголовок:

Имя пользователя:

Изменение регистра текста:

Смайлики

Ещё смайлики…

К этому ответу прикреплено по крайней мере одно вложение.

Если вы не хотите добавлять вложения, оставьте поля пустыми. Можно прикреплять файлы, перетаскивая их в окно сообщения.

Максимально разрешённый размер вложения: 15 МБ.

Имя файла:

Комментарий к файлу:

Имя файла	Комментарий к файлу	Размер	Статус

Похожие темы

Ответы

Просмотры

Последнее сообщение

Проблема с наземной проверкой Unity3D

Последнее сообщение Anonymous « 10 июл 2024, 18:54
Добавлено в форуме C#

Anonymous » 10 июл 2024, 18:54 » в форуме C#

Итак, у меня есть наземный объект под названием Player , к которому прикреплены коллайдер и аниматор. Я пытаюсь успешно проверить, «заземлен» ли игрок, считывая столкновение игрока с плоскостью под названием (и помеченной) Ground .
Мой игрок имеет...

0 Ответы

17 Просмотры

Последнее сообщение Anonymous
10 июл 2024, 18:54
Преобразование мира в пиксели в Pyrender

Последнее сообщение Anonymous « 13 дек 2024, 06:11
Добавлено в форуме Python

Anonymous » 13 дек 2024, 06:11 » в форуме Python

Я пытаюсь преобразовать точку в трехмерном мире, визуализированном с помощью Pyrender, в пиксельные координаты. Преобразование мира в кадр камеры, кажется, работает, однако преобразование кадра камеры в пиксельный кадр неверно, и я не могу понять,...

0 Ответы

7 Просмотры

Последнее сообщение Anonymous
13 дек 2024, 06:11
Преобразование мира в пиксели в Pyrender

Последнее сообщение Anonymous « 13 дек 2024, 22:14
Добавлено в форуме Python

Anonymous » 13 дек 2024, 22:14 » в форуме Python

Я пытаюсь преобразовать точку в трехмерном мире, визуализированном с помощью Pyrender, в пиксельные координаты. Преобразование мира в кадр камеры, кажется, работает, однако преобразование кадра камеры в пиксельный кадр неверно, и я не могу понять,...

0 Ответы

6 Просмотры

Последнее сообщение Anonymous
13 дек 2024, 22:14
Почему моя проекция 3D-объекта не совпадает в Pyrender с внутренними функциями и позой ARKit?

Последнее сообщение Anonymous « 29 дек 2024, 12:20
Добавлено в форуме Python

Anonymous » 29 дек 2024, 12:20 » в форуме Python

Я работаю с репозиторием неявной глубины от Niantic Labs, но столкнулся с проблемой. В репозитории нет общего кода для проецирования 3D-точек или значений глубины на плоскость изображения 2D-камеры.
Я пытаюсь спроецировать 3D-объект в сцену,...

0 Ответы

11 Просмотры

Последнее сообщение Anonymous
29 дек 2024, 12:20
Как лучше всего перераспределить прогнозируемые дневные продажи на почасовой уровень?

Последнее сообщение Anonymous « 24 окт 2024, 23:07
Добавлено в форуме Python

Anonymous » 24 окт 2024, 23:07 » в форуме Python

У меня есть почасовые данные о продажах (06:00–22:00). Я агрегировал их по дням и сделал прогнозы на основе дневного уровня. Это связано с тем, что я обнаружил, что прогнозирование на ежедневном уровне приводит к более точным прогнозам.
У меня также...

0 Ответы

16 Просмотры

Последнее сообщение Anonymous
24 окт 2024, 23:07

Вернуться в «Python»

Programmiererforum