Я работаю над проектом компьютерного зрения, в котором мне нужно обрабатывать изображения головоломок судоку с уникальными формами и символами. Эти головоломки включают в себя нестандартные варианты судоку с дополнительными визуальными элементами, такими как цветные линии, стрелки, ромбы, группы Ренбан или другие особые фигуры, накладывающиеся на сетку. Моя цель — обнаружить, классифицировать и извлечь эти элементы программным способом.
Для справки, вот несколько примеров типов головоломок, с которыми я имею дело:


Задание включает в себя:
Shape detection: Identifying the visual elements (e.g., "L" shapes, arrows, or lines) within a grid cell and differentiating between them.
Symbol classification: Recognizing unique patterns (e.g., distinguishing arrows from diamonds) and associating them with specific rules.
Grid alignment: Ensuring correct identification of symbols within their respective Sudoku grid cells.
Output format: Converting all detected elements into a structured JSON representation for further analysis.
Существующий подход: я пробовал следующее:
Classical Computer Vision: Using OpenCV with edge detection (Canny) and contour finding, but the results are inconsistent due to overlapping shapes and variable sizes.
Deep Learning: Attempted to train a CNN on limited labeled data. However, the dataset size is insufficient to achieve reliable accuracy.
В чем мне нужна помощь:
Best practices for detecting multiple overlapping shapes (e.g., arrows and lines in the same cell).
Suggestions for augmenting limited datasets or pre-trained models suitable for these kinds of problems.
Recommendations on combining classical methods with machine learning for better accuracy.
Techniques to handle grid alignment effectively, especially when the images may have slight rotations or distortions.
Подход 1. Обнаружение формы и стрелки
Tools: OpenCV (Python)
Key Steps:
Convert the image to grayscale for simpler processing.
Use Gaussian Blur to reduce noise and smoothen the image.
Apply the Canny edge detection algorithm to identify edges in the image.
Use cv2.findContours to detect shapes and approximate their geometry:
Shapes with more than 8 vertices are considered circles and highlighted using cv2.minEnclosingCircle.
Detect straight lines and arrows using Hough Line Transform (cv2.HoughLinesP).
Visualize the results with detected shapes and lines drawn over the image.
Challenges:
Inconsistent detection of overlapping shapes and symbols.
Difficulty in separating similar-looking elements like lines and arrows when they overlap or are part of a group.
Подход 2: поиск пути в лабиринтах
Tools: OpenCV, Numpy
Key Steps:
Convert the maze-like image to grayscale and apply binary thresholding to create a black-and-white mask.
Detect contours using cv2.findContours.
Identify starting and ending points by checking for circles in specific regions (e.g., corners).
Use a flood fill algorithm to find a continuous path between the start and end points.
Extract the path coordinates using the contours of the flood-filled region.
Challenges:
Ineffective when paths are embedded in grid cells or obscured by other symbols.
Requires fine-tuning for identifying start and end points in varying puzzle layouts.
Подход 3: контурная аппроксимация и случайная выборка
Tools: OpenCV, Numpy, Random
Key Steps:
Preprocess the image with grayscale conversion, Gaussian blur, and Otsu's thresholding to create a binary mask.
Use cv2.findContours to extract external contours.
Approximate contours using cv2.approxPolyDP to simplify shapes.
Filter contours based on their area and number of vertices to focus on relevant shapes.
Randomly select points from valid contours for additional processing or marking.
Challenges:
Struggles to differentiate between visually similar elements, such as small symbols and noise.
Random sampling introduces inconsistencies in results.
Общие проблемы всех подходов:
Overlapping Elements: Difficulty separating overlapping shapes (e.g., numbers within circles or arrows crossing lines).
Limited Data: Lack of sufficient training data for machine learning approaches to classify shapes reliably.
Shape Complexity: Challenges in handling irregular shapes and distinguishing between subtle differences (e.g., arrows vs. lines).
Grid Alignment: Ensuring correct mapping of detected shapes to their respective grid cells.
Подробнее здесь: https://stackoverflow.com/questions/792 ... -puzzle-im