OCR в сложном изображении - Цифровое Кемерово

OCR в сложном изображении ⇐ Python

1 сообщение • Страница 1 из 1

Anonymous

Цитата

Сообщение Anonymous » 01 дек 2025, 20:05

Мне нужно извлечь статистику игрока из изображения: введите здесь описание изображения.
Я попробовал pytesseract с предварительной обработкой изображения: преобразовать в оттенки серого, изменить размер изображения с коэффициентом 2 и отфильтровать по краям:

Код: Выделить всё

with open(TEXT_FILE, "w", encoding="utf-8") as f_out:
for img_file in sorted(FRAMES_DIR.glob("*.png")):
# option : prétraitement léger
#img = Image.open(img_file).convert("L")  # niveaux de gris
img = Image.open(img_file)
gray_image = ImageOps.grayscale(img)
# Resize the image to enhance details.
scale_factor = 3
resized_image = gray_image.resize(
(gray_image.width * scale_factor, gray_image.height * scale_factor),
resample=Image.LANCZOS
)
# Apply adaptive thresholding using the `FIND_EDGES` filter.
thresholded_image = gray_image.filter(ImageFilter.FIND_EDGES)

text = pytesseract.image_to_string(
thresholded_image,
lang="eng",                # français (ajoute "eng" si besoin)
#config="--psm 6"           # adapté pour du texte en ligne / interface
config = tessdata_config
).strip()

if text:
f_out.write(f"===== {img_file.name} =====\n")
f_out.write(text + "\n\n")

print(f"OCR terminé. Résultat dans : {TEXT_FILE}")

но результат был не очень хорошим.
Можете ли вы дать мне совет по этой задаче или другой библиотеке для извлечения текста?
С уважением

Подробнее здесь: https://stackoverflow.com/questions/798 ... plex-image

1764608720

Anonymous

Мне нужно извлечь статистику игрока из изображения: введите здесь описание изображения.
Я попробовал pytesseract с предварительной обработкой изображения: преобразовать в оттенки серого, изменить размер изображения с коэффициентом 2 и отфильтровать по краям:
[code]with open(TEXT_FILE, "w", encoding="utf-8") as f_out:
for img_file in sorted(FRAMES_DIR.glob("*.png")):
# option : prétraitement léger
#img = Image.open(img_file).convert("L")  # niveaux de gris
img = Image.open(img_file)
gray_image = ImageOps.grayscale(img)
# Resize the image to enhance details.
scale_factor = 3
resized_image = gray_image.resize(
(gray_image.width * scale_factor, gray_image.height * scale_factor),
resample=Image.LANCZOS
)
# Apply adaptive thresholding using the `FIND_EDGES` filter.
thresholded_image = gray_image.filter(ImageFilter.FIND_EDGES)

text = pytesseract.image_to_string(
thresholded_image,
lang="eng",                # français (ajoute "eng" si besoin)
#config="--psm 6"           # adapté pour du texte en ligne / interface
config = tessdata_config
).strip()

if text:
f_out.write(f"===== {img_file.name} =====\n")
f_out.write(text + "\n\n")

print(f"OCR terminé. Résultat dans : {TEXT_FILE}")
[/code]
но результат был не очень хорошим.
Можете ли вы дать мне совет по этой задаче или другой библиотеке для извлечения текста?
С уважением 

Подробнее здесь: [url]https://stackoverflow.com/questions/79833476/ocr-in-complex-image[/url]