Вот пример Ввод LaTeX, который я обрабатываю:
Код: Выделить всё
\begin{definition}{}
Given $G=(V,E,\mu)$, and $W\subseteq V$, we define the \emph{clone of $G$ by duplication of $W$}, $Cl_G^W$, as:
$$Cl_G^W=(V \cup W', E \cup E', \mu \cup \{(n', \mu(n))\}_{n \in W} \cup \{(e', \mu(e))\}_{e' \in E'})$$
where $W' = \{n'\ :\ n \in W\}$ are new cloned nodes from $W$, and $E'$ is a set of new edges obtained from incident edges on nodes of $W$ where nodes of $W$ are replaced by copies of $W'$ (edges connecting original nodes with cloned nodes and edges connecting cloned nodes, are cloned).
\end{definition}
Код: Выделить всё
from latex2sympy2 import latex2sympy
import re
import sympy as sp
import logging
class LatexProcessor:
def _convert_equations_to_text(self, latex_text, processed_text):
equation_pattern = re.compile(r'\$.*?\$|\$\$.*?\$\$|\\\\\[.*?\\\\\]|\\\\\(.*?\\\\\)', re.DOTALL)
equations = equation_pattern.findall(latex_text)
for equation in equations:
try:
equation_clean = equation.strip("$").strip("\\[").strip("\\]").strip("\\(").strip("\\)").strip()
if not equation_clean.strip():
raise ValueError("Empty equation")
equation_clean = equation_clean.replace("\\subseteq", "
Подробнее здесь: [url]https://stackoverflow.com/questions/79319049/issues-processing-advanced-latex-math-expressions-with-latex2sympy[/url]