ВЕРСИЯ POLARS
Код: Выделить всё
def cleanse_text(sentence):
RIGHT_QUOTE = r"(\u2019)"
sentence = re.sub(RIGHT_QUOTE, "'", sentence)
sentence = re.sub(r" +", " ", sentence)
return sentence.strip()
df = df.with_columns(pl.col("text").map_elements(lambda x: cleanse_text(x)).name.keep())
Код: Выделить всё
def cleanse_text(sentence):
RIGHT_QUOTE = r"(\u2019)"
sentence = re.sub(RIGHT_QUOTE, "'", sentence)
sentence = re.sub(r" +", " ", sentence)
return sentence.strip()
df["text"] = df["text"].apply(lambda x: cleanse_text(x))
Подробнее здесь: https://stackoverflow.com/questions/750 ... ly-in-pand