Расщепление PDF с PYPDF2 в лямбда - Цифровое Кемерово

Расщепление PDF с PYPDF2 в лямбда ⇐ Python

1 сообщение • Страница 1 из 1

Anonymous

Сообщение Anonymous » 29 янв 2025, 21:13

Я, вероятно, делаю что -то действительно глупое здесь, но у меня есть следующая функция Lambda, чтобы разделить загруженный PDF на отдельные страницы. Когда я загружаю 8-страничный PDF, он создает 8 идентичных копий исходного PDF. Br />import boto3
from PyPDF2 import PdfReader, PdfWriter

s3 = boto3.client('s3')

def lambda_handler(event, context):
# Retrieve the uploaded file details from the event
bucket_name = event['Records'][0]['s3']['bucket']['name']
file_key = event['Records'][0]['s3']['object']['key']
file_name = file_key.split('/')[-1] # Extract the original file name

# Prepare the output directory path
output_dir = 'PCP/temp/' # Specify your desired output directory
output_prefix = file_name.split('.')[0] + '-' # Prefix for split file names

# Download the uploaded file to temp storage
temp_file_path = '/tmp/' + file_name
s3.download_file(bucket_name, file_key, temp_file_path)

# Read the uploaded PDF file
pdf = PdfReader(temp_file_path)

# Split the PDF into individual pages and save them
for page_number in range(len(pdf.pages)):
print (f"Page {page_number}")
temp_output_path = f"/tmp/{output_prefix}{page_number + 1}.pdf"
output_page_path = f"{output_dir}{output_prefix}{page_number + 1}.pdf"
output_pdf = PdfWriter()
output_pdf.add_page(pdf.pages[page_number])

with open(temp_output_path, 'wb') as output_file:
output_pdf.write(output_file)

# Upload the split page to S3 bucket
s3.upload_file(temp_file_path, bucket_name, output_page_path)

return {
'statusCode': 200,
'body': 'PDF splitting completed successfully.'
}

Подробнее здесь: https://stackoverflow.com/questions/764 ... a-function

Anonymous

1 сообщение • Страница 1 из 1

Вернуться в «Python»