Viewing a single comment thread. View all comments

InitialWalrus t1_iwuetbz wrote

https://pypi.org/project/PyPDF2/ This python library will allow you to convert the pdf to a string (assuming it is text readable. If it's not text readable you'll need to look into OCR, optical character recognition).

2

dwightsrus t1_iwuq4um wrote

Thanks for the suggestion. My challenge is that each pdf is not structured the same way. Would love to get a bunch of them go through a ML training model that spits out the data in the format I need.

1