Description
Hi guys this is my first time posting on Github so apologies for breaking any conventions and feedback appreciated!
I have some code which converts a pdf using python-pdf2docx and then adds text to the tables using python-docx. When I run the following code on the attached docx files
print(doc.tables[0].cell(0, 1).text)
print(doc.tables[1].cell(0, 1).text)
I find that for Output_A the first "table" python-docx recognises isn't a table (not a particular problem) and then it skips over the first two actual tables (a problem).
Output_B seems not to recognise any tables at all.
If I convert the files using Adobe or SmallPDFs online tools docx interprets the tables correctly. Perhaps this is more of an issue with python-pdf2docx but I haven't had much luck pursuing that avenue and would appreciate any way to make it work with python-docx. I'd like to turn the code into a little executable so anyone can use it and the actually .docx code works great ... if it can detect the tables. Thank you in advance for helping.