Description
Consider the following two word tables, the first one only has merged cells I can read this correctly.
In the 2nd one I split the top row (after merge) to have two, and when I do this, my table reads incorrect.
here is the code to reproduce:
import docx
document = docx.Document('demo-word.docx')
for table in document.tables:
print('TABLE---')
for row in table.rows:
print('--new-row--')
for cell in iter_unique_cells(row):
print(cell.text)
to simply read the result I have the print out "Table" and "--new row--"
this will result in a "cell defect" propagating through the table (see below). Does anyone have a workaround for this?
Many thanks
1th table reads correct as:
TABLE---
--new-row--
A
--new-row--
--new-row--
B1
B2
B3
B4
--new-row--
C1
C2
C3
C4
--new-row--
D
--new-row--
E1
E3
--new-row--
F1
F2
F3
F4
F5
2nd table reads incorrect:
TABLE---
--new-row--
A1
A2
--new-row--
B1
--new-row--
B2
B3
B4
C1
C2
--new-row--
C3
C4
D
--new-row--
E1
E3
--new-row--
E3
F1
F2
F3
F4
--new-row--
F5
looking at the XML sturcutre the split creates something like a hidden row?