Skip to content

Extracting hyperlinks from table cell #925

Open
@divyeshlad18

Description

@divyeshlad18

Hello Everyone,

I'm trying to extract the text from the table cells in word and populating them into pandas DataFrame. I was successfully able to do that mainly with the help of this code:


document = Document(path_to_your_docx)
tables = document.tables
for table in tables:
    for row in table.rows:
        for cell in row.cells:
            for paragraph in cell.paragraphs:
                print(paragraph.text)

Thanks to @scanny

However, I get empty text when hyperlinks are encountered in the cell.

Alternatively, I'm able to extract all the hyperlinks from the document using this code:

rels = document_name.part.rels
for rel in rels:
    if rels[rel].reltype == RT.HYPERLINK:
        print( rels[rel]._target)

But I would much rather prefer extracting them using cells object, this would allow me to place the hyperlinks corresponding to the row they belong.

Any help is appreciated !!!

Many Thanks,
Divyesh

Metadata

Metadata

Assignees

No one assigned

    Labels

    hyperlinkRead and write hyperlinks in paragraph

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions