Skip to content

feat: add annotations in MD & HTML serialization #295

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 27, 2025

Conversation

vagenas
Copy link
Collaborator

@vagenas vagenas commented May 16, 2025

  • included annotations in MD & HTML serialization as default (users can opt out)
    • MD: appended to text
    • HTML: appended to figcaption content
    • (DocTags currently not touched as apparently specialized annotation handling already included there)
    • exposed via param include_annotations in respective DoclingDocument methods
  • annotation handling single-sourced in common.py
    • annotation types currently handled: classification, description, & SMILES

Resolves docling-project/docling#1608

vagenas added 2 commits May 16, 2025 13:06
Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>
Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>
@vagenas vagenas requested review from cau-git and dolfim-ibm May 16, 2025 11:36
Copy link

mergify bot commented May 16, 2025

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

🟢 Require two reviewer for test updates

Wonderful, this rule succeeded.

When test data is updated, we require two reviewers

  • #approved-reviews-by >= 2

Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>
Copy link

codecov bot commented May 16, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

📢 Thoughts on this report? Let us know!

Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
@dolfim-ibm dolfim-ibm force-pushed the add-annotations-in-serialization branch from 16ec453 to a0f295b Compare May 27, 2025 08:29
dolfim-ibm
dolfim-ibm previously approved these changes May 27, 2025
Copy link
Contributor

@dolfim-ibm dolfim-ibm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

PeterStaar-IBM
PeterStaar-IBM previously approved these changes May 27, 2025
Copy link
Contributor

@PeterStaar-IBM PeterStaar-IBM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>
@vagenas vagenas dismissed stale reviews from PeterStaar-IBM and dolfim-ibm via 07a4cc6 May 27, 2025 08:39
Copy link
Contributor

@PeterStaar-IBM PeterStaar-IBM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@vagenas vagenas merged commit f067c51 into main May 27, 2025
9 checks passed
@vagenas vagenas deleted the add-annotations-in-serialization branch May 27, 2025 08:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Image descriptions into markdown file
3 participants