Skip to content

Be slightly more careful to take the first paragraph #2110

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

adamwight
Copy link

The previous algorithm was easily fooled into emitting unbalanced tags, so the logic is tightened up with a more specific regex.

Overall, this probably isn't doing the right thing. Consider making the synopsis more robust by using eg. Floki in production.

Closes #2108 , sort of...

The previous algorithm was easily fooled into emitting unbalanced
tags, so the logic is tightened up with a more specific regex.

Overall, this probably isn't doing the right thing.  Consider making
the synopsis more robust by using eg. Floki in production.

Closes elixir-lang#2108 , sort of...
Regex.run(~r|<p>((?:(?!</p>).)*?):*</p>|, doc)
|> case do
nil -> doc
[_, first_paragraph_text] -> "<p>" <> first_paragraph_text <> "</p>"
end
Copy link
Member

@josevalim josevalim Apr 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are kind of assuming that it starts with <p>, so perhaps I would enforce that instead?

case String.trim_leading(doc) do
  "<p>" <> _ = doc ->
    doc =
      case :binary.split(doc, "</p>") do
        [left, _] -> String.trim_trailing(left, ":") <> "</p>"
        [all] -> all
      end

    Regex.replace(~r|(<[^>]*) id="[^"]*"([^>]*>)|, doc, ~S"\1\2", [])

  _ ->
    nil
end

Alternatively we capture the synopsis over the AST instead of the rendered document. This would also make it easier to remove links and what not.

@josevalim josevalim closed this in 8552c49 Apr 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Bug: unbalanced <li> tag emitted in synopsis
2 participants