Skip to content

Change in behavior caused by Nokogiri 1.13.5 #130

Closed
@CarlosCD

Description

@CarlosCD

I was trying to strip tags, and anything that could be used to build an HTML tags from data entered in a Rails application, and had a few tests related to this... It seems that changes in Nokogiri 1.13.5 have modified the Rails::Html::FullSanitizer behavior (probably a change in libxml2 to 1.9.13 - see Loofah issue below).

The difference seems to be whether some blank characters are used inside <>, close to the brackets. This changes the behavior, from removing the inside of <...> to escape the '<' and '>' characters, keeping what is inside.

Before Nokogiri 1.13.5 (this is 1.13.4):

s1 =  'Hello <world!>'
ActionView::Base.full_sanitizer.sanitize(s1)
# => "Hello "

s2 = 'Hello <... world!>'
ActionView::Base.full_sanitizer.sanitize(s2)
# => "Hello " 

s3 =  'Kitty is <-NOT-> bad!'
ActionView::Base.full_sanitizer.sanitize(s3)
# => "Kitty is  bad!" 

Nokogiri 1.13.5:

s1 =  'Hello <world!>'
ActionView::Base.full_sanitizer.sanitize(s1)
# => "Hello "

s2 = 'Hello <... world!>'
ActionView::Base.full_sanitizer.sanitize(s2)
# => "Hello &lt;... world!&gt;"

s3 =  'Kitty is <-NOT-> bad!'
ActionView::Base.full_sanitizer.sanitize(s3)
# => "Kitty is &lt;-NOT-&gt; bad!"

This issue in Loofah is probably the same: flavorjones/loofah#230 (closed).

I am not sure if this is a problem for some folks. In our case we wanted to remove any HTML that later could be used to carefully build a XSS problem, so it is not a big deal, but surprising.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions