Skip to content

(Possibly) invalid output of tidy -ashtml #767

Open
@hosiet

Description

@hosiet

I'm forwarding some longstanding downstream issues here, one of which is about -ashtml. Previous reports:

Test case can be found at https://bugs.debian.org/562004 with email attachment tidy.crashtest.zip (downloadable on that page) but anyway I'm attaching a copy here:
tidy.crashtest.zip

The error information

Content of test0.html:

<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="Docutils 0.6: http://docutils.sourceforge.net/" />
<title>Test</title>
<link rel="stylesheet" href="/usr/lib/pymodules/python2.5/docutils/writers/html4css1/html4css1.css" type="text/css" />
</head>
<body>
<div class="document" id="test">
<h1 class="title">Test</h1>


<p>Some text</p>
</div>
</body>
</html>

Tidy output:

$ tidy -ashtml -m test0.html
Info: Doctype given is "-//W3C//DTD XHTML 1.0 Transitional//EN"
Info: Document content looks like XHTML 1.0 Strict
No warnings or errors were found.

tidy -ashtml -m test0.html
Info: Doctype given is "-//W3C//DTD XHTML 1.0 Strict//EN"
Info: Document content looks like HTML 4.01 Strict
No warnings or errors were found.

$ tidy -ashtml -m test0.html
line 2 column 1 - Warning: missing <!DOCTYPE> declaration
line 1 column 1 - Warning: An XML declaration was detected. Did you mean to use input-xml?
Info: Document content looks like HTML5
Tidy found 2 warnings and 0 errors!

$ tidy -ashtml -m test0.html
line 1 column 1 - Warning: An XML declaration was detected. Did you mean to use input-xml?
Info: Document content looks like HTML5
Tidy found 1 warning and 0 errors!

$ tidy -ashtml -m test0.html'
> ^C
$ tidy -ashtml -m test0.html
line 1 column 1 - Warning: An XML declaration was detected. Did you mean to use input-xml?
Info: Document content looks like HTML5
Tidy found 1 warning and 0 errors!

$ tidy -ashtml -m test0.html
line 1 column 1 - Warning: An XML declaration was detected. Did you mean to use input-xml?
Info: Document content looks like HTML5
Tidy found 1 warning and 0 errors!

The problem is that when converting XHTML document with <?xml ...?> header to HTML, the <?xml ...?> line was never stripped. Besides, on third invocation the DOCTYPE was missing. Was that expected or is that a bug?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions