Skip to content

Strange and inconsistent parsing of lists with headers and multiple lines #1433

Closed
@Andre601

Description

@Andre601

A strange parsing behaviour can be observed with the markdown library when used on Lists containing headers and multiple lines.

Problem

When a list entry contains a header and subsequent text using indents of 4 spaces, will the first entry render fine.
However, any subsequent entry will have their text after the header be rendered as code blocks, due to the behaviour of turning lines with 4 spaces indent into code blocks.

This behaviour strangely enough can only be observed if there is a gap in-between two list entries. If they are right after each other will the result be as expected (That being text rendering as normal paragraph).
Even stranger is the behaviour with any additional paragraphs. Should a list entry have more than one paragraph, meaning there is a gap between the first and second text after the header, will it render the second paragraph fine, will still having this odd behaviour on the first.

See my tests below for possible results.

There are fixes/workarounds for this.

The first one being to have no gaps in-between the list entries. This is the easiest in terms of keeping consistency, but may worsen readability of the raw content.

The second option is to start the first paragraph with 2 indents instead of 4. This may cause visual inconsistencies in the raw text if an entry has more than one paragraph, by having different indents, since the second paragraph needs to have 4 indents or else it won't be included as part of the list entry.


Tests

These tests were all made using python -m markdown test.txt with the test.txt file containing the below displayed markdown content.

Test 1

Base-line test showing the issue.

Input:

- ### List 1
    Entry 1.1

- ### List 2
    Entry 2.1

- ### List 3
    Entry 3.1

Output:

<ul>
<li>
<h3>List 1</h3>
<p>Entry 1.1</p>
</li>
<li>
<h3>List 2</h3>
<pre><code>Entry 2.1
</code></pre>
</li>
<li>
<h3>List 3</h3>
<pre><code>Entry 3.1
</code></pre>
</li>
</ul>

Test 2

Test with multiple paragraphs

Input:

- ### List 1
    Entry 1.1
    
    Entry 1.2

- ### List 2
    Entry 2.1
    
    Entry 2.2

- ### List 3
    Entry 3.1
    
    Entry 3.2

Output:

<ul>
<li>
<h3>List 1</h3>
<p>Entry 1.1</p>
<p>Entry 1.2</p>
</li>
<li>
<h3>List 2</h3>
<pre><code>Entry 2.1
</code></pre>
<p>Entry 2.2</p>
</li>
<li>
<h3>List 3</h3>
<pre><code>Entry 3.1
</code></pre>
<p>Entry 3.2</p>
</li>
</ul>

Test 3

Test 1, but with spaces between entries removed.

Input:

- ### List 1
    Entry 1.1
- ### List 2
    Entry 2.1
- ### List 3
    Entry 3.1

Output:

<ul>
<li>
<h3>List 1</h3>
<p>Entry 1.1</p>
</li>
<li>
<h3>List 2</h3>
<p>Entry 2.1</p>
</li>
<li>
<h3>List 3</h3>
<p>Entry 3.1</p>
</li>
</ul>

Test 4

Test 2, but with indents adjusted for first paragraph (Note the render issue with Entry 1).

Input:

- ### List 1
  Entry 1.1
    
    Entry 1.2

- ### List 2
  Entry 2.1
    
    Entry 2.2

- ### List 3
  Entry 3.1
    
    Entry 3.2

Output:

<ul>
<li>
<h3>List 1</h3>
  Entry 1.1<p>Entry 1.2</p>
</li>
<li>
<h3>List 2</h3>
<p>Entry 2.1</p>
<p>Entry 2.2</p>
</li>
<li>
<h3>List 3</h3>
<p>Entry 3.1</p>
<p>Entry 3.2</p>
</li>
</ul>

Test 5

Test 4, but first entry does not have its indents adjusted.

Input:

- ### List 1
    Entry 1.1
    
    Entry 1.2

- ### List 2
  Entry 2.1
    
    Entry 2.2

- ### List 3
  Entry 3.1
    
    Entry 3.2

Output:

<ul>
<li>
<h3>List 1</h3>
<p>Entry 1.1</p>
<p>Entry 1.2</p>
</li>
<li>
<h3>List 2</h3>
<p>Entry 2.1</p>
<p>Entry 2.2</p>
</li>
<li>
<h3>List 3</h3>
<p>Entry 3.1</p>
<p>Entry 3.2</p>
</li>
</ul>

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugBug report.confirmedConfirmed bug report or approved feature request.coreRelated to the core parser code.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions