Some edge cases in `email.utils.parsedate_to_datetime` seem to differ from RFC2822 spec

# Bug report

### Bug description:

While tinkering around with `email.utils.parsedate_to_datetime`, I found some behavior that may be worth adjusting.

##  1. low-number years aren't handled according to spec:

> The year is any numeric year 1900 or later. [section 3.3]
>
> [section 4.3] The syntax for the obsolete date format allows a 2 digit year.
> [..]
> Where a two or three digit year occurs in a date, the year is to be
> interpreted as follows: If a two digit year is encountered whose
> value is between 00 and 49, the year is interpreted by adding 2000,
> ending up with a value between 2000 and 2049.  If a two digit year is
> encountered with a value between 50 and 99, or any three digit year
> is encountered, the year is interpreted by adding 1900.

```python
>>> parsedate_to_datetime("Sat, 15 Aug 0001 23:12:09 +0500")
datetime.datetime(2001, 8, 15, 23, 12, 9, ...)
```

expected: either year 1, or a parsing failure. Neither the new or old format interpret 4-digit years this way.

## 2. offset minutes larger than 59 don't lead to parsing failure

```python
>>> parsedate_to_datetime('Sat, 15 Aug 0001 23:12:09 +0590')
datetime.datetime(2001, 8, 15, 23, 12, 9, tzinfo=datetime.timezone(datetime.timedelta(seconds=23400)))
```

expected: parse failure. Instead, the "90 minutes" component is parsed without issue (0590 being equal to 0630). The spec is actually not explicit about this, although _"A date-time specification MUST be semantically valid"_. Note that a "90" value as minute in the time component _does_ give the appropriate parsing failure. 

Note: `datetime.fromisoformat()` has the same behavior. Also in this case, I can't determine whether ISO8601 explicitly disallows it. RFC3339 _is_ clear on _disallowing_ this.

## 3. Invalid day-of-week doesn't lead to parsing failure

```python
>>> parsedate_to_datetime('Sun, 15 Aug 0001 23:12:09 +0520')  # actually a saturday
```

expected: parsing failure

> A date-time specification MUST be semantically valid.  That is, the
> day-of-the-week (if included) MUST be the day implied by the date,

## 4. Non-ASCII digits don't lead to parsing failure

If I'm reading the RFC correctly, only ASCII characters are valid.

```python
>>> parsedate_to_datetime('Sat, 15 Aug 01 𝟚𝟛:𝟝𝟛:𝟛𝟛 +0500')  # note the fancy numbers
datetime.datetime(2001, 8, 15, 23, 53, 33, ...)
```

expected: parsing failure

## 5. Handling of the `-0000` case may be inconsistent with drive to eliminate the practice of "naive UTC" datetimes.

Lately, the `datetime` module appears to discourage the usage of naive datetimes to mean UTC, as evidenced by the deprecation of `utcnow()` and other methods.

However, `parsedate_to_datetime` will return a _naive_ datetime in the `-0000` case.

```python
>>> parsedate_to_datetime("Sat, 15 Aug 01 23:53:33 -0000")
datetime.datetime(2001, 8, 15, 23, 53, 33)
```

expected: `tzinfo=UTC`

The spec says:

> "-0000" also indicates Universal Time, it is
> used to indicate that the time was generated on a system that may be
> in a local time zone other than Universal Time and therefore
> indicates that the date-time contains no information about the local
> time zone.

The spec again is a bit fuzzy, but my reading here is that `-0000` means "UTC, with no offset known". In contrast, `+0000`  means "UTC offset known to be 0". My impression would be that only _omission_ of the offset should result in a naive datetime. What do you think?


### CPython versions tested on:

3.13

### Operating systems tested on:

macOS

edit: typo


### Linked PRs
* gh-134311
* gh-134350
* gh-134438

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Some edge cases in `email.utils.parsedate_to_datetime` seem to differ from RFC2822 spec #126845

Bug report

Bug description:

1. low-number years aren't handled according to spec:

2. offset minutes larger than 59 don't lead to parsing failure

3. Invalid day-of-week doesn't lead to parsing failure

4. Non-ASCII digits don't lead to parsing failure

5. Handling of the `-0000` case may be inconsistent with drive to eliminate the practice of "naive UTC" datetimes.

CPython versions tested on:

Operating systems tested on:

Linked PRs

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Some edge cases in email.utils.parsedate_to_datetime seem to differ from RFC2822 spec #126845

Description

Bug report

Bug description:

1. low-number years aren't handled according to spec:

2. offset minutes larger than 59 don't lead to parsing failure

3. Invalid day-of-week doesn't lead to parsing failure

4. Non-ASCII digits don't lead to parsing failure

5. Handling of the -0000 case may be inconsistent with drive to eliminate the practice of "naive UTC" datetimes.

CPython versions tested on:

Operating systems tested on:

Linked PRs

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Some edge cases in `email.utils.parsedate_to_datetime` seem to differ from RFC2822 spec #126845

5. Handling of the `-0000` case may be inconsistent with drive to eliminate the practice of "naive UTC" datetimes.