Skip to content

socket.getfqdn() UnicodeDecodeError depending on LANG variable #93251

Open
@cpina

Description

@cpina

Bug report

This code:

import locale
import socket

locale.setlocale(locale.LC_ALL, '')

socket.getfqdn()

Raise an exception if running it like this:

LANG=ru_RU.CP1251 /opt/Python-3.9.2/bin/python3 bug.py

Note the LANG. I haven't checked for which "LANG" this works or fails.

⚠️ : to exercise the problematic code (see comments for details on the problematic code path) the hostname should not be resolvable (so not in /etc/hosts, not resolvable via DNS or other methods up to /etc/nsswitch.conf hosts settings). The hostname, to reproduce the problem, can be changed on Linux via sudo hostname something-that-does-not-exist.

Traceback (most recent call last):
  File "/root/t/prova.py", line 7, in <module>
    socket.getfqdn()
  File "/opt/Python-3.9.2/lib/python3.9/socket.py", line 791, in getfqdn
    hostname, aliases, ipaddrs = gethostbyaddr(name)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xcd in position 0: invalid continuation byte

Your environment

Tested this on a Debian 11 bullseye with the the following Python interpreters:

  • Packaged Python 3.9.2
  • Compiled from source Python 3.9.2
  • Compiled from source Python 3.9.13
  • Compiled from source Python 3.10.4

I've encountered this bug in two independent Debian installations (with different locale settings) and in a CI system (also Debian based but unrelated settings).

Only tested in x64 systems.

Metadata

Metadata

Assignees

No one assigned

    Labels

    3.10only security fixes3.9only security fixestopic-unicodetype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions