Skip to content

Data race in PyUnicode_AsUTF8AndSize under free-threading #128013

Closed
@hawkinsp

Description

@hawkinsp

Bug report

Bug description:

Repro: build this C extension (race.so) with free threading enabled, with a CPython built with thread-sanitizer enabled:

#define PY_SSIZE_T_CLEAN
#include <Python.h>

static PyObject *ConvertStr(PyObject *self, PyObject *arg) {
  Py_ssize_t size;
  const char *str = PyUnicode_AsUTF8AndSize(arg, &size);
  return Py_None;
}

static PyMethodDef race_methods[] = {
    {"convert_str", ConvertStr, METH_O, "Converts a string to utf8",},
    {NULL, NULL, 0, NULL}};

static struct PyModuleDef race_module = {
    PyModuleDef_HEAD_INIT, "race",
    NULL, -1,
    race_methods};

#define EXPORT_SYMBOL __attribute__ ((visibility("default")))

EXPORT_SYMBOL PyMODINIT_FUNC PyInit_race(void) {
  PyObject *module = PyModule_Create(&race_module);
  if (module == NULL) {
    return NULL;
  }
  PyUnstable_Module_SetGIL(module, Py_MOD_GIL_NOT_USED);
  return module;
}

and run this python module:

import concurrent.futures
import threading

import race

num_threads = 8

b = threading.Barrier(num_threads)
def closure():
  b.wait()
  print("start")
  for _ in range(10000):
    race.convert_str("😊")

with concurrent.futures.ThreadPoolExecutor(max_workers=num_threads) as executor:
  for _ in range(num_threads):
    executor.submit(closure)

I built the module with -fsanitize=thread (clang-18 -fsanitize=thread t.c -shared -o race.so -I ~/p/cpython-tsan/include/python3.13t/) although I doubt it matters a whole lot.

After running it a few times on my machine, I received the following thread-sanitizer report:

WARNING: ThreadSanitizer: data race (pid=2939235)
  Write of size 8 at 0x7f4b601ebd98 by thread T3:
    #0 unicode_fill_utf8 /usr/local/google/home/phawkins/p/cpython/Objects/unicodeobject.c:5445:37 (python3.13+0x323820) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #1 PyUnicode_AsUTF8AndSize /usr/local/google/home/phawkins/p/cpython/Objects/unicodeobject.c:4066:13 (python3.13+0x323820)
    #2 ConvertStr t.c (race.so+0x1205) (BuildId: 2ca767157d7177c993bad36fb4e26c7315893616)
    #3 cfunction_vectorcall_O /usr/local/google/home/phawkins/p/cpython/Objects/methodobject.c:512:24 (python3.13+0x28a4b5) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #4 _PyObject_VectorcallTstate /usr/local/google/home/phawkins/p/cpython/./Include/internal/pycore_call.h:168:11 (python3.13+0x1eafaa) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #5 PyObject_Vectorcall /usr/local/google/home/phawkins/p/cpython/Objects/call.c:327:12 (python3.13+0x1eafaa)
    #6 _PyEval_EvalFrameDefault /usr/local/google/home/phawkins/p/cpython/Python/generated_cases.c.h:813:23 (python3.13+0x3e24fb) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #7 _PyEval_EvalFrame /usr/local/google/home/phawkins/p/cpython/./Include/internal/pycore_ceval.h:119:16 (python3.13+0x3de62a) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #8 _PyEval_Vector /usr/local/google/home/phawkins/p/cpython/Python/ceval.c:1811:12 (python3.13+0x3de62a)
    #9 _PyFunction_Vectorcall /usr/local/google/home/phawkins/p/cpython/Objects/call.c (python3.13+0x1eb61f) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #10 _PyObject_VectorcallTstate /usr/local/google/home/phawkins/p/cpython/./Include/internal/pycore_call.h:168:11 (python3.13+0x1ef5ef) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #11 method_vectorcall /usr/local/google/home/phawkins/p/cpython/Objects/classobject.c:70:20 (python3.13+0x1ef5ef)
    #12 _PyVectorcall_Call /usr/local/google/home/phawkins/p/cpython/Objects/call.c:273:16 (python3.13+0x1eb293) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #13 _PyObject_Call /usr/local/google/home/phawkins/p/cpython/Objects/call.c:348:16 (python3.13+0x1eb293)
    #14 PyObject_Call /usr/local/google/home/phawkins/p/cpython/Objects/call.c:373:12 (python3.13+0x1eb315) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #15 thread_run /usr/local/google/home/phawkins/p/cpython/./Modules/_threadmodule.c:337:21 (python3.13+0x564292) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #16 pythread_wrapper /usr/local/google/home/phawkins/p/cpython/Python/thread_pthread.h:243:5 (python3.13+0x4bd637) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)

  Previous read of size 8 at 0x7f4b601ebd98 by thread T7:
    #0 PyUnicode_AsUTF8AndSize /usr/local/google/home/phawkins/p/cpython/Objects/unicodeobject.c:4075:18 (python3.13+0x3236cc) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #1 ConvertStr t.c (race.so+0x1205) (BuildId: 2ca767157d7177c993bad36fb4e26c7315893616)
    #2 cfunction_vectorcall_O /usr/local/google/home/phawkins/p/cpython/Objects/methodobject.c:512:24 (python3.13+0x28a4b5) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #3 _PyObject_VectorcallTstate /usr/local/google/home/phawkins/p/cpython/./Include/internal/pycore_call.h:168:11 (python3.13+0x1eafaa) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #4 PyObject_Vectorcall /usr/local/google/home/phawkins/p/cpython/Objects/call.c:327:12 (python3.13+0x1eafaa)
    #5 _PyEval_EvalFrameDefault /usr/local/google/home/phawkins/p/cpython/Python/generated_cases.c.h:813:23 (python3.13+0x3e24fb) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #6 _PyEval_EvalFrame /usr/local/google/home/phawkins/p/cpython/./Include/internal/pycore_ceval.h:119:16 (python3.13+0x3de62a) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #7 _PyEval_Vector /usr/local/google/home/phawkins/p/cpython/Python/ceval.c:1811:12 (python3.13+0x3de62a)
    #8 _PyFunction_Vectorcall /usr/local/google/home/phawkins/p/cpython/Objects/call.c (python3.13+0x1eb61f) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #9 _PyObject_VectorcallTstate /usr/local/google/home/phawkins/p/cpython/./Include/internal/pycore_call.h:168:11 (python3.13+0x1ef5ef) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #10 method_vectorcall /usr/local/google/home/phawkins/p/cpython/Objects/classobject.c:70:20 (python3.13+0x1ef5ef)
    #11 _PyVectorcall_Call /usr/local/google/home/phawkins/p/cpython/Objects/call.c:273:16 (python3.13+0x1eb293) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #12 _PyObject_Call /usr/local/google/home/phawkins/p/cpython/Objects/call.c:348:16 (python3.13+0x1eb293)
    #13 PyObject_Call /usr/local/google/home/phawkins/p/cpython/Objects/call.c:373:12 (python3.13+0x1eb315) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #14 thread_run /usr/local/google/home/phawkins/p/cpython/./Modules/_threadmodule.c:337:21 (python3.13+0x564292) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)
    #15 pythread_wrapper /usr/local/google/home/phawkins/p/cpython/Python/thread_pthread.h:243:5 (python3.13+0x4bd637) (BuildId: 9c1c16fb1bb8a435fa6fa4c6944da5d41f654e96)

I'd guess that this CPython code really needs to hold a mutex:

    if (PyUnicode_UTF8(unicode) == NULL) {
        if (unicode_fill_utf8(unicode) == -1) {
            if (psize) {
                *psize = -1;
            }
            return NULL;
        }
    }

CPython versions tested on:

3.13

Operating systems tested on:

Linux

Linked PRs

Metadata

Metadata

Labels

3.13bugs and security fixes3.14bugs and security fixesinterpreter-core(Objects, Python, Grammar, and Parser dirs)topic-free-threadingtype-bugAn unexpected behavior, bug, or error

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions