ggml : fix race-condition in ggml-rpc #13600

gkpln3 · 2025-05-17T09:07:05Z

No description provided.

rgerganov · 2025-05-17T11:17:36Z

The RPC backend is always called in a single thread. What kind of crashes do you observe?

gkpln3 · 2025-05-17T11:21:10Z

I'm adding RPC support to Ollama, which depends on this feature. In Ollama, it's called from multiple Go threads, leading to race conditions and crashes. I assumed ggml-rpc.cpp was intended to be thread-safe, given the use of mutexes elsewhere in the file.

rgerganov · 2025-05-17T11:39:06Z

While we use std::mutex in few places, the RPC backend is not considered thread-safe. I think that ggml backends in general are not required to be thread-safe by design, @slaren please correct me if I am wrong.

I'd suggest to add proper synchronization in the Go code.

slaren · 2025-05-17T11:43:17Z

Most ggml-backend objects are not intended to be thread safe, however it should still be possible to use different objects in different threads simultaneously. The RPC client reuses the same socket for multiple objects, so the synchronization here seems necessary.

gkpln3 · 2025-05-25T07:51:11Z

Talking specifically about Ollama for a second: adding synchronization at the Go layer is sub-optimal since we do benefit from parallelization on other backends, the RPC is the only one that doesn't support it well (without the following fix).

ggerganov · 2025-05-25T07:58:07Z

ggml/src/ggml-rpc/ggml-rpc.cpp

+    static std::mutex send_rpc_cmd_mutex;
+    std::lock_guard<std::mutex> lock(send_rpc_cmd_mutex);


Instead of a global mutex, probably it's better to associate it with the specific instance of the socket. Consider moving the mutex to struct socket_t.

Good idea, let me try that.

Looks good, I've updated the PR.

rgerganov · 2025-05-25T10:41:50Z

ggml/src/ggml-rpc/ggml-rpc.cpp

@@ -51,6 +51,7 @@ struct socket_t {
        close(this->fd);
 #endif
    }
+    std::mutex send_rpc_cmd_mutex;


I think it makes it sense to use this mutex for both sending and receiving data, so better rename it to sock_mutex

Would it make sense for multiple threads to read from the same socket? Sounds like it would introduce new issues

My assumption was that if 3rd party clients need multi-threaded send, they would also need multi-threaded receive. If this is not the case, you can leave it as is, just rename it to send_mutex

rgerganov · 2025-05-25T11:13:53Z

ggml/src/ggml-rpc/ggml-rpc.cpp

@@ -386,6 +387,7 @@ static bool parse_endpoint(const std::string & endpoint, std::string & host, int
 // RPC request : | rpc_cmd (1 byte) | request_size (8 bytes) | request_data (request_size bytes) |
 // No response
 static bool send_rpc_cmd(const std::shared_ptr<socket_t> & sock, enum rpc_cmd cmd, const void * input, size_t input_size) {
+    std::lock_guard<std::mutex> lock(sock->send_rpc_cmd_mutex);


you should also acquire the lock in the overloaded send_rpc_cmd function below to prevent multiple threads reading from the socket at the same time;

you may also need to use std::recursive_mutex for this to work

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label May 17, 2025

gkpln3 force-pushed the master branch from cbe2c46 to 5fb8768 Compare May 17, 2025 09:09

ggml : Fixed race-condition that leads to crashes when using ggml-rpc

e1e8e91

gkpln3 force-pushed the master branch from 5fb8768 to e1e8e91 Compare May 17, 2025 09:10

gkpln3 changed the title ~~ggml : Fixed race-condition that leads to crashes when using ggml-rpc~~ ggml : fix race-condition that leads to crashes when using ggml-rpc May 17, 2025

gkpln3 changed the title ~~ggml : fix race-condition that leads to crashes when using ggml-rpc~~ ggml : fix race-condition in ggml-rpc May 17, 2025

ggerganov reviewed May 25, 2025

View reviewed changes

Refactor RPC command mutex handling to use socket-specific mutex

f33b76d

rgerganov reviewed May 25, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ggml : fix race-condition in ggml-rpc #13600

ggml : fix race-condition in ggml-rpc #13600

gkpln3 commented May 17, 2025

Uh oh!

rgerganov commented May 17, 2025

Uh oh!

gkpln3 commented May 17, 2025 •

edited

Loading

Uh oh!

rgerganov commented May 17, 2025

Uh oh!

slaren commented May 17, 2025

Uh oh!

gkpln3 commented May 25, 2025

Uh oh!

ggerganov May 25, 2025

Uh oh!

gkpln3 May 25, 2025

Uh oh!

gkpln3 May 25, 2025

Uh oh!

rgerganov May 25, 2025

Uh oh!

gkpln3 May 25, 2025 •

edited

Loading

Uh oh!

rgerganov May 25, 2025 •

edited

Loading

Uh oh!

rgerganov May 25, 2025

Uh oh!

Uh oh!

		static std::mutex send_rpc_cmd_mutex;
		std::lock_guard<std::mutex> lock(send_rpc_cmd_mutex);

ggml : fix race-condition in ggml-rpc #13600

Are you sure you want to change the base?

ggml : fix race-condition in ggml-rpc #13600

Conversation

gkpln3 commented May 17, 2025

Uh oh!

rgerganov commented May 17, 2025

Uh oh!

gkpln3 commented May 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rgerganov commented May 17, 2025

Uh oh!

slaren commented May 17, 2025

Uh oh!

gkpln3 commented May 25, 2025

Uh oh!

ggerganov May 25, 2025

Choose a reason for hiding this comment

Uh oh!

gkpln3 May 25, 2025

Choose a reason for hiding this comment

Uh oh!

gkpln3 May 25, 2025

Choose a reason for hiding this comment

Uh oh!

rgerganov May 25, 2025

Choose a reason for hiding this comment

Uh oh!

gkpln3 May 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rgerganov May 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rgerganov May 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gkpln3 commented May 17, 2025 •

edited

Loading

gkpln3 May 25, 2025 •

edited

Loading

rgerganov May 25, 2025 •

edited

Loading