server: feature Add Admin key parameter for slots/health/metrics #5837

robeyh · 2024-03-02T20:09:58Z

The API key solely protects the completion/infill endpoints. Adding an admin key that protects the slots/metrics endpoints is important when these endpoints are turned on. Without this, anyone can hit the slots endpoint and snoop on the prompts that other users are submitting.

This is quick, mostly replicating the api_key feature. It shifts to a generalized validate_key function that takes in a vector of strings to check against. The new validation function checks the headers, but then also the query params for a key (named key in all cases).

Artefact2 · 2024-03-02T20:15:59Z

Can we update the readme so that it indicates which endpoints require which kind of key?

phymbert

Recall that slots data can be turned off in health endoints using --slots-endpoint-disable.

OK to add admin keys on health but please implement a test scenario in security.feature.

examples/server/server.cpp

phymbert · 2024-03-02T20:46:34Z

examples/server/server.cpp

@@ -2842,7 +2888,10 @@ int main(int argc, char **argv)
    }

    if (sparams.metrics_endpoint) {
-        svr.Get("/metrics", [&](const httplib::Request&, httplib::Response& res) {
+        svr.Get("/metrics", [&](const httplib::Request& req, httplib::Response& res) {


/metrics must not be protected. It does not contain data and it targets prometheus which does not support authentication.

This is the purpose of the key query param.

Could you link to a security protocol this query param implements ?

Not sure what you mean. This is just passing in the same api/admin key in through ?key= instead of the authorization header for ease in configuration. It's the same as the authorization header otherwise.

Kindly, I meant where have you seen we can pass secret as query param in a protocol ? I am wondering if it is a security issue.
For example, in oauth2 implicit grant flow, this has been deprecated.

Any form of authentication via a URL Query Param is considered bad practice and is inherently unsafe, even when using TLS/SSL.

There's always a "gotcha" somewhere that exposes the authentication method. This is why it's usually done with a bearer token or jwt, packaged as a header or body payload, and then passed via an encrypted tunnel.

You can find this all on OWASP and within the RFC Specs. This has to do with underlying mechanics of how GET and POST requests work.

For a real authy method, you would use POST and not expose the Auth tokens via Query Params.

Edit: It took me a bit to find it. They updated it since I last read it.

TLDR; Query parameters enable injection attacks.

Access Control

Query Parameterization

HTTP Authentication

Parameter Pollution

There's way more in-depth stuff that exploits query parameters. It's beginner security stuff.

The question was the opposite. We all know that this is not security here. That's why this PR is open... There is no need for beginners' explanations

phymbert · 2024-03-02T20:52:25Z

All these security features are quite useless: users are sharing keys, API Key must work with an IDP/API Gateway. IMHO the server must not be exposed directly to users in a real multi users setup but be protected behind a reverse proxy.

robeyh · 2024-03-02T21:07:51Z

Agreed that this is likely the wrong long-term approach.

In my case, it is running behind a naive reverse proxy with no request blocking/filtering. Given the presence of the api keys in server.cpp, this seemed like the most direct approach to allow two roles so that slots inspection could be preserved and protected.

phymbert · 2024-03-02T21:15:47Z

examples/server/server.cpp

+        }
+
+        // Check for API key in the params
+        auto auth_param = req.get_param_value("key");


Query params are in clear on all http traffic scanners. I feel this hack is a security breach.
Monitoring endpoints must simply not be exposed to the world, or at least slots data simply can be disabled. They are here for debug purpose during development only.
@ggerganov I would prefer we invest on a real security protocol like Oauth2/openid than adding this.

Right, but authorization headers are also passed in cleartext without https. Given the nature of the api keys as they are at the moment this is no more or less secure than the current solution.

Long term, specifying keys on the command line and passing them directly is not the correct solution.

This is just a quick way to get some level of security on those endpoints in the server as it exists now until a proper (jwt would make sense to me) solution is in place.

I am speaking about logging. Authorization header is never logged. Query params are, example:

https://github.com/ggerganov/llama.cpp/blob/9731134296af3a6839cd682e51d9c2109a871de5/examples/server/server.cpp#L2691-L2705

But all http monitoring tool will do the same.

Note: JWT is not a security protocol, just a transparent token often used in oauth2.

ggerganov · 2024-03-03T08:32:24Z

@robeyh Thank you for the contribution. However, I recommend that we focus on this sort of functionality at a later stage. server needs to first become more robust and we need to fix all inference and performance related problems that we currently have, since it's primary purpose is to demonstrate usage of the llama library. Security mechanism are not a high concern atm

In hindsight, the API key stuff was also not a great idea to add in the first place, but I thought it could serve as a basic demonstration. The problem is that people start to use server as a full-fledged application and are having higher expectations for it's functionality. Security features like this one and the CORS headers are distracting the maintainers from the more pressing issues that need to be resolved first

Hopefully, we will soon get to a state where we can implement such kind of features since they are important for real-world applications, but in the short-term, I would consider these items with low priority. Hope this makes sense

@ggerganov I would prefer we invest on a real security protocol like Oauth2/openid than adding this.

@phymbert Sure. We can add the "demo" label to this PR to serve as an example and implement this feature in the future

robeyh · 2024-03-03T15:35:17Z

Makes perfect sense to me. I just threw this together as it solved a problem I was having in prototyping. Wanted to offer it if it was useful to others to be able to isolate the slots endpoint.

I imagine that future development will be a more robust solution, but feel free to ping me if you'd like me to clean this code up at a later date.

teleprint-me · 2024-03-11T21:59:36Z

examples/server/server.cpp

@@ -2060,6 +2061,7 @@ static void server_print_usage(const char *argv0, const gpt_params &params,
    printf("  --host                    ip address to listen (default  (default: %s)\n", sparams.hostname.c_str());
    printf("  --port PORT               port to listen (default  (default: %d)\n", sparams.port);
    printf("  --path PUBLIC_PATH        path from which to serve static files (default %s)\n", sparams.public_path.c_str());
+    printf("  --admin-key ADMIN_KEY     optional admin key to enhance server security. If set, requests to admin endpoints must include this key.\n");


OWASP recommends "denying by default". It's a PITA, but the flag should be the inverse. If you want to toggle security measures, they should be on by default and the flag should be set to disable them for whatever purposes, e.g. testing, local usage, etc. Not sure how it should be handled in llama.cpp or local usage, but this matters in production.

Add Admin key param and generalize key check

ebc1dec

robeyh changed the title ~~server: feature Add Admin key parameter for slots/health/metric~~ server: feature Add Admin key parameter for slots/health/metrics Mar 2, 2024

phymbert requested changes Mar 2, 2024

View reviewed changes

examples/server/server.cpp Outdated Show resolved Hide resolved

robeyh added 2 commits March 2, 2024 12:35

Fixed spacing/removed errant tab

5507220

Added admin-key param, and added endpoints to api-key description.

17dfcde

phymbert reviewed Mar 2, 2024

View reviewed changes

teleprint-me reviewed Mar 11, 2024

View reviewed changes

ggerganov added the demo Demonstrate some concept or idea, not intended to be merged label Mar 12, 2024

mofosyne added Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level server/api server labels May 10, 2024

server: feature Add Admin key parameter for slots/health/metrics #5837

Are you sure you want to change the base?

server: feature Add Admin key parameter for slots/health/metrics #5837

Uh oh!

Conversation

robeyh commented Mar 2, 2024

Uh oh!

Artefact2 commented Mar 2, 2024

Uh oh!

phymbert left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

teleprint-me Mar 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

phymbert commented Mar 2, 2024

Uh oh!

robeyh commented Mar 2, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ggerganov commented Mar 3, 2024

Uh oh!

robeyh commented Mar 3, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

teleprint-me Mar 11, 2024 •

edited

Loading