Skip to content

server: feature Add Admin key parameter for slots/health/metrics #5837

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

robeyh
Copy link
Contributor

@robeyh robeyh commented Mar 2, 2024

The API key solely protects the completion/infill endpoints. Adding an admin key that protects the slots/metrics endpoints is important when these endpoints are turned on. Without this, anyone can hit the slots endpoint and snoop on the prompts that other users are submitting.

This is quick, mostly replicating the api_key feature. It shifts to a generalized validate_key function that takes in a vector of strings to check against. The new validation function checks the headers, but then also the query params for a key (named key in all cases).

@robeyh robeyh changed the title server: feature Add Admin key parameter for slots/health/metric server: feature Add Admin key parameter for slots/health/metrics Mar 2, 2024
@Artefact2
Copy link
Collaborator

Can we update the readme so that it indicates which endpoints require which kind of key?

Copy link
Collaborator

@phymbert phymbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recall that slots data can be turned off in health endoints using --slots-endpoint-disable.

OK to add admin keys on health but please implement a test scenario in security.feature.

@@ -2842,7 +2888,10 @@ int main(int argc, char **argv)
}

if (sparams.metrics_endpoint) {
svr.Get("/metrics", [&](const httplib::Request&, httplib::Response& res) {
svr.Get("/metrics", [&](const httplib::Request& req, httplib::Response& res) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/metrics must not be protected. It does not contain data and it targets prometheus which does not support authentication.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the purpose of the key query param.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you link to a security protocol this query param implements ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean. This is just passing in the same api/admin key in through ?key= instead of the authorization header for ease in configuration. It's the same as the authorization header otherwise.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kindly, I meant where have you seen we can pass secret as query param in a protocol ? I am wondering if it is a security issue.
For example, in oauth2 implicit grant flow, this has been deprecated.

Copy link
Contributor

@teleprint-me teleprint-me Mar 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any form of authentication via a URL Query Param is considered bad practice and is inherently unsafe, even when using TLS/SSL.

There's always a "gotcha" somewhere that exposes the authentication method. This is why it's usually done with a bearer token or jwt, packaged as a header or body payload, and then passed via an encrypted tunnel.

You can find this all on OWASP and within the RFC Specs. This has to do with underlying mechanics of how GET and POST requests work.

For a real authy method, you would use POST and not expose the Auth tokens via Query Params.

Edit: It took me a bit to find it. They updated it since I last read it.

TLDR; Query parameters enable injection attacks.

There's way more in-depth stuff that exploits query parameters. It's beginner security stuff.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question was the opposite. We all know that this is not security here. That's why this PR is open... There is no need for beginners' explanations

@phymbert
Copy link
Collaborator

phymbert commented Mar 2, 2024

All these security features are quite useless: users are sharing keys, API Key must work with an IDP/API Gateway. IMHO the server must not be exposed directly to users in a real multi users setup but be protected behind a reverse proxy.

@robeyh
Copy link
Contributor Author

robeyh commented Mar 2, 2024

Agreed that this is likely the wrong long-term approach.

In my case, it is running behind a naive reverse proxy with no request blocking/filtering. Given the presence of the api keys in server.cpp, this seemed like the most direct approach to allow two roles so that slots inspection could be preserved and protected.

}

// Check for API key in the params
auto auth_param = req.get_param_value("key");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Query params are in clear on all http traffic scanners. I feel this hack is a security breach.
Monitoring endpoints must simply not be exposed to the world, or at least slots data simply can be disabled. They are here for debug purpose during development only.
@ggerganov I would prefer we invest on a real security protocol like Oauth2/openid than adding this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but authorization headers are also passed in cleartext without https. Given the nature of the api keys as they are at the moment this is no more or less secure than the current solution.

Long term, specifying keys on the command line and passing them directly is not the correct solution.

This is just a quick way to get some level of security on those endpoints in the server as it exists now until a proper (jwt would make sense to me) solution is in place.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am speaking about logging. Authorization header is never logged. Query params are, example:

https://github.com/ggerganov/llama.cpp/blob/9731134296af3a6839cd682e51d9c2109a871de5/examples/server/server.cpp#L2691-L2705

But all http monitoring tool will do the same.

Note: JWT is not a security protocol, just a transparent token often used in oauth2.

@ggerganov
Copy link
Member

@robeyh Thank you for the contribution. However, I recommend that we focus on this sort of functionality at a later stage. server needs to first become more robust and we need to fix all inference and performance related problems that we currently have, since it's primary purpose is to demonstrate usage of the llama library. Security mechanism are not a high concern atm

In hindsight, the API key stuff was also not a great idea to add in the first place, but I thought it could serve as a basic demonstration. The problem is that people start to use server as a full-fledged application and are having higher expectations for it's functionality. Security features like this one and the CORS headers are distracting the maintainers from the more pressing issues that need to be resolved first

Hopefully, we will soon get to a state where we can implement such kind of features since they are important for real-world applications, but in the short-term, I would consider these items with low priority. Hope this makes sense

@ggerganov I would prefer we invest on a real security protocol like Oauth2/openid than adding this.

@phymbert Sure. We can add the "demo" label to this PR to serve as an example and implement this feature in the future

@robeyh
Copy link
Contributor Author

robeyh commented Mar 3, 2024

Makes perfect sense to me. I just threw this together as it solved a problem I was having in prototyping. Wanted to offer it if it was useful to others to be able to isolate the slots endpoint.

I imagine that future development will be a more robust solution, but feel free to ping me if you'd like me to clean this code up at a later date.

@@ -2060,6 +2061,7 @@ static void server_print_usage(const char *argv0, const gpt_params &params,
printf(" --host ip address to listen (default (default: %s)\n", sparams.hostname.c_str());
printf(" --port PORT port to listen (default (default: %d)\n", sparams.port);
printf(" --path PUBLIC_PATH path from which to serve static files (default %s)\n", sparams.public_path.c_str());
printf(" --admin-key ADMIN_KEY optional admin key to enhance server security. If set, requests to admin endpoints must include this key.\n");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OWASP recommends "denying by default". It's a PITA, but the flag should be the inverse. If you want to toggle security measures, they should be on by default and the flag should be set to disable them for whatever purposes, e.g. testing, local usage, etc. Not sure how it should be handled in llama.cpp or local usage, but this matters in production.

@ggerganov ggerganov added the demo Demonstrate some concept or idea, not intended to be merged label Mar 12, 2024
@mofosyne mofosyne added Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level server/api server labels May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
demo Demonstrate some concept or idea, not intended to be merged Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level server/api server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants