-
Notifications
You must be signed in to change notification settings - Fork 11.9k
server: feature Add Admin key parameter for slots/health/metrics #5837
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Can we update the readme so that it indicates which endpoints require which kind of key? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recall that slots data can be turned off in health endoints using --slots-endpoint-disable
.
OK to add admin keys on health but please implement a test scenario in security.feature.
@@ -2842,7 +2888,10 @@ int main(int argc, char **argv) | |||
} | |||
|
|||
if (sparams.metrics_endpoint) { | |||
svr.Get("/metrics", [&](const httplib::Request&, httplib::Response& res) { | |||
svr.Get("/metrics", [&](const httplib::Request& req, httplib::Response& res) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/metrics must not be protected. It does not contain data and it targets prometheus which does not support authentication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the purpose of the key
query param.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you link to a security protocol this query param implements ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you mean. This is just passing in the same api/admin key in through ?key=
instead of the authorization header for ease in configuration. It's the same as the authorization header otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kindly, I meant where have you seen we can pass secret as query param in a protocol ? I am wondering if it is a security issue.
For example, in oauth2 implicit grant flow, this has been deprecated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any form of authentication via a URL Query Param is considered bad practice and is inherently unsafe, even when using TLS/SSL.
There's always a "gotcha" somewhere that exposes the authentication method. This is why it's usually done with a bearer token or jwt, packaged as a header or body payload, and then passed via an encrypted tunnel.
You can find this all on OWASP and within the RFC Specs. This has to do with underlying mechanics of how GET and POST requests work.
For a real authy method, you would use POST and not expose the Auth tokens via Query Params.
Edit: It took me a bit to find it. They updated it since I last read it.
TLDR; Query parameters enable injection attacks.
There's way more in-depth stuff that exploits query parameters. It's beginner security stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The question was the opposite. We all know that this is not security here. That's why this PR is open... There is no need for beginners' explanations
All these security features are quite useless: users are sharing keys, API Key must work with an IDP/API Gateway. IMHO the server must not be exposed directly to users in a real multi users setup but be protected behind a reverse proxy. |
Agreed that this is likely the wrong long-term approach. In my case, it is running behind a naive reverse proxy with no request blocking/filtering. Given the presence of the api keys in server.cpp, this seemed like the most direct approach to allow two roles so that slots inspection could be preserved and protected. |
} | ||
|
||
// Check for API key in the params | ||
auto auth_param = req.get_param_value("key"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Query params are in clear on all http traffic scanners. I feel this hack is a security breach.
Monitoring endpoints must simply not be exposed to the world, or at least slots data simply can be disabled. They are here for debug purpose during development only.
@ggerganov I would prefer we invest on a real security protocol like Oauth2/openid than adding this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but authorization headers are also passed in cleartext without https. Given the nature of the api keys as they are at the moment this is no more or less secure than the current solution.
Long term, specifying keys on the command line and passing them directly is not the correct solution.
This is just a quick way to get some level of security on those endpoints in the server as it exists now until a proper (jwt would make sense to me) solution is in place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am speaking about logging. Authorization header is never logged. Query params are, example:
But all http monitoring tool will do the same.
Note: JWT is not a security protocol, just a transparent token often used in oauth2.
@robeyh Thank you for the contribution. However, I recommend that we focus on this sort of functionality at a later stage. In hindsight, the API key stuff was also not a great idea to add in the first place, but I thought it could serve as a basic demonstration. The problem is that people start to use Hopefully, we will soon get to a state where we can implement such kind of features since they are important for real-world applications, but in the short-term, I would consider these items with low priority. Hope this makes sense
@phymbert Sure. We can add the "demo" label to this PR to serve as an example and implement this feature in the future |
Makes perfect sense to me. I just threw this together as it solved a problem I was having in prototyping. Wanted to offer it if it was useful to others to be able to isolate the slots endpoint. I imagine that future development will be a more robust solution, but feel free to ping me if you'd like me to clean this code up at a later date. |
@@ -2060,6 +2061,7 @@ static void server_print_usage(const char *argv0, const gpt_params ¶ms, | |||
printf(" --host ip address to listen (default (default: %s)\n", sparams.hostname.c_str()); | |||
printf(" --port PORT port to listen (default (default: %d)\n", sparams.port); | |||
printf(" --path PUBLIC_PATH path from which to serve static files (default %s)\n", sparams.public_path.c_str()); | |||
printf(" --admin-key ADMIN_KEY optional admin key to enhance server security. If set, requests to admin endpoints must include this key.\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OWASP recommends "denying by default". It's a PITA, but the flag should be the inverse. If you want to toggle security measures, they should be on by default and the flag should be set to disable them for whatever purposes, e.g. testing, local usage, etc. Not sure how it should be handled in llama.cpp or local usage, but this matters in production.
The API key solely protects the completion/infill endpoints. Adding an admin key that protects the slots/metrics endpoints is important when these endpoints are turned on. Without this, anyone can hit the slots endpoint and snoop on the prompts that other users are submitting.
This is quick, mostly replicating the api_key feature. It shifts to a generalized
validate_key
function that takes in a vector of strings to check against. The new validation function checks the headers, but then also the query params for a key (namedkey
in all cases).