-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds Health gRPC Server and Refactors Main() #148
base: main
Are you sure you want to change the base?
Conversation
/lgtm |
pkg/ext-proc/backend/datastore.go
Outdated
} | ||
ready = true | ||
return false | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At startup, I think we want to ensure that the extension did a sync with the api server and fetched the models, but not declare itself ready only if at least one model is defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The health probe now uses a client to check the API server for the configured InferencePool and that at least one InferenceModel exists in the same namespace. Should this probe also check that at least one InferenceModel references the configured InferencePool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that the health check needs to block on at least one InferenceModel. On the other hand, since extension is currently 1:1 with InferencePool, I think it makes sense to ensure that the extension successfully initialized the assigned InferencePool.
New changes are detected. LGTM label has been removed. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: danehans The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
- Introduced a health gRPC server to handle liveness and readiness probes. - Refactored main() to manage server goroutines. - Added graceful shutdown for servers and controller manager. - Improved logging consistency and ensured. - Validates CLI flags. Signed-off-by: Daneyon Hansen <[email protected]>
// Ensure at least 1 InferenceModel | ||
if len(modelList.Items) == 0 { | ||
return fmt.Errorf("no InferenceModels exist in namespace %s", *poolNamespace) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure this is necessary.
*targetPodHeader, | ||
) | ||
|
||
// Wait for first error from any goroutine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or the controller manager returning gracefully
client.Client | ||
} | ||
|
||
func (s *healthServer) Check(ctx context.Context, in *healthPb.HealthCheckRequest) (*healthPb.HealthCheckResponse, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can check that the datastore populated the inference pool instead of actually doing a pull from the server?
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Adds a health gRPC Server and refactors
main()
for better lifecycle management:Fixes #96
Fixes #175