-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
InferencePool config proposal for API review #162
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ahg-g The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This is intended to be post-v0.1, correct? |
api/v1alpha1/inferencepool_types.go
Outdated
} | ||
|
||
// ExtensionRef is a reference to the extension deployment. | ||
type ExtensionRef struct { | ||
// A selector for the pods that run the deployment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Need to support extensions possibly running outside the cluster
- Need to support "localhost" to run extensions as a sidecar
api/v1alpha1/inferencepool_types.go
Outdated
// A selector for the pods that run the deployment. | ||
// | ||
// +kubebuilder:validation:Required | ||
Selector map[string]string `json:"selector"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assuming sidecar here, is sidecar in the gateway pod, if it looks like a backendRef
, you could route back to the Gateway Service and have the sidecar listen on a different port, and other implementations like Envoy Gateway could use the custom Backend
Resource to backendRef
to route via UDS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, sidecar on the gateway pod, that's a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, so we should be good with a service reference here, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahg-g something that looks like https://github.com/kubernetes-sigs/gateway-api/blob/50cf145b0dd1849ebd03c5866a92797862f23efe/apis/v1/object_reference_types.go#L96 should solve the Service
case and also allow for other custom backend types
@ahg-g: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Proposal at https://docs.google.com/document/d/1RKVFAgerUoIMhClU9gRduXoF8Hq2tzzVOHdPvkQPKgs/edit?tab=t.0
/hold