Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xds: introduce generic xds clients xDS and LRS Client API signatures #8042

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

purnesh42H
Copy link
Contributor

@purnesh42H purnesh42H commented Jan 27, 2025

This is the next part of generic xds clients to be usable outside of grpc which builds on top of #8024. This change is adding the user API signatures for xDS and LRS client (without implementation) to communicate with xDS management server.

POC
Internal Design

RELEASE NOTES: None

@purnesh42H purnesh42H changed the title Generic xds client 2 interface xds: introduce xDS and LRS Client API signatures Jan 27, 2025
@purnesh42H purnesh42H requested review from dfawley and easwars January 27, 2025 18:50
@purnesh42H purnesh42H changed the title xds: introduce xDS and LRS Client API signatures xds: introduce generic xds client xDS and LRS Client API signatures Jan 27, 2025
Copy link

codecov bot commented Jan 27, 2025

Codecov Report

Attention: Patch coverage is 0% with 16 lines in your changes missing coverage. Please review.

Project coverage is 82.01%. Comparing base (e95a4b7) to head (e0a5652).

Files with missing lines Patch % Lines
xds/internal/clients/xdsclient/client.go 0.00% 8 Missing ⚠️
xds/internal/clients/lrsclient/client.go 0.00% 4 Missing ⚠️
xds/internal/clients/lrsclient/load_store.go 0.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8042      +/-   ##
==========================================
- Coverage   82.15%   82.01%   -0.14%     
==========================================
  Files         387      390       +3     
  Lines       39067    39081      +14     
==========================================
- Hits        32094    32051      -43     
- Misses       5643     5688      +45     
- Partials     1330     1342      +12     
Files with missing lines Coverage Δ
xds/internal/clients/config.go 97.61% <ø> (+4.43%) ⬆️
xds/internal/clients/lrsclient/client.go 0.00% <0.00%> (ø)
xds/internal/clients/lrsclient/load_store.go 0.00% <0.00%> (ø)
xds/internal/clients/xdsclient/client.go 0.00% <0.00%> (ø)

... and 14 files with indirect coverage changes

@purnesh42H purnesh42H changed the title xds: introduce generic xds client xDS and LRS Client API signatures xds: introduce generic xds clients xDS and LRS Client API signatures Jan 27, 2025
@purnesh42H purnesh42H added Area: xDS Includes everything xDS related, including LB policies used with xDS. Type: Feature New features or improvements in behavior labels Jan 27, 2025
@purnesh42H purnesh42H added this to the 1.71 Release milestone Jan 27, 2025
@purnesh42H purnesh42H force-pushed the generic-xds-client-2-interface branch 3 times, most recently from 042b236 to f8f7af5 Compare January 30, 2025 13:56
v3statuspb "github.com/envoyproxy/go-control-plane/envoy/service/status/v3"
)

// XDSClient is a full fledged client which queries a set of discovery APIs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"full fledged" -- please remove words that don't help with understanding what it does. It's just "a client which" ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

*
*/

// Package xdsclient provides implementation of the xDS client for enabling
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"an implementation"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

*/

// Package xdsclient provides implementation of the xDS client for enabling
// applications to communicate with xDS management servers.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"enabling applications to" is also unnecessarily wordy.

Compare to the http package's documentation:

// Package http provides HTTP client and server implementations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// applications to communicate with xDS management servers.
//
// It allows applications to:
// - Create xDS client instance with in-memory configurations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*instanceS

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// It allows applications to:
// - Create xDS client instance with in-memory configurations.
// - Register watches for named resources.
// - Receive resources via the ADS (Aggregated Discovery Service) stream.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the->an

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// During a race (e.g. an xDS response is received while the user is calling
// cancel()), there's a small window where the callback can be called after
// the watcher is canceled. Callers need to handle this case.
func (c *XDSClient) WatchResource(_ string, _ string, _ ResourceWatcher) (cancel func()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to fill in the names of the parameters otherwise this is not so helpful. Is vet hard-failing? Would changing to panic("not implemented") in the body of the function help?

Copy link
Contributor Author

@purnesh42H purnesh42H Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting //revive:disable:unused-parameter above the file seem to be do the trick. Though I have replaced all methods to `panic("unimplemented")

}

// WatchResource uses xDS to discover the resource associated with the provided
// resource name. The resource type url look up the resource type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The resource type url look up the" -- this isn't parsing for me.

Also please capitalize URL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified the language. Let me know if it looks better.

Comment on lines 55 to 57
// During a race (e.g. an xDS response is received while the user is calling
// cancel()), there's a small window where the callback can be called after
// the watcher is canceled. Callers need to handle this case.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't even worth mentioning. It should be obvious that before this function returns, it may not have finished doing what it was called to do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh ok. removed

@@ -0,0 +1,207 @@
/*
*
* Copyright 2024 gRPC authors.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this 2025 in all these new files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done everywhere.

*
*/

// Package lrsclient provides implementation of the LRS client for enabling
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*an implementation.

Please take another pass of the whole PR and apply the previous comments throughout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I did another pass and applied called out comments and improved details in some places. PTAL.

@dfawley dfawley assigned purnesh42H and unassigned easwars and dfawley Jan 30, 2025
@purnesh42H purnesh42H force-pushed the generic-xds-client-2-interface branch from f8f7af5 to 415b2ca Compare January 31, 2025 16:14
@purnesh42H purnesh42H requested a review from dfawley February 3, 2025 07:56
@purnesh42H purnesh42H assigned dfawley and unassigned purnesh42H Feb 3, 2025
@easwars easwars self-assigned this Feb 3, 2025
@purnesh42H
Copy link
Contributor Author

purnesh42H commented Feb 4, 2025

@dfawley i did the godoc review and made following changes (see the last commit)

  • removed the helper methods on struct that are not being used to avoid exporting them. we can add later during implementation.
  • removed square brackets from docstrings because actual param and struct field already has the link to respective object definition.

}

// NewConfig returns a new xDS client config with provided parameters.
func NewConfig(servers []clients.ServerConfig, authorities map[string]clients.Authority, node clients.Node, transport clients.TransportBuilder, resourceTypes map[string]ResourceType) Config {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unusual to have a constructor and a struct with exported fields.

Generally it's unusual to have a constructor for a config struct. I hope we won't need this at all. Let's remove it for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed for now. I only see the use case for e2e tests but we can always do all those stuff while creating new client or from test itself.

// be used only for old-style names without an authority.
Servers []clients.ServerConfig

// Authorities is a map of authority names to authority configurations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is an authority? How would the user find out? Is there a doc we can link?

Copy link
Contributor Author

@purnesh42H purnesh42H Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have linked the proto link of authority on envoy in the common config where Authority struct is present. It can be navigated from here in godoc.

Comment on lines 51 to 52
// TransportBuilder is the implementation to create a communication channel
// to an xDS management server.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// TransportBuilder is the implementation to create a communication channel
// to an xDS management server.
// TransportBuilder is used to connect to the management server.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 62 to 63
// Below values will have default values but can be overridden for testing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Below values will have default values but can be overridden for testing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the unexported fields intended for testing for now

Comment on lines 21 to 26
// OnCallbackProcessed is a function to be invoked by resource watcher
// implementations upon completing the processing of a callback from the xDS
// client. Failure to invoke this callback prevents the xDS client from reading
// further messages from the xDS server.
type OnCallbackProcessed func()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@easwars WDYT about deleting this as a type? The ResourceWatcher interface can just have a func() and name the parameter, and it can be documented on ResourceWatcher instead. Otherwise these types end up split in the godoc view and we have to write a lot to connect them and it adds to the total complexity of the package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind getting rid of this type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Added documentation on the ResourceWatcher interface

Comment on lines 38 to 39
// Servers specifies a list of xDS servers to connect to. This field should
// be used only for old-style names without an authority.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave "old" and "style" out of the docstrings and describe what it is or what it is used for in specific terms instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have modified the documentation. One more thing i have added in both Authority and here about order of servers in the list for precedence. Let me know if thats okay. I remember there was a question recently about how the fallback servers are chosen. Should we mention the gRFC as well?

Comment on lines 31 to 32
// TransportBuilder is the implementation to create a communication channel
// to an xDS management server.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// TransportBuilder is the implementation to create a communication channel
// to an xDS management server.
// TransportBuilder is used to connect to the LRS server.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// Config provides parameters for configuring the LRS client.
type Config struct {
// Node is the identity of the client application reporting load to the
// xDS management server.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// xDS management server.
// LRS server.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

TypeURL() string

// TypeName identifies resources in a transport protocol agnostic way. This
// can be used for logging/debugging purposes, as well in cases where the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// can be used for logging/debugging purposes, as well in cases where the
// can be used for logging/debugging purposes, as well as in cases where the

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 37 to 38
// resource type name is to be uniquely identified but the actual
// functionality provided by the resource type is not required.
Copy link
Member

@dfawley dfawley Feb 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not understanding this comment.

But..should a ResourceType actually be a struct and not an interface? It has 3 things that are really just settings.

Maybe

type ResourceType struct {
	Name string
	TypeURL string
	// Incremental specifies that this resource is incremental, not state of the world.
	// (link to https://www.envoyproxy.io/docs/envoy/latest/api-docs/xds_protocol#variants-of-the-xds-transport-protocol)
	Incremental bool
	Decoder Decoder
}

type Decoder interface {
	Decode(resource any, options DecodeOptions) (*DecodeResult, error)
}

@easwars

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internally, our resource type implementations do have a struct which provides these three settings and really the decode functionality is the only moving piece.

What advantages do you see with a struct over the interface?

I know AllResourcesRequiredInSotW is not a great name, but Incremental might be confusing as well since we only support the SotW for all resources.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What advantages do you see with a struct over the interface?

It's not very idiomatic to accept an interface with a bunch of methods that return constant setting values... I know I've never seen anything quite like that before, but maybe you have.

Incremental might be confusing as well since we only support the SotW for all resources

Sorry, I think I misinterpreted the setting. This is also the name that C++ uses, so we can leave that alone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to the struct with Decoder interface

@dfawley dfawley assigned purnesh42H and unassigned dfawley Feb 6, 2025
@purnesh42H purnesh42H force-pushed the generic-xds-client-2-interface branch from 20e26f8 to c3e96f7 Compare February 7, 2025 10:14
@purnesh42H purnesh42H removed their assignment Feb 7, 2025
}

// Authority contains configuration for an xDS control plane authority.
// [authority]: https://www.envoyproxy.io/docs/envoy/latest/xds/core/v3/authority.proto
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we should link to this type. I'm not even sure if our authority corresponds to this type. I have never seen this proto before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linked boo. Yeah the authority proto doesn't have much info. I have linked the service proto of LRS though in LRS client documentation.

type Authority struct {
// XDSServers contains the list of server configurations for this authority.
// xDS client use the first available server from the list. To ensure high
// availability, list the most reliable server first.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order of the servers does not affect "high availability" in any way. The order of servers reflects the order of preference of the data returned by those servers. See: https://github.com/grpc/proposal/blob/master/A71-xds-fallback.md#reservations-about-using-the-fallback-server-data

Also, maybe this file could be useful in terms of what to include in our docstrings: https://github.com/grpc/grpc/blob/master/doc/grpc_xds_bootstrap_format.md

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to mention that. Thanks for correcting.

type LRSClient struct {
}

// ReportLoad creates a new load reporting stream for the client.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are saying that it creates a new load reporting stream, but it returns a LoadStore. How are they connected?

Copy link
Contributor Author

@purnesh42H purnesh42H Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had mentioned the connection on LoadStore struct. Moved here that ReportLoad returns a LoadStore.

"google.golang.org/grpc/xds/internal/clients"
)

// A Config structure is used to configure an LRS client. After one has been
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go comments being with the name of the symbol for which the comment is meant for. And please skip structure.

What is LRS function here? This comment is quite confusing as it stands now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this format is from Doug's suggestion to follow same style as TLS #8042 (comment)


// LoadStore keeps the loads for multiple clusters and services to be reported
// via LRS. It contains loads to report to one LRS server. It creates
// multiple stores for multiple servers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two sentences are contradictory:

  • It contains loads to report to one LRS server.
  • It creates multiple stores for multiple servers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a mistake. It means the LoadStore struct keep load of only one server. For multiple servers, caller should create multiple stores. Modified the docstring

type Config struct {
// Servers specifies a list of xDS management servers to connect to,
// including fallbacks. xDS client use the first available server from the
// list. To ensure high availability, list the most reliable server first.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

including fallbacks doesn't provide much information. We need to be clear that this is an ordered list.

Same here about To ensure high availability, list the most reliable server first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed fallbacks and modified as suggested in above comment

Comment on lines 21 to 26
// OnCallbackProcessed is a function to be invoked by resource watcher
// implementations upon completing the processing of a callback from the xDS
// client. Failure to invoke this callback prevents the xDS client from reading
// further messages from the xDS server.
type OnCallbackProcessed func()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind getting rid of this type.

// AmbientError indicates an error occurred while trying to fetch or decode
// the associated resource. The previous version of the resource should still
// be considered valid.
AmbientError(err error, done func())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both methods need to be consistent with the type and name for the callback.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected.

ResourceChanged(ResourceDataOrError, OnCallbackProcessed)

// AmbientError indicates an error occurred while trying to fetch or decode
// the associated resource. The previous version of the resource should still
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also indicate that the watcher may use this error message for better debuggability as mentioned in A88.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// AmbientError indicates an error occurred after a resource has been // received that should not modify the use of that resource but may be // useful information about the ambient state of the XdsClient

Mentioned this which is the language from gRFC

Comment on lines 37 to 38
// resource type name is to be uniquely identified but the actual
// functionality provided by the resource type is not required.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internally, our resource type implementations do have a struct which provides these three settings and really the decode functionality is the only moving piece.

What advantages do you see with a struct over the interface?

I know AllResourcesRequiredInSotW is not a great name, but Incremental might be confusing as well since we only support the SotW for all resources.

@purnesh42H purnesh42H force-pushed the generic-xds-client-2-interface branch from d50c656 to def35ab Compare February 10, 2025 17:40
@purnesh42H purnesh42H force-pushed the generic-xds-client-2-interface branch from def35ab to fef4e3d Compare February 10, 2025 17:55
@purnesh42H purnesh42H force-pushed the generic-xds-client-2-interface branch 3 times, most recently from a5aefef to e134ea6 Compare February 11, 2025 11:26
@purnesh42H purnesh42H force-pushed the generic-xds-client-2-interface branch from e134ea6 to e0a5652 Compare February 11, 2025 11:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: xDS Includes everything xDS related, including LB policies used with xDS. Type: Feature New features or improvements in behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants