feat: Add implementation of new AvailableComponents message #340

mrsillydog · 2025-01-10T17:04:08Z

Implements the specification added to opamp-spec in open-telemetry/opamp-spec#201

codecov · 2025-01-10T17:39:52Z

Codecov Report

Attention: Patch coverage is 95.38462% with 3 lines in your changes missing coverage. Please review.

Project coverage is 78.29%. Comparing base (e6fac32) to head (53d8cd0).
Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
client/internal/clientcommon.go	92.30%	2 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #340      +/-   ##
==========================================
+ Coverage   77.81%   78.29%   +0.47%     
==========================================
  Files          25       25              
  Lines        2321     2386      +65     
==========================================
+ Hits         1806     1868      +62     
- Misses        408      410       +2     
- Partials      107      108       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

tigrannajaryan · 2025-01-13T15:47:51Z

client/client.go

+	// If components.Hash is nil or an empty []byte, errNoAvailableComponentHash will be returned.
+	// This method is subject to agent status compression - if components is not
+	// different from the cached agent state, this method is a no-op.
+	SetAvailableComponents(components *protobufs.AvailableComponents) error


Should we add to the comment that ReportsAvailableComponents capability must be set in StartSettings otherwise this call with fail?

tigrannajaryan · 2025-01-13T15:49:34Z

client/clientimpl_test.go

+				prepareClient(t, &settings, client)
+
+				// Client --->
+				assert.NoError(t, client.Start(context.Background(), settings))


I am probably misunderstanding something but shouldn't this Start() fail for test case "apply AvailableComponents without required capability" because that test case does not set ReportsAvailableComponents capability? Why isn't it failing?

It's not failing because AvailableComponents is ignored during PrepareStart if the capability is not set - think this comment is along a similar vein to the one below.

tigrannajaryan · 2025-01-13T15:52:38Z

client/httpclient_test.go

+				})
+			} else {
+				// Verify that no second status report is delivered (polling is too infrequent for this to happen in 3 seconds)
+				require.Never(t, func() bool {


Does this block the test for 3 seconds? If it does, can we find a way to avoid waiting for so long? We don't want test durations to increase, especially if they are simply idling.

Reduced all blocking calls to assert/require.Never() to 50ms, including one other test besides the ones I added.

tigrannajaryan · 2025-01-13T15:55:54Z

client/internal/clientcommon.go

@@ -88,6 +89,16 @@ func (c *ClientCommon) PrepareStart(
 		return ErrHealthMissing
 	}

+	if c.Capabilities&protobufs.AgentCapabilities_AgentCapabilities_ReportsAvailableComponents != 0 {


Do we also need to check for the other related case, where ReportsAvailableComponents is 0 but AvailableComponents is not nil? Is that considered a valid situation?

I would consider that a valid situation - since the only real way to affect ReportsAvailableComponents and AvailableComponents before Start() is via the StartSettings, I think this case is covered by the StartSettings.AvailableComponents documentation:

// Defines the available components of the Agent. // Required if the ReportsAvailableComponents capability is set. // If the ReportsAvailableComponents capability is not set, this option has no effect.

I went with this option since there seemed to be precedent based on StartSettings.HeartbeatInterval -

// If the ReportsHeartbeat capability is disabled, this option has no effect. HeartbeatInterval *time.Duration

tigrannajaryan · 2025-01-13T15:59:41Z

client/client.go

@@ -134,4 +134,14 @@ type OpAMPClient interface {
 	// If no error is returned, the channel returned will be closed after the specified
 	// message is sent.
 	SendCustomMessage(message *protobufs.CustomMessage) (messageSendingChannel chan struct{}, err error)
+
+	// SetAvailableComponents modifies the set of components that are available for configuration


It looks like there is 2 ways to define AvailableComponents: 1) via StartSettings and 2) by calling SetAvailableComponents. Can we eliminate one of these and keep only one? If we need to allow re-defining AvailableComponents after Start (do we?) can we change SetAvailableComponents() to be allowed to be called before or after Start() and eliminate StartSettings.AvailableComponents?

See for example how SetAgentDescription() can be called (actually must be) before Start() and can be called after Start() too.

This was actually my original plan, but I encountered a problem which made me adjust my approach.

Basically, the defining difference between this and SetAgentDescription() is that SetAgentDescription is not contingent on an AgentCapability being enabled, whereas SetAvailableComponents is. Before Start(), our OpAmpClient has no idea what its agent capabilities are, since those are defined for the client during PrepareStart(), which means that any method which is dependent on those capabilities must be called after Start() -> PrepareStart().

I thought adding the starting AvailableComponents to the StartSettings was the simplest route around this issue, rather than attempting to rework the larger flow of the client. If that's not your preference, I'm happy to take another attempt at it using only SetAvailableComponents.

A couple other alternatives I could think of:

getting rid of the SetAvailableComponents method entirely and just utilizing the StartSettings option - I don't love this option, since it makes it impossible for the client to dynamically change the AvailableComponents after Start(), but maybe that's not a concern given that AvailableComponents is unlikely to change for a given agent after Start()?
Edit: see you actually already described this option: If we need to allow re-defining AvailableComponents after Start (do we?)

adding some logic to SetAvailableComponents that could somehow (not clear yet) detect whether Start() has been called already and behave differently based on that (i.e. if Start() has been called, care about the Capability, if it hasn't, don't). I do not love this for multiple reasons, but it is an option.

TLDR: There may be value in allowing SetAvailableComponents to continue existing when a proxy OpAMP server is involved.

Discovered a slight problem with alternative 1: my planned implementation (which could change, to be fair) using the supervisor in the OpenTelemetry collector requires SetAvailableComponents. Here's the planned implementation, with new additions in bold:

Supervisor starts, starts bootstrap OpAMP server to get agent description and the initial AvailableComponents.

Collector starts, sends bootstrap information, but because of the OpAMP client implementation here, only sends the hash, and not the full component list.

Supervisor receives the bootstrap information, stores it. This only includes the AvailableComponents hash.

Supervisor starts up its own OpAMP client, sends the stored AvailableComponents information to the external OpAMP Server - again, this only includes the hash.

Supervisor starts up its own OpAMP server.

Everything is good - external OpAMP server has the component hash, so does the supervisor OpAMP server, but only the collector client has the full component list.

External OpAMP server decides it needs the full component list. Sends a ServerToAgent message to the supervisor client with the ReportAvailableComponents flag set.

Supervisor OpAMP client proxies the server request flags along to the collector OpAMP client via the supervisor OpAMP server.

Collector OpAMP client receives the message with its flags, returns the full component list to the supervisor OpAMP server.

Supervisor uses its OpAMP server to receive the component list, stores the component list, uses its OpAMP client to send it to the external OpAMP server by calling client.SetAvailableComponents(). The value here is that the Supervisor client didn't have complete knowledge of the available components from the start, just the hash - so having a way to modify the client's known list of available components after Start() does have value.

An alternative approach would be to more heavily modify the supervisor bootstrap process to not only retrieve the initial hash from the collector client, but also request the full list of available components after receiving the initial hash. The upside would be that we would not have to proxy any ServerToAgent flags back to the collector client - the downside would be that the original source of truth, the collector, would be cut off from these messages. It would also complicate the bootstrap process.

I'm going to go with the "modify the supervisor bootstrap process" route and eliminate SetAvailableComponents, I've discovered some issues with the process during end-to-end testing that make the existing workflow less than ideal.

Can you expand on alternate 2 and why you don't like it? It seems that would allow to redefine available components dynamically and would eliminate the problem you have with bootstrapping.

Sure, I think a couple aspects of it just don't feel very clean to me.

It seems like it's overloading that single method with two responsibilities

It seems like a divergence from the way that the other Set____ interface methods are implemented, which are agnostic to whether Start() has been called or not.

One solution that I do think solves both these issues in a way that I feel better about is including two new methods in the OpAMPClient interface, SetAvailableComponents and InitializeAvailableComponents - one which would need to be called before Start() if the ReportAvailableComponents capability was defined (but would only set up client state, nothing more), one which could only be called after Start(), which would allow for dynamic available component redefinition. With that being said, I'm not sure that my implementation within the supervisor and OpAMP extension would change substantially - if a case emerges where the available components for a collector can change dynamically, then it will be useful to have the SetAvailableComponents method already defined, but until that case emerges, I think it would likely be unused.

With all that being said, I'm not opposed to being overridden here if getting rid of StartSettings.AvailableComponents and adding this code to (c *ClientCommon) SetAvailableComponents seems like the strongly better option to you.

if !c.isStarted { return c.ClientSyncedState.SetAvailableComponents(components) }

I've reconsidered. Think I'm going to go with the approach.

mrsillydog · 2025-01-15T15:58:23Z

Comfortable moving this into Ready for Review - have successfully manually tested this implementation end-to-end with a custom build of the OpAMP Supervisor and OpAMP extension in open-telemetry/opentelemetry-collector-contrib.

This reverts commit 9e2acf3.

… StartSettings

mrsillydog added 2 commits January 10, 2025 12:01

Add implementation of new AvailableComponents message

f6b5058

Remove pointer from Flags field

843072b

mrsillydog added 2 commits January 10, 2025 16:37

Add unit tests, fix issues found by tests, document

26a7996

Adjust SetAvailableComponents and add tests

b189b21

tigrannajaryan reviewed Jan 13, 2025

View reviewed changes

mrsillydog added 5 commits January 13, 2025 11:56

Update documentation and reduce testing wait interval

97a27da

Fix Flags field typing

cd6ccd7

Set flags on MessageData before OnMessage callback

9a8dabd

Improve code coverage

7893479

Remove SetAvailableComponents method from interface

9e2acf3

mrsillydog marked this pull request as ready for review January 15, 2025 15:56

mrsillydog requested a review from a team as a code owner January 15, 2025 15:56

mrsillydog added 2 commits January 22, 2025 09:42

Revert "Remove SetAvailableComponents method from interface"

be86a6f

This reverts commit 9e2acf3.

Initialize AvailableComponents via SetAvailableComponents rather than…

53d8cd0

… StartSettings

mrsillydog requested a review from tigrannajaryan January 22, 2025 15:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add implementation of new AvailableComponents message #340

feat: Add implementation of new AvailableComponents message #340

mrsillydog commented Jan 10, 2025

codecov bot commented Jan 10, 2025 •

edited

Loading

tigrannajaryan Jan 13, 2025

tigrannajaryan Jan 13, 2025

mrsillydog Jan 13, 2025

tigrannajaryan Jan 13, 2025

mrsillydog Jan 13, 2025

tigrannajaryan Jan 13, 2025

mrsillydog Jan 13, 2025

mrsillydog Jan 13, 2025

tigrannajaryan Jan 13, 2025

mrsillydog Jan 13, 2025

mrsillydog Jan 13, 2025 •

edited

Loading

mrsillydog Jan 13, 2025 •

edited

Loading

mrsillydog Jan 14, 2025

tigrannajaryan Jan 16, 2025

mrsillydog Jan 16, 2025 •

edited

Loading

mrsillydog Jan 22, 2025

mrsillydog commented Jan 15, 2025

feat: Add implementation of new AvailableComponents message #340

Are you sure you want to change the base?

feat: Add implementation of new AvailableComponents message #340

Conversation

mrsillydog commented Jan 10, 2025

codecov bot commented Jan 10, 2025 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mrsillydog Jan 13, 2025 • edited Loading

Choose a reason for hiding this comment

mrsillydog Jan 13, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mrsillydog Jan 16, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mrsillydog commented Jan 15, 2025

codecov bot commented Jan 10, 2025 •

edited

Loading

mrsillydog Jan 13, 2025 •

edited

Loading

mrsillydog Jan 13, 2025 •

edited

Loading

mrsillydog Jan 16, 2025 •

edited

Loading