-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[elasticsearchexporter] Direct serialization without objmodel in OTel mode #37032
base: main
Are you sure you want to change the base?
[elasticsearchexporter] Direct serialization without objmodel in OTel mode #37032
Conversation
Benchmark results: TL;DR: the throughput is 2x for metrics and 3x for logs and traces. The allocated bytes/op are reduced 82% for metrics and 95% for logs and traces.
|
scopeMetrics := scopeMetrics.At(j) | ||
scope := scopeMetrics.Scope() | ||
groupedDataPointsByIndex := make(map[string]map[uint32][]dataPoint) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to reviewer: I made it so that documents from different scopes are never merged. This simplified the serialization logic and also fixes a subtle bug in the current implementation where we're only hashing the scope attributes but not the scope name. This leads to grouping of potentially different scopes to the same document. I guess as a consequence, we should also add the scope name as a dimension in the mappings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think by moving this here, rather than outside of the scopeMetrics
loop, we're assuming that there will never be two identical scopes within a resource. Is that a safe assumption?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose it's no worse than the existing assumption that resourceMetrics is free of duplicate resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will make it safe is that the only consequence of being wrong in the assumption is leaving some storage savings on the table. In other words, we should prioritize elastic/elasticsearch#99123, which turns out to be more of an issue than anticipated in various contexts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't duplicates resources/scopes lead to duplicate _tsid & doc rejections? Definitely agree on prioritising that issue though...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it would, until we fix the referenced issue.
bytes.Buffer.Write is guaranteed to not return an error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. The amount of handwritten serialisation makes me a little uncomfortable, but we can perhaps improve that with code generation later.
I've introduced pooling for the buffer holding the serialized events in 5e523c5. This reduced allocation by another 60% and increased throughput as well. I suppose most of the remaining allocations are from creating the pdata model itself, something that we can't easily optimize and not allocations that are directly caused by the ES exporter itself. I've updated the benchmark results in #37032 (comment). |
exporter/elasticsearchexporter has more than one function: "NewBufferPool,NewFactory"
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #37032 +/- ##
==========================================
- Coverage 79.60% 79.60% -0.01%
==========================================
Files 2252 2254 +2
Lines 211920 212032 +112
==========================================
+ Hits 168704 168781 +77
- Misses 37549 37576 +27
- Partials 5667 5675 +8 ☔ View full report in Codecov by Sentry. |
Directly serializes pdata to JSON in OTel mode
objmodel.Document
needs to be created first