Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spanmetrics cannot work with numerical default value in dimensions since version v0.109.0 #36786

Closed
timbeemster opened this issue Dec 11, 2024 · 10 comments
Labels
bug Something isn't working

Comments

@timbeemster
Copy link

Component(s)

No response

What happened?

Description

In our setup we use the VERSION environment variable to set a dimension on metrics, so we can query our metrics based on the version of the application.
The version is sourced in the environment variables and consists of a combination of numbers and letters.
It has come to our attention that in a newer version (v109) the spanmetrics connector does not allow a default of only digits to be set.
Similarly, if there are only digits and the letter e, it will try to decode the value as a float.

Perhaps something in the decoding of the yaml config changed?

Steps to Reproduce

I've created a minimal setup in a public repo: https://github.com/timbeemster/otelcol-custom/blob/main/README
Build the docker image and run the setup.

Expected Result

A running otel collector

Actual Result

Error message on startup (see log output section)

Let me know if I can assist in debugging the problem.

Collector version

v0.109.0

Environment information

Environment

OS: Golang base image for building + Debian12 (distroless variant for running)
Compiler(if manually compiled): golang:1.23.2

(see: https://github.com/timbeemster/otelcol-custom/blob/main/Dockerfile )

OpenTelemetry Collector configuration

builder config: https://github.com/timbeemster/otelcol-custom/blob/main/otel-builder-config.yml
collector config: https://github.com/timbeemster/otelcol-custom/blob/main/otel-collector-config.yml

Log output

Error: failed to get config: cannot unmarshal the configuration: decoding failed due to the following error(s):

error decoding 'connectors': error reading configuration for "spanmetrics": decoding failed due to the following error(s):

'dimensions[0].default' expected type 'string', got unconvertible type 'float64', value: '1.23e+06'
2024/12/11 15:45:40 collector server run finished with error: failed to get config: cannot unmarshal the configuration: decoding failed due to the following error(s):

error decoding 'connectors': error reading configuration for "spanmetrics": decoding failed due to the following error(s):

'dimensions[0].default' expected type 'string', got unconvertible type 'float64', value: '1.23e+06'

Additional context

No response

@timbeemster timbeemster added bug Something isn't working needs triage New item requiring triage labels Dec 11, 2024
@timbeemster timbeemster changed the title Default spanmetrics cannot work with numerical default value in dimensions since version v0.109.0 Dec 11, 2024
@bacherfl
Copy link
Contributor

Hi @timbeemster! If you enclose the environment variable for the version in quotes, it should be parsed properly:

connectors:
  spanmetrics:
    dimensions:
      - name: gitHash
        default: "${VERSION}"

@timbeemster
Copy link
Author

Hi @bacherfl ,
Thanks for reaching out. Unfortunately it does not fix the problem.
I already tested it earlier, because I thought it could solve the problem.
However, the same output is given.

Patch on the shared public repo to reproduce the problem:

diff --git a/otel-collector-config.yml b/otel-collector-config.yml
index df1789f..b29c00e 100644
--- a/otel-collector-config.yml
+++ b/otel-collector-config.yml
@@ -7,7 +7,7 @@ connectors:
   spanmetrics:
     dimensions:
       - name: gitHash
-        default: ${VERSION}
+        default: "${VERSION}"
 
 exporters:
   otlp:

rebuild docker image + run:
docker run -e VERSION=123e4 <hash>

output:

Error: failed to get config: cannot unmarshal the configuration: decoding failed due to the following error(s):

error decoding 'connectors': error reading configuration for "spanmetrics": decoding failed due to the following error(s):

'dimensions[0].default' expected type 'string', got unconvertible type 'float64', value: '1.23e+06'
2024/12/12 09:53:12 collector server run finished with error: failed to get config: cannot unmarshal the configuration: decoding failed due to the following error(s):

error decoding 'connectors': error reading configuration for "spanmetrics": decoding failed due to the following error(s):

'dimensions[0].default' expected type 'string', got unconvertible type 'float64', value: '1.23e+06'

@bacherfl
Copy link
Contributor

ok now i can also reproduce this - I was trying to add the value directly into the config map earlier, but when using an env var this issue starts to appear. I will check if there have been some changes to that recently

@bacherfl
Copy link
Contributor

It seems like in v0.109.0 a feature flag that controls whether the mapstructure library used for decoding should use the option WeaklyTypedInput has been removed, and therefore this option is now always set to false - some more background information including the rationaly behind this decision can be found in this issue: open-telemetry/opentelemetry-collector#10552

@timbeemster
Copy link
Author

Hi @bacherfl, thanks for your investigation.
The related issue seems relevant indeed.
However, I can still not come to a point of a working config.
I tried to use:
default: !!str "${VERSION}" but it did not work.

I don't see other positives comments on the issue with a working solution/workaround

@bacherfl
Copy link
Contributor

bacherfl commented Dec 12, 2024

Would it be a possible solution to add a prefix to the commit hash? i.e. something like

connectors:
  spanmetrics:
    dimensions:
      - name: gitHash
        default: "v-${env:VERSION}"

This is the only workaround I can think of at the moment

@bacherfl
Copy link
Contributor

bacherfl commented Dec 12, 2024

also, can you open an issue for the confmap component in the core repository about this? This should be the place where this can be fixed, as the original string value gets converted into a float somewhere in this component (i'm currently trying to find out where exactly)

@timbeemster
Copy link
Author

Hi @bacherfl,

I've opened up a bug report in the Core project.
Unfortunately prepending a v is not possible, as the collector is not the only component reasoning about that version number in our landscape.

Thanks for helping out thus far!

@bacherfl
Copy link
Contributor

Thanks for creating the issue @timbeemster!

One more idea regarding prepending a prefix to the version env var until the issue is fixed in core: You could, as a workaround, also use the transform processor to remove the prefix again, so the format of the gitHash property stays compatible with other components in your environment - for example:

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317

connectors:
  spanmetrics:
    dimensions:
      - name: gitHash
        default: "sha256:${env:VERSION}"

exporters:
  debug:
    verbosity: detailed

processors:
  transform:
    metric_statements:
      - context: datapoint
        statements:
          - replace_pattern(attributes["gitHash"], "sha256:([1-9a-zA-Z]+)", "$1")

extensions:

service:
  pipelines:
    traces/spanmetrics:
      receivers: [ otlp ]
      exporters: [ spanmetrics ]
    metrics:
      receivers: [ otlp, spanmetrics ]
      processors: [ transform ]
      exporters: [ debug ]

@bacherfl bacherfl removed the needs triage New item requiring triage label Dec 17, 2024
@bacherfl
Copy link
Contributor

Closing this one as the issue for the confmap component has been created in core: open-telemetry/opentelemetry-collector#11879

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants