Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config: add support for parsing env variables in configuration #6215

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.htm
- Add a `LogProcessor` to `go.opentelemetry.io/contrib/processors/baggagecopy` to copy baggage members to log records. (#6277)
- Use `baggagecopy.NewLogProcessor` when configuring a Log Provider.
- `NewLogProcessor` accepts a `Filter` function type that selects which baggage members are added to the log record.
- Parsing via ParseYAML in `go.opentelemetry.io/contrib/config` now support environment variables in the format `${VAR_NAME}`. (#6215)

### Changed

Expand Down
15 changes: 15 additions & 0 deletions config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ package config // import "go.opentelemetry.io/contrib/config"
import (
"context"
"errors"
"regexp"

"gopkg.in/yaml.v3"

Expand Down Expand Up @@ -139,6 +140,20 @@ func WithOpenTelemetryConfiguration(cfg OpenTelemetryConfiguration) Configuratio

// ParseYAML parses a YAML configuration file into an OpenTelemetryConfiguration.
func ParseYAML(file []byte) (*OpenTelemetryConfiguration, error) {
re := regexp.MustCompile(`\$\{([a-zA-Z_][a-zA-Z0-9_]*[-]?.*)\}`)

replaceEnvVars := func(input []byte) []byte {
return re.ReplaceAllFunc(input, func(s []byte) []byte {
match := re.FindSubmatch(s)
if len(match) < 2 {
return s
}
return replaceEnvVar(string(match[1]))
})
}

file = replaceEnvVars(file)
Comment on lines +143 to +155
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The env var parsing is not compliant with the specification: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/configuration/data-model.md#environment-variable-substitution

Environment variable substitution MUST only apply to scalar values. Mapping keys are not candidates for substitution.

and

It MUST NOT be possible to inject YAML structures by environment variables.

Are these handled anywhere? Can we add unit tests for unallowed substitutions?

Copy link
Member

@pellared pellared Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a feeling that implementing it in a safe manner would require implementing e.g. https://pkg.go.dev/gopkg.in/yaml.v3#Unmarshaler for each scalar type (and using these scalar types instead of the regular ones).

Such parsing would need to couple the package to some concrete YAML library (I am not against it).

Copy link
Member

@pellared pellared Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alternative could be to implement https://pkg.go.dev/encoding/json#Unmarshaler and use https://github.com/kubernetes-sigs/yaml. Maybe it would be simpler. We could then also provide a ParseJSON function easier if needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would using something like this work here? https://github.com/open-telemetry/opentelemetry-collector/blob/28d0d57c358d850166c7a0d14eaad3d162e2ce56/confmap/provider.go#L234

Maybe. You would have to elaborate as I do not know how would like to use it (or maybe simply try). However, notice that probably we would not want to support []any and map[string]any.


var cfg OpenTelemetryConfiguration
err := yaml.Unmarshal(file, &cfg)
if err != nil {
Expand Down
34 changes: 34 additions & 0 deletions config/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -438,6 +438,40 @@ func TestParseYAML(t *testing.T) {
}
}

func TestParseYAMLWithEnvironmentVariables(t *testing.T) {
tests := []struct {
name string
input string
wantErr error
wantType interface{}
}{
{
name: "valid v0.2 config with env vars",
input: "v0.2-env-var.yaml",
wantType: &v02OpenTelemetryConfig,
},
}

t.Setenv("OTEL_SDK_DISABLED", "false")
t.Setenv("OTEL_ATTRIBUTE_VALUE_LENGTH_LIMIT", "4096")
t.Setenv("OTEL_EXPORTER_OTLP_PROTOCOL", "http/protobuf")

for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
b, err := os.ReadFile(filepath.Join("testdata", tt.input))
require.NoError(t, err)

got, err := ParseYAML(b)
if tt.wantErr != nil {
require.Equal(t, tt.wantErr.Error(), err.Error())
} else {
require.NoError(t, err)
assert.Equal(t, tt.wantType, got)
}
})
}
}

func TestSerializeJSON(t *testing.T) {
tests := []struct {
name string
Expand Down
38 changes: 38 additions & 0 deletions config/envprovider.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
// Copyright The OpenTelemetry Authors
// SPDX-License-Identifier: Apache-2.0

package config // import "go.opentelemetry.io/contrib/config"

import (
"os"
"regexp"
"strings"
)

const ValidationPattern = `^[a-zA-Z_][a-zA-Z0-9_]*$`

var validationRegexp = regexp.MustCompile(ValidationPattern)

func replaceEnvVar(uri string) []byte {
envVarName, defaultValuePtr := parseEnvVarURI(uri)
if !validationRegexp.MatchString(envVarName) {
return nil
}

val, exists := os.LookupEnv(envVarName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be an allow list of env vars we can retrieve here, to avoid possible security problems?
For example, do we want to allow folks to inject PATH into the config?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose it would be up to the user if they set the configuration to include the whichever environment variable they choose to include.

It's true that someone could leak sensitive information through this mechanism, but it seems too restrictive to have an allow list. Either ways maybe this could be discussed further in the spec? Currently there are no such restrctions defined https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/configuration/data-model.md#environment-variable-substitution

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/configuration/sdk-environment-variables.md#parsing-empty-value

The SDK MUST interpret an empty value of an environment variable the same way as when the variable is unset.

Reasons are here: open-telemetry/opentelemetry-specification#2045. Maybe I should add the reasons to spec as well, but it was I think my first contribution to the specification.

if !exists {
if defaultValuePtr != nil {
val = *defaultValuePtr
}
}
return []byte(val)
}

func parseEnvVarURI(uri string) (string, *string) {
const defaultSuffix = ":-"
if strings.Contains(uri, defaultSuffix) {
parts := strings.SplitN(uri, defaultSuffix, 2)
return parts[0], &parts[1]
}
return uri, nil
}
Loading
Loading