Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] Automatically quote vars that start with asterisks or ampersands #126919

Closed
wants to merge 1 commit into from

Conversation

joshdover
Copy link
Contributor

@joshdover joshdover commented Mar 4, 2022

Summary

Fixes #121934

We ran into this problem about a year ago in #91401. Initially that was fixed by escaping all special YAML characters in #91418. However, this created another issue and this was then reverted and replaced with a validation error in #93585.

While #93585 has a good explanation of why this problem presents itself when converting between YAML to JSON back to YAML, I don't agree with the conclusion that putting the onus on package developers and users makes sense. We should be able to handle at least the most common scenarios (like a string that starts with a *).

It appears the initial implementation of this in #91418 was too heavy handed and could be re-implemented in a safer way. Instead of quoting any strings that contain any special characters at any position, this PR will only quote strings that start with * or &.

How to test

Start Kibana with this configuration (needs to be the Cloud policy w/o the APM policy added):

xpack.apm.agent.migrations.enabled: true

# Cloud Agent policy
xpack.fleet.packages:
  - name: fleet_server
    version: latest
xpack.fleet.outputs:
  - name: 'Elastic Cloud internal output'
    type: 'elasticsearch'
    id: 'es-containerhost'
    hosts: ["http://localhost:9200"]
xpack.fleet.agentPolicies:
  # Cloud Agent policy
  - name: Elastic Cloud agent policy
    description: Default agent policy for agents hosted on Elastic Cloud
    id: policy-elastic-agent-on-cloud

    # set the internal containerhost URL to ES
    data_output_id: es-containerhost
    monitoring_output_id: es-containerhost

    is_default: false
    is_managed: true
    is_default_fleet_server: false

    namespace: default
    monitoring_enabled: []
    # If a user scales up / down the Elastic Agent in Cloud, the old
    # instances will still be shown as offline for 24h
    # The reason for having a value such high is to prevent fleet-servers to self unenroll to soon
    # in case checkin cannot happen for a longer time period.
    unenroll_timeout: 86400 # 1 day TTL

    package_policies:
      - name: Fleet Server
        id: elastic-cloud-fleet-server
        package:
          name: fleet_server
        inputs:
          - type: fleet-server
            keep_enabled: true

            vars:
              - name: host
                value: 0.0.0.0
                frozen: true
              - name: port
                value: 8220
                frozen: true
              - name: custom
                value: |
                  server.runtime:
                    gc_percent: 20          # Force the GC to execute more frequently: see https://golang.org/pkg/runtime/debug/#SetGCPercent

Run this curl command to setup the correct APM state:

curl --request POST \
  --url http://localhost:5601/api/apm/fleet/apm_server_schema \
  --user elastic:changeme \
  --header 'Content-Type: application/json' \
  --header 'kbn-xsrf: true' \
  --data '{
	"schema": {
		"apm-server.host": "0.0.0.0:8200",
		"apm-server.secret_token": "asdfkjhasdf",
		"apm-server.api_key.enabled": true,
		"apm-server.read_timeout": 3600,
		"apm-server.register.ingest.pipeline.enabled": true,
		"apm-server.rum.enabled": true,
		"apm-server.rum.allow_origins": [
			"*"
		],
		"apm-server.rum.rate_limit": 10,
		"apm-server.shutdown_timeout": "30s",
		"logging.level": "error",
		"logging.metrics.enabled": false,
		"queue.mem.events": 2000,
		"queue.mem.flush.min_events": 267,
		"queue.mem.flush.timeout": "1s",
		"setup.template.settings.index.auto_expand_replicas": "0-1",
		"setup.template.settings.index.number_of_replicas": 1,
		"setup.template.settings.index.number_of_shards": 1
	}
}'
  • Go to APM/Settings/Schema/Switch to Elastic Agent
  • After doing the switch (without any errors) go to the Elastic APM integration policy that was created and verify that RUM allowed origins has a single * setting.

Checklist

Delete any items that are not applicable to this PR.

@joshdover joshdover requested a review from a team as a code owner March 4, 2022 15:17
@botelastic botelastic bot added the Team:Fleet Team label for Observability Data Collection Fleet team label Mar 4, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@@ -330,7 +330,7 @@ export interface RegistryDataStreamPrivileges {
indices?: string[];
}

export type RegistryVarType = 'integer' | 'bool' | 'password' | 'text' | 'yaml' | 'string';
export type RegistryVarType = 'integer' | 'bool' | 'password' | 'text' | 'yaml';
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this because it looks like it was intended to be removed in https://github.com/elastic/kibana/pull/94174/files but wasn't (I think this was an accident)

cc @jen-huang

@nchaulet
Copy link
Member

nchaulet commented Mar 4, 2022

On the Fleet side we do not really know how variable are used and some packaged are using string concatenation (like Kafka here for example https://epr.elastic.co/package/kafka/1.1.0/data_stream/log/agent/stream/log.yml.hbs ) It's why we choose to put the responsibility here on the package developper side.

paths:
{{#each paths as |path i|}}
 - {{../kafka_home}}{{path}}
{{/each}}

That PR could broke string concatenation.

@kibana-ci
Copy link
Collaborator

kibana-ci commented Mar 4, 2022

💔 Build Failed

Failed CI Steps

Metrics [docs]

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
fleet 110.3KB 109.7KB -543.0B
Unknown metric groups

ESLint disabled in files

id before after diff
apm 15 14 -1
securitySolution 67 66 -1
uptime 7 6 -1
total -3

ESLint disabled line counts

id before after diff
apm 85 82 -3
uptime 48 42 -6
total -9

References to deprecated APIs

id before after diff
fleet 5 4 -1

Total ESLint disabled count

id before after diff
apm 100 96 -4
securitySolution 514 513 -1
uptime 55 48 -7
total -12

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@joshdover joshdover closed this Mar 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release_note:fix Team:Fleet Team label for Observability Data Collection Fleet team v7.17.2 v8.1.1 v8.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

APM switch to Elastic Agent fails with cryptic error
4 participants