Skip to content
This repository has been archived by the owner on Oct 23, 2024. It is now read-only.

Support for more variables and variable expansion in app definition #1828

Closed
F21 opened this issue Jul 19, 2015 · 19 comments
Closed

Support for more variables and variable expansion in app definition #1828

F21 opened this issue Jul 19, 2015 · 19 comments

Comments

@F21
Copy link

F21 commented Jul 19, 2015

Posted this in the google group and would love to see it as an enhancement: https://groups.google.com/forum/#!topic/marathon-framework/lVNyLTCAipU

I would love to see:

  • Support for more variables such as $ip
  • Ability to have these variables expanded in the app's definition.

This would allow people to satisfy a lot of requirements of docker containers in the wild and reduce the need to build their own fork to get it to work with marathon.

@F21 F21 changed the title Support for more variables and expansion Support for more variables and variable expansion in app definition Jul 19, 2015
@gkleiman
Copy link
Contributor

Thanks for the feature request, I'm adding it to our backlog.

@gkleiman gkleiman added this to the Backlog milestone Jul 23, 2015
@sepiroth887
Copy link

+1

@kolloch
Copy link
Contributor

kolloch commented Jul 27, 2015

Hi @F21, thanks for your input! Please try to be as specific as possible, look for duplicates and try to break up potentially unrelated things into multiple issues.

"Support for more variables such as $ip" -- You mean along the lines of https://mesosphere.github.io/marathon/docs/task-environment-vars.html? So probably better call them "$MARATHON_IP" or something similar? How should these variables be defined? Please note that Marathon has to be able to calculate these variables when passing the TaskInfo to Mesos. If the IP is assigned randomly by docker, Marathon cannot know it in advance.

"Ability to have these variables expanded in the app's definition." -- Do you mean that Marathon should expand them? Marathon can only expand them if it knows the values of the referenced variables when it launches the task. Thus it can only possibly substitute other variables passed in the "env" configuration or variables that it determines automatically when launching (see https://mesosphere.github.io/marathon/docs/task-environment-vars.html). If we do this, we also have to be careful to come up with a good syntax for quoting values (to indicate that we do not want substitution) and dealing with reference cycles.

For backward compatibility, we also might want to only substitute variables if another configuration option is set or a global configuration option of Marathon is set. Otherwise old AppDefinitions might start to behave differently.

(This seems to be related to #1401)

@sepiroth887
Copy link

I can think of at least one i really wish marathon had: counter variables! :)

I.e. Pass in something like "Foo" : "bar_idx" to env.
And marathon would track current value per task started/stopped/restarted.

It could help with naming containers or for stateful services with data replication mechanisms.

Though it kind of breaks the dynamic nature of mesos/marathon there are tools around which still rely on unique naming of cluster members and require some script or mechanism to habdle it outside the mesos/marathon framework

@kolloch
Copy link
Contributor

kolloch commented Jul 27, 2015

@sepiroth887 that's very similar to #1242.

Maybe we should reopen this.

@sepiroth887
Copy link

That'd be nice :)

I also thought of another thing though:

What about a build in variable store?
E.g. A REST api to set/update/delete kv pairs.
Those could be prefixed to avoid clashes and then be referenced in app payloads.

It would allow other tools to keep them up to date by whatever logic they require and can be audited/viewed quite easily that way too

@kolloch
Copy link
Contributor

kolloch commented Jul 27, 2015

@sepiroth887 Even though it is not trivial to implement, I like this related idea: #1863

@sepiroth887
Copy link

That'd do just fine :)

@F21
Copy link
Author

F21 commented Jul 27, 2015

@kolloch Thanks for taking a look at this!

By $IP, I meant the ip of the node the task executes on. $HOST is useful, but sometimes things like ceph requires an ip address rather than the hostname.

Being expanding variables Marathon knows in places such as env and perhaps other parts of the app config would be extremely useful and simplify a lot of things.

@BenWhitehead
Copy link
Contributor

follow

@kolloch
Copy link
Contributor

kolloch commented Jul 28, 2015

Hi @F21, as far as I know, Marathon cannot know the IP of the host your task is launched on via the Scheduler API of Mesos.

For variable substitution, it would be helpful if you came up with specific examples of syntax & use cases.

@F21
Copy link
Author

F21 commented Jul 28, 2015

@kolloch:

One use-case would be to get Marathon variables expanded in the env definition:

{
   "id":"ceph",
   "cpus":0.3,
   "mem":256,
   "instances":1,
   "maxLaunchDelaySeconds":36000,
   "container":{
      "docker":{
         "image":"ceph/demo:latest",
         "network":"HOST"
      }
   },
   "env":{
      "CEPH_NETWORK":"192.168.33.0/24",
      "MON_IP": "$HOST"
   }
}

@bwb
Copy link

bwb commented Aug 15, 2015

I'd like to expand variables in the volumes portion of app definitions.

{
  "id": "app",
  "cpus": 0.25,
  "mem": 256,
  "instances": 1,
  "container": {
    "type": "DOCKER",
    "docker": {
      "image": "private-registry.example.com/ns/app:version"
    },
    "volumes": [
      {
        "hostPath": "/mnt/blobs/$MESOS_TASK_ID",
        "containerPath": "/var/log/app",
        "mode": "RW"
      }
    ]
  }
}

@flungo
Copy link

flungo commented Sep 3, 2015

The use case for me would be to pass environment variables to the docker image without having to fork it and adapt it to work. For example, I need to specify where some configuration folder is (which is in my $MESOS_SANDBOX) via an environment variable and so would like to be able to add the following to my marathon config:

{
  ...
  "env": {
    "CONFIG_FOLDER": "$MESOS_SANDBOX/config"
  },
  ...
}

@dylanmei
Copy link

Throwing out another example where the expansion in env is desirable.

This doesn't work:

{
  "id": "/apache/zeppelin",
  "cmd": "./zeppelin-0.6.0-incubating-SNAPSHOT/bin/zeppelin.sh",
  "env": {
    "ZEPPELIN_PORT": "$PORT0",
    "ZEPPELIN_JAVA_OPTS": "-Dspark.executor.uri=hdfs://bin/spark-1.4.1-bin-hadoop2.4.tgz -Dspark.driver.port=$PORT1 -Dspark.fileserver.port=$PORT2 -Dspark.executor.memory=8g -Dspark.cores.max=8 -Dspark.mesos.coarse=true"
  },
  "ports": [0, 0, 0],
  ...
}

Notice I need all three ports in different places. This workaround in JSON 😞

{
  "id": "/apache/zeppelin",
  "cmd": "ZEPPELIN_PORT=$PORT0 ZEPPELIN_JAVA_OPTS='-Dspark.executor.uri=hdfs://bin/spark-1.4.1-bin-hadoop2.4.tgz -Dspark.driver.port='$PORT1' -Dspark.fileserver.port='$PORT2' -Dspark.executor.memory=8g -Dspark.cores.max=8 -Dspark.mesos.coarse=true' ./zeppelin-0.6.0-incubating-SNAPSHOT/bin/zeppelin.sh",
  "ports": [0, 0, 0],
  ...
}

@BenWhitehead
Copy link
Contributor

This is a very very challenging thing for Marathon to even attempt to implement.

In the example provided by @flungo it is quite literally impossible for Marathon to resolved the value of CONFIG_FOLDER. The environment variable MESOS_SANDBOX is something that is only available in the environment when the mesos docker containerizer is used. This information is never made available to Marathon in any way.

The real solution to this problem probably involves an upstream change to Mesos. One approach could be to create a file containing all of the environment variables that are declared on the ExecutorInfo and the CommandInfo and have Mesos source that file before running the specified command. This approach would make sure that it's the shell doing the variable expansion in the same environment that the command is going to be executed in and keeps the responsibility of variable resolution/expansion from having to be implemented by Mesos or any framework.

@flungo
Copy link

flungo commented Sep 15, 2015

@BenWhitehead I have found that MESOS_SANDBOX is always /mnt/mesos/sandbox so I have been explicitly declaring this, for the time being.

I think the best way to do this, may be to have it expanded by the shell (possibly through a generated file that is given to the container and sourced before the entrypoint/cmd). This way it is guaranteed that every environment variable that is available at runtime for the container can be expanded into the given environment variables.

Not sure if this would be useful or risky but this could also lead to being able to expand all shell syntax such as $(command) to use the output of a command on the container to set an environment variable.

@BenWhitehead
Copy link
Contributor

@flungo I think we're mostly in agreement.

I feel the shell is the only appropriate location for this expansion to take place.

Part of my point is that MESOS_SANDBOX env variable only exists when running your task in a docker image. If the mesos containerizer is used that variable doesn't even exist. I suspect there are a number of subtleties that will creep up if/when the work to do variable expansion is started (what happens if the shell tries to expand a variable that doesn't have a value? or doesn't yet have a value?).

@meichstedt
Copy link
Contributor

Note: This issue has been migrated to https://jira.mesosphere.com/browse/MARATHON-3002. For more information see https://groups.google.com/forum/#!topic/marathon-framework/khtvf-ifnp8.

@d2iq-archive d2iq-archive locked and limited conversation to collaborators Mar 27, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants