Improve docs (#722)

* Improve docs * Review * Review
vmware-archive · Apr 30, 2018 · c591c23 · c591c23
1 parent 44c2cba
commit c591c23
Show file tree

Hide file tree

Showing 6 changed files with 472 additions and 279 deletions.
diff --git a/docs/autoscaling.md b/docs/autoscaling.md
@@ -57,11 +57,11 @@ To autoscale based on CPU usage, it is *required* that your function has been de
 
 To do this, use the `--cpu` parameter when deploying your function. Please see the [Meaning of CPU](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#meaning-of-cpu) for the format of the value that should be passed. 
 
-## Autoscaling with custom metrics on k8s 1.7
+## Autoscaling with custom metrics
 
-This walkthrough will go over the step-by-step of setting up the prometheus-based custom API server on your cluster and configuring autoscaler (HPA) to use application metrics sourced from prometheus instance.
+It is possible to use custom metrics (like queries per second) to scale your functions. We are [looking for help](https://github.com/kubeless/kubeless/issues/647) in order to document the required steps to do so with the different Kubernetes providers for newer versions of Kubernetes. If you want to contribute to this guide PRs are more than welcome :).
 
-This walkthrough is done in [kubeadm-dind-cluster v1.7](https://github.com/Mirantis/kubeadm-dind-cluster)
+**Warning** This walkthrough is done in [kubeadm-dind-cluster v1.7](https://github.com/Mirantis/kubeadm-dind-cluster) it may not work for other versions or platforms.
 
 ### Cluster configuration
 

diff --git a/docs/debug-functions.md b/docs/debug-functions.md
@@ -0,0 +1,147 @@
+# Debug Kubeless Functions
+
+In this document we will show how you can debug your function in order to spot possible errors. There could be several reasons that causes a wrong deployment. For learning how to successfully debug a function it is important to know what is the process of deploying a Kubeless function. In this guide we are going to assume that you are using the `kubeless` CLI tool to deploy your functions. If that is the case, this is the process to run a function:
+
+ 1. The `kubeless` CLI read the parameters you give to it and produces a [Function](/docs/advanced-function-deployment) object that submits to the Kubernetes API server.
+ 2. The Kubeless Function Controller detects that a new `Function` has been created and reads its content. From the function content it generates: a `ConfigMap` with the function code and its dependencies, a `Service` to make the function reachable through HTTP and a `Deployment` with the base image and all the required steps to install and run your functions. It is important to know this order because if the controller fails to deploy the `ConfigMap` or the `Service` it will never create the `Deployment`. A failure in any step will abort the process.
+ 3. Once the `Deployment` has been created a `Pod` should be generated with your function. When a Pod starts it dinamically reads the content of your function (in case of interpreted languages).
+
+After all the above you are ready to call your function. Let's see some common mistakes and how to fix them.
+
+## "kubeless function deploy" fails
+
+The first failure that can appear is an error in the parameters that we give to the `kubeless function deploy` command. Hopefully this errors are pretty easy to debug:
+
+```console
+$ kubeless function deploy --runtime node8 \
+  --from-file hello.js \
+  --handler todos.create \
+  --dependencies package.json \
+  hello
+FATA[0000] Invalid runtime: node8. Supported runtimes are: python2.7, python3.4, python3.6, nodejs6, nodejs8, ruby2.4, php7.2, go1.10
+```
+
+In the above we can see that we have a typo in the runtime. It should be `nodejs8` instead of `node8`.
+
+## "kubeless function ls" returns "MISSING: Check controller logs"
+
+There will be cases in which the validations done in the CLI won't be enough to spot a problem in the given parameters. If that is the case the function `Deployment` will never appear. To debug this kind of issues it is necessary to check what is the error in the controller logs. To retrieve these logs execute:
+
+```
+$ kubeless function deploy foo --from-file hellowithdata.py --handler hello,foo --runtime python3.6
+INFO[0000] Deploying function...
+INFO[0000] Function foo submitted for deployment
+INFO[0000] Check the deployment status executing 'kubeless function ls foo'
+$ kubeless function ls
+NAME 	NAMESPACE	HANDLER  	RUNTIME  	DEPENDENCIES	STATUS
+foo  	default  	hello,foo	python3.6	            	MISSING: Check controller logs
+$ kubectl logs -n kubeless -l kubeless=controller
+time="2018-04-27T15:12:28Z" level=info msg="Processing update to function object foo Namespace: default" controller=cronjob-trigger-controller
+time="2018-04-27T15:12:28Z" level=error msg="Function can not be created/updated: failed: incorrect handler format. It should be module_name.handler_name" pkg=function-controller
+time="2018-04-27T15:12:28Z" level=error msg="Error processing default/foo (will retry): failed: incorrect handler format. It should be module_name.handler_name" pkg=function-controller
+```
+
+From the logs we can see that there is a problem with the handler: we specified `hello,foo` while the correct value is `hello.foo`.
+
+## Function pod is crashing
+
+The most common error is finding that the `Deployment` is generated successfully but the function remains with the status `0/1 Not ready`. This is usually caused by a syntax error in our function or in the dependencies we specify.
+
+If our function doesn't start we should check the status of the pods executing:
+
+```
+$ kubectl get pods -l function=foo
+```
+
+### Function pod crashes with Init:CrashLoopBackOff
+
+If our function fails with an `Init` error that could mean that:
+
+ - It fails to retrieve the function content.
+ - It fails to install dependencies.
+ - It fails to compile our function (in compiled languages).
+
+For any of the above we should first identify which container is failing (since each step is performed in a different container):
+
+```console
+$ kubectl get pods -l function=foo
+NAME                   READY     STATUS                  RESTARTS   AGE
+foo-74978bbf45-9xb4p   0/1       Init:CrashLoopBackOff   1         6m
+$ kubectl get pods -l function=foo -o yaml
+...
+      name: install
+      ready: false
+      restartCount: 2
+...
+```
+
+From the above we can see that is the container `install` is the one with the problem. Depending on the runtime the logs of the container will be shown as well so we can directly spot the issue. Unfortunately that is not the case so let's retrieve manually the logs of the `install` container:
+
+```console
+$ kubectl logs foo-74978bbf45-9xb4p -c install --previous
+...
+Collecting twiter (from -r /kubeless/requirements.txt (line 1))
+  Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.VerifiedHTTPSConnection object at 0x7f10eb4d7400>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',)': /simple/twiter/
+```
+
+Now we can spot that the problem is a typo in our requirements: `twiter` should be `twitter`.
+
+### Function pod crashes with CrashLoopBackOff
+
+In the case the Pod remains in that state we should retrieve the logs of the runtime container:
+
+```console
+$ kubectl get pods -l function=bar
+NAME                   READY     STATUS             RESTARTS   AGE
+bar-7d458f6d7c-2gsh7   0/1       CrashLoopBackOff   7          15m
+$ kubectl logs -l function=bar
+kubectl logs -l function=bar
+Traceback (most recent call last):
+...
+  File "/kubeless/hello.py", line 2
+    return Hello world
+                     ^
+SyntaxError: invalid syntax
+```
+
+We can see that we have a syntax error: `return Hello world` should be modified with `return "Hello world"`.
+
+### Function returns an "Internal Server Error"
+
+There will be cases in which the pod doesn't crash but the function returns an error:
+
+```console
+$ kubectl get pods -l function=test
+NAME                    READY     STATUS    RESTARTS   AGE
+test-6845ff45cb-6q865   1/1       Running   0          1m
+$ kubeless function call test --data '{"username": "test"}'
+ERRO[0000]
+FATA[0000] an error on the server ("Internal Server Error") has prevented the request from succeeding
+```
+
+This usually means that the function is syntactically correct but it has a bug. Again for spotting the issue we should check the function logs:
+
+```console
+$ kubectl logs -l function=test
+...
+[27/Apr/2018:15:45:33 +0000] "GET /healthz HTTP/1.1" 200 2 "-" "kube-probe/."
+Function failed to execute: TypeError: Cannot read property 'name' of undefined
+    at handler (/kubeless/hello.js:3:39)
+    ...
+```
+
+We can see that it is raising an error in the line 3 of our function:
+
+```js
+module.exports = {
+  handler: (event, context) => {
+    return "Hello " + event.data.user.name;
+  },
+};
+```
+
+We are trying to access the property `name` of the property `user` while we are giving the function `username` instead.
+
+## Conclusion
+
+These are just some tips to quickly identify what's gone wrong with a function. If after checking the controller and function logs (or any other information that Kubernetes may provide) you are not able to spot the error you can open an [Issue in our GitHub repository](https://github.com/kubeless/kubeless/issues) or contact us through [slack](http://slack.k8s.io) in the #kubeless channel.
diff --git a/docs/kubeless-functions.md b/docs/kubeless-functions.md
@@ -0,0 +1,56 @@
+# Kubeless Functions
+
+Functions are the main entity in Kubeless. It is possible to write Functions in different languages but all of them share common properties like the generic interface, the default timeout or the runtime UID. In this document we are going to explain some these common properties and different runtimes availables in Kubeless. You can find in depth details about the Function specification [here](/docs/advanced-function-deployment). 
+
+## Functions Interface
+
+Every function receives two arguments: `event` and `context`. The first argument contains information about the source of the event that the function has received. The second contains general information about the function like its name or maximum timeout. This is a representation in YAML of a Kafka event:
+
+```yaml
+event:                                  
+  data:                                         # Event data
+    foo: "bar"                                  # The data is parsed as JSON when required
+  event-id: "2ebb072eb24264f55b3fff"            # Event ID
+  event-type: "application/json"                # Event content type
+  event-time: "2009-11-10 23:00:00 +0000 UTC"   # Timestamp of the event source
+  event-namespace: "kafkatriggers.kubeless.io"  # Event emitter
+  extensions:                                   # Optional parameters
+    request: ...                                # Reference to the request received 
+                                                # (specific properties will depend on the function language)
+context:
+    function-name: "pubsub-nodejs"
+    timeout: "180"
+    runtime: "nodejs6"
+    memory-limit: "128M"
+```
+
+Functions should return a string that will be used as the HTTP response for the caller. Some runtimes may support different types (like objects) for the returned values.
+
+You can check basic examples of every language supported in the [examples](https://github.com/kubeless/kubeless/tree/master/examples) folder.
+
+## Functions Timeout
+
+Runtimes have a maximum timeout set by the environment variable FUNC_TIMEOUT. This environment variable can be set using the CLI option `--timeout`. The default value is 180 seconds. If a function takes more than that in being executed, the process will be terminated.
+
+## Runtime User
+
+As a [Security Context](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/) functions are configured to run with an unprivileged user (UID 1000) by default (except for OpenShift where the UID is automatically set). This prevent functions from having root privileges. This default behaviour can be overridden specifying a different Security Context in the `Deployment` template that is part of the Function Spec.
+
+## Scheduled functions
+
+It is possible to deploy functions that should be triggered following a certain schedule. For specifying the execution frequency we use the [Cron](https://en.wikipedia.org/wiki/Cron) format. Every time a scheduled function is executed, a [Job](https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/) is started. This Job will do a HTTP GET request to the function service and will be successful as far as the function returns 200 OK.
+
+For executing scheduled functions we use Kubernetes [CronJobs](https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/) using mostly the default options which means:
+ - If a Job fails, it won't be restarted but it will be retried in the next scheduled event. The maximum time that a Job will exist is specified with the function timeout (180 seconds by default).
+ - The concurrency policy is set to `Allow` so concurrent jobs may exists.
+ - The history limit is set to maintain as maximum three successful jobs (and one failed).
+
+If for some reason you want to modify one of the default values for a certain function you can execute `kubectl edit cronjob trigger-<func_name>` (where `func_name` is the name of your function) and modify the fields required. Once it is saved the CronJob will be updated.
+
+## Monitoring functions
+
+Some Kubeless runtimes expose metrics at `/metrics` endpoint and these metrics will be collected by Prometheus. We also include a prometheus setup in [`manifests/monitoring`](https://github.com/kubeless/kubeless/blob/master/manifests/monitoring/prometheus.yaml) to help you easier set it up. The metrics collected are: Number of calls, succeeded and error executions and the time spent per call.
+
+## Runtime variants
+
+Check [this document](/docs/runtimes) to get more details about supported runtimes and languages.
diff --git a/docs/pubsub-functions.md b/docs/pubsub-functions.md
@@ -0,0 +1,141 @@
+# PubSub events
+
+You can trigger any Kubeless function by a PubSub mechanism. The PubSub function is expected to consume input messages from a predefined topic from a messaging system. Kubeless currently supports using events from Kafka and NATS messaging systems.
+
+## Kafka
+
+In Kubeless [release page](https://github.com/kubeless/kubeless/releases), you can find the manifest to quickly deploy a collection of Kafka and Zookeeper statefulsets. If you have a Kafka cluster already running in the same Kubernetes environment, you can also deploy PubSub function with it. Check out [this tutorial](/docs/use-existing-kafka) for more details how to do that.
+
+If you want to deploy the manifest we provide to deploy Kafka and Zookeeper execute the following command:
+
+```console
+$ export RELEASE=$(curl -s https://api.github.com/repos/kubeless/kubeless/releases/latest | grep tag_name | cut -d '"' -f 4)
+$ kubectl create -f https://github.com/kubeless/kubeless/releases/download/$RELEASE/kafka-zookeeper-$RELEASE.yaml
+```
+
+> NOTE: Kafka statefulset uses a PVC (persistent volume claim). Depending on the configuration of your cluster you may need to provision a PV (Persistent Volume) that matches the PVC or configure dynamic storage provisioning. Otherwise Kafka pod will fail to get scheduled. Also note that Kafka is only required for PubSub functions, you can still use http triggered functions. Please refer to [PV](https://kubernetes.io/docs/concepts/storage/persistent-volumes/) documentation on how to provision storage for PVC.
+
+Once deployed, you can verify two statefulsets up and running:
+
+```
+$ kubectl -n kubeless get statefulset
+NAME      DESIRED   CURRENT   AGE
+kafka     1         1         40s
+zoo       1         1         42s
+
+$ kubectl -n kubeless get svc
+NAME        TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)             AGE
+broker      ClusterIP   None            <none>        9092/TCP            1m
+kafka       ClusterIP   10.55.250.89    <none>        9092/TCP            1m
+zoo         ClusterIP   None            <none>        9092/TCP,3888/TCP   1m
+zookeeper   ClusterIP   10.55.249.102   <none>        2181/TCP            1m
+```
+
+A function can be as simple as:
+
+```python
+def foobar(event, context):
+  print event['data']
+  return event['data']
+```
+
+Now you can deploy a pubsub function. 
+
+```console
+$ kubeless function deploy test --runtime python2.7 \
+                                --handler test.foobar \
+                                --from-file test.py
+```
+
+You need to create a _Kafka_ trigger that lets you associate a function with a topic specified by `--trigger-topic` as below:
+
+```console
+$ kubeless trigger kafka create test --function-selector created-by=kubeless,function=test --trigger-topic test-topic
+```
+
+After that you can invoke the function by publishing messages in that topic. To allow you to easily manage topics `kubeless` provides a convenience function `kubeless topic`. You can create/delete and publish to a topic easily.
+
+```console
+$ kubeless topic create test-topic
+$ kubeless topic publish --topic test-topic --data "Hello World!"
+```
+
+You can check the result in the pod logs:
+
+```console
+$ kubectl logs test-695251588-cxwmc
+...
+Hello World!
+```
+## NATS
+
+If you do not have NATS cluster its pretty easy to setup a NATS cluster. Run below command to deploy a [NATS operator](https://github.com/nats-io/nats-operator)
+
+```console
+$ kubectl apply -f https://raw.githubusercontent.com/nats-io/nats-operator/master/example/deployment-rbac.yaml
+```
+
+Once NATS operator is up and running run below command to deploy a NATS cluster
+
+```console
+echo '
+apiVersion: "nats.io/v1alpha2"
+kind: "NatsCluster"
+metadata:
+  name: "nats"
+spec:
+  size: 3
+  version: "1.1.0"
+' | kubectl apply -f - -n nats-io
+```
+
+Above command will create NATS cluster IP service `nats.nats-io.svc.cluster.local:4222` which is the default URL Kubeless NATS trigger contoller expects.
+
+Now use this manifest to deploy Kubeless NATS triggers controller.
+
+```console
+kubectl create -f https://github.com/kubeless/kubeless/releases/download/$RELEASE/nats-$RELEASE.yaml
+```
+
+By default NATS trigger controller expects NATS cluster is available as Kubernetes cluster service `nats.nats-io.svc.cluster.local:4222`. You can overide the default NATS cluster url used by setting the environment variable `NATS_URL` in the manifest. Once NATS trigger controller is setup you can deploy the function and associate function with a topic on the NATS cluster.
+
+```console
+$ kubeless function deploy pubsub-python-nats --runtime python2.7 \
+                                --handler test.foobar \
+                                --from-file test.py
+```
+
+After function is deployed you can use `kubeless trigger nats` CLI command to  associate function with a topic on NATS cluster as below.
+
+```console
+$ kubeless trigger nats create pubsub-python-nats --function-selector created-by=kubeless,function=pubsub-python-nats --trigger-topic test
+```
+
+At this point you are all set try Kubeless NATS triggers.
+
+You could quickly test the functionality by publishing a message to the topic, and verifying that message is seen by the pod running the function.
+
+```console
+$ kubeless trigger nats publish --url nats://nats-server-ip:4222 --topic test --message "Hello World!"
+```
+
+You can check the result in the pod logs:
+
+```console
+$ kubectl logs pubsub-python-nats-5b9c849fc-tvq2l
+...
+Hello World!
+```
+
+## Other commands
+
+You can create, list and delete PubSub topics (for Kafka):
+
+```console
+$ kubeless topic create another-topic
+Created topic "another-topic".
+
+$ kubeless topic delete another-topic
+
+$ kubeless topic ls
+```