[skip ci] Add ROBO test plans #7297

anchal-agrawal · 2018-02-09T00:06:08Z

This commit adds test plans for the ROBO support features in a new
directory (Group19-ROBO) under manual test cases. The existing ROBO-SKU
test has been moved into this directory. The test plans include tests
for the container limit feature, placement without DRS, the license/
feature checks and WAN connectivity.

Fixes #7294

mhagen-vmware

lgtm - just one small non-blocking comment, take it or leave it

mhagen-vmware · 2018-02-12T16:04:49Z

tests/manual-test-cases/Group19-ROBO/19-3-ROBO-VM-Placement.md

+* All steps should succeed.
+* In Step 2, the VCH should be placed on a host that satisfies the license and other feature requirements.
+* In Steps 3 and 6, containers shouldn't fail to be created/started unless the cluster resources/limits are exhausted.
+* In Steps 4 and 7, containers should be placed according to the criteria defined in [Purpose](#purpose). More details are TBD.


I would actually number the references and refer here to the 2nd ref, not the purpose

cgtexmex · 2018-02-12T17:06:16Z

tests/manual-test-cases/Group19-ROBO/19-3-ROBO-VM-Placement.md

@@ -0,0 +1,36 @@
+Test 19-3 - ROBO - VM Placement


We should be able to control host resource consumption and test with that in mind. For example deploy progrium/stress containerVMs that will consume resources in a predictable manner -- once the hosts are running at a known level of consumption then deploy test containerVMs (ubuntu, busybox, etc) and ensure they are placed on the host that we know has the necessary free resources.

Additionally, we need a negative test -- i.e. all hosts are consumed, so deployment fails.

Good call, I'll add these details.

cgtexmex · 2018-02-12T17:36:31Z

This is a solid start - as we discussed via slack we'll need to address the vic-machine specifics and we'll have additional testing bubble up as we solidify some items (i.e. events, cache invalidation, inventory folder support, etc..)

hickeng · 2018-02-12T22:28:41Z

tests/manual-test-cases/Group19-ROBO/19-1-ROBO-SKU.md

@@ -15,10 +15,10 @@ This test requires access to VMware Nimbus cluster for dynamic ESXi and vCenter
 2. Add the ROBO SKU license to the vCenter appliance
 3. Assign the ROBO SKU license to each of the hosts within the vCenter
 4. Install the VIC appliance onto one of the hosts in the vCenter


Think this should be explicit about standalone hosts, single host clusters, or multi-host clusters.

hickeng · 2018-02-12T22:31:42Z

tests/manual-test-cases/Group19-ROBO/19-2-ROBO-Container-Limit.md

+=======
+
+# Purpose:
+To verify that the total container VM limit feature works as expected in a vSphere ROBO Advanced environment.


This configuration option should not be unique to ROBO - I suggest the main body of this test be in the CI test buckets and this test, in the robo specific setting, uses it in the same way the previous robo test references the regression tests.

I brought this up with @mhagen-vmware since the container VM limit feature doesn't apply just to ROBO, but to vic-machine in general. This pull request is intended only for ROBO-focused test plans. A CI test plan for this would come when closing #6529 for example.

I've updated #6529's acceptance criteria.

I'm not sure I understand this response... but I'm not going to push it at this time. We'll simply end up committing the tests directly into the CI test suites and refactoring this test runbook to reference those.

hickeng · 2018-02-12T22:33:12Z

tests/manual-test-cases/Group19-ROBO/19-2-ROBO-Container-Limit.md

+
+# Test Steps:
+1. Deploy a ROBO Advanced vCenter testbed for both environments above
+2. Install the VIC appliance on vCenter with a container VM limit of y


Which cluster are we installing to? Is it random, round robin, each in turn?

We could pick any cluster to install into, unless we want to run this test in each cluster configuration - in which case I can mention so in the Environment section above.

I'll make this line specific.

hickeng · 2018-02-12T22:37:00Z

tests/manual-test-cases/Group19-ROBO/19-2-ROBO-Container-Limit.md

+13. Delete/stop some containers so the current container VM count is lower than the set limit
+14. Attempt to create/run more containers until the set limit
+15. Delete the VCH
+


suggest we also try:

running run/delete in series to ensure that we're only enforcing concurrently running limits.

creating more than y cVMs but not starting them (assuming the enforcement is for running cVMs for first drop)

starting and deleting containers at the same time so we stay close to the limit but are bouncing off it and ensure after we stop with the concurrent operations that we can still hit the limit and are not over it - this tests the concurrency of the bound tracking.

Fixed - thanks for clarifying!

hickeng · 2018-02-12T22:38:28Z

tests/manual-test-cases/Group19-ROBO/19-3-ROBO-VM-Placement.md

+
+# Test Steps:
+1. Deploy a ROBO Advanced vCenter testbed for both environments above
+2. Install the VIC appliance on vCenter


which cluster/clusters? I'm assuming one of the multi-host clusters_

Correct - one of the multi/single-host clusters. I'll make it specific.

hickeng · 2018-02-12T22:40:33Z

tests/manual-test-cases/Group19-ROBO/19-3-ROBO-VM-Placement.md

+3. Deploy containers that will consume resources predictably (e.g. the `progrium/stress` image)
+4. Measure ESX host metrics and gather resource consumption
+5. Create and run regular containers such as `busybox`
+6. Create and run enough containers to consume all available host resources


now I'm thinking single host cluster... and this test is focused on testing the resource exhaustion case.
But in that case I'd expect another test that ensures we can consume the entire cluster resource.

My intent for this test is to target a particular cluster (multi-host or otherwise) - I'll make the steps clearer and specific.

hickeng · 2018-02-12T22:41:24Z

tests/manual-test-cases/Group19-ROBO/19-1-ROBO-SKU.md

@@ -15,10 +15,10 @@ This test requires access to VMware Nimbus cluster for dynamic ESXi and vCenter
 2. Add the ROBO SKU license to the vCenter appliance


I believe the environment of primary interest has an Enterprise license for VC but ROBO for the hosts.

Thanks - I'm changing this line to Add the Enterprise license to the vCenter appliance. cc @mhagen-vmware since he wrote this particular test.

hickeng · 2018-02-12T22:42:59Z

tests/manual-test-cases/Group19-ROBO/19-4-ROBO-License-Features.md

+1. Deploy a ROBO Advanced vCenter testbed for both environments above
+2. Install the VIC appliance on vCenter
+3. Visit the VCH Admin page and verify that the License and Feature Status sections show that required license and features are present
+4. Assign a more restrictive license such as ROBO (unadvanced) or Standard that does not have the required features (VDS, VSPC) to vCenter


ROBO (unadvanced)

ROBO Standard

hickeng · 2018-02-12T22:44:52Z

tests/manual-test-cases/Group19-ROBO/19-5-ROBO-Vcenter-Connectivity.md

+1. Deploy a ROBO Advanced vCenter testbed for both environments above
+2. Install the VIC appliance on vCenter
+3. Create and start some container services such as nginx, wordpress or a database
+4. Run a containerized application with docker-compose


I would like this to be an explicitly multi-container application so we exercise bridge communications, etc.

hickeng · 2018-02-12T22:45:22Z

tests/manual-test-cases/Group19-ROBO/19-5-ROBO-Vcenter-Connectivity.md

+2. Install the VIC appliance on vCenter
+3. Create and start some container services such as nginx, wordpress or a database
+4. Run a containerized application with docker-compose
+5. For each ESXi host that hosts containerVM(s), disconnect it from vCenter


If we're emulating WAN link outage then it should be all hosts in the cluster.
We should also ensure that the hosts can continue to talk to each other.

I'm unsure what is meant by disconnect - it needs to be unexpected from both the VC and ESX side so that we don't end up with polite behaviours.

Might be possible with firewall rules on ESX or in nimbus. Alternatively if the VC is addressed via a separate network than the other hosts in the cluster.

hickeng · 2018-02-13T17:56:25Z

tests/manual-test-cases/Group19-ROBO/19-3-ROBO-VM-Placement.md

+
+# Test Steps:
+1. Deploy a ROBO Advanced vCenter testbed for both environments above
+2. Install the VIC appliance on a particular (multi/single-host) cluster on vCenter


The use of a single host cluster for this test would be:
a. to ensure we deal cleanly with that scenario
b. to check resource exhaustion behaviour in a simple setting.

We must test in a multi-host cluster as that is what this placement logic is expressly for. These should be pulled out as two distinct variants of the test rather than incidentally noted in the current manner.

It's okay to leave comments in for the multi-host test saying we're not sure of exact behaviour at this time as it will be contingent on the algorithm design for the placement, but we should note the we are able to reach whatever cluster utilization level we would expect from the cVM size, cluster capacity and placement logic.

hickeng · 2018-02-13T17:59:12Z

tests/manual-test-cases/Group19-ROBO/19-4-ROBO-License-Features.md

+2. Install the VIC appliance on vCenter
+3. Visit the VCH Admin page and verify that the License and Feature Status sections show that required license and features are present
+4. Assign a more restrictive license such as ROBO Standard or Standard that does not have the required features (VDS, VSPC) to vCenter
+5. Assign the above license to each of the hosts within the vCenter host


vCenter host

vCenter cluster

We will also need to decide what we will do if not all hosts in the cluster are licensed with ROBO advanced. Do we place only onto the hosts that have met the requirements (which is kind of what we do with incompletely connected cluster datastores and networks via DRS currently) or do we refuse to install in a heterogeneous cluster. @cgtexmex ?

hickeng · 2018-02-13T18:01:51Z

tests/manual-test-cases/Group19-ROBO/19-5-ROBO-Vcenter-Connectivity.md

+
+# Test Steps:
+1. Deploy a ROBO Advanced vCenter testbed for both environments above
+2. Install the VIC appliance on vCenter


Install the VIC appliance on vCenter

Install a VCH in a cluster

I'm going to be increasingly pedantic about word usage going forwards, as part of my responsibilities to ensure clear/concise communication. This same comment applies to other uses of this term - VIC appliance is an overloaded term at this time, for example I think you're talking about a VCH here, but you may well be talking about testing the VIC appliance with Harbor/Admiral across a WAN link.

Harbor/Admiral over WAN is a good test and we should likely add a section for, even if it's a statement that we're explicitly not testing that facet at this time @cgtexmex .

VIC appliance is an overloaded term at this time

Agree completely. I'm replacing all occurrences of VIC appliance with VCH in this change. For this particular test, I'll add some steps to test Harbor/Admiral with Engine through the VIC appliance.

This commit adds test plans for the ROBO support features in a new directory (Group19-ROBO) under manual test cases. The existing ROBO-SKU test has been moved into this directory. The test plans include tests for the container limit feature, placement without DRS, the license/ feature checks and WAN connectivity. Fixes vmware#7294

* Dump dmesg if container rootfs blocks or fails mount (#7260) This is to enable bridging of the guest side state with the virtual hardware if we see issues such as disks not presenting on a pvscsi controller or a mount operation hanging. * updating priority definitions to include features (#7292) * Change default fellows for gandalf (#7310) * Avoid exposing test credentials in 12-01-Delete (#7306) To avoid exposing test credentials, use the established `Run VIC Machine Delete Command` keyword, which in turn calls a secret keyword. This changes the behavior of the test slightly: - It no longer checks for the absence of "delete failed" in output. - It will wait up to 30 seconds for the deletion to succeed. - It will clean up cert files at the end of the deletion. * Bug fix in API delete test: disable volume store cleanup (#7303) * Remove volume store cleanup before re-installing VIC appliance using existing volume stores * Cleanup dangling volume stores on test teardown * Add logging for image upload (#7296) * Reduce datastore searches during non-vSAN delete operations (#6951) * Optimize portlayer volume cache rebuild on startup (#7267) This commit modifies the portlayer volume cache's rebuildCache func to only process the volumes from the volume store that is currently being added to the cache. rebuildCache is invoked for every volume store during portlayer startup. Before this change, rebuildCache would process volumes from all volume stores in the cache every time a volume store was added. This led to unneeded extra computation which could slow down portlayer startup and overwhelm NFS endpoints if NFS volume stores are being used. Fixes #6991 * Added local ci testing target to makefile (#7170) Make testing locally as friction-free as possible by 1. Adding a makefile target 'local-ci-test' 2. Using TARGET_VCH added in VIC 1.3 to use an existing VCH 3. Using a custom script that doesn't utilize drone so that if the test fails and craters, we can still access the logs 4. Parameters can come from env vars, arguments, or secrets file Resolves #7162 * Added upload progress bar tracker for ISO images. (#7320) * Added upload progress bar tracker for ISO images. Removed concurrent upload since it doesn't make any significant performance imapact. When I tried to measure performance differene with and without concurrent uppload, the results were fluctuating in a wide range so no good measurement was possible. * Document the design for the vic-machine API (#6702) This document proposes a design for a comprehensive vic-machine API, the implementation of which will be tracked by #6116. Subsets of this API (tracked by #5721, #6123, and eventually others) will be implemented incrementally, and the design will be revised as those efforts progress to reflect changes to the long-term vision. * Remove superfluous calls to Set Test VCH Name (#7304) Several tests explicitly call the `Set Test VCH Name` keyword shortly after calling `Set Test Environment Variables`. This can lead to test failures when a VCH name collision occurs; subsequent tests which re-use the VCH name fail because there may be leftover certificates from the first VCH with that name. `Set Test Environment Variables` itself calls `Set Test VCH Name` and then cleans up old certificate directories. Therefore, the explicit calls to `Set Test VCH Name` are both superfluous and problematic. * Ensure that static ip worker is on the same nimbus pod as VC otherwise network connectivity not guaranteed (#7307) * [skip ci] Add ROBO test plans (#7297) This commit adds test plans for the ROBO support features in a new directory (Group19-ROBO) under manual test cases. The existing ROBO-SKU test has been moved into this directory. The test plans include tests for the container limit feature, placement without DRS, the license/ feature checks and WAN connectivity. Fixes #7294 * Add hosts to DVS within the test bed as well (#7326) * Setup updated for Longevity Tests (#7298) * Setup updated for Longevity Tests to run on 6.5U1 * [skip ci] Terminate gracefully to gather logs (#7331) * Terminate gracefully to gather logs * Remove extra whitespace * Increase timeout to 70 minutes * Increase ELM timeout to 70 minutes * Add repo to slack message since we have multiple repos reporting now (#7334) * Not sending user credentials with every request (#6382) * Add concurrent testing tool to tests folder (#6534) Adds a minimized test case for testing our core vSphere interactions at varying degrees of concurrency. This is intended to simplify debugging issues that are suspected to be platform problems, or API usage issues that are conceptually divorced from the VIC engine product code. * Refactor Install Harbor To Test Server keyword (#7335) The secret tag on the `Install Harbor To Test Server` makes it difficult to investigate failures when they occur. Only one out of 30+ lines actually uses secret information. Refactor the keyword to extract the secret information into its own keyword, allowing the tag to be applied in a more focused way. This is similar to the pattern used by keywords in `VCH-Util`. * Add ability to cache generated dependency. (#7340) * Add ability to cache generated dependency, so not much time wasted during the build process. * Added documentation to reflect necessary steps to leverage such improvements. * Ignore not-supported result from RetrieveUserGroups in VC 6.0 (#7328) * Move build time directives from title to body of PR (#7060) * Retry the harbor setup as well (#7336) * Skip non vSphere managed datastores when granting perms (#7346) * Fix volume leak on group 23 test (#7358) * Fix github status automation filtering (#7344) Adds filtering for the event source and consolidates remote API calls. Details the specific builds and their status for quick reference. * Drone 0.8 and HaaS updates (#7364) * Add tether.debug in integration test log bundle (#7422) * Update the gcs plugin (#7421) * [skip ci] Suggest subnet/gateway to static ip worker * Ensure that static ip worker is on the same nimbus pod as VC otherwise network connectivity not guaranteed (#7307) * Refactored some proxy code to reuse with wolfpack Refactored the system, volume, container, and stream swagger code into proxy code. 1) Moved the errors.go from backends to a new folder to be accessed by all folders outside of the backends folder. 2) Refactored Container proxy and moved from engine/backends to engine/proxy 3) Refactored Volume proxy and moved from engine/backends to engine/proxy 4) Refactored System proxy and moved from engine/backends to engine/proxy 5) Refactored Stream proxy and moved from engine/backends to engine/proxy 6) Adopted some common patterns in all the proxies 7) Moved common networking util calls to engine/networking 8) Fix up unit tests 9) Changed all "not yet implemented messages" 10) Updated robot scripts More refactoring will be needed to make these proxy less dependent on docker types and portlayer swagger types. Helps resolves #7210 and #7232 * Add virtual-kubelet binary to VIC ISO (#7315) * Start virtual-kubelet inside the VCH (#7369) * Fix value of the PORTLAYER_ADDR environment variable (#7400) * Use vic kubelet provider * Made modifications for virtual kubelet * Added admin log collection and fix env variable content (#7404) * Added most-vkubelet target (#7418)

vmwclabot added the cla-not-required label Feb 9, 2018

anchal-agrawal force-pushed the 7294-robo-testplans branch 3 times, most recently from d421c42 to a352d6d Compare February 9, 2018 00:10

anchal-agrawal changed the title ~~[WIP] [skip ci] ROBO test plans~~ [WIP] [skip ci] Add ROBO test plans Feb 9, 2018

anchal-agrawal force-pushed the 7294-robo-testplans branch 2 times, most recently from 9e6d7f8 to 6e5edeb Compare February 9, 2018 19:12

anchal-agrawal changed the title ~~[WIP] [skip ci] Add ROBO test plans~~ [skip ci] Add ROBO test plans Feb 9, 2018

anchal-agrawal requested review from mhagen-vmware and cgtexmex February 9, 2018 19:13

anchal-agrawal force-pushed the 7294-robo-testplans branch from 6e5edeb to 36bf9bc Compare February 9, 2018 19:17

anchal-agrawal requested a review from pdaigle February 9, 2018 19:17

anchal-agrawal force-pushed the 7294-robo-testplans branch from 36bf9bc to ef97184 Compare February 9, 2018 19:21

anchal-agrawal requested a review from hickeng February 9, 2018 19:46

mhagen-vmware approved these changes Feb 12, 2018

View reviewed changes

anchal-agrawal force-pushed the 7294-robo-testplans branch from ef97184 to a71fd90 Compare February 12, 2018 16:22

cgtexmex reviewed Feb 12, 2018

View reviewed changes

anchal-agrawal force-pushed the 7294-robo-testplans branch from a71fd90 to 63ba0fb Compare February 12, 2018 17:29

cgtexmex approved these changes Feb 12, 2018

View reviewed changes

pdaigle approved these changes Feb 12, 2018

View reviewed changes

hickeng reviewed Feb 12, 2018

View reviewed changes

anchal-agrawal force-pushed the 7294-robo-testplans branch from 63ba0fb to 4759438 Compare February 13, 2018 00:30

hickeng reviewed Feb 13, 2018

View reviewed changes

anchal-agrawal force-pushed the 7294-robo-testplans branch from ad3e4dd to 6d3cfa2 Compare February 14, 2018 16:54

hickeng approved these changes Feb 14, 2018

View reviewed changes

Anchal Agrawal added 2 commits February 14, 2018 12:24

Address review feedback

3d6969b

anchal-agrawal force-pushed the 7294-robo-testplans branch from 6d3cfa2 to 3d6969b Compare February 14, 2018 20:24

anchal-agrawal merged commit 97a7881 into vmware:master Feb 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[skip ci] Add ROBO test plans #7297

[skip ci] Add ROBO test plans #7297

anchal-agrawal commented Feb 9, 2018 •

edited

Loading

mhagen-vmware left a comment

mhagen-vmware Feb 12, 2018

cgtexmex Feb 12, 2018

anchal-agrawal Feb 12, 2018

anchal-agrawal Feb 12, 2018

cgtexmex commented Feb 12, 2018

hickeng Feb 12, 2018

hickeng Feb 12, 2018

anchal-agrawal Feb 12, 2018 •

edited

Loading

anchal-agrawal Feb 12, 2018

hickeng Feb 14, 2018

hickeng Feb 12, 2018

anchal-agrawal Feb 12, 2018 •

edited

Loading

hickeng Feb 12, 2018

anchal-agrawal Feb 13, 2018

hickeng Feb 12, 2018

anchal-agrawal Feb 12, 2018 •

edited

Loading

hickeng Feb 12, 2018

anchal-agrawal Feb 12, 2018

hickeng Feb 12, 2018

anchal-agrawal Feb 12, 2018

hickeng Feb 12, 2018

hickeng Feb 12, 2018

hickeng Feb 12, 2018

hickeng Feb 13, 2018

hickeng Feb 13, 2018

hickeng Feb 13, 2018

anchal-agrawal Feb 13, 2018

		@@ -15,10 +15,10 @@ This test requires access to VMware Nimbus cluster for dynamic ESXi and vCenter
		2. Add the ROBO SKU license to the vCenter appliance

[skip ci] Add ROBO test plans #7297

[skip ci] Add ROBO test plans #7297

Conversation

anchal-agrawal commented Feb 9, 2018 • edited Loading

mhagen-vmware left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cgtexmex commented Feb 12, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anchal-agrawal Feb 12, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anchal-agrawal Feb 12, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anchal-agrawal Feb 12, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anchal-agrawal commented Feb 9, 2018 •

edited

Loading

anchal-agrawal Feb 12, 2018 •

edited

Loading

anchal-agrawal Feb 12, 2018 •

edited

Loading

anchal-agrawal Feb 12, 2018 •

edited

Loading