[Storage] Deal with RBAC replication in live tests. #20559

kasobol-msft · 2021-04-20T23:53:55Z

This is to address #17384 .

The idea is to annotate problematic tests with attribute that will keep retrying on AuthorizationPermissionMismatch.

Before this I tried to use this:

azure-sdk-for-net/sdk/storage/Azure.Storage.Common/tests/Shared/StorageTestBase.cs

Lines 415 to 428 in bf4b77e

    
           protected async Task<T> RetryAsync<T>( 
        
               Func<Task<T>> operation, 
        
               Func<RequestFailedException, bool> shouldRetry, 
        
               int retryDelay = TestConstants.RetryDelay, 
        
               int retryAttempts = Constants.MaxReliabilityRetries) => 
        
               await RetryAsync(Mode, operation, shouldRetry, retryDelay, retryAttempts); 
        
           public static async Task<T> RetryAsync<T>( 
        
               RecordedTestMode mode, 
        
               Func<Task<T>> operation, 
        
               Func<RequestFailedException, bool> shouldRetry, 
        
               int retryDelay = TestConstants.RetryDelay, 
        
               int retryAttempts = Constants.MaxReliabilityRetries) 
        
           {

However, the amount of code involved made me cry... and we'll have more tests dependent on oauth in the future so looking for something more concise.

Scale of the problem can be seen in https://dev.azure.com/azure-sdk/internal/_build/results?buildId=849732&view=ms.vss-test-web.build-test-results-tab&runId=18450850&resultId=105746&paneView=debug .

kasobol-msft · 2021-04-20T23:54:28Z

sdk/storage/Azure.Storage.Common/tests/Shared/RetryOnFailedRequestAttribute.cs

+namespace Azure.Storage.Tests.Shared
+{
+    [AttributeUsage(AttributeTargets.Method, AllowMultiple = false, Inherited = false)]
+    public class RetryOnFailedRequestAttribute : NUnitAttribute, IRepeatTest


borrowed from https://github.com/nunit/nunit/blob/b34eba3ac1aa6957157857bddd116256c634afab/src/NUnitFramework/framework/Attributes/RetryAttribute.cs#L17 .

I'm afraid of attributes like this :) Can we just have the RetryOnAuthorizationPermissionMismatch and avoid giving people a gun?

kasobol-msft · 2021-04-20T23:55:37Z

sdk/storage/Azure.Storage.Common/tests/Shared/RetryOnFailedRequestAttribute.cs

+                return context.CurrentResult;
+            }
+
+            private bool ShouldRetry(TestResult testResult)


this is borrowed from

azure-sdk-for-net/sdk/core/Azure.Core.TestFramework/src/RecordedTestAttribute.cs

Line 73 in bf4b77e

private static bool IsTestFailedWithRecordingMismatch(TestExecutionContext context)

…plication-lag

pakrym · 2021-04-21T00:01:24Z

Wonder if this is something we can address in our live resource deployment framework. Add a post-deploy script that would try to access the resource until it's available.

pakrym · 2021-04-21T00:02:41Z

We already have a bad version of it https://github.com/Azure/azure-sdk-for-net/blob/master/sdk/eventhub/test-resources-post.ps1#L15-L16 :)

pakrym · 2021-04-21T00:03:12Z

Yeah, I think it's a common enough problem that it should be solved outside the storage project.

pakrym · 2021-04-21T00:08:07Z

Filed Azure/azure-sdk-tools#1567

kasobol-msft · 2021-04-21T00:11:50Z

@pakrym lol, looks like my old workaround got copied , see

azure-sdk-for-net/sdk/storage/test-resources-post.ps1

Lines 177 to 179 in bf4b77e

    
           # Wait until RBAC replicates. It has 5min SLA. https://github.com/Azure/azure-sdk-for-net/issues/17384 to find better solution. 
        
           Write-Verbose "Sleeping for 90 seconds to let RBAC replicate" 
        
           Start-Sleep -s 90

.

Here are few considerations:

unfortunately, there isn't a good way to check if replication finished, the only way is to execute arbitrary scenario and check if it's working... which is going to be pain in powershell. a small dotnet test suite with small number of tests could be an option so we can leverage test setup.
there's plenty of time between ARM template execution and moment when problematic tests run. it would be waste of time to wait for this to complete eagerly. (i.e. the time when devops switch between tasks, do checkouts, builds and so on, also while other tests run as part of the run). this is primary motivation to move that wait from ps1 to tests

maybe we can have something like

azure-sdk-for-net/sdk/storage/Azure.Storage.Common/tests/Shared/AzuriteNUnitFixture.cs

Lines 11 to 17 in bf4b77e

    
           [SetUpFixture] 
        
           public class AzuriteNUnitFixture 
        
           { 
        
               public static AzuriteFixture Instance { get; private set; } 
        
               [OneTimeSetUp] 
        
               public void SetUp()

where we block and wait for some "oauth" scenario to succeed.

pakrym · 2021-04-21T00:13:21Z

Yeah, one conclusion might be that we put something in the test framework and maybe add a method to the TestEnvironment to verify permissions that would get executed before the assembly runs.

kasobol-msft · 2021-04-21T00:23:00Z

Yeah, one conclusion might be that we put something in the test framework and maybe add a method to the TestEnvironment to verify permissions that would get executed before the assembly runs.

Should be something where test assembly can provide a "sampling scenario" to the framework. Unfortunately, each service will have it different and the easiest way to code it is with service client in hand.

I'm going to try build a simple SetUpFixture in storage suite as interim solution for us.

pakrym · 2021-04-21T00:29:15Z

add a method to the TestEnvironment

Sorry, I meant a virtual placeholder that libraries can choose to override with their sampling function.

pakrym · 2021-04-21T00:34:51Z

We might not even need a fixture as we can call the method from RecorderTestBase.SetUp

kasobol-msft added 3 commits April 20, 2021 12:37

fix flaky test.

babed3f

draft.

da574be

tweak.

91f8828

ghost added the Storage Storage Service (Queues, Blobs, Files) label Apr 20, 2021

kasobol-msft commented Apr 20, 2021

View reviewed changes

kasobol-msft added 2 commits April 20, 2021 16:55

remove that...

d3d3f95

Merge remote-tracking branch 'upstream/master' into deal-with-rbac-re…

dbd30fa

…plication-lag

kasobol-msft requested review from seanmcc-msft, amnguye and pakrym April 20, 2021 23:57

pakrym mentioned this pull request Apr 21, 2021

Provide a pattern to wait for RBAC propagation Azure/azure-sdk-tools#1567

Open

kasobol-msft mentioned this pull request Apr 21, 2021

[Storage][Core] Wait for environment to become eventually consistent. #20574

Merged

kasobol-msft closed this Apr 21, 2021

kasobol-msft deleted the deal-with-rbac-replication-lag branch April 21, 2021 19:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Storage] Deal with RBAC replication in live tests. #20559

[Storage] Deal with RBAC replication in live tests. #20559

kasobol-msft commented Apr 20, 2021 •

edited

Loading

kasobol-msft Apr 20, 2021

pakrym Apr 21, 2021

kasobol-msft Apr 20, 2021

pakrym commented Apr 21, 2021

pakrym commented Apr 21, 2021

pakrym commented Apr 21, 2021

pakrym commented Apr 21, 2021

kasobol-msft commented Apr 21, 2021

pakrym commented Apr 21, 2021 •

edited

Loading

kasobol-msft commented Apr 21, 2021

pakrym commented Apr 21, 2021

pakrym commented Apr 21, 2021

	protected async Task<T> RetryAsync<T>(
	Func<Task<T>> operation,
	Func<RequestFailedException, bool> shouldRetry,
	int retryDelay = TestConstants.RetryDelay,
	int retryAttempts = Constants.MaxReliabilityRetries) =>
	await RetryAsync(Mode, operation, shouldRetry, retryDelay, retryAttempts);

	public static async Task<T> RetryAsync<T>(
	RecordedTestMode mode,
	Func<Task<T>> operation,
	Func<RequestFailedException, bool> shouldRetry,
	int retryDelay = TestConstants.RetryDelay,
	int retryAttempts = Constants.MaxReliabilityRetries)
	{

[Storage] Deal with RBAC replication in live tests. #20559

[Storage] Deal with RBAC replication in live tests. #20559

Conversation

kasobol-msft commented Apr 20, 2021 • edited Loading

kasobol-msft Apr 20, 2021

Choose a reason for hiding this comment

pakrym Apr 21, 2021

Choose a reason for hiding this comment

kasobol-msft Apr 20, 2021

Choose a reason for hiding this comment

pakrym commented Apr 21, 2021

pakrym commented Apr 21, 2021

pakrym commented Apr 21, 2021

pakrym commented Apr 21, 2021

kasobol-msft commented Apr 21, 2021

pakrym commented Apr 21, 2021 • edited Loading

kasobol-msft commented Apr 21, 2021

pakrym commented Apr 21, 2021

pakrym commented Apr 21, 2021

kasobol-msft commented Apr 20, 2021 •

edited

Loading

pakrym commented Apr 21, 2021 •

edited

Loading