Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Storage] Deal with RBAC replication in live tests. #20559

Closed

Conversation

kasobol-msft
Copy link
Contributor

@kasobol-msft kasobol-msft commented Apr 20, 2021

This is to address #17384 .

The idea is to annotate problematic tests with attribute that will keep retrying on AuthorizationPermissionMismatch.

Before this I tried to use this:

protected async Task<T> RetryAsync<T>(
Func<Task<T>> operation,
Func<RequestFailedException, bool> shouldRetry,
int retryDelay = TestConstants.RetryDelay,
int retryAttempts = Constants.MaxReliabilityRetries) =>
await RetryAsync(Mode, operation, shouldRetry, retryDelay, retryAttempts);
public static async Task<T> RetryAsync<T>(
RecordedTestMode mode,
Func<Task<T>> operation,
Func<RequestFailedException, bool> shouldRetry,
int retryDelay = TestConstants.RetryDelay,
int retryAttempts = Constants.MaxReliabilityRetries)
{

However, the amount of code involved made me cry... and we'll have more tests dependent on oauth in the future so looking for something more concise.

Scale of the problem can be seen in https://dev.azure.com/azure-sdk/internal/_build/results?buildId=849732&view=ms.vss-test-web.build-test-results-tab&runId=18450850&resultId=105746&paneView=debug .

image

@ghost ghost added the Storage Storage Service (Queues, Blobs, Files) label Apr 20, 2021
namespace Azure.Storage.Tests.Shared
{
[AttributeUsage(AttributeTargets.Method, AllowMultiple = false, Inherited = false)]
public class RetryOnFailedRequestAttribute : NUnitAttribute, IRepeatTest
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid of attributes like this :) Can we just have the RetryOnAuthorizationPermissionMismatch and avoid giving people a gun?

return context.CurrentResult;
}

private bool ShouldRetry(TestResult testResult)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is borrowed from

private static bool IsTestFailedWithRecordingMismatch(TestExecutionContext context)

@pakrym
Copy link
Contributor

pakrym commented Apr 21, 2021

Wonder if this is something we can address in our live resource deployment framework. Add a post-deploy script that would try to access the resource until it's available.

@pakrym
Copy link
Contributor

pakrym commented Apr 21, 2021

@pakrym
Copy link
Contributor

pakrym commented Apr 21, 2021

Yeah, I think it's a common enough problem that it should be solved outside the storage project.

@pakrym
Copy link
Contributor

pakrym commented Apr 21, 2021

Filed Azure/azure-sdk-tools#1567

@kasobol-msft
Copy link
Contributor Author

@pakrym lol, looks like my old workaround got copied , see

# Wait until RBAC replicates. It has 5min SLA. https://github.com/Azure/azure-sdk-for-net/issues/17384 to find better solution.
Write-Verbose "Sleeping for 90 seconds to let RBAC replicate"
Start-Sleep -s 90
.

Here are few considerations:

  • unfortunately, there isn't a good way to check if replication finished, the only way is to execute arbitrary scenario and check if it's working... which is going to be pain in powershell. a small dotnet test suite with small number of tests could be an option so we can leverage test setup.
  • there's plenty of time between ARM template execution and moment when problematic tests run. it would be waste of time to wait for this to complete eagerly. (i.e. the time when devops switch between tasks, do checkouts, builds and so on, also while other tests run as part of the run). this is primary motivation to move that wait from ps1 to tests
  • maybe we can have something like
    [SetUpFixture]
    public class AzuriteNUnitFixture
    {
    public static AzuriteFixture Instance { get; private set; }
    [OneTimeSetUp]
    public void SetUp()
    where we block and wait for some "oauth" scenario to succeed.

@pakrym
Copy link
Contributor

pakrym commented Apr 21, 2021

Yeah, one conclusion might be that we put something in the test framework and maybe add a method to the TestEnvironment to verify permissions that would get executed before the assembly runs.

@kasobol-msft
Copy link
Contributor Author

Yeah, one conclusion might be that we put something in the test framework and maybe add a method to the TestEnvironment to verify permissions that would get executed before the assembly runs.

Should be something where test assembly can provide a "sampling scenario" to the framework. Unfortunately, each service will have it different and the easiest way to code it is with service client in hand.

I'm going to try build a simple SetUpFixture in storage suite as interim solution for us.

@pakrym
Copy link
Contributor

pakrym commented Apr 21, 2021

add a method to the TestEnvironment

Sorry, I meant a virtual placeholder that libraries can choose to override with their sampling function.

@pakrym
Copy link
Contributor

pakrym commented Apr 21, 2021

We might not even need a fixture as we can call the method from RecorderTestBase.SetUp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Storage Storage Service (Queues, Blobs, Files)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants