Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support dynamic impermission user V2 #17305

Open
wants to merge 4 commits into
base: master-2.x
Choose a base branch
from

Conversation

qian0817
Copy link
Contributor

What changes are proposed in this pull request?

Fix #17295.
Another implementation of #17296.
Need to wait for #17301 merge

Why are the changes needed?

support dynamic refresh impersonation user.

Does this PR introduce any user facing changes?

@alluxio-bot
Copy link
Contributor

Automated checks report:

  • Commits associated with Github account: PASS
  • PR title follows the conventions: FAIL
    • The title of the PR does not pass all the checks. Please fix the following issues:
      • First word must be capitalized
      • Title must not end with punctuation
      • First word of title ("feat:") is not an imperative verb. Please use one of the valid words

Some checks failed. Please fix the reported issues and reply 'alluxio-bot, check this please' to re-run checks.

@qian0817 qian0817 changed the title feat: support dynamic impermission user v2. Support dynamic impermission user V2 Apr 23, 2023
@alluxio-bot
Copy link
Contributor

Automated checks report:

  • Commits associated with Github account: PASS
  • PR title follows the conventions: PASS

All checks passed!

Copy link
Contributor

@maobaolong maobaolong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks better than v1 version, left some comment.

private void loadImpersonationUser(AlluxioConfiguration conf) {
Map<String, Set<String>> impersonationGroups = new HashMap<>();
Map<String, Set<String>> impersonationUsers = new HashMap<>();
for (PropertyKey key : conf.keySet()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just concern to the effect of old implementation, we have to iterate all PropertyKey to find the template property, although the updated propertykeys are not them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, there are approximately 1000 property keys in Alluxio. Traversing them does not significantly affect performance. However, it would be more effective to operate based on the PropertyKey that changes with modifications.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For current implementation, we can have two idea to reduce waste loop templete key improve the performance

  • Improve the update configuration and reconfig framework, support fire change and listen changed properties only.
  • Improve the propertykey framework to support index template keys, so we can get the template key efficiently.

@jja725
Copy link
Contributor

jja725 commented Apr 24, 2023

@HelloHorizon fyi

jiacheliu3 pushed a commit to jiacheliu3/alluxio that referenced this pull request May 17, 2023
…7305: Reduce time cost of checking bucket path in S3 API

- I create `BUCKET_PATH_CACHE` as a static object to
`S3RestHandlerService`.

- I create property key `PROXY_S3_BUCKETPATHCACHE_TIMEOUT_MS` to
determine how long each entry should be automatically removed from the
cache.

- I create function `checkPathIsAlluxioDirectory(FileSystem,
String,@nullable S3AuditContext, Cache<AlluxioURI, Boolean>
bucketPathCache)` ,which plays the same role as
`checkPathIsAlluxioDirectory(FileSystem, String,@nullable
S3AuditContext)` but more efficient.

- I change S3RestHandlerService.java to call
`checkPathIsAlluxioDirectory` with argument `BUCKET_PATH_CACHE`.

- I add boolean value `checkS3BucketPath` as a field of
`CreateFilePOptions` and `CreateDirectoryPOptions`. This flag is used in
`FileSystemMasterClientServiceHandler` to check the bucket path existent
if `checkS3BucketPath` is `TRUE`.

1. `BUCKET_PATH_CACHE` highly improves the efficiency while creating
objects.
For example:
Set up an Alluxio cluster with 1 master(aws ec2 m5.2xlarge) and 2
workers(aws ec2 m5.4xlarge);
benchmark S3 API with warp cmd.
```
WARP_ACCESS_KEY=minioadmin WARP_SECRET_KEY=minioadmin ./warp put
--concurrent=80 --obj.size=64KiB --host=cluster0120-masters-1
--disable-multipart --insecure --md5 --duration=3m
```
Benchmark result is like:

|PUT operation|before|after|
|-|-|-|
| Average Throughput |48.49 MiB/s, 775.82 obj/s|59.66 MiB/s, 954.57
obj/s|
|Time percentage:
checkPathIsAlluxioDirectory()/createObjectOrUploadPart()|7.70%|0.50%|
|Time percentage: checkPathIsAlluxioDirectory()/total_time|3.31%|0.21%|

2. Users can modify `alluxio.proxy.s3.bucketpathcache.timeout` in
`conf/alluxio-site.properties` to change the time period of cache
expiration.

Users can modify `alluxio.proxy.s3.bucketpathcache.timeout` in
`conf/alluxio-site.properties` to change the time period of cache
expiration. The default value is `1min`. Users can set it `0min` to
close the function.
|Property Name|Default|Description|
|-|-|-|
|alluxio.proxy.s3.bucketpathcache.timeout|1min| Expire bucket path
statistics in cache for this time period. Set `0min` to disable the
cache.|

> Note: `create file` and `create bucket` operations are always accurate
, but other operations bring incosistent results because of the cache to
bucket path. All in all, alluxio makes sure that every modification is
cosistent and correct.

pr-link: Alluxio#16806
change-id: cid-fdaff4b5a6fdd3eca834cfaf8caea3fe38618005
jiacheliu3 pushed a commit to jiacheliu3/alluxio that referenced this pull request May 17, 2023
…7305: Reduce time cost of checking bucket path in S3 API

- I create `BUCKET_PATH_CACHE` as a static object to
`S3RestHandlerService`.

- I create property key `PROXY_S3_BUCKETPATHCACHE_TIMEOUT_MS` to
determine how long each entry should be automatically removed from the
cache.

- I create function `checkPathIsAlluxioDirectory(FileSystem,
String,@nullable S3AuditContext, Cache<AlluxioURI, Boolean>
bucketPathCache)` ,which plays the same role as
`checkPathIsAlluxioDirectory(FileSystem, String,@nullable
S3AuditContext)` but more efficient.

- I change S3RestHandlerService.java to call
`checkPathIsAlluxioDirectory` with argument `BUCKET_PATH_CACHE`.

- I add boolean value `checkS3BucketPath` as a field of
`CreateFilePOptions` and `CreateDirectoryPOptions`. This flag is used in
`FileSystemMasterClientServiceHandler` to check the bucket path existent
if `checkS3BucketPath` is `TRUE`.

1. `BUCKET_PATH_CACHE` highly improves the efficiency while creating
objects.
For example:
Set up an Alluxio cluster with 1 master(aws ec2 m5.2xlarge) and 2
workers(aws ec2 m5.4xlarge);
benchmark S3 API with warp cmd.
```
WARP_ACCESS_KEY=minioadmin WARP_SECRET_KEY=minioadmin ./warp put
--concurrent=80 --obj.size=64KiB --host=cluster0120-masters-1
--disable-multipart --insecure --md5 --duration=3m
```
Benchmark result is like:

|PUT operation|before|after|
|-|-|-|
| Average Throughput |48.49 MiB/s, 775.82 obj/s|59.66 MiB/s, 954.57
obj/s|
|Time percentage:
checkPathIsAlluxioDirectory()/createObjectOrUploadPart()|7.70%|0.50%|
|Time percentage: checkPathIsAlluxioDirectory()/total_time|3.31%|0.21%|

2. Users can modify `alluxio.proxy.s3.bucketpathcache.timeout` in
`conf/alluxio-site.properties` to change the time period of cache
expiration.

Users can modify `alluxio.proxy.s3.bucketpathcache.timeout` in
`conf/alluxio-site.properties` to change the time period of cache
expiration. The default value is `1min`. Users can set it `0min` to
close the function.
|Property Name|Default|Description|
|-|-|-|
|alluxio.proxy.s3.bucketpathcache.timeout|1min| Expire bucket path
statistics in cache for this time period. Set `0min` to disable the
cache.|

> Note: `create file` and `create bucket` operations are always accurate
, but other operations bring incosistent results because of the cache to
bucket path. All in all, alluxio makes sure that every modification is
cosistent and correct.

pr-link: Alluxio#16806
change-id: cid-fdaff4b5a6fdd3eca834cfaf8caea3fe38618005
@qian0817
Copy link
Contributor Author

Add time-related code to record before and after the call

public void update() {
    long start = System.currentTimeMillis();
    loadImpersonationUser(Configuration.global());
    long end = System.currentTimeMillis();
    System.out.println("time: " + (end - start));
}

You can see that the loadImpersonationUser method only takes 1ms.

maobaolong added a commit to maobaolong/alluxio that referenced this pull request Oct 20, 2023
alluxio-bot pushed a commit that referenced this pull request Oct 26, 2023
### What changes are proposed in this pull request?

Fix to support update new key, for example, template key.

This is the dependency order. so we can review the bottom one first.

#17305#17780#17850#17856

### Why are the changes needed?

Without this PR, we cannot update template key or any other key didn't exist in PropertyKeys.

### Does this PR introduce any user facing changes?

No.

			pr-link: #17856
			change-id: cid-5caa981f178aa3581cd49ed845e3ebbfa6214a15
@@ -33,7 +33,7 @@ public void unknownOption() {
@Test
public void updateUnknownKey() {
int ret = mFsAdminShell.run("updateConf", "unknown-key=unknown-value");
Assert.assertEquals(-2, ret);
Assert.assertEquals(0, ret);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be updated if you rebase your code

@qian0817
Copy link
Contributor Author

qian0817 commented Nov 5, 2023

@maobaolong discuess in #18335

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support dynamic refreshing impersonation user configuration.
5 participants