-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fabtests: Add extra configure options for efa and lpp tests #10419
Conversation
@zachdworkin @j-xiong Sigh, I think 6901162 should be a better fix for this issue for EFA provider. Sorry that I haven't pushed in that direction earlier |
I believe you may have trouble with |
#10411 broke intel CI because our nodes don't have an updated rdma-core package with the EFADV_DEVICE_ATTR_CAPS_RDMA_WRITE flag. I need some solution to completely disable efa in fabtests. Can we have both options? One will conditionally disable if the header is missing and the other can force disable it. |
@zachdworkin sure we can do both |
@shijin-aws I think having an option to disable it is fine, but I also think the efa config needs to fully check for that support by default and be smarter like your fix is doing so our CI shouldn't need the disabling option - I tried your proposed fix and it does not fix the issue for me |
@aingerson it needs one more diff, let me open a PR and let you try |
@aingerson does your efadv.h file have the added flag? Even if the header is detected it might fail to compile because the package is too old |
@zachdworkin I can add AC_CHECK_DECL for EFADV_DEVICE_ATTR_CAPS_RDMA_WRITE |
@zachdworkin @aingerson can you test this #10420 ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't necessarily have a problem with including these changes but I don't think they are really solving an issue and I don't think we should disable the efa and lpp builds on our CI. Our CI should probably try building as many things as possible to catch all build issues we can catch.
The thing I would change about these patches is the name of the config define (EFA and LPP) to make them more descriptive and avoid any potential duplicate defines
I agree that the CI should build as many things as possible. I can change the names but we need this option until we do our cluster upgrade next quarter. We can remove the command line --enable-efa=no and --enable-lpp=no when that upgrade is complete. |
Why do we need this option? Doesn't Shi's configure change fix it? |
I guess we don't need it but it could be nice to have for specific types of builds |
In libfabric we have per provider configure script that check if a provider can be built and global option to enable/disable providers. We can have the same options in fabtests, so I am fine with both this and #10420 in. |
Yeah, I'm totally fine with it being included - just wanted to clarify that it wasn't actually needed |
@shijin-aws whats the aws failure? |
bot:aws:retest |
@shijin-aws @darrylabbate whats the AWS failure? Rerun doesn't seem to let it pass |
@zachdworkin you need to rebase on the latest main branch, there was a bug in efa provider which was fixed by #10421. It's not blocking your PR though |
Add --enable-efa argument to fabtests to disable efa building. Default is enabled. This is to disable efa for Intel CI. Signed-off-by: Zach Dworkin <zachary.dworkin@intel.com>
Add --enable-lpp option to configure. Default on. This is to turn it off for CI that doesn't need to build this provider. Signed-off-by: Zach Dworkin <zachary.dworkin@intel.com>
@shijin-aws can you share the AWS CI failure? |
It's socket provider failure again
|
bot:aws:retest |
add --enable-efa=yes/no --enable-lpp=yes/no, defaulted to yes to turn off these providers if they are not needed to be built.