Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rebuild 1.46.4 with abseil 20211102.0 #222

Closed
wants to merge 10 commits into from

Conversation

ngam
Copy link

@ngam ngam commented Aug 16, 2022

fixes #221

Comment:

rebuild: grpc-cpp 1.46.4 with abseil 20211102.0

motivation: mamba install grpc-cpp==1.46.4 yields:

package tensorflow-base-2.9.1-cpu_py310h048a3e3_0 requires abseil-cpp >=20211102.0,<20211102.1.0a0, but none of the providers can be installed

These pins are crazy wild... :tear

Checklist

  • Used a personal fork of the feedstock to propose changes
  • Bumped the build number (if the version is unchanged)
  • Reset the build number to 0 (if the version changed)
  • Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
  • Ensured the license file is being packaged.

@ngam
Copy link
Author

ngam commented Aug 16, 2022

@conda-forge-admin, please rerender.

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@ngam ngam marked this pull request as ready for review August 16, 2022 13:00
@github-actions
Copy link
Contributor

Hi! This is the friendly automated conda-forge-webservice.

I tried to rerender for you but ran into some issues. Please check the output logs of the latest rerendering GutHub actions workflow run for errors. You can also ping conda-forge/core for further assistance or try re-rendering locally.

This message was generated by GitHub actions workflow run https://github.com/conda-forge/grpc-cpp-feedstock/actions/runs/2868115265.

@ngam
Copy link
Author

ngam commented Aug 16, 2022

@h-vetinari or should I do libabseil here instead?

@github-actions
Copy link
Contributor

Hi! This is the friendly automated conda-forge-webservice.

I tried to rerender for you but ran into some issues. Please check the output logs of the latest rerendering GutHub actions workflow run for errors. You can also ping conda-forge/core for further assistance or try re-rendering locally.

This message was generated by GitHub actions workflow run https://github.com/conda-forge/grpc-cpp-feedstock/actions/runs/2868120800.

@github-actions
Copy link
Contributor

Hi! This is the friendly automated conda-forge-webservice.

I tried to rerender for you, but it looks like there was nothing to do.

This message was generated by GitHub actions workflow run https://github.com/conda-forge/grpc-cpp-feedstock/actions/runs/2868138017.

@h-vetinari
Copy link
Member

I'd like to understand first why 1.46.4 needs to be rebuilt specifically, and the 1.46.3 builds aren't usable

@ngam
Copy link
Author

ngam commented Aug 16, 2022

I'd like to understand first why 1.46.4 needs to be rebuilt specifically, and the 1.46.3 builds aren't usable

I believe the general pins are wrong somehow because they tend to pin for the patch...

So, we have conditions on jaxlib and tensorflow that are exactly identical, right?

But for some bizarre reason, now they cannot coexist, even though nothing has changed:

  • tensorflow-base built with hard pin on abseil-cpp 20211102.0 zipped with grpc-cpp 1.46 ends up with a condition on grpc-cpp like grpc-cpp >=1.46.3,<1.47 --- all good, right?
  • jaxlib build with the same exact hard pin on abseil-cpp 20211102.0 zipped with grpc-cpp 1.46 ends up with a condition on grpc-cpp like grpc-cpp >=1.46.4,<1.47

So, somehow the pin is taking a floor in the patch, rather than a minor. Hence, the inconsistency.

Now the problem is, the latter grpc-cpp 1.46.4 has not been built with abseil-cpp and for some reason that just worked because jaxlib didn't actually use abseil-cpp (for complicated reasons). I am happy resubmit the builds but force jaxlib to confront abseil-cpp and so it will be forced to downgrade grpc-cpp, but that may cause further problems (jaxlib takes 5 hours to build and it's fragile).

We could just rebuild grpc-cpp to keep it completely in sync instead.

Thoughts?

Either way, a rebuild somewhere is necessary to avoid this jam. I am happy to pursue a build in jaxlib if you object to yet another build here :)

@ngam
Copy link
Author

ngam commented Aug 16, 2022

My concern is, we really have to keep tiptoeing around for these fragile yet egomaniacal packages like tensorflow, jaxlib, pytorch, and co. Such a pain 😿

@ngam
Copy link
Author

ngam commented Aug 16, 2022

I never understood these pins and I raised some issue sin the past (I will try to find them) but I was told to relax because that was how they were supposed to work. I am so confused why the pins work by taking the current patch as minimum and keeping the next minor as maximum. I would think they should stay minor to minor, but... 🤷

@ngam ngam mentioned this pull request Aug 16, 2022
@h-vetinari
Copy link
Member

h-vetinari commented Aug 16, 2022

So in general, this situation should hopefully improve substantially, now that we have the wildcard of the abseil ABI better under control. But there's a whole bunch of transition pain, and I'm sorry about that. The silent run-dependence of grpc already plays a big role in that; that at least would be solved by my PR to mark those builds broken.

The lower bound that you get from pinning against 1.46.4 is correct. It's always >= the version used to build (imagine an ABI-relevant thing being fixed between x.y.0 and x.y.1); the interesting part is how far it goes up.

In this case, you'd have to either add a separate bound to not pick up the 20220623-only grpc 1.46.4, make the dependence on abseil explicit, or migrate to the newest abseil.

Or we merge this PR... It's not ideal to undo a migration on a branch, but in this case I guess we can "soft-undo" it in cbc & then revert for the next PR.

@ngam
Copy link
Author

ngam commented Aug 16, 2022

I am happy to submit a jaxlib rebuild if you don't want to upset the order here. It is straightforward there, it will just take 5 hours, that's all.

Let me know what you prefer.

@ngam
Copy link
Author

ngam commented Aug 16, 2022

But there's a whole bunch of transition pain, and I'm sorry about that.

PLEASE. No pain at all! This is the fault of the silent dependency. I like how this is really organized and streamlined. It's great! The fact that I could easily find the place where the problem was coming from is a testament to your good work!

@ngam ngam closed this Aug 16, 2022
@ngam ngam reopened this Aug 16, 2022
@@ -2,3 +2,6 @@ c_compiler_version: # [linux]
- '11' # [linux]
cxx_compiler_version: # [linux]
- '11' # [linux]

libabseil: # [unix]
- '20211102.0' # [unix]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to add the same for libabseil_static

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, then that's why it didn't rerender

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it didn't get picked up because you added pins in meta.yaml. The rerender will only pick up completely unpinned lines (well, except selectors) that have an entry in the global pinning or an ongoing migration.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

@ngam ngam Aug 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See it returned to 2022 ... 46fbfd3

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the migration file takes precedence. Let me remove it and test... but this is changing a lot of stuff now in this branch

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's strange to me... The local cbc always should take precedence

recipe/meta.yaml Outdated
@@ -34,13 +34,13 @@ outputs:
- libprotobuf
- ninja
# We need all host deps also in build for cross-compiling
- libabseil # [build_platform != target_platform]
- libabseil 20211102.0 # [build_platform != target_platform]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove these pins please. Should be done by cbc+rerender

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's see if it works now after adding -static

@ngam
Copy link
Author

ngam commented Aug 16, 2022

Sorry, I accidentally closed this

@h-vetinari
Copy link
Member

Well, not least since we're gonna revert this again right after, this doesn't need to win a beauty or purity contest. I suggest to skip touching the migration & the cbc, and just force the pin in the meta.yaml like you had done before. Sorry for the back and forth

@h-vetinari
Copy link
Member

Just also add the pin to libabseil-static so we do this homogeneously across platforms

@ngam
Copy link
Author

ngam commented Aug 16, 2022

I will start a cleaner PR with meta.yaml only changes.

@ngam ngam closed this Aug 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants