Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backports for 1.8-rc1/beta3 #44710

Merged
merged 13 commits into from
Mar 28, 2022
Merged

Backports for 1.8-rc1/beta3 #44710

merged 13 commits into from
Mar 28, 2022

Conversation

KristofferC
Copy link
Member

@KristofferC KristofferC commented Mar 23, 2022

fingolfin and others added 4 commits March 23, 2022 15:17
On Aarch64, the `fpcr` register is 64bit wide, although the top 32bit
are currently unused and reserved for future usage. Nevertheless, we
should safe and restore the full 64 bit, not just 32 bit. This also
silences a compiler warning about this. Reference:
<https://developer.arm.com/documentation/ddi0595/2021-06/AArch64-Registers/FPCR--Floating-point-Control-Register>

(cherry picked from commit 5bd0545)
It also fixes `round(Integer, big(NaN))`.

Solves #44662

(cherry picked from commit ecf3558)
@KristofferC KristofferC added the release Release management and versioning. label Mar 23, 2022
staticfloat and others added 3 commits March 23, 2022 15:24
* More flexibly test affinity setting

When running on a machine with `cpusets` applied, we are unable to
assign CPU affinity to CPUs 1 and 2; we may be locked to CPUs 9-16, for
example.  So we must inspect what our current cpumask is, and from that
select CPUs that we can safely assign affinity to in our tests.

* Import `uv_thread_getaffinity` from `print_process_affinity.jl`

* Call `uv_thread_getaffinity` only if `AFFINITY_SUPPORTED`

* Fix a syntax error

Co-authored-by: Takafumi Arakaki <aka.tkf@gmail.com>
(cherry picked from commit 32b1305)
Fix #44614

(cherry picked from commit e9ba166)
@KristofferC
Copy link
Member Author

@nanosoldier runtests(["AbstractDifferentiation", "AbstractTrees", "AdaStress", "AdventOfCode", "AnyMOD", "ApproxFun", "ApproxFunFourier", "ApproxFunOrthogonalPolynomials", "ApproxFunSingularities", "ArrayLayouts", "AssociativeArrays", "AtomicGraphNets", "AxisKeys", "AxisSets", "AxisTables", "BEASTXMLConstructor", "BIDSTools", "BPFnative", "BSON", "BlockArrays", "CBinding", "CGAL", "CUDAKernels", "Caching", "Causal", "ChainRulesCore", "ChainRulesOverloadGeneration", "ClassicalOrthogonalPolynomials", "ClimaCorePlots", "ClimaCoreVTK", "ClimatePlots", "ClimateTools", "Cloudy", "CombinedParsers", "CompactBases", "CompilerPluginTools", "Conductor", "CxxWrap", "DailyTreasuryYieldCurve", "Dash", "DataFrames", "DataStructures", "DeIdentification", "DeepDiffs", "DiffEqCallbacks", "DiffEqParamEstim", "DiscreteEvents", "Dispatcher", "DynamicBoundsBase", "DynamicalBilliards", "EconJobMarket", "Enzyme", "EquationsOfStateOfSolids", "ExSup", "FMIFlux", "FastJet", "FeatureEng", "FinEtoolsVoxelMesher", "FindClosest", "FlightMechanics", "FlxQTL", "FormattedTables", "GeneralizedSVD", "GenericSchur", "GeoEfficiency", "GeoStatsDevTools", "GoogleCodeSearch", "GroebnerBasis", "HarmonicOrthogonalPolynomials", "HierarchicalUtils", "HomotopyContinuation", "Hypatia", "IBMQClient", "ImageCore", "ImageQuilting", "ImageSegmentation", "ImplicitArrays", "InfiniteArrays", "InverseLaplace", "JET", "JSONPointer", "JuliennedArrays", "KernelGradients", "LCIO", "LMDB", "LRSLib", "LazyArrays", "LegibleLambdas", "LinearMaps", "LiterateOrg", "LogParser", "LogRoller", "LoweredCodeUtils", "MEstimation", "MIRT", "MIRTjim", "MambaModels", "MappedArrays", "MarketCycles", "Memento", "MeshCore", "MetaArrays", "Metatheory", "MinimallyDisruptiveCurves", "MultinomialRegression", "MultipleTesting", "MultivariateOrthogonalPolynomials", "NMRTools", "NeuralArithmetic", "NicePipes", "NotebookToLaTeX", "OpenEphysLoader", "OrbitalTrajectories", "OscillatoryIntegrals", "PLCTag", "POMDPPolicies", "PProf", "PSIS", "PackageAnalyzer", "ParallelUtilities", "Parallelism", "Pathfinder", "PkgDeps", "Plots", "Poltergeist", "Polyhedra", "PredictMDExtra", "ProfileSVG", "QuantizedArrays", "QuantumOptics", "QuasiArrays", "Qwind", "ROCKernels", "RSCG", "RedefStructs", "RiemannHilbert", "RobotDescriptions", "RobotOSData", "Runner", "SIAN", "SOM", "SemiclassicalOrthogonalPolynomials", "SemiseparableMatrices", "Signals", "SimplePadics", "SingularIntegralEquations", "SnoopCompile", "SocialSolver", "Soss", "SparseTimeSeries", "SpatialBoundaries", "StableDQMC", "StackOverflow", "StarAlgebras", "StatProfilerHTML", "StaticTrafficAssignment", "StochasticDiffEq", "StrRegex", "StructuralIdentifiability", "StructuredArrays", "TORA", "TensorKitManifolds", "Tracker", "TraitSimulation", "TurbulenceConvection", "Wikidata", "ZfpCompression", "Zomato"], vs = ":release-1.7")

@nanosoldier
Copy link
Collaborator

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

aviatesk and others added 4 commits March 24, 2022 11:48
…44668)

I found that a tricky thing can happen when constant inference derives
`Const`-result while non-constant inference has derived (non-constant)
`InterConditional` result beforehand. In such a case, currently we discard
the result with constant inference (since `!(Const ⊑ InterConditional)`),
but we can achieve more accuracy by not discarding that `Const`-information, e.g.:
```julia
julia> iszero_simple(x) = x === 0
iszero_simple (generic function with 1 method)

julia> @test Base.return_types() do
           iszero_simple(0) ? nothing : missing
       end |> only === Nothing
Test Passed
```
* avoid using `@sync_add` on remotecalls

It seems like @sync_add adds the Futures to a queue (Channel) for @sync, which
in turn calls wait() for all the futures synchronously. Not only that is
slightly detrimental for network operations (latencies add up), but in case of
Distributed the call to wait() may actually cause some compilation on remote
processes, which is also wait()ed for. In result, some operations took a great
amount of "serial" processing time if executed on many workers at once.

For me, this closes #44645.

The major change can be illustrated as follows: First add some workers:

```
using Distributed
addprocs(10)
```

and then trigger something that, for example, causes package imports on the
workers:

```
using SomeTinyPackage
```

In my case (importing UnicodePlots on 10 workers), this improves the loading
time over 10 workers from ~11s to ~5.5s.

This is a far bigger issue when worker count gets high. The time of the
processing on each worker is usually around 0.3s, so triggering this problem
even on a relatively small cluster (64 workers) causes a really annoying delay,
and running `@everywhere` for the first time on reasonable clusters (I tested
with 1024 workers, see #44645) usually takes more than 5 minutes. Which sucks.

Anyway, on 64 workers this reduces the "first import" time from ~30s to ~6s,
and on 1024 workers this seems to reduce the time from over 5 minutes (I didn't
bother to measure that precisely now, sorry) to ~11s.

Related issues:
- Probably fixes #39291.
- #42156 is a kinda complementary -- it removes the most painful source of
  slowness (the 0.3s precompilation on the workers), but the fact that the
  wait()ing is serial remains a problem if the network latencies are high.

May help with #38931

Co-authored-by: Valentin Churavy <vchuravy@users.noreply.github.com>
(cherry picked from commit 62e0729)
Fix #44734.

(cherry picked from commit 68e2969)
@KristofferC
Copy link
Member Author

@nanosoldier runtests(["AbstractDifferentiation", "AbstractTrees", "AdaStress", "AdventOfCode", "AnyMOD", "ApproxFun", "ApproxFunFourier", "ApproxFunOrthogonalPolynomials", "ApproxFunSingularities", "AssociativeArrays", "AtomicGraphNets", "AxisKeys", "AxisSets", "AxisTables", "BEASTXMLConstructor", "BIDSTools", "BPFnative", "BSON", "CBinding", "CGAL", "CUDAKernels", "Caching", "Causal", "ChainRulesCore", "ChainRulesOverloadGeneration", "ClimaCorePlots", "ClimaCoreVTK", "ClimatePlots", "ClimateTools", "Cloudy", "CombinedParsers", "CompactBases", "CompilerPluginTools", "Conductor", "CxxWrap", "DailyTreasuryYieldCurve", "Dash", "DataFrames", "DataStructures", "DeIdentification", "DeepDiffs", "DiffEqCallbacks", "DiscreteEvents", "Dispatcher", "DynamicBoundsBase", "DynamicalBilliards", "EconJobMarket", "Enzyme", "EquationsOfStateOfSolids", "ExSup", "FastJet", "FeatureEng", "FinEtoolsVoxelMesher", "FindClosest", "FlightMechanics", "FlxQTL", "FormattedTables", "GeneralizedSVD", "GenericSchur", "GeoEfficiency", "GeoStatsDevTools", "GoogleCodeSearch", "GroebnerBasis", "HarmonicOrthogonalPolynomials", "HierarchicalUtils", "HomotopyContinuation", "Hypatia", "IBMQClient", "ImageCore", "ImageQuilting", "ImageSegmentation", "ImplicitArrays", "InverseLaplace", "JET", "JSONPointer", "JuliennedArrays", "KernelGradients", "LCIO", "LMDB", "LRSLib", "LazyArrays", "LegibleLambdas", "LiterateOrg", "LogParser", "LogRoller", "LoweredCodeUtils", "MEstimation", "MIRT", "MIRTjim", "MambaModels", "MappedArrays", "MarketCycles", "Memento", "MeshCore", "MetaArrays", "Metatheory", "MinimallyDisruptiveCurves", "MultinomialRegression", "MultipleTesting", "NMRTools", "NeuralArithmetic", "NicePipes", "NotebookToLaTeX", "OpenEphysLoader", "OscillatoryIntegrals", "PLCTag", "POMDPPolicies", "PProf", "PSIS", "PackageAnalyzer", "ParallelUtilities", "Parallelism", "Pathfinder", "PkgDeps", "Plots", "Poltergeist", "Polyhedra", "PredictMDExtra", "ProfileSVG", "QuantizedArrays", "QuantumOptics", "ROCKernels", "RSCG", "RedefStructs", "RiemannHilbert", "RobotDescriptions", "RobotOSData", "Runner", "SIAN", "SOM", "SemiseparableMatrices", "Signals", "SimplePadics", "SingularIntegralEquations", "SnoopCompile", "SocialSolver", "Soss", "SparseTimeSeries", "SpatialBoundaries", "StableDQMC", "StackOverflow", "StarAlgebras", "StatProfilerHTML", "StaticTrafficAssignment", "StochasticDiffEq", "StrRegex", "StructuralIdentifiability", "StructuredArrays", "TORA", "TensorKitManifolds", "Tracker", "TraitSimulation", "TurbulenceConvection", "ZfpCompression", "Zomato"], vs = ":release-1.7")

@nanosoldier
Copy link
Collaborator

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

dlfivefifty and others added 2 commits March 28, 2022 09:44
fixes #44723

Co-authored-by: Takafumi Arakaki <aka.tkf@gmail.com>
(cherry picked from commit 19eb307)
@KristofferC KristofferC merged commit b9b0bcf into release-1.8 Mar 28, 2022
@KristofferC KristofferC deleted the backports-release-1.8 branch March 28, 2022 13:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release Release management and versioning.
Projects
None yet
Development

Successfully merging this pull request may close these issues.