Make narrowing float->int casts on wasm go via wider ints #7973

abadams · 2023-12-01T19:04:03Z

steven-johnson · 2023-12-03T21:34:43Z

Looks like there is some Cuda error happening in BoundaryConditions, did that code change recently?

abadams · 2023-12-04T22:34:14Z

Somewhat recently, but the tests passed at that time. I suspect a driver update has made the cuda API fussier about something. Doesn't repro locally, so I'm attempting to diagnose on the bots.

abadams · 2023-12-04T22:49:53Z

I'm seeing us emit an aligned vector store instruction for an address which is not aligned. I'm now suspecting a new llvm bug. On my checked out version of llvm locally it emits 8 1-byte store instructions, but on llvm main on the bots it emits a 64-bit store instruction.

abadams · 2023-12-04T22:55:16Z

Looks like it was llvm/llvm-project@173fcf7

steven-johnson · 2023-12-07T16:06:55Z

Failures unrelated, landing

Fixes #7972

* Half-plumbed * Revert "Half-plumbed" This reverts commit eb9dd02. * Interface for double buffer * Update Provides, Calls and Realizes for double buffering * Proper sync for double buffering * Use proper name for the semaphor and use correct initial value * Rename the class * Pass expression for index * Adds storage for double buffering index * Use a separate index to go through the double buffer * Failing test * Better handling of hoisted storage in all of the async-related passes * New test and clean-up the generated IR * More tests * Allow double buffering without async and add corresponding test * Filter out incorrect double_buffer schedules * Add tests to the cmake files * Clean up * Update the comment * Clean up * Clean up * Update serialization * complete_x86_target() should enable F16C and FMA when AVX2 is present (#7971) All known AVX2-enabled architectures definitely have these features. * Add two new tail strategies for update definitions (#7949) * Add two new tail strategies for update definitions * Stop printing asm * Update expected number of partitions for Partition::Always * Add a comment explaining why the blend safety check is per dimension * Add serialization support for the new tail strategies * trigger buildbots * Add comment --------- Co-authored-by: Steven Johnson <srj@google.com> * Add appropriate mattrs for arm-32 extensions (#7978) * Add appropriate mattrs for arm-32 extensions Fixes #7976 * Pull clauses out of if * Move canonical version numbers into source, not build system (#7980) (#7981) * Move canonical version numbers into source, not build system (#7980) * Fixes * Silence useless "Insufficient parallelism" autoscheduler warning (#7990) * Add a notebook with a visualization of the aprrox_* functions and their errors (#7974) * Add a notebook with a visualization of the aprrox_* functions and their errors * Fix spelling error * Make narrowing float->int casts on wasm go via wider ints (#7973) Fixes #7972 * Fix handling of assert statements whose conditions get vectorized (#7989) * Fix handling of assert statements whose conditions get vectorized * Fix test name * Fix all "unscheduled update()" warnings in our code (#7991) * Fix all "unscheduled update()" warnings in our code And also fix the Mullapudi scheduler to explicitly touch all update stages. This allows us to mark this warning as an error if we so choose. * fixes * fixes * Update recursive_box_filters.cpp * Silence useless 'Outer dim vectorization of var' warning in Mullapudi… (#7992) Silence useless 'Outer dim vectorization of var' warning in Mullapudi scheduler * Add a tutorial for async and double_buffer * Renamed double_buffer to ring_buffer * ring_buffer() now expects an extent Expr * Actually use extent for ring_buffer() * Address some of the comments * Provide an example of the code structure for producer-consumer async example * Comments updates * Fix clang-format and clang-tidy * Add Python binding for Func::ring_buffer() * Don't use a separate index for ring buffer + add a new test * Rename the tests * Clean up the old name * Add & * Move test to the right folder * Move expr * Add comments for InjectRingBuffering * Improve ring_buffer doc * Fix comments * Comments * A better error message * Mention that extent is expected to be a positive integer * Add another code structure and explain how the indices for ring buffer are computed * Expand test comments * Fix spelling --------- Co-authored-by: Steven Johnson <srj@google.com> Co-authored-by: Andrew Adams <andrew.b.adams@gmail.com>

Fixes halide#7972

* Half-plumbed * Revert "Half-plumbed" This reverts commit eb9dd02. * Interface for double buffer * Update Provides, Calls and Realizes for double buffering * Proper sync for double buffering * Use proper name for the semaphor and use correct initial value * Rename the class * Pass expression for index * Adds storage for double buffering index * Use a separate index to go through the double buffer * Failing test * Better handling of hoisted storage in all of the async-related passes * New test and clean-up the generated IR * More tests * Allow double buffering without async and add corresponding test * Filter out incorrect double_buffer schedules * Add tests to the cmake files * Clean up * Update the comment * Clean up * Clean up * Update serialization * complete_x86_target() should enable F16C and FMA when AVX2 is present (halide#7971) All known AVX2-enabled architectures definitely have these features. * Add two new tail strategies for update definitions (halide#7949) * Add two new tail strategies for update definitions * Stop printing asm * Update expected number of partitions for Partition::Always * Add a comment explaining why the blend safety check is per dimension * Add serialization support for the new tail strategies * trigger buildbots * Add comment --------- Co-authored-by: Steven Johnson <srj@google.com> * Add appropriate mattrs for arm-32 extensions (halide#7978) * Add appropriate mattrs for arm-32 extensions Fixes halide#7976 * Pull clauses out of if * Move canonical version numbers into source, not build system (halide#7980) (halide#7981) * Move canonical version numbers into source, not build system (halide#7980) * Fixes * Silence useless "Insufficient parallelism" autoscheduler warning (halide#7990) * Add a notebook with a visualization of the aprrox_* functions and their errors (halide#7974) * Add a notebook with a visualization of the aprrox_* functions and their errors * Fix spelling error * Make narrowing float->int casts on wasm go via wider ints (halide#7973) Fixes halide#7972 * Fix handling of assert statements whose conditions get vectorized (halide#7989) * Fix handling of assert statements whose conditions get vectorized * Fix test name * Fix all "unscheduled update()" warnings in our code (halide#7991) * Fix all "unscheduled update()" warnings in our code And also fix the Mullapudi scheduler to explicitly touch all update stages. This allows us to mark this warning as an error if we so choose. * fixes * fixes * Update recursive_box_filters.cpp * Silence useless 'Outer dim vectorization of var' warning in Mullapudi… (halide#7992) Silence useless 'Outer dim vectorization of var' warning in Mullapudi scheduler * Add a tutorial for async and double_buffer * Renamed double_buffer to ring_buffer * ring_buffer() now expects an extent Expr * Actually use extent for ring_buffer() * Address some of the comments * Provide an example of the code structure for producer-consumer async example * Comments updates * Fix clang-format and clang-tidy * Add Python binding for Func::ring_buffer() * Don't use a separate index for ring buffer + add a new test * Rename the tests * Clean up the old name * Add & * Move test to the right folder * Move expr * Add comments for InjectRingBuffering * Improve ring_buffer doc * Fix comments * Comments * A better error message * Mention that extent is expected to be a positive integer * Add another code structure and explain how the indices for ring buffer are computed * Expand test comments * Fix spelling --------- Co-authored-by: Steven Johnson <srj@google.com> Co-authored-by: Andrew Adams <andrew.b.adams@gmail.com>

Make narrowing float->int casts on wasm go via wider ints

f501a64

Fixes #7972

steven-johnson approved these changes Dec 1, 2023

View reviewed changes

Merge remote-tracking branch 'origin/main' into abadams/fix_7972

b6bad30

abadams force-pushed the abadams/fix_7972 branch from 3c99931 to b6bad30 Compare December 5, 2023 18:07

steven-johnson merged commit d1ecc1f into main Dec 7, 2023
21 of 22 checks passed

steven-johnson deleted the abadams/fix_7972 branch December 7, 2023 16:07

vksnk pushed a commit that referenced this pull request Dec 7, 2023

Make narrowing float->int casts on wasm go via wider ints (#7973)

2dc021b

Fixes #7972

BrewTestBot mentioned this pull request Feb 2, 2024

halide 17.0.0 Homebrew/homebrew-core#161602

Closed

ardier pushed a commit to ardier/Halide-mutation that referenced this pull request Mar 3, 2024

Make narrowing float->int casts on wasm go via wider ints (halide#7973)

1aec8f1

Fixes halide#7972

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make narrowing float->int casts on wasm go via wider ints #7973

Make narrowing float->int casts on wasm go via wider ints #7973

abadams commented Dec 1, 2023

steven-johnson commented Dec 3, 2023

abadams commented Dec 4, 2023

abadams commented Dec 4, 2023

abadams commented Dec 4, 2023

steven-johnson commented Dec 7, 2023

Make narrowing float->int casts on wasm go via wider ints #7973

Make narrowing float->int casts on wasm go via wider ints #7973

Conversation

abadams commented Dec 1, 2023

steven-johnson commented Dec 3, 2023

abadams commented Dec 4, 2023

abadams commented Dec 4, 2023

abadams commented Dec 4, 2023

steven-johnson commented Dec 7, 2023