Module and testbench codes are given in Verilog for a low-cost synchronous, generic multiplier that uses only one adder with bit width as same as its inputs' bit width. This design is optimal for cases when high latency is accepted but best-case area is compulsory in small FPGA/ASIC to perform large multiplications such as 128-bit * 128-bit, 256-bit * 256-bit, .. etc.
Co-authored-by: Emre Aydin Guzel https://github.com/aydinemreguzel