Releases: coreylammie/MemTorch
Releases · coreylammie/MemTorch
1.1.6 Release
Added
- The
random_crossbar_init
argument to memtorch.bh.Crossbar. If true, this is used to initialize crossbars to random device conductances in between 1/Ron and 1/Roff. CUDA_device_idx
tosetup.py
to allow users to specify theCUDA
device to use when installingMemTorch
from source.- Implementations of CUDA accelerated passive crossbar programming routines for the 2021 Data-Driven model.
- A BiBTeX entry, which can be used to cite the corresponding OSP paper.
Fixed
- In the getting started tutorial, Section 4.1 was a code cell. This has since been converted to a markdown cell.
- OOM errors encountered when modeling passive inference routines of crossbars.
Enhanced
- Templated quantize bindings and fixed semantic error in
memtorch.bh.nonideality.FiniteConductanceStates
. - The memory consumption when modeling passive inference routines.
- The sparse factorization method used to solve sparse linear matrix systems.
- The
naive_program
routine for crossbar programming. The maximum number of crossbar programming iterations is now configurable. - Updated ReadTheDocs documentation for
memtorch.bh.Crossbar
. - Updated the version of
PyTorch
used to build Python wheels from1.9.0
to1.10.0
.
1.1.5 Release
Added
- Partial support for the
groups
argument for convolutional layers.
Fixed
- Patching procedure in
memtorch.mn.module.patch_model
andmemtorch.bh.nonideality.apply_nonidealities
to fix semantic error inTutorial.ipynb
. - Import statement in
Exemplar_Simulations.ipynb
.
Enhanced
- Further modularized patching logic in
memtorch.bh.nonideality.NonIdeality
andmemtorch.mn.Module
. - Modified default number of worker in
memtorch.utils
from 2 to 1.
1.1.4 Release
Added
- Added Patching Support for
torch.nn.Sequential
containers. - Added support for modeling source and line resistances for passive crossbars/tiles.
- Added C++ and CUDA bindings for modeling source and line resistances for passive crossbars/tiles*.
- Added a new MemTorch logo to
README.md
- Added the
set_cuda_malloc_heap_size
routine to patchedtorch.mn
modules. - Added unit tests for source and line resistance modeling.
- Relaxed requirements for programming passive crossbars/tiles.
*Note it is strongly suggested to set cuda_malloc_heap_size
using m.set_cuda_malloc_heap_size
manually when simulating source and line resistances using CUDA bindings.
Enhanced
- Modularized patching logic in
memtorch.bh.nonideality.NonIdeality
andmemtorch.mn.Module
. - Updated
ReadTheDocs
documentation. - Transitioned from
Gitter
toGitHub Discussions
for general discussion.
1.1.3 Release
Added
- Added another version of the Data Driven Model defined using
memtorch.bh.memrsitor.Data_Driven2021
. - Added CPU- and GPU-bound C++ bindings for
gen_tiles
. - Exposed
use_bindings
. - Added unit tests for
use_bindings
. - Added
exemptAssignees
tag toscale.yml
. - Created
memtorch.map.Input
to encapsulate customizable input scaling methods. - Added the
force_scale
input argument to the default scaling method to specify whether inputs are force scaled if they do not exceedmax_input_voltage
. - Added CPU and GPU bindings for
tiled_inference
.
Enhanced
- Modularized input scaling logic for all layer types.
- Modularized
tile_inference
for all layer types. - Updated ReadTheDocs documentation.
Fixed
- Fixed GitHub Action Workflows for external pull requests.
- Fixed error raised by
memtorch.map.Parameter
whenp_l
is defined. - Fixed semantic error in
memtorch.cpp.gen_tiles
.
1.1.2 Release
Added
- C++ and CUDA bindings for
memtorch.bh.crossbar.Tile.tile_matmul
.
Using an NVIDIA GeForce GTX 1080, a tile shape of (25, 25), and two tensors of size (500, 500), the runtime of tile_matmul
without quantization support is reduced by 2.45x and 5.48x, for CPU-bound and GPU-bound operation, respectively. With an ADC resolution of 4 bits and an overflow rate of 0.0, the runtime of tile_matmul
with quantization support is reduced by 2.30x and 105.27x, for CPU-bound and GPU-bound operation, respectively.
Implementation | Runtime Without Quantization Support (s) | Runtime With Quantization Support (s) |
---|---|---|
Pure Python (Previous) | 6.917784 | 27.099764 |
C++ (CPU-bound) | 2.822265 | 11.736974 |
CUDA (GPU-bound) | 1.262861 | 0.2574267 |
Eigen
integration with C++ and CUDA bindings.- Additional unit tests.
Enhanced
- Modularized C++ and CUDA
quantize
bindings. - Enhanced functionality of
naive_progam
and added additional input arguments to dictate logic for stuck devices.
Fixed
- Removed debugging code from
naive_progam
.
1.1.0 Release
Added
- Unit tests and removed system CUDA dependency;
- Support for Conv1d and Conv3d Layers;
- Legacy support;
- MANIFEST.in and resolved header dependency;
- Native toggle for forward_legacy and size arguments to tune;
codecov
integration;- Support for all
torch.distributions
; - 1R programming routine and non-linear device simulation during inference;
- Stanford PKU and A Data Driven Verilog a ReRAM memristor models;
- Modular crossbar tile support;
- ADC and variable input voltage range support, and modularized all
memtorch.mn
modules; cibuildwheel
integration to automatically generate build wheels.
Enhanced
- Mapping functionality;
- Reduced pooling memory usage with
maxtasksperchild
; - Programming routine;
set_conductance
;apply_cycle_variability
.
Fixed
- Dimension mismatch error for convolutional layers with non-zero padding;
reg.coef_
andreg.intercept_
extraction process for N-dimensional arrays;- Various semantic errors.
Initial Release
v1.0.0 Initial Release