ramjana

Follow

Raman Jana ramjana

Follow

Machine Learning, hardware-software co-design, GPU kernel development, ML performance modeling.

AMD,inc
Austin,TX
https://ramjana.github.io

Achievements

Achievements

Pinned Loading

Tensile Tensile Public

Forked from ROCm/Tensile

Stretching GPU performance for GEMMs and tensor contractions.

Python
fp8_quant_simulation fp8_quant_simulation Public

analytical model for fp8 quantization (power of exponent)

Jupyter Notebook
LLM-Inference-Modeling LLM-Inference-Modeling Public

throughput inference modeling for AMD GPU(s)

Jupyter Notebook
LLM_scaling_laws LLM_scaling_laws Public

code for published papers

Jupyter Notebook
composable_kernel composable_kernel Public

Forked from ROCm/composable_kernel

Composable C++ Template abstractions for implementing Tensor contraction operators (GEMM, iGEMM).

C++