[BUG/Help] <title>RuntimeError: MPS backend out of memory (MPS allocated: 18.05 GB, other allocations: 4.48 MB, max allowed: 18.13 GB). Tried to allocate 192.00 MB on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure). #311

dragononly · 2023-03-31T16:02:44Z

Is there an existing issue for this?

I have searched the existing issues

Current Behavior

一直跑不起来，
是因为我内存不够么
我的16G内存
cpu和常规模式能跑起来。
环境没问题
m1pro 芯片

Expected Behavior

No response

Steps To Reproduce

一直跑不起来，
是因为我内存不够么
我的16G内存
cpu和常规模式能跑起来。
环境没问题
m1pro 芯片

Environment

- OS:13.3
- Python:3.8
- Transformers:27
- PyTorch:2.1night
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :true

Anything else?

No response

YIZXIY · 2023-04-01T05:52:33Z

是的，内存不够。
以及你这个CUDA Support :true很有意思，

qitao052 · 2023-04-01T15:36:04Z

我感觉可能是MacOS 13.3的问题。我自己maccbook是13.2的版本，mac studio 13,3的版本，然后我发现：

如果我model 用.half，在macbook上跑没问题，结果秒出，内存python占用14G；但如果在mac studio上，就会发现一直不出结果，也不报错，python占用内存能直接到70G
如果在13.3的mac studio上，不用.half 而是用.float。能运行，能出结果，就是整个内存占下来能到90%。实在太高了。
我是手欠把mac studio 从13.2升级到13.3. 但是在13.2版本的时候，我已经调试过是没问题的其实，升级后开始出现这个问题
我2台电脑的python环境都是3.9.16， pytorch 和 transformer 版本也都一致，分别是'2.1.0.dev20230324' 和 ‘4.26.1’

所以我怀疑，可能是这个13.3的问题。当时升级到13.3就是因为程序报warning：
UserWarning: MPS: no support for int64 for min_max, downcasting to a smaller data type (int32/float32). Native support for int64 has been added in macOS 13.3.
但实际上，升级后13.3 后反倒有问题了。

chenguokai · 2023-04-01T16:11:57Z

我感觉可能是MacOS 13.3的问题。我自己maccbook是13.2的版本，mac studio 13,3的版本，然后我发现：

如果我model 用.half，在macbook上跑没问题，结果秒出，内存python占用14G；但如果在mac studio上，就会发现一直不出结果，也不报错，python占用内存能直接到70G

如果在13.3的mac studio上，不用.half 而是用.float。能运行，能出结果，就是整个内存占下来能到90%。实在太高了。

我是手欠把mac studio 从13.2升级到13.3. 但是在13.2版本的时候，我已经调试过是没问题的其实，升级后开始出现这个问题

我2台电脑的python环境都是3.9.16， pytorch 和 transformer 版本也都一致，分别是'2.1.0.dev20230324' 和 ‘4.26.1’

所以我怀疑，可能是这个13.3的问题。当时升级到13.3就是因为程序报warning： UserWarning: MPS: no support for int64 for min_max, downcasting to a smaller data type (int32/float32). Native support for int64 has been added in macOS 13.3. 但实际上，升级后13.3 后反倒有问题了。

Same behavior (1 and 2) on my MacBook Pro M1Pro with macOS 13.3

wukaiyu · 2023-04-02T05:19:17Z

我也是碰到一样的问题，和qitao052描述的一摸一样，13.3系统下面只能用cpu，gpu加速失效，报内存不够

vvanglro · 2023-04-07T15:15:35Z

我32g的 MacOS 13.3
一开始用的half跑了好久内存直奔40g了还没出结果我就终止了
换了float后能出来结果但是也很慢用了148s, 有下面这个输出不知道影响不影响

The dtype of attention mask (torch.int64) is not bool

kanson1996 · 2023-04-09T04:42:47Z

+1，使用mps会报错，异常退出了：RuntimeError: MPS backend out of memory (MPS allocated: 11.82 GB, other allocations: 6.30 GB, max allowed: 18.13 GB). Tried to allocate 5.72 MB on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure).
目前能正常运行的办法是不要用mps：

model = AutoModel.from_pretrained("./chatglm-6b", trust_remote_code=True).float()  # .half().to('mps')

缺点：响应特别慢，几乎一个符号一个文字返回结果

硬件环境：16G内存，M1芯片，Mac OS 13.3.1 (22E261)，可用储存空间24.87 GB
软件环境：Python 3.10.10，torch 2.1.0.dev20230407，transformers 4.27.1
模型：chatglm-6b

fquirin · 2023-04-09T09:23:15Z

我32g的 MacOS 13.3 一开始用的half跑了好久内存直奔40g了还没出结果我就终止了换了float后能出来结果但是也很慢用了148s, 有下面这个输出不知道影响不影响
The dtype of attention mask (torch.int64) is not bool

Same error/warning(?) on Intel 12th Gen x86 CPU, 16 GB RAM.
Model: "THUDM/chatglm-6b-int4-qe"
Answering a question takes about 2-5 minutes.

duzx16 · 2023-04-09T09:36:16Z

之前的实现会触发一个pytorch的bug，现在已经修复了，见 #462

zhuyeqingqing · 2023-07-14T06:20:59Z

I select trust-remote-code and resolve it ,but i use facebook_galactica-6.7b model

duzx16 closed this as completed Apr 9, 2023

Cacuta mentioned this issue Jun 2, 2023

RuntimeError: MPS backend out of memory (MPS allocated: 18.00 GB, other allocations: 4.87 MB, max allowed: 18.13 GB) chatchat-space/Langchain-Chatchat#533

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG/Help] <title>RuntimeError: MPS backend out of memory (MPS allocated: 18.05 GB, other allocations: 4.48 MB, max allowed: 18.13 GB). Tried to allocate 192.00 MB on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure). #311

[BUG/Help] <title>RuntimeError: MPS backend out of memory (MPS allocated: 18.05 GB, other allocations: 4.48 MB, max allowed: 18.13 GB). Tried to allocate 192.00 MB on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure). #311

dragononly commented Mar 31, 2023

YIZXIY commented Apr 1, 2023

qitao052 commented Apr 1, 2023

chenguokai commented Apr 1, 2023

wukaiyu commented Apr 2, 2023

vvanglro commented Apr 7, 2023 •

edited

Loading

kanson1996 commented Apr 9, 2023

fquirin commented Apr 9, 2023

duzx16 commented Apr 9, 2023

zhuyeqingqing commented Jul 14, 2023

Comments

dragononly commented Mar 31, 2023

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Anything else?

YIZXIY commented Apr 1, 2023

qitao052 commented Apr 1, 2023

chenguokai commented Apr 1, 2023

wukaiyu commented Apr 2, 2023

vvanglro commented Apr 7, 2023 • edited Loading

kanson1996 commented Apr 9, 2023

fquirin commented Apr 9, 2023

duzx16 commented Apr 9, 2023

zhuyeqingqing commented Jul 14, 2023

vvanglro commented Apr 7, 2023 •

edited

Loading