Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change virt type from INT to BIGINT #902

Merged
merged 4 commits into from
Feb 7, 2021

Conversation

splhack
Copy link
Contributor

@splhack splhack commented Feb 2, 2021

Problem

Some frame is always killed by Cuebot due to the Virtual proc did not exist warning. Which doesn't make sense, the frame is actually running on the host.

com.imageworks.spcue.dispatcher.HostReportHandler - warning, the proc PPPP on host HHHH was running for N minutes FFFF the DB did not reflect this Virtual proc did not exist.
com.imageworks.spcue.rqd.RqdClientGrpc - killing frame on HHHH, source: OpenCue could not verify this frame.

Root cause

com.imageworks.spcue.dao.postgres.ProcDaoJdbc - The proc for frame FFFF could not be updated with new memory stats: org.springframework.dao.DataIntegrityViolationException: PreparedStatementCallback; SQL [UPDATE proc SET int_mem_used = ?, int_mem_max_used = ?,int_virt_used = ?, int_virt_max_used = ?, ts_ping = current_timestamp WHERE pk_frame = ?]; ERROR: integer out of range; nested exception is org.postgresql.util.PSQLException: ERROR: integer out of range

The root cause was the proc wasn't updated due to integer out of range in ProcDao, when it was updating the memory stats of the frame. What is the INT values in proc table? here

int_virt_used INT DEFAULT 0 NOT NULL,
int_virt_max_used INT DEFAULT 0 NOT NULL,

Actually, INT is not enough for 64bit Linux VSIZE (virt.) It can exceed INT_MAX even it is divided by 1024. In fact, these are the real values Cuebot received when the integer out of range happened.

max_vsize: 107422045712
vsize: 21487748948

Solution

Use BIGINT for int_virt_used and int_virt_max_used.

@larsbijl larsbijl merged commit 014bcef into AcademySoftwareFoundation:master Feb 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants