Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[swig]: Fix swig template memory leak on issue 17025 #876

Merged
merged 3 commits into from
May 28, 2024

Conversation

Pterosaur
Copy link
Contributor

Fix the issue sonic-net/sonic-buildimage#17025 about Redis set activity

Description

The issue reports a memory leak on the Redis set operations

Reason

  1. Didn't decrease the reference count after PySequence_GetItem
  2. Use the inappropriate Swig API and didn't release the string of SWIG_AsPtr_std_string

Fix:

  1. Refer PR: Fix swig template memory leak #859 from @praveenraja1
  2. Replace the API SWIG_AsPtr_std_string to SWIG_AsVal_std_string

Add unit test

To monitor there is no dramatic memory increment after a huge amount of Redis set operations.

Signed-off-by: Ze Gan <zegan@microsoft.com>
for _ in range(OP_COUNT):
t.set("big_data", fvs)
t.get("big_data")
assert psutil.Process(os.getpid()).memory_info().rss - rss < OP_COUNT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Pterosaur os.getpid()).memory_info().rss - rss shouldn't this be close to ZERO?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be close to ZERO, but if there is a memory leak on the set/get OPs, we will got the following log:

$ python3 -m pytest --pdb -s -v tests/test_redis_ut.py::test_TableOpsMemoryLeak
============================================================= test session starts ==============================================================
platform linux -- Python 3.10.12, pytest-6.2.5, py-1.10.0, pluggy-0.13.0 -- /usr/bin/python3
cachedir: .pytest_cache
rootdir: /home/zegan/github/sonic/sonic-swss-common
plugins: leaks-0.3.1
collected 1 item

tests/test_redis_ut.py::test_TableOpsMemoryLeak FAILED
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> traceback >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

    def test_TableOpsMemoryLeak():
        OP_COUNT = 50000
        app_db = swsscommon.DBConnector("APPL_DB", 0, True)
        t = swsscommon.Table(app_db, "TABLE")
        long_data = "x" * 100
        fvs = swsscommon.FieldValuePairs([(long_data, long_data)])
        rss = psutil.Process(os.getpid()).memory_info().rss
        for _ in range(OP_COUNT):
            t.set("long_data", fvs)
            t.get("long_data")
>       assert psutil.Process(os.getpid()).memory_info().rss - rss < OP_COUNT
E       AssertionError: assert (82194432 - 43737088) < 50000
E        +  where 82194432 = pmem(rss=82194432, vms=104538112, shared=21266432, text=2818048, lib=0, data=64155648, dirty=0).rss
E        +    where pmem(rss=82194432, vms=104538112, shared=21266432, text=2818048, lib=0, data=64155648, dirty=0) = <bound method Process.memory_info of psutil.Process(pid=282163, name='python3', status='running', started='15:26:59')>()
E        +      where <bound method Process.memory_info of psutil.Process(pid=282163, name='python3', status='running', started='15:26:59')> = psutil.Process(pid=282163, name='python3', status='running', started='15:26:59').memory_info
E        +        where psutil.Process(pid=282163, name='python3', status='running', started='15:26:59') = <class 'psutil.Process'>(282163)
E        +          where <class 'psutil.Process'> = psutil.Process
E        +          and   282163 = <built-in function getpid>()
E        +            where <built-in function getpid> = os.getpid

tests/test_redis_ut.py:880: AssertionError
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> entering PDB >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PDB post_mortem >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> /home/zegan/github/sonic/sonic-swss-common/tests/test_redis_ut.py(880)test_TableOpsMemoryLeak()
-> assert psutil.Process(os.getpid()).memory_info().rss - rss < OP_COUNT
(Pdb) q

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Pterosaur if so, please delete the "long_data" key from the table and

assert psutil.Process(os.getpid()).memory_info().rss - rss == 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea, I will create another PR for teardown.

rss = psutil.Process(os.getpid()).memory_info().rss
for _ in range(OP_COUNT):
t.set("big_data", fvs)
t.get("big_data")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Pterosaur delete the "big_data" also after use?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why delete it? I didn't test the del OPs, because the del OPs will not come to my SWIG code

@@ -156,7 +156,7 @@
SWIG_Python_AppendOutput($result, temp);
}

%typemap(in, fragment="SWIG_AsPtr_std_string")
%typemap(in, fragment="SWIG_AsVal_std_string")
const std::vector<std::pair< std::string,std::string >,std::allocator< std::pair< std::string,std::string > > > &
(std::vector< std::pair< std::string,std::string >,std::allocator< std::pair< std::string,std::string > > > temp,
Copy link
Contributor

@qiluo-msft qiluo-msft May 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

temp is not a reference. Is it heavy copy? #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a function argument, just to tell the SWIG to generate a temp variable. The generated code looks like:

image

So, reference isn't available here.

@@ -205,8 +211,10 @@
&& !PyUnicode_Check(value)
&& !PyString_Check(value)) {
$1 = 0;
Py_DECREF(item);
Copy link
Contributor

@qiluo-msft qiluo-msft May 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Py_DECREF

It's easy to forget one Py_DECREF. could you encapsulate and improve systematically? #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, pleas check it.

Signed-off-by: Ze Gan <zegan@microsoft.com>
@Pterosaur Pterosaur requested a review from qiluo-msft May 25, 2024 01:11
PySequence_GetItem($input, i),
[](PyObject *ptr){
Py_DECREF(ptr);
});
Copy link
Contributor

@qiluo-msft qiluo-msft May 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This long statement is duplicated. Could you further refactor? #WontFix

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Options I see:

  1. use a new class to wrap the pointer
  2. make_unique

1 is better.

Copy link
Contributor Author

@Pterosaur Pterosaur May 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like not a good idea to use a new class/function in SWIG code.
Because I need to use some internal functions/MACRO that are generated in the wrap.cpp. So I cannot access them in the outside of this file.

For example, I tried to create a new file like swig_utils.h but got following error:

pyext/swig_utils.h:26:16: error: ‘SWIG_AsVal_std_string’ was not declared in this scope; did you mean ‘SWIGTYPE_p_std__string’?
   26 |     } else if (SWIG_AsVal_std_string(obj, &buffer) != SWIG_ERROR) {
      |                ^~~~~~~~~~~~~~~~~~~~~
      |                SWIGTYPE_p_std__string

Because this function is generated in the swsscommon_wrap.c at compiling time, So I cannot access it in the outside.
image

This function signature isn't fixed so I cannot use a forward-declaration yet.

} else if (SWIG_AsPtr_std_string(value, &ptr)) {
temp.back().second = *ptr;
} else if (SWIG_AsVal_std_string(value, &str) != SWIG_ERROR) {
temp.back().second = str;
} else {
SWIG_fail;
}
Copy link
Contributor

@qiluo-msft qiluo-msft May 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the if-elesif-else block is duplicated. Could you further refactor? #WontFix

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel to extract a new function isn't a good idea in our case, same reason as above.

@Pterosaur
Copy link
Contributor Author

/azpw run

@mssonicbld
Copy link
Collaborator

/AzurePipelines run

@Pterosaur Pterosaur requested review from qiluo-msft and prgeor May 26, 2024 01:06
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@qiluo-msft
Copy link
Contributor

@prgeor Do you have more concern?

@lguohan lguohan merged commit 138b32e into sonic-net:master May 28, 2024
17 checks passed
@lguohan
Copy link
Contributor

lguohan commented May 28, 2024

@Pterosaur , thanks for fixing this, can you submit pr to get it fixed in 202311?

Pterosaur added a commit to Pterosaur/sonic-swss-common that referenced this pull request May 29, 2024
Fix the issue sonic-net/sonic-buildimage#17025 about Redis set activity

Description
The issue reports a memory leak on the Redis set operations

Reason
Didn't decrease the reference count after PySequence_GetItem
Use the inappropriate Swig API and didn't release the string of SWIG_AsPtr_std_string
Fix:
Refer PR: Fix swig template memory leak sonic-net#859 from @praveenraja1
Replace the API SWIG_AsPtr_std_string to SWIG_AsVal_std_string
Add unit test
To monitor there is no dramatic memory increment after a huge amount of Redis set operations.

Signed-off-by: Ze Gan <zegan@microsoft.com>
Pterosaur added a commit to Pterosaur/sonic-swss-common that referenced this pull request May 29, 2024
Fix the issue sonic-net/sonic-buildimage#17025 about Redis set activity

Description
The issue reports a memory leak on the Redis set operations

Reason
Didn't decrease the reference count after PySequence_GetItem
Use the inappropriate Swig API and didn't release the string of SWIG_AsPtr_std_string
Fix:
Refer PR: Fix swig template memory leak sonic-net#859 from @praveenraja1
Replace the API SWIG_AsPtr_std_string to SWIG_AsVal_std_string
Add unit test
To monitor there is no dramatic memory increment after a huge amount of Redis set operations.

Signed-off-by: Ze Gan <zegan@microsoft.com>
Pterosaur added a commit to Pterosaur/sonic-swss-common that referenced this pull request May 29, 2024
Fix the issue sonic-net/sonic-buildimage#17025 about Redis set activity

Description
The issue reports a memory leak on the Redis set operations

Reason
Didn't decrease the reference count after PySequence_GetItem
Use the inappropriate Swig API and didn't release the string of SWIG_AsPtr_std_string
Fix:
Refer PR: Fix swig template memory leak sonic-net#859 from @praveenraja1
Replace the API SWIG_AsPtr_std_string to SWIG_AsVal_std_string
Add unit test
To monitor there is no dramatic memory increment after a huge amount of Redis set operations.

Signed-off-by: Ze Gan <zegan@microsoft.com>
@prgeor
Copy link
Contributor

prgeor commented May 30, 2024

@prgeor Do you have more concern?

@Pterosaur @qiluo-msft so its OK to leak < 50000 bytes but not more than that

@prgeor
Copy link
Contributor

prgeor commented May 30, 2024

I am referring to test_TableOpsMemoryLeak() this test

@Pterosaur
Copy link
Contributor Author

@prgeor Do you have more concern?

@Pterosaur @qiluo-msft so its OK to leak < 50000 bytes but not more than that

@prgeor
This test will call set/get 50000 times in the loop, so as long as it leaks more than ONE byte in an iteration, it will be detected.
This case is just for testing the leakage of set/get, if there is another leakage point not in this loop, I don't expect that can be detected by this test case.

yxieca pushed a commit that referenced this pull request Jun 3, 2024
Fix the issue sonic-net/sonic-buildimage#17025 about Redis set activity

Description
The issue reports a memory leak on the Redis set operations

Reason
Didn't decrease the reference count after PySequence_GetItem
Use the inappropriate Swig API and didn't release the string of SWIG_AsPtr_std_string
Fix:
Refer PR: Fix swig template memory leak #859 from @praveenraja1
Replace the API SWIG_AsPtr_std_string to SWIG_AsVal_std_string
Add unit test
To monitor there is no dramatic memory increment after a huge amount of Redis set operations.

Signed-off-by: Ze Gan <zegan@microsoft.com>
arfeigin pushed a commit to arfeigin/sonic-swss-common that referenced this pull request Jun 27, 2024
Fix the issue sonic-net/sonic-buildimage#17025 about Redis set activity

Description
The issue reports a memory leak on the Redis set operations

Reason
Didn't decrease the reference count after PySequence_GetItem
Use the inappropriate Swig API and didn't release the string of SWIG_AsPtr_std_string
Fix:
Refer PR: Fix swig template memory leak sonic-net#859 from @praveenraja1
Replace the API SWIG_AsPtr_std_string to SWIG_AsVal_std_string
Add unit test
To monitor there is no dramatic memory increment after a huge amount of Redis set operations.

Signed-off-by: Ze Gan <zegan@microsoft.com>
Pterosaur added a commit to Pterosaur/sonic-swss-common that referenced this pull request Aug 28, 2024
Fix the issue sonic-net/sonic-buildimage#17025 about Redis set activity

Description
The issue reports a memory leak on the Redis set operations

Reason
Didn't decrease the reference count after PySequence_GetItem
Use the inappropriate Swig API and didn't release the string of SWIG_AsPtr_std_string
Fix:
Refer PR: Fix swig template memory leak sonic-net#859 from @praveenraja1
Replace the API SWIG_AsPtr_std_string to SWIG_AsVal_std_string
Add unit test
To monitor there is no dramatic memory increment after a huge amount of Redis set operations.

Signed-off-by: Ze Gan <zegan@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants