-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poor performances and getting stuck #2
Comments
Hi,
That's weird indeed! It's year ago I ran this code, but I definitely ran more episodes than 61. In the paper I ran 6 million steps. I'm not sure why its so slow now, but I often had issues with memory leakage in Tensorflow. That could be something to check.
Performance being bad could have to do with the reward range of Cartpole. You get a +1 for every step you survive, while UCT is taylored for total returns in [0,1]. Increasing your c parameter in UCT, or scaling your reward function down, might help.
Hope that helps!
Best regards,
Thomas
T.M. (Thomas) Moerland
Post-doctoral researcher
Leiden University
https://thomasmoerland.nl<http://thomasmoerland.nl/>
…________________________________
From: fede72bari ***@***.***>
Sent: Monday, April 3, 2023 10:06 AM
To: tmoer/alphazero_singleplayer ***@***.***>
Cc: Subscribed ***@***.***>
Subject: [tmoer/alphazero_singleplayer] Poor performances and getting stuck (Issue #2)
Dear,
thank you for sharing your code. I made some modifications just to let it run under Tensorflow 2 avoiding some errors on placeholders and importing the slim library. I run it on CartPole with the default settings, I just increased the max number of steps to see how much it could learn. I have to say that the performance is quite poor in terms of speed, but it could be a problem with my SO and HW configuration, but also in terms of results. And at a certain point, it gets stuck. Even if the log shows a few seconds for episodes it takes hours for each. And now, after 10 runs it has done just 61 episodes, with a number of steps that is just a little bit higher than a random run. Have you ever tried to run it increasing the max number of steps to 5000? Which are your results? Thank you.
[image]<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuser-images.githubusercontent.com%2F46624075%2F229449138-1d498a08-0239-422a-96ec-b86aac863665.png&data=05%7C01%7Ct.m.moerland%40liacs.leidenuniv.nl%7Cfa78bc27df0d4593791b08db341a5c7e%7Cca2a7f76dbd74ec091086b3d524fb7c8%7C0%7C0%7C638161060056755733%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=VpN7NcIwDNPXcLM3tJDUW9jSD%2FCfVcxixotXeoFKDCU%3D&reserved=0>
—
Reply to this email directly, view it on GitHub<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgh.neting.cc%2Ftmoer%2Falphazero_singleplayer%2Fissues%2F2&data=05%7C01%7Ct.m.moerland%40liacs.leidenuniv.nl%7Cfa78bc27df0d4593791b08db341a5c7e%7Cca2a7f76dbd74ec091086b3d524fb7c8%7C0%7C0%7C638161060056755733%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=hNlHjnHiLu9oaKj9ep5KDGw%2BSAPgVD7o0rND2I1hy7Y%3D&reserved=0>, or unsubscribe<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgh.neting.cc%2Fnotifications%2Funsubscribe-auth%2FAC35J733TZGNY3X2LKZYMJTW7KAJDANCNFSM6AAAAAAWQ6PPRQ&data=05%7C01%7Ct.m.moerland%40liacs.leidenuniv.nl%7Cfa78bc27df0d4593791b08db341a5c7e%7Cca2a7f76dbd74ec091086b3d524fb7c8%7C0%7C0%7C638161060056755733%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=646RKh42P2j29i93ug6Jwe42ngZfjU15ra69u156Rvc%3D&reserved=0>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Dear Thomas,
thank you again for having shared your work with the international
community and for answering to my post. I
didn't give up and let the script run to see what could happen further. I
copy at the end of this answer the ongoing log, right now it has been
running for almost 2 days. As a beginner, I note this
1. when the episode achieves poor results (few steps) the script is
relatively fast, seconds or even less
2. when the episode reaches the limit of 5000 steps it takes hours, even 10
hours
3. I noted also a very strange thing, we had two consecutive episodes (112
and 113) reaching the limit of 5000 steps, but one is 10 times faster with
respect to the other. At equal results, we have very different execution time.
4. it is very unstable, it could reach the maximum steps, which I assume to
be "full learning reached", and then forget everything and drop down to
20 steps.
Launching the script I see this warning that could give suggestion about the reason for so slow performance
2023-04-03 01:29:30.414774: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-04-03 01:29:30.422381: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:357] MLIR V1 optimization pass is not enabled
Do you think I can adjust something to let it work in a more predictable (in
terms of computational effort), faster and more stable (avoiding catastrofical
forget) way?
Thank you very much.
Federico.
Here the log:
Finished episode 0, total return: 81.0, total time: 1.7 sec
Finished episode 1, total return: 11.0, total time: 0.1 sec
Finished episode 2, total return: 9.0, total time: 0.1 sec
Finished episode 3, total return: 10.0, total time: 0.1 sec
Finished episode 4, total return: 11.0, total time: 0.1 sec
Finished episode 5, total return: 11.0, total time: 0.2 sec
Finished episode 6, total return: 10.0, total time: 0.1 sec
Finished episode 7, total return: 9.0, total time: 0.1 sec
Finished episode 8, total return: 8.0, total time: 0.1 sec
Finished episode 9, total return: 11.0, total time: 0.1 sec
Finished episode 10, total return: 9.0, total time: 0.1 sec
Finished episode 11, total return: 86.0, total time: 2.6 sec
Finished episode 12, total return: 95.0, total time: 3.0 sec
Finished episode 13, total return: 14.0, total time: 0.2 sec
Finished episode 14, total return: 115.0, total time: 2.8 sec
Finished episode 15, total return: 11.0, total time: 0.1 sec
Finished episode 16, total return: 12.0, total time: 0.1 sec
Finished episode 17, total return: 23.0, total time: 0.4 sec
Finished episode 18, total return: 21.0, total time: 0.4 sec
Finished episode 19, total return: 13.0, total time: 0.2 sec
Finished episode 20, total return: 23.0, total time: 0.3 sec
Finished episode 21, total return: 18.0, total time: 0.2 sec
Finished episode 22, total return: 20.0, total time: 0.2 sec
Finished episode 23, total return: 32.0, total time: 0.5 sec
Finished episode 24, total return: 22.0, total time: 0.3 sec
Finished episode 25, total return: 20.0, total time: 0.3 sec
Finished episode 26, total return: 21.0, total time: 0.3 sec
Finished episode 27, total return: 34.0, total time: 0.7 sec
Finished episode 28, total return: 25.0, total time: 0.4 sec
Finished episode 29, total return: 22.0, total time: 0.3 sec
Finished episode 30, total return: 18.0, total time: 0.3 sec
Finished episode 31, total return: 21.0, total time: 0.3 sec
Finished episode 32, total return: 24.0, total time: 0.3 sec
Finished episode 33, total return: 31.0, total time: 0.6 sec
Finished episode 34, total return: 16.0, total time: 0.3 sec
Finished episode 35, total return: 39.0, total time: 0.7 sec
Finished episode 36, total return: 18.0, total time: 0.3 sec
Finished episode 37, total return: 17.0, total time: 0.2 sec
Finished episode 38, total return: 21.0, total time: 0.3 sec
Finished episode 39, total return: 25.0, total time: 0.3 sec
Finished episode 40, total return: 30.0, total time: 0.5 sec
Finished episode 41, total return: 26.0, total time: 0.5 sec
Finished episode 42, total return: 31.0, total time: 0.5 sec
Finished episode 43, total return: 127.0, total time: 2.6 sec
Finished episode 44, total return: 221.0, total time: 17.0 sec
Finished episode 45, total return: 22.0, total time: 0.4 sec
Finished episode 46, total return: 88.0, total time: 1.7 sec
Finished episode 47, total return: 15.0, total time: 0.3 sec
Finished episode 48, total return: 13.0, total time: 0.2 sec
Finished episode 49, total return: 56.0, total time: 0.9 sec
Finished episode 50, total return: 19.0, total time: 0.3 sec
Finished episode 51, total return: 20.0, total time: 0.4 sec
Finished episode 52, total return: 47.0, total time: 0.7 sec
Finished episode 53, total return: 21.0, total time: 0.3 sec
Finished episode 54, total return: 40.0, total time: 0.6 sec
Finished episode 55, total return: 42.0, total time: 0.9 sec
Finished episode 56, total return: 13.0, total time: 0.2 sec
Finished episode 57, total return: 40.0, total time: 0.6 sec
Finished episode 58, total return: 63.0, total time: 1.5 sec
Finished episode 59, total return: 80.0, total time: 1.4 sec
Finished episode 60, total return: 36.0, total time: 0.9 sec
Finished episode 61, total return: 81.0, total time: 1.3 sec
Finished episode 62, total return: 5000.0, total time: 36745.2 sec
Finished episode 63, total return: 92.0, total time: 3.1 sec
Finished episode 64, total return: 34.0, total time: 0.7 sec
Finished episode 65, total return: 24.0, total time: 0.4 sec
Finished episode 66, total return: 32.0, total time: 0.6 sec
Finished episode 67, total return: 103.0, total time: 2.1 sec
Finished episode 68, total return: 17.0, total time: 0.2 sec
Finished episode 69, total return: 10.0, total time: 0.1 sec
Finished episode 70, total return: 17.0, total time: 0.3 sec
Finished episode 71, total return: 13.0, total time: 0.2 sec
Finished episode 72, total return: 29.0, total time: 0.5 sec
Finished episode 73, total return: 52.0, total time: 0.9 sec
Finished episode 74, total return: 117.0, total time: 1.9 sec
Finished episode 75, total return: 79.0, total time: 1.5 sec
Finished episode 76, total return: 122.0, total time: 2.2 sec
Finished episode 77, total return: 177.0, total time: 6.5 sec
Finished episode 78, total return: 77.0, total time: 1.4 sec
Finished episode 79, total return: 71.0, total time: 1.2 sec
Finished episode 80, total return: 17.0, total time: 0.2 sec
Finished episode 81, total return: 44.0, total time: 0.8 sec
Finished episode 82, total return: 28.0, total time: 0.5 sec
Finished episode 83, total return: 58.0, total time: 1.0 sec
Finished episode 84, total return: 11.0, total time: 0.2 sec
Finished episode 85, total return: 14.0, total time: 0.2 sec
Finished episode 86, total return: 10.0, total time: 0.2 sec
Finished episode 87, total return: 11.0, total time: 0.2 sec
Finished episode 88, total return: 29.0, total time: 0.5 sec
Finished episode 89, total return: 49.0, total time: 0.9 sec
Finished episode 90, total return: 18.0, total time: 0.3 sec
Finished episode 91, total return: 21.0, total time: 0.4 sec
Finished episode 92, total return: 41.0, total time: 1.0 sec
Finished episode 93, total return: 83.0, total time: 1.2 sec
Finished episode 94, total return: 305.0, total time: 9.6 sec
Finished episode 95, total return: 46.0, total time: 0.9 sec
Finished episode 96, total return: 119.0, total time: 4.1 sec
Finished episode 97, total return: 11.0, total time: 0.2 sec
Finished episode 98, total return: 229.0, total time: 6.9 sec
Finished episode 99, total return: 294.0, total time: 25.2 sec
Finished episode 100, total return: 43.0, total time: 0.7 sec
Finished episode 101, total return: 52.0, total time: 0.8 sec
Finished episode 102, total return: 32.0, total time: 0.6 sec
Finished episode 103, total return: 12.0, total time: 0.2 sec
Finished episode 104, total return: 20.0, total time: 0.4 sec
Finished episode 105, total return: 12.0, total time: 0.2 sec
Finished episode 106, total return: 112.0, total time: 2.5 sec
Finished episode 107, total return: 35.0, total time: 0.6 sec
Finished episode 108, total return: 38.0, total time: 0.7 sec
Finished episode 109, total return: 70.0, total time: 1.1 sec
Finished episode 110, total return: 275.0, total time: 8.1 sec
Finished episode 111, total return: 52.0, total time: 1.0 sec
Finished episode 112, total return: *5000.0, total time: 2791.7 sec*
Finished episode 113, total return: *5000.0, total time: 273.2 sec*
Finished episode 114, total return: 144.0, total time: 2.8 sec
Finished episode 115, total return: 63.0, total time: 1.0 sec
Finished episode 116, total return: 75.0, total time: 1.3 sec
Finished episode 117, total return: 35.0, total time: 0.6 sec
Finished episode 118, total return: 83.0, total time: 1.4 sec
Finished episode 119, total return: 46.0, total time: 0.7 sec
Finished episode 120, total return: 98.0, total time: 1.6 sec
Finished episode 121, total return: 80.0, total time: 1.3 sec
Finished episode 122, total return: 5000.0, total time: 621.0 sec
Finished episode 123, total return: 44.0, total time: 0.8 sec
Finished episode 124, total return: 30.0, total time: 0.5 sec
Finished episode 125, total return: 46.0, total time: 0.8 sec
Finished episode 126, total return: 17.0, total time: 0.2 sec
Finished episode 127, total return: 234.0, total time: 6.3 sec
Finished episode 128, total return: 5000.0, total time: 26232.6 sec
Finished episode 129, total return: 261.0, total time: 6.1 sec
Finished episode 130, total return: 5000.0, total time: 13568.5 sec
Finished episode 131, total return: 5000.0, total time: 17371.6 sec
Finished episode 132, total return: 5000.0, total time: 3384.3 sec
Finished episode 133, total return: 82.0, total time: 1.3 sec
Finished episode 134, total return: 27.0, total time: 0.5 sec
Finished episode 135, total return: 46.0, total time: 0.8 sec
Finished episode 136, total return: 168.0, total time: 10.9 sec
Finished episode 137, total return: 5000.0, total time: 3458.9 sec
Finished episode 138, total return: 124.0, total time: 1.9 sec
Finished episode 139, total return: 53.0, total time: 0.8 sec
Finished episode 140, total return: 4158.0, total time: 3789.5 sec
Finished episode 141, total return: 78.0, total time: 1.2 sec
Finished episode 142, total return: 56.0, total time: 0.8 sec
Finished episode 143, total return: 16.0, total time: 0.2 sec
Finished episode 144, total return: 51.0, total time: 0.8 sec
Finished episode 145, total return: 49.0, total time: 0.8 sec
Finished episode 146, total return: 13.0, total time: 0.2 sec
Finished episode 147, total return: 84.0, total time: 1.4 sec
Finished episode 148, total return: 1100.0, total time: 52.8 sec
Finished episode 149, total return: 44.0, total time: 0.9 sec
|
Hi Federico,
I'd not run 5000 steps per episode, but 500 or 200 on Cartpole (I guess that's the default?).
It's weird that similar episode lengths have such different runtime. I have no clue what that could be. You'd have to log the runtime of each function call to figure out where the issue happens. But the first thing I would look at it memory leakage. Can you check whether your RAM grows full after a while?
Best,
Thomas
T.M. (Thomas) Moerland
Post-doctoral researcher
Leiden University
https://thomasmoerland.nl<http://thomasmoerland.nl/>
________________________________
From: fede72bari ***@***.***>
Sent: Tuesday, April 4, 2023 11:05 AM
To: tmoer/alphazero_singleplayer ***@***.***>
Cc: Moerland, T.M. (Thomas) ***@***.***>; Comment ***@***.***>
Subject: Re: [tmoer/alphazero_singleplayer] Poor performances and getting stuck (Issue #2)
Dear Thomas,
thank you again for having shared your work with the international
community and for answering to my post. My email is ***@***.*** I
didn't give up and let the script run to see what could happen further. I
copy at the end of this answer the on-going log, right now it has been
running for almost 2 days. As a beginner I note this
1. when the episode achieves poor results (few steps) the script is
relative fast, seconds or even less
2. when the episode reaches the limit of 5000 steps it takes hours, even 10
hours
3. I noted also a very strange thig, we had two consecutive episodes (112
and 113) reaching the limit of 5000 steps, but one is 10 time faster with
respect tto he other. At equal result we have very different execution time.
4. it is very unstable, it could reach the maximum steps, that I assume to
be "full learning reached", and then forget everything and dropping down to
20 steps.
Do you think I can adjust something to let it work in more predictable (in
terms of computational effort), faster and stable (avoiding catastrofical
forget) way?
Thank you very much.
Federico.
-------------------------------------------------------------
Here the log:
Finished episode 0, total return: 81.0, total time: 1.7 sec
Finished episode 1, total return: 11.0, total time: 0.1 sec
Finished episode 2, total return: 9.0, total time: 0.1 sec
Finished episode 3, total return: 10.0, total time: 0.1 sec
Finished episode 4, total return: 11.0, total time: 0.1 sec
Finished episode 5, total return: 11.0, total time: 0.2 sec
Finished episode 6, total return: 10.0, total time: 0.1 sec
Finished episode 7, total return: 9.0, total time: 0.1 sec
Finished episode 8, total return: 8.0, total time: 0.1 sec
Finished episode 9, total return: 11.0, total time: 0.1 sec
Finished episode 10, total return: 9.0, total time: 0.1 sec
Finished episode 11, total return: 86.0, total time: 2.6 sec
Finished episode 12, total return: 95.0, total time: 3.0 sec
Finished episode 13, total return: 14.0, total time: 0.2 sec
Finished episode 14, total return: 115.0, total time: 2.8 sec
Finished episode 15, total return: 11.0, total time: 0.1 sec
Finished episode 16, total return: 12.0, total time: 0.1 sec
Finished episode 17, total return: 23.0, total time: 0.4 sec
Finished episode 18, total return: 21.0, total time: 0.4 sec
Finished episode 19, total return: 13.0, total time: 0.2 sec
Finished episode 20, total return: 23.0, total time: 0.3 sec
Finished episode 21, total return: 18.0, total time: 0.2 sec
Finished episode 22, total return: 20.0, total time: 0.2 sec
Finished episode 23, total return: 32.0, total time: 0.5 sec
Finished episode 24, total return: 22.0, total time: 0.3 sec
Finished episode 25, total return: 20.0, total time: 0.3 sec
Finished episode 26, total return: 21.0, total time: 0.3 sec
Finished episode 27, total return: 34.0, total time: 0.7 sec
Finished episode 28, total return: 25.0, total time: 0.4 sec
Finished episode 29, total return: 22.0, total time: 0.3 sec
Finished episode 30, total return: 18.0, total time: 0.3 sec
Finished episode 31, total return: 21.0, total time: 0.3 sec
Finished episode 32, total return: 24.0, total time: 0.3 sec
Finished episode 33, total return: 31.0, total time: 0.6 sec
Finished episode 34, total return: 16.0, total time: 0.3 sec
Finished episode 35, total return: 39.0, total time: 0.7 sec
Finished episode 36, total return: 18.0, total time: 0.3 sec
Finished episode 37, total return: 17.0, total time: 0.2 sec
Finished episode 38, total return: 21.0, total time: 0.3 sec
Finished episode 39, total return: 25.0, total time: 0.3 sec
Finished episode 40, total return: 30.0, total time: 0.5 sec
Finished episode 41, total return: 26.0, total time: 0.5 sec
Finished episode 42, total return: 31.0, total time: 0.5 sec
Finished episode 43, total return: 127.0, total time: 2.6 sec
Finished episode 44, total return: 221.0, total time: 17.0 sec
Finished episode 45, total return: 22.0, total time: 0.4 sec
Finished episode 46, total return: 88.0, total time: 1.7 sec
Finished episode 47, total return: 15.0, total time: 0.3 sec
Finished episode 48, total return: 13.0, total time: 0.2 sec
Finished episode 49, total return: 56.0, total time: 0.9 sec
Finished episode 50, total return: 19.0, total time: 0.3 sec
Finished episode 51, total return: 20.0, total time: 0.4 sec
Finished episode 52, total return: 47.0, total time: 0.7 sec
Finished episode 53, total return: 21.0, total time: 0.3 sec
Finished episode 54, total return: 40.0, total time: 0.6 sec
Finished episode 55, total return: 42.0, total time: 0.9 sec
Finished episode 56, total return: 13.0, total time: 0.2 sec
Finished episode 57, total return: 40.0, total time: 0.6 sec
Finished episode 58, total return: 63.0, total time: 1.5 sec
Finished episode 59, total return: 80.0, total time: 1.4 sec
Finished episode 60, total return: 36.0, total time: 0.9 sec
Finished episode 61, total return: 81.0, total time: 1.3 sec
Finished episode 62, total return: 5000.0, total time: 36745.2 sec
Finished episode 63, total return: 92.0, total time: 3.1 sec
Finished episode 64, total return: 34.0, total time: 0.7 sec
Finished episode 65, total return: 24.0, total time: 0.4 sec
Finished episode 66, total return: 32.0, total time: 0.6 sec
Finished episode 67, total return: 103.0, total time: 2.1 sec
Finished episode 68, total return: 17.0, total time: 0.2 sec
Finished episode 69, total return: 10.0, total time: 0.1 sec
Finished episode 70, total return: 17.0, total time: 0.3 sec
Finished episode 71, total return: 13.0, total time: 0.2 sec
Finished episode 72, total return: 29.0, total time: 0.5 sec
Finished episode 73, total return: 52.0, total time: 0.9 sec
Finished episode 74, total return: 117.0, total time: 1.9 sec
Finished episode 75, total return: 79.0, total time: 1.5 sec
Finished episode 76, total return: 122.0, total time: 2.2 sec
Finished episode 77, total return: 177.0, total time: 6.5 sec
Finished episode 78, total return: 77.0, total time: 1.4 sec
Finished episode 79, total return: 71.0, total time: 1.2 sec
Finished episode 80, total return: 17.0, total time: 0.2 sec
Finished episode 81, total return: 44.0, total time: 0.8 sec
Finished episode 82, total return: 28.0, total time: 0.5 sec
Finished episode 83, total return: 58.0, total time: 1.0 sec
Finished episode 84, total return: 11.0, total time: 0.2 sec
Finished episode 85, total return: 14.0, total time: 0.2 sec
Finished episode 86, total return: 10.0, total time: 0.2 sec
Finished episode 87, total return: 11.0, total time: 0.2 sec
Finished episode 88, total return: 29.0, total time: 0.5 sec
Finished episode 89, total return: 49.0, total time: 0.9 sec
Finished episode 90, total return: 18.0, total time: 0.3 sec
Finished episode 91, total return: 21.0, total time: 0.4 sec
Finished episode 92, total return: 41.0, total time: 1.0 sec
Finished episode 93, total return: 83.0, total time: 1.2 sec
Finished episode 94, total return: 305.0, total time: 9.6 sec
Finished episode 95, total return: 46.0, total time: 0.9 sec
Finished episode 96, total return: 119.0, total time: 4.1 sec
Finished episode 97, total return: 11.0, total time: 0.2 sec
Finished episode 98, total return: 229.0, total time: 6.9 sec
Finished episode 99, total return: 294.0, total time: 25.2 sec
Finished episode 100, total return: 43.0, total time: 0.7 sec
Finished episode 101, total return: 52.0, total time: 0.8 sec
Finished episode 102, total return: 32.0, total time: 0.6 sec
Finished episode 103, total return: 12.0, total time: 0.2 sec
Finished episode 104, total return: 20.0, total time: 0.4 sec
Finished episode 105, total return: 12.0, total time: 0.2 sec
Finished episode 106, total return: 112.0, total time: 2.5 sec
Finished episode 107, total return: 35.0, total time: 0.6 sec
Finished episode 108, total return: 38.0, total time: 0.7 sec
Finished episode 109, total return: 70.0, total time: 1.1 sec
Finished episode 110, total return: 275.0, total time: 8.1 sec
Finished episode 111, total return: 52.0, total time: 1.0 sec
Finished episode 112, total return: *5000.0, total time: 2791.7 sec*
Finished episode 113, total return: *5000.0, total time: 273.2 sec*
Finished episode 114, total return: 144.0, total time: 2.8 sec
Finished episode 115, total return: 63.0, total time: 1.0 sec
Finished episode 116, total return: 75.0, total time: 1.3 sec
Finished episode 117, total return: 35.0, total time: 0.6 sec
Finished episode 118, total return: 83.0, total time: 1.4 sec
Finished episode 119, total return: 46.0, total time: 0.7 sec
Finished episode 120, total return: 98.0, total time: 1.6 sec
Finished episode 121, total return: 80.0, total time: 1.3 sec
Finished episode 122, total return: 5000.0, total time: 621.0 sec
Finished episode 123, total return: 44.0, total time: 0.8 sec
Finished episode 124, total return: 30.0, total time: 0.5 sec
Finished episode 125, total return: 46.0, total time: 0.8 sec
Finished episode 126, total return: 17.0, total time: 0.2 sec
Finished episode 127, total return: 234.0, total time: 6.3 sec
Finished episode 128, total return: 5000.0, total time: 26232.6 sec
Finished episode 129, total return: 261.0, total time: 6.1 sec
Finished episode 130, total return: 5000.0, total time: 13568.5 sec
Finished episode 131, total return: 5000.0, total time: 17371.6 sec
Finished episode 132, total return: 5000.0, total time: 3384.3 sec
Finished episode 133, total return: 82.0, total time: 1.3 sec
Finished episode 134, total return: 27.0, total time: 0.5 sec
Finished episode 135, total return: 46.0, total time: 0.8 sec
Finished episode 136, total return: 168.0, total time: 10.9 sec
Finished episode 137, total return: 5000.0, total time: 3458.9 sec
Finished episode 138, total return: 124.0, total time: 1.9 sec
Finished episode 139, total return: 53.0, total time: 0.8 sec
Finished episode 140, total return: 4158.0, total time: 3789.5 sec
Finished episode 141, total return: 78.0, total time: 1.2 sec
Finished episode 142, total return: 56.0, total time: 0.8 sec
Finished episode 143, total return: 16.0, total time: 0.2 sec
Finished episode 144, total return: 51.0, total time: 0.8 sec
Finished episode 145, total return: 49.0, total time: 0.8 sec
Finished episode 146, total return: 13.0, total time: 0.2 sec
Finished episode 147, total return: 84.0, total time: 1.4 sec
Finished episode 148, total return: 1100.0, total time: 52.8 sec
Finished episode 149, total return: 44.0, total time: 0.9 sec
Il giorno lun 3 apr 2023 alle ore 15:57 Thomas Moerland <
***@***.***> ha scritto:
Hi,
That's weird indeed! It's year ago I ran this code, but I definitely ran
more episodes than 61. In the paper I ran 6 million steps. I'm not sure why
its so slow now, but I often had issues with memory leakage in Tensorflow.
That could be something to check.
Performance being bad could have to do with the reward range of Cartpole.
You get a +1 for every step you survive, while UCT is taylored for total
returns in [0,1]. Increasing your c parameter in UCT, or scaling your
reward function down, might help.
Hope that helps!
Best regards,
Thomas
T.M. (Thomas) Moerland
Post-doctoral researcher
Leiden University
https://thomasmoerland.nl<http://thomasmoerland.nl/>
________________________________
From: fede72bari ***@***.***>
Sent: Monday, April 3, 2023 10:06 AM
To: tmoer/alphazero_singleplayer ***@***.***>
Cc: Subscribed ***@***.***>
Subject: [tmoer/alphazero_singleplayer] Poor performances and getting
stuck (Issue #2)
Dear,
thank you for sharing your code. I made some modifications just to let it
run under Tensorflow 2 avoiding some errors on placeholders and importing
the slim library. I run it on CartPole with the default settings, I just
increased the max number of steps to see how much it could learn. I have to
say that the performance is quite poor in terms of speed, but it could be a
problem with my SO and HW configuration, but also in terms of results. And
at a certain point, it gets stuck. Even if the log shows a few seconds for
episodes it takes hours for each. And now, after 10 runs it has done just
61 episodes, with a number of steps that is just a little bit higher than a
random run. Have you ever tried to run it increasing the max number of
steps to 5000? Which are your results? Thank you.
[image]<
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuser-images.githubusercontent.com%2F46624075%2F229449138-1d498a08-0239-422a-96ec-b86aac863665.png&data=05%7C01%7Ct.m.moerland%40liacs.leidenuniv.nl%7Cfa78bc27df0d4593791b08db341a5c7e%7Cca2a7f76dbd74ec091086b3d524fb7c8%7C0%7C0%7C638161060056755733%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=VpN7NcIwDNPXcLM3tJDUW9jSD%2FCfVcxixotXeoFKDCU%3D&reserved=0
>
—
Reply to this email directly, view it on GitHub<
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgh.neting.cc%2Ftmoer%2Falphazero_singleplayer%2Fissues%2F2&data=05%7C01%7Ct.m.moerland%40liacs.leidenuniv.nl%7Cfa78bc27df0d4593791b08db341a5c7e%7Cca2a7f76dbd74ec091086b3d524fb7c8%7C0%7C0%7C638161060056755733%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=hNlHjnHiLu9oaKj9ep5KDGw%2BSAPgVD7o0rND2I1hy7Y%3D&reserved=0>,
or unsubscribe<
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgh.neting.cc%2Fnotifications%2Funsubscribe-auth%2FAC35J733TZGNY3X2LKZYMJTW7KAJDANCNFSM6AAAAAAWQ6PPRQ&data=05%7C01%7Ct.m.moerland%40liacs.leidenuniv.nl%7Cfa78bc27df0d4593791b08db341a5c7e%7Cca2a7f76dbd74ec091086b3d524fb7c8%7C0%7C0%7C638161060056755733%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=646RKh42P2j29i93ug6Jwe42ngZfjU15ra69u156Rvc%3D&reserved=0
>.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
—
Reply to this email directly, view it on GitHub
<#2 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ALDW2S3RMSF4PR24FO7GMO3W7LJOBANCNFSM6AAAAAAWQ6PPRQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
—
Reply to this email directly, view it on GitHub<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgh.neting.cc%2Ftmoer%2Falphazero_singleplayer%2Fissues%2F2%23issuecomment-1495611278&data=05%7C01%7Ct.m.moerland%40liacs.leidenuniv.nl%7Cb8d64292c0ef46027d7808db34ebc076%7Cca2a7f76dbd74ec091086b3d524fb7c8%7C0%7C0%7C638161959377880013%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=DTK51tHlEQZGK0Y46OpZ5lyBoH%2B80iL0iOrDxkZaLu8%3D&reserved=0>, or unsubscribe<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgh.neting.cc%2Fnotifications%2Funsubscribe-auth%2FAC35J7ZO7LFT3YY4R7CZ32TW7PP55ANCNFSM6AAAAAAWQ6PPRQ&data=05%7C01%7Ct.m.moerland%40liacs.leidenuniv.nl%7Cb8d64292c0ef46027d7808db34ebc076%7Cca2a7f76dbd74ec091086b3d524fb7c8%7C0%7C0%7C638161959377880013%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=YxZ5hemyRAee9aL7qqW7VaMKhcfbpfIMXR%2BxfXejj2o%3D&reserved=0>.
You are receiving this because you commented.Message ID: ***@***.***>
|
Hi again, as suggested i tracked the RSS (physical) and VMS (virtual) memory used by the running process at the end of each episode and kept track of the max values for each step of each episode. The long log is at the bottom. This time I run 400 max episodes with max length per episode of 200 steps
What I can see is that the RSS memory slightly increases, more often and with higher values when there is a longer execution time; but it is not a certain rule. At a certain point, RSS stops to increase. It never decreases. The VMS increase but at a certain point it increases and decreases alternately. I think that the reason for the delay is not memory leakage. Could you share the performance in your system for "max_ep_len 500"? Also, do you have the same warning I see?
Here the log
|
Hi Thomas, hope everything is fine. Did the last attempt to track memory usage growth could give some suggestions on the trouble fixing according to you? Which is the time performance you can see in your machine environment running the algorithm for a higher number of max steps and episodes? Thank you. |
Dear,
thank you for sharing your code. I made some modifications just to let it run under Tensorflow 2 avoiding some errors on placeholders and importing the slim library. I run it on CartPole with the default settings, I just increased the max number of steps to see how much it could learn. I have to say that the performance is quite poor in terms of speed, but it could be a problem with my SO and HW configuration, but also in terms of results. And at a certain point, it gets stuck. Even if the log shows a few seconds for episodes it takes hours for each. And now, after 10 runs it has done just 61 episodes, with a number of steps that is just a little bit higher than a random run. Have you ever tried to run it increasing the max number of steps to 5000? Which are your results? Thank you.
The text was updated successfully, but these errors were encountered: