Skip to content

Renoir Tuning Guide

Falcosc edited this page Jan 5, 2022 · 39 revisions

How to properly remove limits on your Mobile Renoir

Preparation

Look for a monitoring tool which does give you access to prochot thermal throttle reason and frequencies. All other values don't need advanced monitoring tools and can be access directly with the RyzenAdj PM Table integration.

RyzenAdj does have an --info and --dump-table option. The info option does already contain the most useful values, but it isn't a proper monitoring tool.

If you are on Windows HWiNFO is a perfect addition to the use of ryzenAdj --info. It is great in monitoring frequency and proc-hot problems. For all other usecases the info option is good enough. If HWiNFO doesn't have all the values you need, you can extend it (example how to do can be found in readjustService.ps1)

For Linux, you need some special things too

Selection of the right values

Before we talk about handling of battery management, multiple power profiles or unexpected value resets from hidden software, you should at first check which values actually need to be changed.

To see all of them in action, just run any kind of stress test and execute ryzenAdj --info during peak load. Execute some benchmark and keep track of your score to monitor your optimization progress.

--stapm-limit / STAPM Limit

has no effect on Renoir if STTv2 is enabled (default). STT does overwrite STAPM Limit with FAST Limit or if STT detects over-temperature the STT power value (--skin-temp-limit) gets applied on STAPM.

--fast-limit / PPT Limit Fast

Limits your PPT VALUE FAST(actual power draw)

Your maximum boost limit, in easy words: Power Limit for X amount of seconds SlowPPTTimeConstant(--slow-time).

But actually, it is only used if PPT VALUE SLOW does not hit PPT LIMIT SLOW

  • PPT VALUE SLOW (average power draw) gets controlled by SlowPPTTimeConstant(--slow-time)
  • PPT VALUE FAST is the current power draw

--slow-limit / PPT Limit Slow

Limits your PPT VALUE SLOW(average power draw)

Don’t use to high values for your average power draw, or otherwise you will lose your boost performance.

You should select a value which results into long-time temperatures 10°C or 20°C below your maximum temperature, to get 10-20°C temperature headroom for boosting, this will make your system more responsive after a high demanding load is done.

--slow-time / SlowPPTTimeConstant

Time does control how long you are able to use the PPT LIMIT FAST(--fast-limit) But it doesn’t tell that boost does last exactly X amount of seconds. Instead, the amount of seconds is used for the average power draw calculation PPT VALUE SLOW. So basically, it tells how many old power consumption values are used to calculate your average power draw. The larger this value gets the longer it takes to push the average value of PPT VALUE SLOW to PPT LIMIT SLOW(--slow-limit).

--stapm-time / StampTimeConstant

has no effect on Renoir, value doesn’t even get applied to the PM table

That's a shame because a 2nd time would be very nice to control long-time heat soak. Without this, we have to turn down our PPT LIMIT SLOW(--slow-limit) to support a little bit of temperature headroom for boosting. This planned temperature headroom is basically wasted performance. I'm not sure if other AMD platforms do support --stapm-limit and --stapm-time as 2nd long-time control.

But for me, it does make sense to not have a 2nd long-time control. I just would not know how this could improve the ability to have turbo after a long period of system load. In my opinion, A 2nd average power consumption, long enough to consider heat soak of the whole device, would be very similar to --tctl-temp. Maybe this is a question for a thermodynamic engineer.

--tctl-temp / THM Limit (Core, GFX, SOC)

Put your temperature target here. But be careful there is a hidden prochot fail-safe thermal limit. If you do select to high values, proc-hot gets triggered and limit your CPU power below 4W, which nearly freezes the system. So, your number 1 goal should be proc-hot prevention because all the small tuning gains means nothing, if the overall system experience does suck.

For example:

  • my device has a max temp of 100, if I enter 102°C only 100 gets applied.
  • I guess my prochot is around 102-105°C, you might guess you are safe with 100°C? Wrong!
  • Sustained workloads are fine, temperature get managed to stay between 99.9 to 100.3
  • Workload demand spikes are not fine because the thermal management has a little delay around 1000ms. And this is a problem with extreme changes in workload. Even with sensor polling of only 2000ms I could get single readings of 103°C, proc-hot gets triggered and the whole system gets limited to 4W for some seconds
  • Using 97°C did fix the proc-hot issue for transient workloads

Because the prochot temperature limit is unknown you need to monitor if this limit got applied during an unexpected system slowdown. HWiNFO has a Value for this Thermal Throttling (PROCHOT EXT) and Thermal Throttling (PROCHOT CPU). Lower your thermal Limit step by step until you don't get prochot limits anymore. Each vendor can define the prochot limit differently, which means you can't follow recommendations and have to monitor it for your devices until you manage to avoid it.

Here is an example why the --tctl-temp=100 is not checked/enforced fast enough and why you need to set it 5-10°C lower than your prochot limit:

--vrm-current / TDC Limit VDD

You need to change this limit only if PM Table does report this value as reason for not reaching your target package wattage

--vrmmax-current / EDC Limit VDD

Similar like TDC Limit VDD(--vrm-current), only raise this if it does hold you back. But this value is more important for GPU. GPU can pull a lot more amps without hitting your temperature or power target. Maybe the very aggressive GPU boost algorithm are the reason for that. So don’t forget testing GPU benchmarks during your power management setup.

--vrmsoc-current / TDC Limit SoC and --vrmsocmax-current / EDC Limit SoC

SoC power is very unlikely to make trouble because it is only for things like memory controller and so on.

--psi0-current / PSI0 Limit VDD and --psi0soc-current / PSI0 Limit SoC

Are not used on my Renoir device, you will see your value changes in PM Table, but there is no power measurement reported on my device

--prochot-deassertion-ramp

Would be cool to be able to make the prochot limit a little less harsh to your system. But unfortunately this controls only the power after you did endure the 5 seconds of painful slow 4W performance. Still don't know how prochot-deassertion-ramp does work, at least it is nothing in s or ms. I can only confirm that this does handle the limit enforcement after prochot gets released. If I use values below ~256 (0x100) nothing happens: feels like limit after prochot for values lower than 256 does exceed my thermal capacity right after a prochot. Which means after a fixed amount of prochot the limit gets so loose that power spikes right from 4W (prochot power limit) up to max power draw possible for my current tctl temperature (maybe 25W because I was already at 100°C). But larger values do limit my power draw after prochot:

  • before prochot 45W to 30W depending on temperature until prochot kicks in
  • 4W during prochot
  • after prochot gets released I have 15W for value ~300 or 6W for values like ~20.000
  • after prochot gets released limit stays for at least 10 minutes (didn't test more than 10 minutes for the small value of 300) After 10 minutes running Cinebench at 15W I did set prochot values like 64 which is so low that right after applying my power draw goes up to maximum. I did then a run with 20 000 which does cool down my laptop very fast because for such large values I get only ~6W after prochot gets released. Based on the fast falling temperature I did expect a change in power draw. But after prochot, the limit stays the same at ~6W until I lower the prochot-deassertion-ramp value.

--apu-skin-temp / STT Limit APU

There is an industry standard for maximum permitted surface temperature of electronic devices. I guess for that reason Renoir got a new thermal limit "skin temperature". In HWiNFO you will find a "CPU Skin Temperature" value, this option controls the limit for it. The readings of this sensor does get really close to the actual surface temperature hotspot of my device. This is one of the hidden values which do cause unexpected system slowdowns. If you know where you shouldn’t touch your laptop you can raise this temperature to 50°C or 55°C But don’t go crazy with this value, some devices may rely on it as workaround for wrong material selection. Unfortunately, you cannot change the temperature power limit. My device uses 15W limit for over skin temperature protection.

--dgpu-skin-temp / STT Limit dGPU

Same thing like STT Limit APU(--apu-skin-temp) but only useful for devices with dedicated GPU

--skin-temp-limit

Controls the power limit used if temperature is over STT Limit APU(--apu-skin-temp) or STT Limit dGPU(--dgpu-skin-temp) to avoid to high temperatures on the device surface. This value should allow your device to reduce the surface temperature. For that reason, this value needs to be below the real cooling capabilities of your device. Do not use to high values here, if it is too high, your device will not be able to control the skin temperature.

Value need to be lower than PPT Limit Slow(--slow-limit) to make sense.

This value is not part of the Renoir Ptable. If you want to know which value was defined by the manufacture or your device, you need to find it out by testing it. You could set STT Limit APU(--apu-skin-temp) to 20°C then you can see the default value in power usage monitoring.

For my device it was 15W, which is very close to my PPT Limit Slow(--slow-limit) so it is already fitting very well. I did not need to change it, instead I raised STT Limit APU(--apu-skin-temp)

--apu-slow-limit / PPT Limit APU

You can change the reported pm table value on Renoir, but on my device usage power is always reported as 0, so it doesn't do anything.

--power-saving (DC-Mode-Tune with boost delay)

On Zen2 it does limit the clock to 2500 Mhz for about 10 seconds, on Zen+ it does limit to 2400 Mhz.

This feature was introduced to reduce the idle power consumption and improves the idle battery runtime. If you have a lot of wasteful background apps, it could cut the package power consumption in half from 4W to 2W.

It gets automatically set if you unplug the power cable. On most devices it does set 4 different clock parameters to a threshold of 95%. Which means it does take around 10 seconds until you get the full boost. We are not sure what else is included in this DC-Mode power saving profile.

Because it gets automatically set on battery, it is only useful to use this option if you like to save power while on AC Power Source. You don't need to apply it manually while on battery.

If you care about the environment, using this on AC power source is most effective on non-continuous workloads. Even watching hardware accelerated videos or reading webpages are sometimes considered as idle for the CPU and will have a noticeable effect on the power draw. You can even use it to get your notebook a bit cooler and quieter in these easy workloads.

--max-performance (AC-Mode-Tune without boost delay)

This does have an effect on how fast you reach your max boost clock. It is the opposite of the --power-saving option. Only one of them can be applied.

It gets automatically set if you plug your power cable in. On most devices it does set 4 different clock parameters to a threshold of 50%. Which means it does get the full boost quickly. We are not sure what else is included in this AC-Mode performance profile.

Because it gets automatically set on AC, it is only useful to use this option if you like to improve the responsiveness while on Battery. You don't need to apply it manually if your AC power adapter is connected.

Gaming is not affected by the boost delay because you have a constant load on the CPU.

If your use-case jumps a lot between idle and load, then you could see up to 50% improvement using --max-performance on battery. But you will trade some of your idle battery runtime, so it is best to use this option it only on demand.