[v2,3/4] lib/power: add get and set API for scaling freq min and max with pstate mode
Checks
Commit Message
Add new get/set API to allow the user or application to set the minimum
and maximum frequencies to use when scaling.
Previously, the frequency range was determined by the HW capabilities of
the CPU. With this new API, the user or application can constrain this
if required.
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
---
lib/power/power_pstate_cpufreq.c | 22 +++++++--
lib/power/rte_power_pmd_mgmt.c | 65 ++++++++++++++++++++++++++
lib/power/rte_power_pmd_mgmt.h | 80 ++++++++++++++++++++++++++++++++
lib/power/version.map | 4 ++
4 files changed, 166 insertions(+), 5 deletions(-)
Comments
Kevin Laatz <kevin.laatz@intel.com> writes:
> Add new get/set API to allow the user or application to set the minimum
> and maximum frequencies to use when scaling.
> Previously, the frequency range was determined by the HW capabilities of
> the CPU. With this new API, the user or application can constrain this
> if required.
>
> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
> ---
> lib/power/power_pstate_cpufreq.c | 22 +++++++--
> lib/power/rte_power_pmd_mgmt.c | 65 ++++++++++++++++++++++++++
> lib/power/rte_power_pmd_mgmt.h | 80 ++++++++++++++++++++++++++++++++
> lib/power/version.map | 4 ++
> 4 files changed, 166 insertions(+), 5 deletions(-)
>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
On 19-Apr-22 12:25 PM, Kevin Laatz wrote:
> Add new get/set API to allow the user or application to set the minimum
> and maximum frequencies to use when scaling.
> Previously, the frequency range was determined by the HW capabilities of
> the CPU. With this new API, the user or application can constrain this
> if required.
>
> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
> ---
<snip>
>
> +int
> +rte_power_pmd_mgmt_set_scaling_freq_min(unsigned int lcore, unsigned int min)
> +{
> + if (lcore >= RTE_MAX_LCORE) {
> + RTE_LOG(ERR, POWER, "Invalid lcore ID: %u\n", lcore);
> + rte_errno = EINVAL;
> + return -1;
> + }
> + scale_freq_min[lcore] = min;
Are there any constraints on the value ranges, or are we just going to
accept any and all values? If the idea was to allow valid values plus
some special "default" value, you can still restrict the range, but
allow 0 as a special case?
> +
> + return 0;
> +}
> +
> +int
> +rte_power_pmd_mgmt_set_scaling_freq_max(unsigned int lcore, unsigned int max)
> +{
> + if (lcore >= RTE_MAX_LCORE) {
> + RTE_LOG(ERR, POWER, "Invalid lcore ID: %u\n", lcore);
> + rte_errno = EINVAL;
> + return -1;
> + }
> + scale_freq_max[lcore] = max;
Same as above. Also, do we want UINT32_MAX be the "special" value for
the "max" case? What do you think of having "0" as "not set", but maybe
set it internally to UINT32_MAX if you still want to keep using the
RTE_MIN/MAX macros?
> +
> + return 0;
> +}
> +
> +int
> +rte_power_pmd_mgmt_get_scaling_freq_min(unsigned int lcore)
> +{
<snip>
> diff --git a/lib/power/rte_power_pmd_mgmt.h b/lib/power/rte_power_pmd_mgmt.h
> index 18a9c3abb5..74e3fa710b 100644
> --- a/lib/power/rte_power_pmd_mgmt.h
> +++ b/lib/power/rte_power_pmd_mgmt.h
> @@ -148,6 +148,86 @@ __rte_experimental
> unsigned int
> rte_power_pmd_mgmt_get_pause_duration(void);
>
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice.
> + *
> + * Set the min frequency to be used for frequency scaling.
> + *
> + * @note Supported by: Pstate mode.
> + *
> + * @param lcore
> + * The ID of the lcore to set the min frequency for.
> + * @param min
> + * The value, in Hertz, to set the minimum frequency to.
Is it really in Hertz? As far as I can tell, it's in steps of 100MHz
(BUS_FREQ).
On 18/05/2022 10:05, Burakov, Anatoly wrote:
> On 19-Apr-22 12:25 PM, Kevin Laatz wrote:
>> Add new get/set API to allow the user or application to set the minimum
>> and maximum frequencies to use when scaling.
>> Previously, the frequency range was determined by the HW capabilities of
>> the CPU. With this new API, the user or application can constrain this
>> if required.
>>
>> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
>> ---
>
> <snip>
>
>> +int
>> +rte_power_pmd_mgmt_set_scaling_freq_min(unsigned int lcore, unsigned
>> int min)
>> +{
>> + if (lcore >= RTE_MAX_LCORE) {
>> + RTE_LOG(ERR, POWER, "Invalid lcore ID: %u\n", lcore);
>> + rte_errno = EINVAL;
>> + return -1;
>> + }
>> + scale_freq_min[lcore] = min;
>
> Are there any constraints on the value ranges, or are we just going to
> accept any and all values? If the idea was to allow valid values plus
> some special "default" value, you can still restrict the range, but
> allow 0 as a special case?
When writing min/max values to HW the values are clamped. Since the API
takes unsigned integer for the frequency value (in this case 'min'), any
value can be considered as 'valid'.
That being said, this should at least check that min <= max for the same
lcore. I'll add this for v3.
>
>> +
>> + return 0;
>> +}
>> +
>> +int
>> +rte_power_pmd_mgmt_set_scaling_freq_max(unsigned int lcore, unsigned
>> int max)
>> +{
>> + if (lcore >= RTE_MAX_LCORE) {
>> + RTE_LOG(ERR, POWER, "Invalid lcore ID: %u\n", lcore);
>> + rte_errno = EINVAL;
>> + return -1;
>> + }
>> + scale_freq_max[lcore] = max;
>
> Same as above. Also, do we want UINT32_MAX be the "special" value for
> the "max" case? What do you think of having "0" as "not set", but
> maybe set it internally to UINT32_MAX if you still want to keep using
> the RTE_MIN/MAX macros?
Similar to 'set_scaling_freq_min', the value will be clamped by HW so
any value can be considered 'valid'. I don't see the benefit of having
"0" for not set, since UINT32_MAX will achieve the same result, i.e. the
value won't be used (it will fall back the max value in sysfs). Do you
have a use-case for it if we don't need a 'special case'?
Will add a check to make sure max >= min for v3.
>
>> +
>> + return 0;
>> +}
>> +
>> +int
>> +rte_power_pmd_mgmt_get_scaling_freq_min(unsigned int lcore)
>> +{
>
> <snip>
>
>> diff --git a/lib/power/rte_power_pmd_mgmt.h
>> b/lib/power/rte_power_pmd_mgmt.h
>> index 18a9c3abb5..74e3fa710b 100644
>> --- a/lib/power/rte_power_pmd_mgmt.h
>> +++ b/lib/power/rte_power_pmd_mgmt.h
>> @@ -148,6 +148,86 @@ __rte_experimental
>> unsigned int
>> rte_power_pmd_mgmt_get_pause_duration(void);
>> +/**
>> + * @warning
>> + * @b EXPERIMENTAL: this API may change, or be removed, without
>> prior notice.
>> + *
>> + * Set the min frequency to be used for frequency scaling.
>> + *
>> + * @note Supported by: Pstate mode.
>> + *
>> + * @param lcore
>> + * The ID of the lcore to set the min frequency for.
>> + * @param min
>> + * The value, in Hertz, to set the minimum frequency to.
>
> Is it really in Hertz? As far as I can tell, it's in steps of 100MHz
> (BUS_FREQ).
Correct, the frequency changes in steps of 100MHz, but the value passed
to 'min' is in kHz - will ammend the comments.
@@ -12,6 +12,7 @@
#include <rte_memcpy.h>
+#include "rte_power_pmd_mgmt.h"
#include "power_pstate_cpufreq.h"
#include "power_common.h"
@@ -354,6 +355,7 @@ power_get_available_freqs(struct pstate_power_info *pi)
FILE *f_min = NULL, *f_max = NULL;
int ret = -1;
uint32_t sys_min_freq = 0, sys_max_freq = 0, base_max_freq = 0;
+ int config_min_freq, config_max_freq;
uint32_t i, num_freqs = 0;
/* open all files */
@@ -388,6 +390,16 @@ power_get_available_freqs(struct pstate_power_info *pi)
goto out;
}
+ /* check for config set by user or application to limit frequency range */
+ config_min_freq = rte_power_pmd_mgmt_get_scaling_freq_min(pi->lcore_id);
+ if (config_min_freq < 0)
+ goto out;
+ config_max_freq = rte_power_pmd_mgmt_get_scaling_freq_max(pi->lcore_id);
+ if (config_max_freq < 0)
+ goto out;
+ sys_min_freq = RTE_MAX(sys_min_freq, (uint32_t)config_min_freq);
+ sys_max_freq = RTE_MIN(sys_max_freq, (uint32_t)config_max_freq);
+
if (sys_max_freq < sys_min_freq)
goto out;
@@ -411,8 +423,8 @@ power_get_available_freqs(struct pstate_power_info *pi)
/* If turbo is available then there is one extra freq bucket
* to store the sys max freq which value is base_max +1
*/
- num_freqs = (base_max_freq - sys_min_freq) / BUS_FREQ + 1 +
- pi->turbo_available;
+ num_freqs = (RTE_MIN(base_max_freq, sys_max_freq) - sys_min_freq) / BUS_FREQ
+ + 1 + pi->turbo_available;
if (num_freqs >= RTE_MAX_LCORE_FREQS) {
RTE_LOG(ERR, POWER, "Too many available frequencies: %d\n",
num_freqs);
@@ -427,10 +439,10 @@ power_get_available_freqs(struct pstate_power_info *pi)
*/
for (i = 0, pi->nb_freqs = 0; i < num_freqs; i++) {
if ((i == 0) && pi->turbo_available)
- pi->freqs[pi->nb_freqs++] = base_max_freq + 1;
+ pi->freqs[pi->nb_freqs++] = RTE_MIN(base_max_freq, sys_max_freq) + 1;
else
- pi->freqs[pi->nb_freqs++] =
- base_max_freq - (i - pi->turbo_available) * BUS_FREQ;
+ pi->freqs[pi->nb_freqs++] = RTE_MIN(base_max_freq, sys_max_freq) -
+ (i - pi->turbo_available) * BUS_FREQ;
}
ret = 0;
@@ -10,9 +10,12 @@
#include <rte_power_intrinsics.h>
#include "rte_power_pmd_mgmt.h"
+#include "power_common.h"
unsigned int emptypoll_max;
unsigned int pause_duration;
+unsigned int scale_freq_min[RTE_MAX_LCORE];
+unsigned int scale_freq_max[RTE_MAX_LCORE];
/* store some internal state */
static struct pmd_conf_data {
@@ -694,8 +697,65 @@ rte_power_pmd_mgmt_get_pause_duration(void)
return pause_duration;
}
+int
+rte_power_pmd_mgmt_set_scaling_freq_min(unsigned int lcore, unsigned int min)
+{
+ if (lcore >= RTE_MAX_LCORE) {
+ RTE_LOG(ERR, POWER, "Invalid lcore ID: %u\n", lcore);
+ rte_errno = EINVAL;
+ return -1;
+ }
+ scale_freq_min[lcore] = min;
+
+ return 0;
+}
+
+int
+rte_power_pmd_mgmt_set_scaling_freq_max(unsigned int lcore, unsigned int max)
+{
+ if (lcore >= RTE_MAX_LCORE) {
+ RTE_LOG(ERR, POWER, "Invalid lcore ID: %u\n", lcore);
+ rte_errno = EINVAL;
+ return -1;
+ }
+ scale_freq_max[lcore] = max;
+
+ return 0;
+}
+
+int
+rte_power_pmd_mgmt_get_scaling_freq_min(unsigned int lcore)
+{
+ if (lcore >= RTE_MAX_LCORE) {
+ RTE_LOG(ERR, POWER, "Invalid lcore ID: %u\n", lcore);
+ rte_errno = EINVAL;
+ return -1;
+ }
+
+ if (scale_freq_max[lcore] == 0)
+ RTE_LOG(DEBUG, POWER, "Scaling freq min config not set. Using sysfs min freq.\n");
+
+ return scale_freq_min[lcore];
+}
+
+int
+rte_power_pmd_mgmt_get_scaling_freq_max(unsigned int lcore)
+{
+ if (lcore >= RTE_MAX_LCORE) {
+ RTE_LOG(ERR, POWER, "Invalid lcore ID: %u\n", lcore);
+ rte_errno = EINVAL;
+ return -1;
+ }
+
+ if (scale_freq_max[lcore] == UINT32_MAX)
+ RTE_LOG(DEBUG, POWER, "Scaling freq max config not set. Using sysfs max freq.\n");
+
+ return scale_freq_max[lcore];
+}
+
RTE_INIT(rte_power_ethdev_pmgmt_init) {
size_t i;
+ int j;
/* initialize all tailqs */
for (i = 0; i < RTE_DIM(lcore_cfgs); i++) {
@@ -706,4 +766,9 @@ RTE_INIT(rte_power_ethdev_pmgmt_init) {
/* initialize config defaults */
emptypoll_max = 512;
pause_duration = 1;
+ /* scaling defaults out of range to ensure not used unless set by user or app */
+ for (j = 0; j < RTE_MAX_LCORE; j++) {
+ scale_freq_min[j] = 0;
+ scale_freq_max[j] = UINT32_MAX;
+ }
}
@@ -148,6 +148,86 @@ __rte_experimental
unsigned int
rte_power_pmd_mgmt_get_pause_duration(void);
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice.
+ *
+ * Set the min frequency to be used for frequency scaling.
+ *
+ * @note Supported by: Pstate mode.
+ *
+ * @param lcore
+ * The ID of the lcore to set the min frequency for.
+ * @param min
+ * The value, in Hertz, to set the minimum frequency to.
+ * @return
+ * 0 on success
+ * <0 on error
+ */
+__rte_experimental
+int
+rte_power_pmd_mgmt_set_scaling_freq_min(unsigned int lcore, unsigned int min);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice.
+ *
+ * Set the max frequency to be used for frequency scaling.
+ *
+ * @note Supported by: Pstate mode.
+ *
+ * @param lcore
+ * The ID of the lcore to set the max frequency for.
+ * @param max
+ * The value, in Hertz, to set the maximum frequency to.
+ * @return
+ * 0 on success
+ * <0 on error
+ */
+__rte_experimental
+int
+rte_power_pmd_mgmt_set_scaling_freq_max(unsigned int lcore, unsigned int max);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice.
+ *
+ * Get the current configured min frequency used for frequency scaling.
+ *
+ * @note Supported by: Pstate mode.
+ *
+ * @param lcore
+ * The ID of the lcore to get the min frequency for.
+ * @return
+ * 0 if no value has been configured via the 'set' API.
+ * >0 if a minimum frequency has been configured. Value is the minimum frequency
+ * , in Hertz, used for frequency scaling.
+ * <0 on error
+ */
+__rte_experimental
+int
+rte_power_pmd_mgmt_get_scaling_freq_min(unsigned int lcore);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice.
+ *
+ * Get the current configured max frequency used for frequency scaling.
+ *
+ * @note Supported by: Pstate mode.
+ *
+ * @param lcore
+ * The ID of the lcore to get the max frequency for.
+ * @return
+ * UINT32_MAX if no value has been configured via the 'set' API.
+ * On success, the current configured maximum frequency, in Hertz, used for
+ * frequency scaling..
+ * <0 on error
+ */
+__rte_experimental
+int
+rte_power_pmd_mgmt_get_scaling_freq_max(unsigned int lcore);
+
#ifdef __cplusplus
}
#endif
@@ -42,6 +42,10 @@ EXPERIMENTAL {
# added in 22.07
rte_power_pmd_mgmt_get_emptypoll_max;
rte_power_pmd_mgmt_get_pause_duration;
+ rte_power_pmd_mgmt_get_scaling_freq_max;
+ rte_power_pmd_mgmt_get_scaling_freq_min;
rte_power_pmd_mgmt_set_emptypoll_max;
rte_power_pmd_mgmt_set_pause_duration;
+ rte_power_pmd_mgmt_set_scaling_freq_max;
+ rte_power_pmd_mgmt_set_scaling_freq_min;
};