[dpdk-dev,v1,1/4] lib/librte_power: add per-core turbo capability
Checks
Commit Message
Adds a new set of APIs to allow per-core turbo
enable-disable.
Signed-off-by: David Hunt <david.hunt@intel.com>
---
lib/librte_power/channel_commands.h | 2 +
lib/librte_power/rte_power.c | 9 ++
lib/librte_power/rte_power.h | 41 +++++++++
lib/librte_power/rte_power_acpi_cpufreq.c | 143 ++++++++++++++++++++++++++++++
lib/librte_power/rte_power_acpi_cpufreq.h | 40 +++++++++
lib/librte_power/rte_power_kvm_vm.c | 19 ++++
lib/librte_power/rte_power_kvm_vm.h | 35 +++++++-
7 files changed, 288 insertions(+), 1 deletion(-)
Comments
Recent generations of the Intel® Xeon® family processors allow Turbo Boost
to be enabled/disabled on a per-core basis.
This patch set introduces additional API calls to the librte_power library
to allow users to enable/disable Turbo Boost on particular cores.
Changes in patchset v2:
* Removed wrmsr/rdmsr functions as they were very architecture specific.
Now using the scaling_setspeed in the sys filesystem, as this is a more
standard cross-platform method of changing frequencies (where available).
* Removed patch that checks for particular models of CPU, as they are no
longer needed with the above change.
* Added APIs to the docs.
Additionally, the use of the library is demonstrated by additions to the
vm_power_manager example application, where the new commands have been
added to allow the turbo status of cores to be changed dynamically.
Extra message types have been added to the virtio-serial channels between the
guest_vm_power_manager app and the vm_power_manager apps to demonstrate
turbo change requests from a virtual machine. In this case, the guest will
send a request to the physical host, which in turn will change the state of
the turbo status.
Usage Example:
--------------
A VM has been created using 8 CPU cores, and 8 virtio-serial channels have
been created as per-core communications channels between the host and the VM.
See: http://www.dpdk.org/doc/guides/sample_app_ug/vm_power_management.html
for more information on setting up the vm_power applications.
In the vm_power_manager app on the host, we can query these channels:
vmpower> show_vm ubuntu2
VM: 'ubuntu2', status = ACTIVE
Channels 8
[0]: /tmp/powermonitor/ubuntu2.0, status = CONNECTED
[1]: /tmp/powermonitor/ubuntu2.1, status = CONNECTED
[2]: /tmp/powermonitor/ubuntu2.2, status = CONNECTED
[3]: /tmp/powermonitor/ubuntu2.3, status = CONNECTED
[4]: /tmp/powermonitor/ubuntu2.4, status = CONNECTED
[5]: /tmp/powermonitor/ubuntu2.5, status = CONNECTED
[6]: /tmp/powermonitor/ubuntu2.6, status = CONNECTED
[7]: /tmp/powermonitor/ubuntu2.7, status = CONNECTED
Virtual CPU(s): 8
[0]: Physical CPU Mask 0x100000
[1]: Physical CPU Mask 0x200000
[2]: Physical CPU Mask 0x400000
[3]: Physical CPU Mask 0x800000
[4]: Physical CPU Mask 0x1000000
[5]: Physical CPU Mask 0x2000000
[6]: Physical CPU Mask 0x4000000
[7]: Physical CPU Mask 0x8000000
Once the VM is up and running, if we exercise all the cores on the guest, we
can use turbostat on the host to see the frequencies of the guest cores. In
this example, it's cores 20-27:
19 0 0.01 2500 2500
20 2498 100.00 2500 2498
21 2498 100.00 2500 2498
22 2498 100.00 2500 2498
23 2498 100.00 2500 2498
24 *2498 100.00 2500 2498
25 2498 100.00 2500 2498
26 2498 100.00 2500 2498
27 2498 100.00 2500 2498
28 0 0.01 2032 2498
We can then issue a command in the vmpower app on the guest:
vmpower(guest)> set_cpu_freq 4 enable_turbo
This command will pass a message down through virtio-serial to the host, which
will enable turbo on core 24, the underlying physical core for the guest's
4th lcore_id. We can then see the change by running turbostat on the host:
19 0 0.01 2500 2496
20 2498 100.00 2500 2498
21 2498 100.00 2500 2498
22 2498 100.00 2500 2498
23 2498 100.00 2500 2498
24 *3297 100.00 3300 2498
25 2498 100.00 2500 2498
26 2498 100.00 2500 2498
27 2498 100.00 2500 2498
28 0 0.01 1016 2498
Core 24 is now running at 3300MHz, whereas the remainder are still running
at 2500MHz.
We can issue a similar command in the vm_power_manager running on the host
to disable turbo on that core, but this time we use the physical core id:
vmpower> set_cpu_freq 24 disable_turbo
and we see that turbo is now disabled on that core.
19 0 0.00 2500 2495
20 2499 100.00 2500 2499
21 2499 100.00 2500 2499
22 2499 100.00 2500 2499
23 2499 100.00 2500 2499
24 *2499 100.00 2500 2499
25 2499 100.00 2500 2499
26 2499 100.00 2500 2499
27 2499 100.00 2500 2499
28 0 0.01 1000 2499
[1/4] lib/librte_power: add turbo boost API
[2/4] examples/vm_power_manager: add per-core turbo
[3/4] examples/vm_power_cli_guest: add per-core turbo
[4/4] doc/power: add information on per-core turbo APIs
13/09/2017 12:44, David Hunt:
> Recent generations of the Intel® Xeon® family processors allow Turbo Boost
> to be enabled/disabled on a per-core basis.
>
> This patch set introduces additional API calls to the librte_power library
> to allow users to enable/disable Turbo Boost on particular cores.
Applied, thanks
@@ -52,6 +52,8 @@ extern "C" {
#define CPU_POWER_SCALE_DOWN 2
#define CPU_POWER_SCALE_MAX 3
#define CPU_POWER_SCALE_MIN 4
+#define CPU_POWER_ENABLE_TURBO 5
+#define CPU_POWER_DISABLE_TURBO 6
struct channel_packet {
uint64_t resource_id; /**< core_num, device */
@@ -50,6 +50,9 @@ rte_power_freq_change_t rte_power_freq_up = NULL;
rte_power_freq_change_t rte_power_freq_down = NULL;
rte_power_freq_change_t rte_power_freq_max = NULL;
rte_power_freq_change_t rte_power_freq_min = NULL;
+rte_power_freq_change_t rte_power_turbo_status;
+rte_power_freq_change_t rte_power_freq_enable_turbo;
+rte_power_freq_change_t rte_power_freq_disable_turbo;
int
rte_power_set_env(enum power_management_env env)
@@ -65,6 +68,9 @@ rte_power_set_env(enum power_management_env env)
rte_power_freq_down = rte_power_acpi_cpufreq_freq_down;
rte_power_freq_min = rte_power_acpi_cpufreq_freq_min;
rte_power_freq_max = rte_power_acpi_cpufreq_freq_max;
+ rte_power_turbo_status = rte_power_acpi_turbo_status;
+ rte_power_freq_enable_turbo = rte_power_acpi_enable_turbo;
+ rte_power_freq_disable_turbo = rte_power_acpi_disable_turbo;
} else if (env == PM_ENV_KVM_VM) {
rte_power_freqs = rte_power_kvm_vm_freqs;
rte_power_get_freq = rte_power_kvm_vm_get_freq;
@@ -73,6 +79,9 @@ rte_power_set_env(enum power_management_env env)
rte_power_freq_down = rte_power_kvm_vm_freq_down;
rte_power_freq_min = rte_power_kvm_vm_freq_min;
rte_power_freq_max = rte_power_kvm_vm_freq_max;
+ rte_power_turbo_status = rte_power_kvm_vm_turbo_status;
+ rte_power_freq_enable_turbo = rte_power_kvm_vm_enable_turbo;
+ rte_power_freq_disable_turbo = rte_power_kvm_vm_disable_turbo;
} else {
RTE_LOG(ERR, POWER, "Invalid Power Management Environment(%d) set\n",
env);
@@ -236,6 +236,47 @@ extern rte_power_freq_change_t rte_power_freq_max;
*/
extern rte_power_freq_change_t rte_power_freq_min;
+/**
+ * Query the Turbo Boost status of a specific lcore.
+ * Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 Turbo Boost is enabled for this lcore.
+ * - 0 Turbo Boost is disabled for this lcore.
+ * - Negative on error.
+ */
+extern rte_power_freq_change_t rte_power_turbo_status;
+
+/**
+ * Enable Turbo Boost for this lcore.
+ * Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+extern rte_power_freq_change_t rte_power_freq_enable_turbo;
+
+/**
+ * Disable Turbo Boost for this lcore.
+ * Review each environments specific documentation for usage..
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 on success.
+ * - Negative on error.
+ */
+extern rte_power_freq_change_t rte_power_freq_disable_turbo;
+
+
#ifdef __cplusplus
}
#endif
@@ -87,6 +87,14 @@
#define POWER_SYSFILE_SETSPEED \
"/sys/devices/system/cpu/cpu%u/cpufreq/scaling_setspeed"
+/*
+ * MSR related
+ */
+#define PLATFORM_INFO 0x0CE
+#define TURBO_RATIO_LIMIT 0x1AD
+#define IA32_PERF_CTL 0x199
+#define CORE_TURBO_DISABLE_BIT ((uint64_t)1<<32)
+
enum power_state {
POWER_IDLE = 0,
POWER_ONGOING,
@@ -543,3 +551,138 @@ rte_power_acpi_cpufreq_freq_min(unsigned lcore_id)
/* Frequencies in the array are from high to low. */
return set_freq_internal(pi, pi->nb_freqs - 1);
}
+
+
+static int
+rdmsr(int lcore, int msr, uint64_t *val)
+{
+ char filename[32];
+ int fd;
+ int retval;
+
+ sprintf(filename, "/dev/cpu/%d/msr", lcore);
+ fd = open(filename, O_RDONLY);
+ if (fd < 0)
+ return fd;
+
+ retval = pread(fd, val, sizeof(uint64_t), msr);
+ if (retval < 0) {
+ close(fd);
+ return retval;
+ }
+ close(fd);
+ return 0;
+}
+
+static int
+wrmsr(int lcore, int msr, uint64_t val)
+{
+ char filename[32];
+ int fd;
+ int retval;
+
+ sprintf(filename, "/dev/cpu/%d/msr", lcore);
+ fd = open(filename, O_WRONLY);
+ if (fd < 0)
+ return fd;
+
+ retval = pwrite(fd, (void *)&val, sizeof(uint64_t), msr);
+ if (retval < 0) {
+ close(fd);
+ return retval;
+ }
+ close(fd);
+ return 0;
+}
+
+int
+rte_power_acpi_turbo_status(unsigned int lcore_id)
+{
+ uint64_t val;
+ int retval;
+
+ if (lcore_id >= RTE_MAX_LCORE) {
+ RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+ return -1;
+ }
+
+#if defined(RTE_ARCH_I686) || defined(RTE_ARCH_X86_64)
+ retval = rdmsr(lcore_id, IA32_PERF_CTL, &val);
+ if (retval)
+ return retval;
+ else
+ return(!(val & CORE_TURBO_DISABLE_BIT));
+#else
+ return 0
+#endif
+}
+
+
+int
+rte_power_acpi_enable_turbo(unsigned int lcore_id)
+{
+ uint64_t val;
+ int retval;
+
+ if (lcore_id >= RTE_MAX_LCORE) {
+ RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+ return -1;
+ }
+
+#if defined(RTE_ARCH_I686) || defined(RTE_ARCH_X86_64)
+ /*
+ * The low byte of 1ADh MSR contains max recomended ratio when a small
+ * number of cores are active. Use this ratio when turbo is enabled.
+ */
+ retval = rdmsr(lcore_id, TURBO_RATIO_LIMIT, &val);
+ if (retval)
+ return retval;
+
+ val = (val & 0x00ff) << 8; /* Move to second lowest byte */
+ val &= ~CORE_TURBO_DISABLE_BIT; /* Switch bit off to enable turbo */
+
+ retval = wrmsr(lcore_id, IA32_PERF_CTL, val);
+ if (retval)
+ return retval;
+ else
+ return 0;
+#else
+ return 0;
+#endif
+}
+
+int
+rte_power_acpi_disable_turbo(unsigned int lcore_id)
+{
+ uint64_t val;
+ int retval;
+
+ if (lcore_id >= RTE_MAX_LCORE) {
+ RTE_LOG(ERR, POWER, "Invalid lcore ID\n");
+ return -1;
+ }
+
+#if defined(RTE_ARCH_I686) || defined(RTE_ARCH_X86_64)
+ /*
+ * 0CEh MSR contains max non-turbo ratio in bits 8-15. Use this
+ * for the freq when turbo is disabled for that core.
+ */
+ retval = rdmsr(lcore_id, PLATFORM_INFO, &val);
+ if (retval)
+ return retval;
+
+ val = val & 0xff00; /* Only need second lowest byte */
+ val |= CORE_TURBO_DISABLE_BIT; /* Switch bit on to disable turbo */
+
+ retval = wrmsr(lcore_id, IA32_PERF_CTL, val);
+ if (retval)
+ return retval;
+
+ /* Try to set freq to max by default coming out of turbo */
+ if (rte_power_acpi_cpufreq_freq_max(lcore_id) < 0) {
+ RTE_LOG(ERR, POWER, "Failed to set frequency of lcore %u to max\n",
+ lcore_id);
+ }
+#endif
+ return 0;
+}
@@ -185,6 +185,46 @@ int rte_power_acpi_cpufreq_freq_max(unsigned lcore_id);
*/
int rte_power_acpi_cpufreq_freq_min(unsigned lcore_id);
+/**
+ * Get the turbo status of a specific lcore.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 Turbo Boost is enabled on this lcore.
+ * - 0 Turbo Boost is disabled on this lcore.
+ * - Negative on error.
+ */
+int rte_power_acpi_turbo_status(unsigned int lcore_id);
+
+/**
+ * Enable Turbo Boost on a specific lcore.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 Turbo Boost is enabled successfully on this lcore.
+ * - Negative on error.
+ */
+int rte_power_acpi_enable_turbo(unsigned int lcore_id);
+
+/**
+ * Disable Turbo Boost on a specific lcore.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 0 Turbo Boost disabled successfully on this lcore.
+ * - Negative on error.
+ */
+int rte_power_acpi_disable_turbo(unsigned int lcore_id);
+
#ifdef __cplusplus
}
#endif
@@ -134,3 +134,22 @@ rte_power_kvm_vm_freq_min(unsigned lcore_id)
{
return send_msg(lcore_id, CPU_POWER_SCALE_MIN);
}
+
+int
+rte_power_kvm_vm_turbo_status(__attribute__((unused)) unsigned int lcore_id)
+{
+ RTE_LOG(ERR, POWER, "rte_power_turbo_status is not implemented for Virtual Machine Power Management\n");
+ return -ENOTSUP;
+}
+
+int
+rte_power_kvm_vm_enable_turbo(unsigned int lcore_id)
+{
+ return send_msg(lcore_id, CPU_POWER_ENABLE_TURBO);
+}
+
+int
+rte_power_kvm_vm_disable_turbo(unsigned int lcore_id)
+{
+ return send_msg(lcore_id, CPU_POWER_DISABLE_TURBO);
+}
@@ -172,8 +172,41 @@ int rte_power_kvm_vm_freq_max(unsigned lcore_id);
*/
int rte_power_kvm_vm_freq_min(unsigned lcore_id);
+/**
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * -ENOTSUP
+ */
+int rte_power_kvm_vm_turbo_status(unsigned int lcore_id);
+
+/**
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success.
+ * - Negative on error.
+ */
+int rte_power_kvm_vm_enable_turbo(unsigned int lcore_id);
+
+/**
+ * It should be protected outside of this function for threadsafe.
+ *
+ * @param lcore_id
+ * lcore id.
+ *
+ * @return
+ * - 1 on success.
+ * - Negative on error.
+ */
+int rte_power_kvm_vm_disable_turbo(unsigned int lcore_id);
#ifdef __cplusplus
}
#endif
-
#endif