[dpdk-dev,v3,3/3] doc: update timer lib docs
Checks
Commit Message
This change updates the timer library documentation to
reflect a change to the organization of the skiplists
in the implementation.
Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
---
v3
* Updated implementation details section of timer_lib.rst to reflect the
addition of the option to use multiple pending timer lists per lcore.
* Updated release notes to reflect the addition of new function in timer
lib API.
doc/guides/prog_guide/timer_lib.rst | 27 +++++++++++++++++----------
doc/guides/rel_notes/release_17_11.rst | 7 +++++++
2 files changed, 24 insertions(+), 10 deletions(-)
Comments
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Erik Gabriel Carrillo
> Sent: Wednesday, September 13, 2017 11:05 PM
> To: rsanford@akamai.com
> Cc: dev@dpdk.org; Ananyev, Konstantin <konstantin.ananyev@intel.com>;
> stephen@networkplumber.org; Wiles, Keith <keith.wiles@intel.com>; Vangati,
> Narender <narender.vangati@intel.com>
> Subject: [dpdk-dev] [PATCH v3 3/3] doc: update timer lib docs
>
> This change updates the timer library documentation to reflect a change to
> the organization of the skiplists in the implementation.
>
> Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
> +Each skiplist data structure has ten levels and each entry in the table
> appears in each level with probability ¼^level.
Probably best not to use that Unicode, or otherwise, 1/4 character since
it can mess up the PDF output. Use "0.25^level" instead.
Apart from that:
Acked-by: John McNamara <john.mcnamara@intel.com>
Thanks for the review, John. I've submitted the change you suggest in a new patch series.
Regards,
Gabriel
> -----Original Message-----
> From: Mcnamara, John
> Sent: Monday, September 18, 2017 11:20 AM
> To: Carrillo, Erik G <erik.g.carrillo@intel.com>; rsanford@akamai.com
> Cc: dev@dpdk.org; Ananyev, Konstantin <konstantin.ananyev@intel.com>;
> stephen@networkplumber.org; Wiles, Keith <keith.wiles@intel.com>;
> Vangati, Narender <narender.vangati@intel.com>
> Subject: RE: [dpdk-dev] [PATCH v3 3/3] doc: update timer lib docs
>
>
>
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Erik Gabriel
> > Carrillo
> > Sent: Wednesday, September 13, 2017 11:05 PM
> > To: rsanford@akamai.com
> > Cc: dev@dpdk.org; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>;
> > stephen@networkplumber.org; Wiles, Keith <keith.wiles@intel.com>;
> > Vangati, Narender <narender.vangati@intel.com>
> > Subject: [dpdk-dev] [PATCH v3 3/3] doc: update timer lib docs
> >
> > This change updates the timer library documentation to reflect a
> > change to the organization of the skiplists in the implementation.
> >
> > Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
>
>
> > +Each skiplist data structure has ten levels and each entry in the
> > +table
> > appears in each level with probability ¼^level.
>
> Probably best not to use that Unicode, or otherwise, 1/4 character since it
> can mess up the PDF output. Use "0.25^level" instead.
>
>
> Apart from that:
>
> Acked-by: John McNamara <john.mcnamara@intel.com>
@@ -1,5 +1,5 @@
.. BSD LICENSE
- Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ Copyright(c) 2010-2017 Intel Corporation. All rights reserved.
All rights reserved.
Redistribution and use in source and binary forms, with or without
@@ -53,16 +53,19 @@ Refer to the `callout manual <http://www.daemon-systems.org/man/callout.9.html>`
Implementation Details
----------------------
-Timers are tracked on a per-lcore basis,
-with all pending timers for a core being maintained in order of timer expiry in a skiplist data structure.
-The skiplist used has ten levels and each entry in the table appears in each level with probability ¼^level.
+Timers are tracked on a per-lcore basis, with all pending timers for a core being maintained in order of timer
+expiry in either a single skiplist data structure or an array of skiplists, depending on whether
+the lcore has been configured for multiple pending lists. Multiple pending lists can be enabled when an
+application experiences contention for a single list for that lcore; skiplists corresponding to every other
+enabled lcore will be created.
+Each skiplist data structure has ten levels and each entry in the table appears in each level with probability ¼^level.
This means that all entries are present in level 0, 1 in every 4 entries is present at level 1,
one in every 16 at level 2 and so on up to level 9.
This means that adding and removing entries from the timer list for a core can be done in log(n) time,
up to 4^10 entries, that is, approximately 1,000,000 timers per lcore.
A timer structure contains a special field called status,
-which is a union of a timer state (stopped, pending, running, config) and an owner (lcore id).
+which is a union of a timer state (stopped, pending, running, config), an installer (lcore id), and an owner (lcore id).
Depending on the timer state, we know if a timer is present in a list or not:
* STOPPED: no owner, not in a list
@@ -77,17 +80,21 @@ Resetting or stopping a timer while it is in a CONFIG or RUNNING state is not al
When modifying the state of a timer,
a Compare And Swap instruction should be used to guarantee that the status (state+owner) is modified atomically.
-Inside the rte_timer_manage() function,
-the skiplist is used as a regular list by iterating along the level 0 list, which contains all timer entries,
-until an entry which has not yet expired has been encountered.
-To improve performance in the case where there are entries in the timer list but none of those timers have yet expired,
+Inside the rte_timer_manage() function, the timer lists are processed.
+If multiple pending lists have been enabled for an lcore, then each skiplist will
+be traversed sequentially, and run lists will be broken out and then processed.
+If multiple pending lists are not enabled for an lcore, then only a single skiplist will be traversed.
+A skiplist is used as a regular list by iterating along the level
+0 list, which contains all timer entries, until an entry which has not yet expired has been encountered.
+To improve performance in the case where there are entries in a skiplist but none of those timers have yet expired,
the expiry time of the first list entry is maintained within the per-core timer list structure itself.
On 64-bit platforms, this value can be checked without the need to take a lock on the overall structure.
(Since expiry times are maintained as 64-bit values,
a check on the value cannot be done on 32-bit platforms without using either a compare-and-swap (CAS) instruction or using a lock,
so this additional check is skipped in favor of checking as normal once the lock has been taken.)
On both 64-bit and 32-bit platforms,
-a call to rte_timer_manage() returns without taking a lock in the case where the timer list for the calling core is empty.
+rte_timer_manage() can either return or continue on to an lcore's next skiplist without taking a lock in the case where a timer list is empty,
+depending on whether or not the lcore has multiple pending lists.
Use Cases
---------
@@ -110,6 +110,13 @@ API Changes
Also, make sure to start the actual text at the margin.
=========================================================
+* **Updated timer library.**
+
+ The timer library has been updated; it can now support multiple timer lists
+ per lcore where it previously only had one. This functionality is off by
+ default but can be enabled in cases where contention for a single list is
+ an issue with the new function ``rte_timer_subsystem_set_multi_pendlists()``.
+
ABI Changes
-----------