telemetry: fix "in-memory" process socket conflicts

Message ID 20210915141030.23514-1-bruce.richardson@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series telemetry: fix "in-memory" process socket conflicts |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/github-robot: build fail github build: failed
ci/iol-x86_64-unit-testing fail Testing issues
ci/iol-x86_64-compile-testing fail Testing issues
ci/Intel-compilation fail Compilation issues
ci/iol-mellanox-Performance success Performance Testing PASS
ci/intel-Testing fail Testing issues

Commit Message

Bruce Richardson Sept. 15, 2021, 2:10 p.m. UTC
  When DPDK is run with --in-memory mode, multiple processes can run
simultaneously using the same runtime dir. This leads to each process
removing another process' telemetry socket as it started up, giving
unexpected behaviour.

This patch changes that behaviour to first check if the existing socket
is active. If not, it's an old socket to be cleaned up and can be
removed. If it is active, telemetry initialization fails and an error
message is printed out giving instructions on how to remove the error;
either by using file-prefix to have a different runtime dir (and
therefore socket path) or by disabling telemetry if it not needed.

Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")
Cc: stable@dpdk.org

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/telemetry/telemetry.c | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

--
2.30.2
  

Comments

Power, Ciara Sept. 22, 2021, 8:43 a.m. UTC | #1
Hi Bruce,


>-----Original Message-----
>From: Richardson, Bruce <bruce.richardson@intel.com>
>Sent: Wednesday 15 September 2021 15:11
>To: dev@dpdk.org
>Cc: Power, Ciara <ciara.power@intel.com>; Burakov, Anatoly
><anatoly.burakov@intel.com>; Richardson, Bruce
><bruce.richardson@intel.com>; stable@dpdk.org; David Marchand
><david.marchand@redhat.com>
>Subject: [PATCH] telemetry: fix "in-memory" process socket conflicts
>
>When DPDK is run with --in-memory mode, multiple processes can run
>simultaneously using the same runtime dir. This leads to each process
>removing another process' telemetry socket as it started up, giving
>unexpected behaviour.
>
>This patch changes that behaviour to first check if the existing socket is active.
>If not, it's an old socket to be cleaned up and can be removed. If it is active,
>telemetry initialization fails and an error message is printed out giving
>instructions on how to remove the error; either by using file-prefix to have a
>different runtime dir (and therefore socket path) or by disabling telemetry if it
>not needed.
>
>Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")
>Cc: stable@dpdk.org
>
>Reported-by: David Marchand <david.marchand@redhat.com>
>Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
>---
> lib/telemetry/telemetry.c | 25 ++++++++++++++++++++-----
> 1 file changed, 20 insertions(+), 5 deletions(-)
>
<snip>

Patch looks good overall, although I see CI is reporting some problems for FreeBSD:

../lib/telemetry/telemetry.c:435:21: error: incompatible pointer types passing 'struct sockaddr_un *' to parameter of type 'const struct sockaddr *' [-Werror,-Wincompatible-pointer-types]
                if (connect(sock, &sun, sizeof(sun)) == 0) {
                                  ^~~~
/usr/include/sys/socket.h:662:41: note: passing argument to parameter here
int     connect(int, const struct sockaddr *, socklen_t);
                                            ^
1 error generated.

Thanks,
Ciara
  

Patch

diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
index 8665db8d03..5be2834757 100644
--- a/lib/telemetry/telemetry.c
+++ b/lib/telemetry/telemetry.c
@@ -421,15 +421,30 @@  create_socket(char *path)

 	struct sockaddr_un sun = {.sun_family = AF_UNIX};
 	strlcpy(sun.sun_path, path, sizeof(sun.sun_path));
-	unlink(sun.sun_path);
+
 	if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
 		struct stat st;

-		TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
-		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode))
+		/* first check if we have a runtime dir */
+		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode)) {
 			TMTY_LOG(ERR, "Cannot access DPDK runtime directory: %s\n", socket_dir);
-		sun.sun_path[0] = 0;
-		goto error;
+			goto error;
+		}
+
+		/* check if current socket is active */
+		if (connect(sock, &sun, sizeof(sun)) == 0) {
+			TMTY_LOG(ERR, "Error binding telemetry socket, path already in use\n");
+			TMTY_LOG(ERR, "Use '--file-prefix' to select a different socket path, or '--no-telemetry' to disable\n");
+			path[0] = 0;
+			goto error;
+		}
+
+		/* socket is not active, delete and attempt rebind */
+		unlink(sun.sun_path);
+		if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
+			TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
+			goto error;
+		}
 	}

 	if (listen(sock, 1) < 0) {