[v2] telemetry: fix "in-memory" process socket conflicts

Message ID 20210924161842.2879019-1-bruce.richardson@intel.com (mailing list archive)
State Superseded, archived
Delegated to: David Marchand
Headers
Series [v2] telemetry: fix "in-memory" process socket conflicts |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS

Commit Message

Bruce Richardson Sept. 24, 2021, 4:18 p.m. UTC
  When DPDK is run with --in-memory mode, multiple processes can run
simultaneously using the same runtime dir. This leads to each process
removing another process' telemetry socket as it started up, giving
unexpected behaviour.

This patch changes that behaviour to first check if the existing socket
is active. If not, it's an old socket to be cleaned up and can be
removed. If it is active, telemetry initialization fails and an error
message is printed out giving instructions on how to remove the error;
either by using file-prefix to have a different runtime dir (and
therefore socket path) or by disabling telemetry if it not needed.

Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")
Cc: stable@dpdk.org

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
v2: fix build error on FreeBSD
---
 lib/telemetry/telemetry.c | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)
  

Comments

Power, Ciara Sept. 29, 2021, 8:50 a.m. UTC | #1
Hi Bruce,

>-----Original Message-----
>From: Richardson, Bruce <bruce.richardson@intel.com>
>Sent: Friday 24 September 2021 17:19
>To: dev@dpdk.org
>Cc: Power, Ciara <ciara.power@intel.com>; Burakov, Anatoly
><anatoly.burakov@intel.com>; Richardson, Bruce
><bruce.richardson@intel.com>; stable@dpdk.org; David Marchand
><david.marchand@redhat.com>
>Subject: [PATCH v2] telemetry: fix "in-memory" process socket conflicts
>
>When DPDK is run with --in-memory mode, multiple processes can run
>simultaneously using the same runtime dir. This leads to each process
>removing another process' telemetry socket as it started up, giving
>unexpected behaviour.
>
>This patch changes that behaviour to first check if the existing socket is active.
>If not, it's an old socket to be cleaned up and can be removed. If it is active,
>telemetry initialization fails and an error message is printed out giving
>instructions on how to remove the error; either by using file-prefix to have a
>different runtime dir (and therefore socket path) or by disabling telemetry if it
>not needed.
>
>Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")
>Cc: stable@dpdk.org
>
>Reported-by: David Marchand <david.marchand@redhat.com>
>Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
>---
>v2: fix build error on FreeBSD
>---
<snip>

Acked-by: Ciara Power <ciara.power@intel.com>

Thanks!
  
Kevin Traynor Sept. 29, 2021, 12:28 p.m. UTC | #2
Hi Bruce,

On 24/09/2021 17:18, Bruce Richardson wrote:
> When DPDK is run with --in-memory mode, multiple processes can run
> simultaneously using the same runtime dir. This leads to each process
> removing another process' telemetry socket as it started up, giving
> unexpected behaviour.
> 
> This patch changes that behaviour to first check if the existing socket
> is active. If not, it's an old socket to be cleaned up and can be
> removed. If it is active, telemetry initialization fails and an error
> message is printed out giving instructions on how to remove the error;
> either by using file-prefix to have a different runtime dir (and
> therefore socket path) or by disabling telemetry if it not needed.
> 

telemetry is enabled by default but it may not be used by the 
application. Hitting this issue will cause rte_eal_init() to fail which 
will probably stop or severely limit the application.

So it could change a working application to a non-working one (albeit 
one that doesn't interfere with other process' sockets).

Can it just print a warning that telemetry will not be enabled and 
continue so it's not returning an rte_eal_init failure?

A more minor thing, I see it changes the behaviour from, last one runs 
with telemetry, to, first one runs with telemetry. Though it can be 
figured from the commit message, it might be worth calling that change 
out explicitly.

thanks,
Kevin.

> Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")
> Cc: stable@dpdk.org
> 
> Reported-by: David Marchand <david.marchand@redhat.com>
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---
> v2: fix build error on FreeBSD
> ---
>   lib/telemetry/telemetry.c | 25 ++++++++++++++++++++-----
>   1 file changed, 20 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
> index 8304fbf6e9..78508c1a1d 100644
> --- a/lib/telemetry/telemetry.c
> +++ b/lib/telemetry/telemetry.c
> @@ -457,15 +457,30 @@ create_socket(char *path)
>   
>   	struct sockaddr_un sun = {.sun_family = AF_UNIX};
>   	strlcpy(sun.sun_path, path, sizeof(sun.sun_path));
> -	unlink(sun.sun_path);
> +
>   	if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
>   		struct stat st;
>   
> -		TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
> -		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode))
> +		/* first check if we have a runtime dir */
> +		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode)) {
>   			TMTY_LOG(ERR, "Cannot access DPDK runtime directory: %s\n", socket_dir);
> -		sun.sun_path[0] = 0;
> -		goto error;
> +			goto error;
> +		}
> +
> +		/* check if current socket is active */
> +		if (connect(sock, (void *)&sun, sizeof(sun)) == 0) {
> +			TMTY_LOG(ERR, "Error binding telemetry socket, path already in use\n");
> +			TMTY_LOG(ERR, "Use '--file-prefix' to select a different socket path, or '--no-telemetry' to disable\n");
> +			path[0] = 0;
> +			goto error;
> +		}
> +
> +		/* socket is not active, delete and attempt rebind */
> +		unlink(sun.sun_path);
> +		if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
> +			TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
> +			goto error;
> +		}
>   	}
>   
>   	if (listen(sock, 1) < 0) {
>
  
Bruce Richardson Sept. 29, 2021, 1:32 p.m. UTC | #3
On Wed, Sep 29, 2021 at 01:28:53PM +0100, Kevin Traynor wrote:
> Hi Bruce,
> 
> On 24/09/2021 17:18, Bruce Richardson wrote:
> > When DPDK is run with --in-memory mode, multiple processes can run
> > simultaneously using the same runtime dir. This leads to each process
> > removing another process' telemetry socket as it started up, giving
> > unexpected behaviour.
> > 
> > This patch changes that behaviour to first check if the existing socket
> > is active. If not, it's an old socket to be cleaned up and can be
> > removed. If it is active, telemetry initialization fails and an error
> > message is printed out giving instructions on how to remove the error;
> > either by using file-prefix to have a different runtime dir (and
> > therefore socket path) or by disabling telemetry if it not needed.
> > 
> 
> telemetry is enabled by default but it may not be used by the application.
> Hitting this issue will cause rte_eal_init() to fail which will probably
> stop or severely limit the application.
> 
> So it could change a working application to a non-working one (albeit one
> that doesn't interfere with other process' sockets).
> 
> Can it just print a warning that telemetry will not be enabled and continue
> so it's not returning an rte_eal_init failure?
> 

For a backported fix, yes, that would probably be better behaviour, but for
the latest branch, I think returning error and having the user explicitly
choose the resolution they want to occur is best. I'll see about doing a
separate backport patch for 20.11.

> A more minor thing, I see it changes the behaviour from, last one runs with
> telemetry, to, first one runs with telemetry. Though it can be figured from
> the commit message, it might be worth calling that change out explicitly.
> 

Sure. I'll resubmit a new version of this without stable CC'ed and include
that behaviour change explicitly in the commit log.

/Bruce
  
Bruce Richardson Sept. 29, 2021, 1:51 p.m. UTC | #4
On Wed, Sep 29, 2021 at 02:32:02PM +0100, Bruce Richardson wrote:
> On Wed, Sep 29, 2021 at 01:28:53PM +0100, Kevin Traynor wrote:
> > Hi Bruce,
> > 
> > On 24/09/2021 17:18, Bruce Richardson wrote:
> > > When DPDK is run with --in-memory mode, multiple processes can run
> > > simultaneously using the same runtime dir. This leads to each process
> > > removing another process' telemetry socket as it started up, giving
> > > unexpected behaviour.
> > > 
> > > This patch changes that behaviour to first check if the existing socket
> > > is active. If not, it's an old socket to be cleaned up and can be
> > > removed. If it is active, telemetry initialization fails and an error
> > > message is printed out giving instructions on how to remove the error;
> > > either by using file-prefix to have a different runtime dir (and
> > > therefore socket path) or by disabling telemetry if it not needed.
> > > 
> > 
> > telemetry is enabled by default but it may not be used by the application.
> > Hitting this issue will cause rte_eal_init() to fail which will probably
> > stop or severely limit the application.
> > 
> > So it could change a working application to a non-working one (albeit one
> > that doesn't interfere with other process' sockets).
> > 
> > Can it just print a warning that telemetry will not be enabled and continue
> > so it's not returning an rte_eal_init failure?
> > 
> 
> For a backported fix, yes, that would probably be better behaviour, but for
> the latest branch, I think returning error and having the user explicitly
> choose the resolution they want to occur is best. I'll see about doing a
> separate backport patch for 20.11.
> 
> > A more minor thing, I see it changes the behaviour from, last one runs with
> > telemetry, to, first one runs with telemetry. Though it can be figured from
> > the commit message, it might be worth calling that change out explicitly.
> > 
> 
> Sure. I'll resubmit a new version of this without stable CC'ed and include
> that behaviour change explicitly in the commit log.
> 
Actually, subtle behaviour change would be in the backport version that
doesn't error out, so I'll note it there when doing that patch, not in the
v3 of this one.

/Bruce
  
Kevin Traynor Sept. 29, 2021, 2:54 p.m. UTC | #5
On 29/09/2021 14:32, Bruce Richardson wrote:
> On Wed, Sep 29, 2021 at 01:28:53PM +0100, Kevin Traynor wrote:
>> Hi Bruce,
>>
>> On 24/09/2021 17:18, Bruce Richardson wrote:
>>> When DPDK is run with --in-memory mode, multiple processes can run
>>> simultaneously using the same runtime dir. This leads to each process
>>> removing another process' telemetry socket as it started up, giving
>>> unexpected behaviour.
>>>
>>> This patch changes that behaviour to first check if the existing socket
>>> is active. If not, it's an old socket to be cleaned up and can be
>>> removed. If it is active, telemetry initialization fails and an error
>>> message is printed out giving instructions on how to remove the error;
>>> either by using file-prefix to have a different runtime dir (and
>>> therefore socket path) or by disabling telemetry if it not needed.
>>>
>>
>> telemetry is enabled by default but it may not be used by the application.
>> Hitting this issue will cause rte_eal_init() to fail which will probably
>> stop or severely limit the application.
>>
>> So it could change a working application to a non-working one (albeit one
>> that doesn't interfere with other process' sockets).
>>
>> Can it just print a warning that telemetry will not be enabled and continue
>> so it's not returning an rte_eal_init failure?
>>
> 
> For a backported fix, yes, that would probably be better behaviour, but for
> the latest branch, I think returning error and having the user explicitly
> choose the resolution they want to occur is best. I'll see about doing a
> separate backport patch for 20.11.
> 

But this is a runtime message dependent on runtime environment. The user 
may not have access or know how to change eal parameters.

In the case where the application doesn't care about telemetry, they 
have gone from not having telemetry to rte_eal_init() failing, which 
probably has severe consequence.

I could maybe agree if telemetry was default disable and the application 
had set the --telemetry flag indicating that they want/need it. As it 
is, it feels like it's possibly a worse outcome for the user.

thanks,
Kevin.

>> A more minor thing, I see it changes the behaviour from, last one runs with
>> telemetry, to, first one runs with telemetry. Though it can be figured from
>> the commit message, it might be worth calling that change out explicitly.
>>
> 
> Sure. I'll resubmit a new version of this without stable CC'ed and include
> that behaviour change explicitly in the commit log.
> 
> /Bruce
>
  
Bruce Richardson Sept. 29, 2021, 3:24 p.m. UTC | #6
On Wed, Sep 29, 2021 at 03:54:48PM +0100, Kevin Traynor wrote:
> On 29/09/2021 14:32, Bruce Richardson wrote:
> > On Wed, Sep 29, 2021 at 01:28:53PM +0100, Kevin Traynor wrote:
> > > Hi Bruce,
> > > 
> > > On 24/09/2021 17:18, Bruce Richardson wrote:
> > > > When DPDK is run with --in-memory mode, multiple processes can run
> > > > simultaneously using the same runtime dir. This leads to each process
> > > > removing another process' telemetry socket as it started up, giving
> > > > unexpected behaviour.
> > > > 
> > > > This patch changes that behaviour to first check if the existing socket
> > > > is active. If not, it's an old socket to be cleaned up and can be
> > > > removed. If it is active, telemetry initialization fails and an error
> > > > message is printed out giving instructions on how to remove the error;
> > > > either by using file-prefix to have a different runtime dir (and
> > > > therefore socket path) or by disabling telemetry if it not needed.
> > > > 
> > > 
> > > telemetry is enabled by default but it may not be used by the application.
> > > Hitting this issue will cause rte_eal_init() to fail which will probably
> > > stop or severely limit the application.
> > > 
> > > So it could change a working application to a non-working one (albeit one
> > > that doesn't interfere with other process' sockets).
> > > 
> > > Can it just print a warning that telemetry will not be enabled and continue
> > > so it's not returning an rte_eal_init failure?
> > > 
> > 
> > For a backported fix, yes, that would probably be better behaviour, but for
> > the latest branch, I think returning error and having the user explicitly
> > choose the resolution they want to occur is best. I'll see about doing a
> > separate backport patch for 20.11.
> > 
> 
> But this is a runtime message dependent on runtime environment. The user may
> not have access or know how to change eal parameters.

True. But on the other hand, this problem only occurs with non-default EAL
parameters anyway, so someone must have configured this with the
--in-memory flag.

> 
> In the case where the application doesn't care about telemetry, they have
> gone from not having telemetry to rte_eal_init() failing, which probably has
> severe consequence.
> 

Yes, I agree, which I why I would suggest that for any backport of this
fix, the error be made non-fatal as you suggest. [Having looked into it,
having it as a non-fatal error is rather awkward, so it may be best just
left unfixed and the current behaviour documented as known-issue].

However, for any application being updated and rebuilt against 21.11, I
would have thought it reasonable to flag this as an error, as any such
application would require revalidation anyway.

> I could maybe agree if telemetry was default disable and the application had
> set the --telemetry flag indicating that they want/need it. As it is, it
> feels like it's possibly a worse outcome for the user.
> 

Perhaps, but I believe the only case of there being an issue would be where:
1) a user who cannot modify the EAL parameters
2) runs an application which has been updated and rebuilt against 21.11
3) where that application is hard-coded to use in-memory mode and
4) has never been verified with two or more instances of that running?
Or am I missing something here?

Regards,
/Bruce
  
Bruce Richardson Sept. 29, 2021, 3:31 p.m. UTC | #7
On Wed, Sep 29, 2021 at 04:24:06PM +0100, Bruce Richardson wrote:
> On Wed, Sep 29, 2021 at 03:54:48PM +0100, Kevin Traynor wrote:
> > On 29/09/2021 14:32, Bruce Richardson wrote:
> > > On Wed, Sep 29, 2021 at 01:28:53PM +0100, Kevin Traynor wrote:
> > > > Hi Bruce,
> > > > 
> > > > On 24/09/2021 17:18, Bruce Richardson wrote:
> > > > > When DPDK is run with --in-memory mode, multiple processes can run
> > > > > simultaneously using the same runtime dir. This leads to each process
> > > > > removing another process' telemetry socket as it started up, giving
> > > > > unexpected behaviour.
> > > > > 
> > > > > This patch changes that behaviour to first check if the existing socket
> > > > > is active. If not, it's an old socket to be cleaned up and can be
> > > > > removed. If it is active, telemetry initialization fails and an error
> > > > > message is printed out giving instructions on how to remove the error;
> > > > > either by using file-prefix to have a different runtime dir (and
> > > > > therefore socket path) or by disabling telemetry if it not needed.
> > > > > 
> > > > 
> > > > telemetry is enabled by default but it may not be used by the application.
> > > > Hitting this issue will cause rte_eal_init() to fail which will probably
> > > > stop or severely limit the application.
> > > > 
> > > > So it could change a working application to a non-working one (albeit one
> > > > that doesn't interfere with other process' sockets).
> > > > 
> > > > Can it just print a warning that telemetry will not be enabled and continue
> > > > so it's not returning an rte_eal_init failure?
> > > > 
> > > 
> > > For a backported fix, yes, that would probably be better behaviour, but for
> > > the latest branch, I think returning error and having the user explicitly
> > > choose the resolution they want to occur is best. I'll see about doing a
> > > separate backport patch for 20.11.
> > > 
> > 
> > But this is a runtime message dependent on runtime environment. The user may
> > not have access or know how to change eal parameters.
> 
> True. But on the other hand, this problem only occurs with non-default EAL
> parameters anyway, so someone must have configured this with the
> --in-memory flag.
> 
> > 
> > In the case where the application doesn't care about telemetry, they have
> > gone from not having telemetry to rte_eal_init() failing, which probably has
> > severe consequence.
> > 
> 
> Yes, I agree, which I why I would suggest that for any backport of this
> fix, the error be made non-fatal as you suggest. [Having looked into it,
> having it as a non-fatal error is rather awkward, so it may be best just
> left unfixed and the current behaviour documented as known-issue].
> 
> However, for any application being updated and rebuilt against 21.11, I
> would have thought it reasonable to flag this as an error, as any such
> application would require revalidation anyway.
> 
> > I could maybe agree if telemetry was default disable and the application had
> > set the --telemetry flag indicating that they want/need it. As it is, it
> > feels like it's possibly a worse outcome for the user.
> > 
> 
> Perhaps, but I believe the only case of there being an issue would be where:
> 1) a user who cannot modify the EAL parameters
> 2) runs an application which has been updated and rebuilt against 21.11
> 3) where that application is hard-coded to use in-memory mode and
> 4) has never been verified with two or more instances of that running?
> Or am I missing something here?
> 

Let me also go back to the drawing board on the solution here a bit, and
see if I can come up with something better. If I can find a reasonable way
to make it so that we can always create a socket in in-memory mode, despite
other processes running, it would sidestep this problem completely. Not
sure if it's possible, but let me see if I can come up with some ideas.
[One idea I did try is using abstract sockets on linux, but with those we
lose out on the permissions/protection we get from having a filesystem
path, so were a no-go for me because of that]

/Bruce
  
Kevin Traynor Sept. 29, 2021, 4:01 p.m. UTC | #8
On 29/09/2021 16:31, Bruce Richardson wrote:
> On Wed, Sep 29, 2021 at 04:24:06PM +0100, Bruce Richardson wrote:
>> On Wed, Sep 29, 2021 at 03:54:48PM +0100, Kevin Traynor wrote:
>>> On 29/09/2021 14:32, Bruce Richardson wrote:
>>>> On Wed, Sep 29, 2021 at 01:28:53PM +0100, Kevin Traynor wrote:
>>>>> Hi Bruce,
>>>>>
>>>>> On 24/09/2021 17:18, Bruce Richardson wrote:
>>>>>> When DPDK is run with --in-memory mode, multiple processes can run
>>>>>> simultaneously using the same runtime dir. This leads to each process
>>>>>> removing another process' telemetry socket as it started up, giving
>>>>>> unexpected behaviour.
>>>>>>
>>>>>> This patch changes that behaviour to first check if the existing socket
>>>>>> is active. If not, it's an old socket to be cleaned up and can be
>>>>>> removed. If it is active, telemetry initialization fails and an error
>>>>>> message is printed out giving instructions on how to remove the error;
>>>>>> either by using file-prefix to have a different runtime dir (and
>>>>>> therefore socket path) or by disabling telemetry if it not needed.
>>>>>>
>>>>>
>>>>> telemetry is enabled by default but it may not be used by the application.
>>>>> Hitting this issue will cause rte_eal_init() to fail which will probably
>>>>> stop or severely limit the application.
>>>>>
>>>>> So it could change a working application to a non-working one (albeit one
>>>>> that doesn't interfere with other process' sockets).
>>>>>
>>>>> Can it just print a warning that telemetry will not be enabled and continue
>>>>> so it's not returning an rte_eal_init failure?
>>>>>
>>>>
>>>> For a backported fix, yes, that would probably be better behaviour, but for
>>>> the latest branch, I think returning error and having the user explicitly
>>>> choose the resolution they want to occur is best. I'll see about doing a
>>>> separate backport patch for 20.11.
>>>>
>>>
>>> But this is a runtime message dependent on runtime environment. The user may
>>> not have access or know how to change eal parameters.
>>
>> True. But on the other hand, this problem only occurs with non-default EAL
>> parameters anyway, so someone must have configured this with the
>> --in-memory flag.
>>
>>>
>>> In the case where the application doesn't care about telemetry, they have
>>> gone from not having telemetry to rte_eal_init() failing, which probably has
>>> severe consequence.
>>>
>>
>> Yes, I agree, which I why I would suggest that for any backport of this
>> fix, the error be made non-fatal as you suggest. [Having looked into it,
>> having it as a non-fatal error is rather awkward, so it may be best just
>> left unfixed and the current behaviour documented as known-issue].
>>
>> However, for any application being updated and rebuilt against 21.11, I
>> would have thought it reasonable to flag this as an error, as any such
>> application would require revalidation anyway.
>>
>>> I could maybe agree if telemetry was default disable and the application had
>>> set the --telemetry flag indicating that they want/need it. As it is, it
>>> feels like it's possibly a worse outcome for the user.
>>>
>>
>> Perhaps, but I believe the only case of there being an issue would be where:
>> 1) a user who cannot modify the EAL parameters
>> 2) runs an application which has been updated and rebuilt against 21.11
>> 3) where that application is hard-coded to use in-memory mode and >> 4) has never been verified with two or more instances of that running?

That's a reasonable point that if it has in-memory hardcoded you might 
expect it to be tested with two or more, and if it's not hardcoded, it 
is added by the user so they are able to set eal params.

I still see an extra step for the user but I agree if they can set eal 
params then it is a lot less impactful. For OVS, a user could update the 
dpdk-extra ovsdb entry for the additional eal flags.

>> Or am I missing something here?
>>
> 
> Let me also go back to the drawing board on the solution here a bit, and
> see if I can come up with something better. If I can find a reasonable way
> to make it so that we can always create a socket in in-memory mode, despite
> other processes running, it would sidestep this problem completely. Not
> sure if it's possible, but let me see if I can come up with some ideas.
> [One idea I did try is using abstract sockets on linux, but with those we
> lose out on the permissions/protection we get from having a filesystem
> path, so were a no-go for me because of that]
> 

ok, thanks Bruce. I think you got the concerns anyway. I suppose a part 
of it goes back to: telemetry is default, but does that imply that it is 
required and dpdk should error out if it is not available or not.

Kevin.

> /Bruce
>
  

Patch

diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
index 8304fbf6e9..78508c1a1d 100644
--- a/lib/telemetry/telemetry.c
+++ b/lib/telemetry/telemetry.c
@@ -457,15 +457,30 @@  create_socket(char *path)
 
 	struct sockaddr_un sun = {.sun_family = AF_UNIX};
 	strlcpy(sun.sun_path, path, sizeof(sun.sun_path));
-	unlink(sun.sun_path);
+
 	if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
 		struct stat st;
 
-		TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
-		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode))
+		/* first check if we have a runtime dir */
+		if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode)) {
 			TMTY_LOG(ERR, "Cannot access DPDK runtime directory: %s\n", socket_dir);
-		sun.sun_path[0] = 0;
-		goto error;
+			goto error;
+		}
+
+		/* check if current socket is active */
+		if (connect(sock, (void *)&sun, sizeof(sun)) == 0) {
+			TMTY_LOG(ERR, "Error binding telemetry socket, path already in use\n");
+			TMTY_LOG(ERR, "Use '--file-prefix' to select a different socket path, or '--no-telemetry' to disable\n");
+			path[0] = 0;
+			goto error;
+		}
+
+		/* socket is not active, delete and attempt rebind */
+		unlink(sun.sun_path);
+		if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
+			TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
+			goto error;
+		}
 	}
 
 	if (listen(sock, 1) < 0) {