[virt-tools-list] [vhostmd] add SIGPIPE handler and reconnect, v2

Jim Fehlig jfehlig at suse.com
Wed Jun 20 21:27:14 UTC 2018


Side note: The 'v2' should be in the subject prefix, not the subject. E.g. 'git 
send-email --subject-prefix="vhostmd PATCH V2" ...'.

On 06/19/2018 07:33 AM, Michael Trapp wrote:
> vhostmd has no signal handler for SIGPIPE and a restart of libvirtd results in a
> stopped vhostmd. The root cause seems to be a UDS socket between vhostmd and
> libvirtd which is closed by a libvirtd restart. In addition to the signal handler
> the connection to libvirtd has to be opened again otherwise vhostmd can't read
> any data from libvirtd and doesn't update the metrics.

I think most of the text below here can be removed from the commit message. IMO 
it is information that is interesting for patch reviewers (and thus should 
included after the '---') but not necessary in the commit message. It is 
sufficient to describe the problem and a brief comment on how the patch fixes it.

> 
> Based on Daniel's reply, I've added a callback function for the connection close.
> 
> Well, the current patch is a minor change of the vhostmd connection handling but
> it might be sufficient to solve the reconnect issue in the available vhostmd implementation.

Ensure any lines in the commit message > 80 columns are trimmed.

> 
> The reconnect can be checked with
>     service libvirtd stop
> This results in a closed UDS socket of the vhostmd process
>     /var/run/libvirt/libvirt-sock-ro is not available anymore
>     and just to see the connection status on libvirt layer
>     virConnectIsAlive() returns -1 for the lost connection
> after a
>     service libvirtd start
> vhostmd connects to libvirtd with
>     connect(6, {sa_family=AF_FILE, path="/var/run/libvirt/libvirt-sock-ro"}, 110) = 0
> and /proc/PID/fd contains
>     lrwx------ 1 root root 64 ... 6 -> socket:[173298]
> and the VM metrics are updated again.
> 
> The libvirt API documents that virConnectIsAlive() 'Returns 1 if alive, 0 if dead, -1 on error'.
>>From vhostmd perspective '-1' should definitely result in a reconnect, because that's what we see
> when libvirtd is restarted and vhostmd can recover from the closed connection.
> At the moment I don't see a situation where vhostmd shouldn't try to reconnect and therefore
> virConnectIsAlive() is not used.
> As there seems to be no 'reconnect' function in the libvirt API, the connection is closed and
> opened again with the assumption that this does not result in a leak in libvirt.
> 
> The reconnect is triggered once in a 'metric collection loop' and in case of a failed connection the error
> is reported to the caller. If there is any additional error handling required, it should be implemented
> in an upper layer because the error is already noticed in vhostmd_run() and a stop (restart) of vhostmd
> must be handled in this function.
> 
> ---
>   vhostmd/vhostmd.c   |  2 ++
>   vhostmd/virt-util.c | 45 +++++++++++++++++++++++++++++++++++++--------
>   2 files changed, 39 insertions(+), 8 deletions(-)
> 
> diff --git a/vhostmd/vhostmd.c b/vhostmd/vhostmd.c
> index 7f04705..4cf4630 100644
> --- a/vhostmd/vhostmd.c
> +++ b/vhostmd/vhostmd.c
> @@ -117,6 +117,7 @@ static void sig_handler(int sig, siginfo_t *siginfo ATTRIBUTE_UNUSED,
>         case SIGQUIT:
>            down = 1;
>            break;
> +      case SIGPIPE:
>         default:
>            break;
>      }
> @@ -1053,6 +1054,7 @@ int main(int argc, char *argv[])
>      sigaction(SIGINT, &sig_action, NULL);
>      sigaction(SIGQUIT, &sig_action, NULL);
>      sigaction(SIGTERM, &sig_action, NULL);
> +   sigaction(SIGPIPE, &sig_action, NULL);
>   
>      xmlInitParser();
>   
> diff --git a/vhostmd/virt-util.c b/vhostmd/virt-util.c
> index 1c31305..0358826 100644
> --- a/vhostmd/virt-util.c
> +++ b/vhostmd/virt-util.c
> @@ -26,21 +26,48 @@
>   
>   #include "util.h"
>   
> +enum {
> +    CLOSED = 0,
> +    ESTABLISHED
> +} connection = CLOSED;
> +
>   static virConnectPtr conn = NULL;
>   
>   const char *libvirt_uri = NULL;
>   
> +void
> +conn_close_cb (virConnectPtr c,
> +               int           reason,
> +               void         *p)

I realize the coding style in vhostmd can make your eyes hurt, but let's try to 
stick with the prevailing style of no space between function name and the parens 
of its argument list. Also all parameters on a single line if they fit.

> +{
> +    (void) p;
> +    (void) reason;

You can use ATTRIBUTE_UNUSED instead. E.g.

void
conn_close_cb(virConnectPtr c,
               int reason ATTRIBUTE_UNUSED,
               void *p ATTRIBUTE_UNUSED)

> +
> +    if (c == conn) connection = CLOSED;
> +}
> +
>   static int
>   do_connect (void)
>   {
> +    if (connection == ESTABLISHED) return 0;
> +
> +    if (conn != NULL) virConnectClose (conn);

This is the style I dislike the most in the vhostmd code :-). I know there is a 
fair bit of it, but whitespace is cheap so lets make use of it and improve 
readability

     if (connection == ESTABLISHED)
         return 0;

     if (conn != NULL)
         virConnectClose (conn);

> +
> +    conn = virConnectOpenReadOnly (libvirt_uri);
>       if (conn == NULL) {
> -	conn = virConnectOpenReadOnly (libvirt_uri);
> -	if (conn == NULL) {
> -	    vu_log (VHOSTMD_ERR, "Unable to open libvirt connection to %s",
> -		    libvirt_uri ? libvirt_uri : "default hypervisor");
> -	    return -1;
> -	}
> +        vu_log (VHOSTMD_ERR, "Unable to open libvirt connection to %s",
> +                libvirt_uri ? libvirt_uri : "default hypervisor");
> +        return -1;
>       }
> +
> +    if (virConnectRegisterCloseCallback (conn, conn_close_cb, NULL, NULL)) {
> +        vu_log (VHOSTMD_ERR, "Unable to register callback 'virConnectCloseFunc'");
> +        virConnectClose (conn);
> +        conn = NULL;
> +        return -1;
> +    }

Same comment here about space between function name and parens.

> +
> +    connection = ESTABLISHED;
>       return 0;
>   }
>   
> @@ -98,8 +125,8 @@ vu_vm *vu_get_vm(int id)
>   void vu_vm_free(vu_vm *vm)
>   {
>      if (vm) {
> -      free(vm->name);
> -      free(vm->uuid);
> +      if (vm->name) free(vm->name);
> +      if (vm->uuid) free(vm->uuid);

Separate bug fix for a separate patch? And as part of it call vu_vm_free() in 
vu_get_vm() error path?

>         free(vm);
>      }
>   }
> @@ -107,8 +134,10 @@ void vu_vm_free(vu_vm *vm)
>   void vu_vm_connect_close()
>   {
>      if (conn) {
> +      virConnectUnregisterCloseCallback(conn, conn_close_cb);
>         virConnectClose(conn);
>         conn = NULL;
>      }
> +   connection = CLOSED;
>   }

The patch works fine in my testing. Can you fix the nits and send a V3? I was 
going to fix them myself until I came across the unrelated bug fix mentioned 
above, which should be submitted as a separate patch.

Regards,
Jim




More information about the virt-tools-list mailing list