<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p><tt>I could feel that you want to express "Oh, shit! How terrible
the code it<br>
is." from the reply, although across the screen. Anyway, I'm
really glad<br>
to thanks for your review and suggestions. <span
class="moz-smiley-s1"><span>:-)</span></span><br>
</tt><br>
<tt>On 2018年02月08日 07:55, John Ferlan wrote:</tt></p>
<p><tt>> That's where I'd expect to find the details of why
someone would want to<br>
> choose dlm over lockd or sanlock. What goodies are
provided? What<br>
> advantage is there?<br>
<br>
For those, assuming this scene: <font color="#330099">one
customer has used cluster software <br>
with DLM in their environment, why not choose DLM plugin
instead of<br>
sanlock/lockd ? Beside that, compared with sanlock/lockd, DLM
don't need<br>
the share storage, and sanlock would bring a little
heavyheavily on disk<br>
IO according to lockd RFC(2011-July/msg00337.html), but DLM
only requires<br>
network.<br>
</font><br>
I have sent a RFC in 12/2017 to ack how the opinion is. Daniel
said that<br>
"it can be considered in scope for a libvirt lock manager plugin
if you<br>
want to try writing one"(2017-December/msg00706.html). So I
draft this <br>
patch set which could only make DLM run in libvirt(with
configure flag<br>
`--with-dlm` to compile 'dlm.so' module by force). Hope some
suggestion<br>
bringed from libvirt community to help me improve, if accepted
DLM lock<br>
manager plugin RFC, so I don't update 'docs/locking.html.in'.<br>
<br>
-------------</tt></p>
<p><tt>The following is the reply contents, aim to answer why I do
that<br>
for most comments:<br>
<br>
1) sorry for indent in my code. I coded them in two notebooks,<br>
both using vim. However one is newer, I have not configured
'.vimrc' in<br>
that one.<br>
<br>
2)<br>
>> > +AC_DEFUN([LIBVIRT_CHECK_CPG],[<br>
>> > + dnl in some distribution, Version is `UNKNOW`
in libcpg.pc<br>
> <br>
> UNKNOWN<br>
> doesn't make it very useful does it?<br>
<br>
In some distribution, for example, openSUSE Tumbleweed:<br>
</tt><font color="#666666"><tt> $ cat
/usr/lib64/pkgconfig/libcpg.pc</tt><br>
<br>
<tt> prefix=/usr</tt><br>
<tt> exec_prefix=${prefix}</tt><br>
<tt> libdir=/usr/lib64</tt><br>
<tt> includedir=${prefix}/include</tt><br>
<br>
<tt> Name: cpg</tt><br>
<tt> Version: UNKNOWN</tt><br>
<tt> Description: cpg</tt><br>
<tt> Requires:</tt><br>
<tt> Libs: -L${libdir} -lcpg</tt><br>
<tt> Cflags: -I${includedir}</tt></font><br>
<tt><br>
using `LIBVIRT_CHECK_PKG([CPG], [libcpg], [xxx])` directly would
make<br>
build error. So I decide to ignore 'version'. I also don't know
why<br>
that is.<br>
<br>
3) <br>
>> > +#define LOCK_RECORD_FILE_PATH
"/tmp/libvirtd-dlm-file"<br>
> <br>
> Is somewhere in /tmp the best/only place to put this? What
not similar<br>
> to lockd/sanlock using /var/lib/libvirt/dlm... (which would
seemingly<br>
> need to be created on installation).<br>
<br>
In my opinion, if one node reboots, DLM cluster would drop locks
belong<br>
that node, most of Linux distribution would clean '/tmp'
directory...<br>
<br>
4)<br>
>> > +<br>
>> > +#define STATUS "STATUS"<br>
>> > +#define RESOURCE_NAME "RESOURCE_NAME"<br>
>> > +#define LOCK_ID "LOCK_ID"<br>
>> > +#define LOCK_MODE "LOCK_MODE"<br>
>> > +#define VM_PID "VM_PID"<br>
>> > +<br>
>> > +#define BUFFERLEN 128<br>
> <br>
> Yuck - for a stack variable...<br>
<br>
>> > + for (i = 0, str = raw, status = 0; ; str =
NULL, i++) {<br>
>> > + subtoken = strtok_r(str, " \n",
&saveptr);<br>
>> > + if (subtoken == NULL)<br>
>> > + break;<br>
>> > +<br>
>> > + switch(i) {<br>
>> > + case 0:<br>
> <br>
> These are magic numbers?<br>
<br>
<br>
Badly code wants to provided a readable file, like that:<br>
<br>
<font color="#666666">
STATUS
RESOURCE_NAME LOCK_MODE VM_PID<br>
0
1501e418dedd122b050c91ec61650ae30d3c217806125d3c2f365989143aa28d
EXMODE 21857</font></tt></p>
<p><tt>The first line is the form header, the following is the lock<br>
information.</tt></p>
<p><tt>5)<br>
>> > + char *name;<br>
>> > + uint32_t mode;<br>
>> > + uint32_t lkid;<br>
> <br>
> I would think uint32_t would be limiting... Why not just
unsigned int?<br>
<br>
view in '/usr/include/libdlm.h', it use `uint32_t` instead of<br>
`unsigned int`.<br>
<br>
6)<br>
>> > + virMutexLock(&(lockListWait.listMutex));<br>
>> > + virListAddTail(&lock->entry,
&(lockListWait.list));<br>
>> > +
virMutexUnlock(&(lockListWait.listMutex));<br>
><br>
> What is this list for? I think this also is completely
separable. I<br>
> wouldn't use a list either... 10s of domains probably works
fine. 100s<br>
> of domains perhaps not as well 1000s of domains starts
being a<br>
> bottleneck for sure.<br>
<br>
Hashtable is a good idea, and I wanted to use it but I had never
read<br>
it's code... Chosing list because that there won't be too much
vms in<br>
one host, is it offten common that there is more than 50 vms in
normal<br>
usage?<br>
<br>
7)<br>
>> > + if (safewrite(fd, buffer, strlen(buffer)) !=
strlen(buffer)) {<br>
>> > + virReportSystemError(errno,<br>
>> > + _("unable to write
lock information '%s' to file '%s'"),<br>
>> > + buffer,
NULLSTR(driver->lockRecordFilePath));<br>
>> > + return;<br>
>> > + }<br>
><br>
> Not sure I understand the reason of the file<br>
<br>
Compared with sanlock and lockd, this implement of DLM has no
daemon,<br>
some area is needed to record lock information in order to
resume lock<br>
after libvirtd reboot. I chose file.<br>
<br>
8)<br>
>> > + * found with another mode, and -ENOENT if
no orphan.<br>
>> > + *<br>
>> > + * cast/bast/param are (void *)1 because
the kernel<br>
>> > + * returns errors if some are null.<br>
>> > + */<br>
>> > +<br>
>> > + status = dlm_ls_lockx(lockspace, mode,
&lksb, LKF_PERSISTENT|LKF_ORPHAN,<br>
>> > + name, strlen(name), 0,<br>
>> > + (void *)1, (void *)1,
(void *)1,<br>
><br>
> ^^ These would seem to relate to the same parametes used
for<br>
> dlm_lock[_wait]. They are actually quite useful when it
comes to lock<br>
> conversions. You may want to considering investigating
that...<br>
><br>
><br>
> Still seems like a "bug" to me to need to pass 1 as a
callback function<br>
> or a blocking callback function.<br>
><br>
>> > + NULL, NULL);<br>
><br>
> Why not dlm_is_lock() instead since you're not using xid
and timeout<br>
<br>
The code is from lvm2, I try to use `dlm_lock_wait`, however it
failed,<br>
I don't know why. `(void *)1` is the same reason. Maybe a bug
from <br>
libdlm or 'dlm.ko'.<br>
<br>
9)<br>
>> > +<br>
>> > + /* create thread to receive notification from
kernel */<br>
><br>
> notification for what? what gets called?<br>
<br>
This is implemented by libdlm, the result is writted in `lksb`.<br>
<font color="#666666"><br>
int dlm_lock_wait(uint32_t mode,<br>
struct dlm_lksb *lksb,<br>
uint32_t flags,<br>
const void *name,<br>
unsigned int namelen,<br>
uint32_t parent, /* unused */<br>
void *bastarg,<br>
void (*bastaddr) (void *bastarg),<br>
void *range); /* unused */</font><br>
<br>
10)<br>
>> > + priv->hasRWDisks = true;<br>
>> > + /* ignore disk resource
without error */<br>
>> > + return 0;<br>
>> > + }<br>
>> > + }<br>
>> > +<br>
>> > + if
(virCryptoHashString(VIR_CRYPTO_HASH_SHA256, name, &newName)
< 0)<br>
><br>
>> Interesting _lockd uses SHA256 and _sanlock uses MD5...<br>
><br>
><br>
>> > + /* we need format the lock information,
so the lock name must be the constant length */<br>
>> > + if
(virCryptoHashString(VIR_CRYPTO_HASH_SHA256, name, &newName)
< 0)<br>
>> > + goto cleanup;<br>
><br>
> Neither _lockd nor _sandisk do any sort of Hash on this
name.<br>
><br>
><br>
>> > + memset(&lksb, 0, sizeof(lksb));<br>
>> > + rv = dlm_ls_lock_wait(lockspace,
priv->resources[i].mode,<br>
>> > + &lksb,
LKF_NOQUEUE|LKF_PERSISTENT,<br>
>> > +
priv->resources[i].name, strlen(priv->resources[i].name),<br>
><br>
> This is where I wonder why we cannot pass the path to the
disk to dlm<br>
> instead of the hash<br>
<br>
It's interesting here. Originally I use MD5 instead of SHA256,
but I found that:<br>
<br>
<font color="#666666"> typedef enum {<br>
VIR_CRYPTO_HASH_MD5, /* Don't use this except for
historic compat */<br>
VIR_CRYPTO_HASH_SHA256,<br>
<br>
VIR_CRYPTO_HASH_LAST<br>
} virCryptoHash;</font><br>
<br>
Futhermore, I find the bug patch related sanlock name: sanlock's
name<br>
length is limited 48, there is always someone use more than 48
characters<br>
(see <a class="moz-txt-link-freetext" href="https://bugzilla.redhat.com/show_bug.cgi?id=735443">https://bugzilla.redhat.com/show_bug.cgi?id=735443</a>). The
same<br>
situation, why not use the constant length? DLM limits name is
64 bytes<br>
to avoid some potential bugs. Besides, it's convenient to format
write<br>
it to file.<br>
<br>
11)<br>
>> > + * 'ensure sanlock socket is labelled with
the VM process label',<br>
>> > + * however, fixing sanlock socket security
labelling remove related<br>
>> > + * code. Now, `fd` parameter is useless.<br>
>> > + */<br>
>> > + if (fd)<br>
>> > + *fd = -1;<br>
><br>
> But you just set it to -1??!<br>
<br>
Maybe I can submit a patch to cut those code. Afterall it's
useless now.<br>
<br>
12)<br>
>> > + virCheckFlags(0, -1);<br>
>> > +<br>
>> > + if (state)<br>
>> > + *state = NULL;<br>
><br>
> Why would this return NULL? My recollection is there is
some sort of<br>
> inquiry API for dlm.<br>
<br>
Inquiry API is disappeared now... Only lock/unlock refered to
libdlm<br>
source code and reference book.<br>
<a class="moz-txt-link-freetext" href="http://people.redhat.com/ccaulfie/docs/rhdlmbook.pdf">http://people.redhat.com/ccaulfie/docs/rhdlmbook.pdf</a><br>
Chapter 4.3. Lock Query Operations</tt><br>
</p>
<pre class="moz-signature" cols="72">--
Regards
Lin Fu(river)</pre>
</body>
</html>