On the need to move to a merge request workflow

Fri Mar 6 11:44:07 UTC 2020

We've discussed the idea of replacing our mailing list review workflow with
a merge request workflow in various places, over the last 6 months or so,
but never made a concrete decision.

I understand that some people will not like the idea of using merge requests.
It is ok to feel that way, however, we need to consider the bigger picture and
be open to learning to make the most of the opportunities, even if we don't
like certain things about the tools. We need to recognise that there are
millions of open source developers who are successfully delivering code using
the merge request workflow, and we are just as capable as them.

To claim that merge requests won't/can't work for libvirt, means there is
something unique about the libvirt project's needs, and/or all those other
developers are doing a worse job than libvirt developers, and/or libvirt
developers are incapable of using tools that everyone else copes with.
I don't believe any of those points are true.

While it is certainly possible to come up with a list of points why we might
like mailing lists, on aggregate merge requests will be the better solution.
There are examples of high profile projects moving to a git forge based
workflow and new projects use them as a first choice. If mailing lists were
truly the better approach, then there would be a notable flow of projects
moving away from merge requests, back to mailing lists. There is no evidence
of a trend towards mailing list workflows.

Short summary
=============

This has quickly become another very long email to read though, so I'll
attempt a summary

 * The need to attract new contributors is critical to the long term
   health of the project. Since 2016 the number of new contributors has
   dropped 50% from 80-90 per year, down to just 40-45 in 2019.

 * The current libvirt mailing lists are suffering from unreliable delivery.
   Some contributors are unable to send. Other contributors are unable
   to receive.

 * The contributors affected by the problems have very limited ability to
   resolve the problem they have, aside from opening tickets or changing
   email providers entirely.

 * A mail workflow relies on all contributors agreeing on the conventions
   to use when sending & replying to messages. Many of these conventions are
   unwritten and impose a burden for new contributors, discouraging their
   participation.

 * A mail workflow has many problems even for long term contributors, most
   notably with keeping track of what mails are outstanding. Each person
   has invented their own solution to deal with the lack of central tracking
   and they are all flawed & duplicating time on triage.

 * A merge request workflow has a lower barrier to participation for new
   contributors to the project, by making libvirt use commonly found workflow
   for OSS projects.

 * A merge request workflow is technically capable of supporting projects of
   equal size, or greater, than libvirt which deliver features at an
   equivalent speed. There is nothing special or unique about libvirts needs.
   See KDE, GNOME, SystemD, Kubernetes, OpenShift, to name just a few

 * Trying to directly match each aspect of a developer's current mail setup
   to a merge request workflow is not practical. They are inherantly different
   tools with differing capabilities which will never map 1:1 in all cases.

 * The largest barrier to a change is not technical, but the relative lack of
   experience & familiarity with the tools. This can only be solved by using
   them in a real world environment. A period of productivity disruption
   is inevitable while learning to optimize use of the new tools.

 * A merge request workflow is better for new or infrequent contributors,
   and in the long term is certainly usable for frequent contributors.
   Switching to merge requests is thus better on aggregate, and with
   experience/hindsight may well turn out better for everyone.

With the last few points in mind in particular, the idea is to not do a big
bang switchover of everything at once, but instead split it into phases.
Initially the focus is to be on all the non-core projects, which means the
language bindings, application integrations, and supporting tools. After those
are all switched successfully, highlighting any areas of difficulty and
allowing for some learning/adaptation, the main libvirt.git would be switched
over. The non-core projects can be switched more or less immediately as their
review demands are much lower. It is anticipated that the timeframe for
everything would be on the order of 3-6 months, at the max.

The long story
==============

The measurement question
------------------------

The key motivation behind this proposal is that it will make libvirt a more
attractive project for new contributors. The quotes later illustrate real
world cases where our use of email workflow has hurt contributors. These are
just the cases we have become aware of from people who were motivated enough
to continue their effort despite the barriers faced. It is an unknown how many
more people faced similar barriers and just walked away without us knowing.

It has been suggested in the past that we run both workflows in parallel as a
way to test which is better. I don't believe such an approach will provide
valid data. Splitting attention between two different tools will be detrimental
to the effectiveness of both tools. It would be an unbalanced comparison
because there is no practical way to compare both tools with the same data set
unless we wish to review all patch series twice.  In addition the timeframes
needed for a meaningful comparison would be too long, spanning 1+ years to
gather statistically significant data on contribution.  If a merge request
workflow was a new, unproven concept, then it would none the less be worth
trying to do a comparison, however, difficult. This is not the case though,
merge request workflows have been around many years & proven viable. There is
nothing special about the libvirt project or developers skills that would
invalidate this track record.

Thus this mail will look at various aspects of the tools and consider how they
impact our work, with a particular focus on issues which negatively impact new
contributors, or which have a discriminatory effect on certain subsets of new
or existing contributors.

Contributors decline
--------------------

Since 2016 libvirt has seen a significant decline in the number of new authors
contributing patches to the project. This is illustrated in the 3rd slide from
this KVM Forum presentation:

  http://people.redhat.com/berrange/kvm-forum-2019/kvm-forum-2019-libvirt.pdf

For the data behind this graph, the commits to all GIT repos on libvirt.git
were analysed, taking the year recorded for the git changeset author date.
The email addresses were recorded each year, and a "new contributor" was
considered to be someone who had never made a commit in any previous calendar
year.

We can see the project grew quite fast until about 2011 and then levelled out
with a reasonably consistent number of long term vs new contributors. Since
2016, however, there has been a clear declining trend. The raw data shows for
the last two years, the non-core git repos in particular saw a big decline,
however, the same is also seen for libvirt.git itself.

There is likely no single factor responsible for this decline. A combination
of interest in containers instead of VMs, and maturity of the codebase likely
play a fairly large part. None the less, the decline of new contributors is
highly undesirable, whatever the reason, as there is still plenty of work that
needs doing in libvirt. Thus steps need to be taken to make libvirt more
attractive to new contributors. Lowering the barrier to entry for contributors
is one key aspect to this. 

New contributor mistakes
------------------------

Some of the mistakes new users are liable to make when dealing with patch
submissions

 * HTML mail instead of plain text
 * Mangled patches due to mail client breaking long lines
 * Incorrectly threaded patch series
 * Not labelling series with version numbers
 * Sending plain ‘diff’ output instead of a git patch
 * Not basing the patch on recent git master
 * Not subscribing to the list before posting
 * Corporate legal privacy / copyright signatures
 * Unintelligible mail quoting in replies (Outlook)

We've tried to steer people towards git-publish, as that has encoded many of
the conventions to prevent mistakes, but that is by no means a foolproof
solution. It still requires setup and a working mail server, as well as
following git best practice wrt branch usage.

New contributor real experiences
--------------------------------

Some quotes on this theme that I've collected from just the last three months.
I'm removing the names as this isn't about the individuals involved, rather
their first / early impressions & experience of libvirt:

 "I have sent the patches to the list. I had seen a while back
  that eventually merge requests might be fielded, so I figured
  it was worth a shot."

 "just not familiar with manual patch work, normally been used
  to gitlab, github. I installed the git-publish as per document,
  but not sure about the next step from this tool. And, I am not
  sure about how to connect my email with this tool either."

 "I'm happy to hear you are considering a switch to gitlab.
  I've never contributed to projects which are using email based
  submission"

 "Have you guys received an email from .... with the subject
  ..... I can't find it in the archives"

 "I don't know what has happened, but we seem to be missing
  this patch from the list.
  ...
  I have all the patches in my inbox from the cc, so have no idea
  what happened either. It was my first time using git-publish and
  it took a few tries to get anything done."

There are a variety of reasons behind these problems, but the effects are
clear. Contributing to libvirt via email is complex and error prone.

Mailman lost delivery
---------------------

For Mailman, we are entirely dependent on Red Hat IT support via a private
ticketing system. The reliability of Mailman leaves alot to be desired. For
the past 5 years or more there have been multiple periods when mail delivery
either stopped entirely, or randomly lost a subset of mails. When this happens
if often isn't obvious at first that we're missing mail, and once identified,
it has usually taken 2-3 days between opening a ticket and having someone
actively investigate and resolve it. In the worst case it was over 7 days with
broken mail delivery. This is not a good situation to be in, but we've
tolerated it as nothing was irretrievably lost, just delayed by days, and
every subscriber was equally affected.

No service is perfectly reliable and we have seen both GitLab and GitHub have
downtime periodically. A key difference though is how the problems are handled.
For the git forges, service downtime is a business critical problem, and so
gets the prioritized attention that it demands. Mailing list delivery for
libvirt is NOT considered a business critical problem for Red Hat, and so the
low priorization of fixing mailman problems reflects this.

MX server content mangling
--------------------------

More recently we've faced worse problems with the mailing list. The new
service acting as redhat.com border MX gateway is mangling mails in a way that
breaks DKIM signatures. Thus any time someone sends a message with a DKIM
signature, there are a large number of subscribers to the list whose mail
server will bounce the delivery from mailan due to DKIM validation failures.

We tried to fix this by removing the subject prefix modification and body text
footer in mailman, but that merely reduced the frequency of the problem by
a small amount. In theory it can be mitigated by having mailman replace the
"From" field with the list server's address. qemu-devel did this originally
before they switched to removing the subject prefix/body footer. The reason it
worked for QEMU was because only mailman was mangling their messages, whereas
for libvirt, both mailman and the redhat.com MX gateway are mangling. The old
version of mailman on redhat.com doesn't appear to support From header
rewriting.

Similar to DKIM problems, there are the same problems with ARC headers, though
we lack definitive proof on the cause. The effect is that when any SUSE
contributor sends a mail, deliveries to all other SUSE subscribers will be
bounced. We believe it affects some other domains, but SUSE is the most obvious
one.

Both these problems are pretty serious and discriminate against certain
subscribers depending on how their mail service configures their incoming
filters. Even redhat.com was rejecting messages with broken DKIM signatures
for a time, until enough people complained about missing valid emails. Many
other contributors are not lucky enough to have an IT staff who would respond
to such complaints.

MX server anti-spam blackholing
-------------------------------

A further problem has impacted mails sent by one contributor, which have been
silently dropped by redhat.com MX servers, with no bounce at all. After two
months we tracked this down to a bogus hostname in the From field of the
messages which caused redhat.com to consider them a spammer spoofing the
sender address & so blackhole it. This is actually a case where I probably
agree with the redhat.com MX server policy, but it is none the less incredibly
unhelpful for the unlucky contributor(s) affected by it, as it is hard for
them to discover what they did wrong. We were lucky this contributor noticed
they had a problem and was very persistent over months in talking to us about
it to rather than walking away, as I expect most people would.

Loosing members through attrition
---------------------------------

Even without any of the above problems, the mailing list workflow still has
the problem of being yet another account for people to keep track of. Every
week I have to purge multiple subscribers from libvirt lists due to repeated
bounced delivery attempts. Most commonly this is when their MX server starts
rejecting messages claiming the address no longer exists (due to leaving a
job).

IOW, when someone has changed their job, and thus changed their email address,
they have to re-register with any projects they are using. In some cases this
is ok, as they may no longer care about the project after switching jobs, but
it means we lack any channel to attempt to keep their interest. If another
contributor tries to contact them via their old email address, they'll just
get bounces too. With Git Forges, users have a login identity that is distinct
from their email address, and thus they one need update one place when
changing jobs, and interactions with other users are via this persistent
identifier.

Mail diagnosis / resolution
---------------------------

The difficulty in all of these cases is that the project contributors who are
affected have very limited means to either diagnose or resolve the problem.
They are dependent on raising tickets with their own IT admin staff, who might
then have to raise tickets with support from a company they outsourced to, and
or talk to libvirt mailman admins. The libvirt mailman admins are similarly
dependent on raising tickets with Red Hat IT and an external company. Depending
on the level of outsourcing, the problem can span across IT support staff from
4/5 different companies. This has a predictably bad effect on the chances of
resolving problems.

Mail lack of recovery
---------------------

Many of the problems result in permanent loss of messages to the subscriber
affected. Even if the problem was eventually solved, there is often no
straightforward way to get back the messages that were lost in the mean time.
One the mail server has a hard bounce for that person, it will stop retrying
delivery. The only option is to visit the mailman archive page and download
the mbox and re-import messages.

This is a result of email being a push based communication mechanism, with
only a weak delivery assurance. If the push fails and isn't retried, the
information is gone for good.

Contrast with a Git Forge, where the primary communication mechanism is a pull
based model. If there is a transient problem with the contributors network
connectivity, or the hosted service has an outage, the user will be unable to
pull information for a period of time. Once the problem is resolved, the next
attempt to pull information will include everything, with no loss from the
end user's POV.

Mail list providers
-------------------

Moving the libvirt mailing lists to another provider is not a magic bullet
solution. qemu-devel is hosted on nongnu.org and has suffered bounces due to
DKIM breakage, though at least their more modern mailman let them workaround
it better. nongnu.org though has had similar problems with mail delivery
getting stopped for periods of time, not as bad as libvirt suffered, but still
dependant on waiting until the US wakes up to get anyone to investigate. I
know other projects hosting lists on groups.io, which have also had serious
problems of late with mails getting silently discarded, or mangled such that
'git am' fails. There are other providers

The impact on contributors
--------------------------

All this conspires to negatively impact the ability of people to participate
in the libvirt project. If you can't reliably send & receive patches to
libvir-list you are effectively excluded from the project, given that email is
the core of our workflow. It hasn't had a huge visible impact on libvirt,
primarily because the Red Hat contributions have escaped the worst effects.
This "luck" doesn't make me at all happy, because it is important that anyone
involved the project has an equal ability to contribute, regardless of whether
they are Red Hat people or not. This is not the case

There are many conceptual reasons to like a distributed messaging system based
on open standards. The reality of the implementation of email though is quite
different from the theory. The distributed nature of email, combined with the
plethora of implementations which don't agree on what the minimum acceptable
standards are, is the direct cause of the problems we have with reliable
operation of the libvirt list, and mail in general. The problems with email
don't seem to be getting better, if anything they are worse, as the broader
world increasingly shifts away from email, sadly often to closed protocols,
or closed applications.

Overall though, I struggle the with idea of continuing to recommend email as
the best choice for collaboration since it is clearly failing & excluding or
harming some of our contributors.

Mail usage for new contributors
-------------------------------

For those of us who have been involved in libvirt for any length of time, the
usage of email feels quite straightforward and natural. Our familiarity and
experience can be quite different from that of new contribtors to the project.
The usage of email as a development workflow relies on a large number of,
often undocumented, conventions. If a contributor gets the conventions wrong
they'll have poor first experience with the project.

In the early days of libvirt those new contributors with existing open source
experiance would already be familiar with email conventions, but we had plenty
of contributors from the corporate world for which open source was a completely
new experiance. Over the years we've sent nasty-gram responses to plenty of
contributors who missed the conventions, though thankfully we're getting more
polite and friendly these days in how we point out these kind of mistakes. It
would be better to use a system that doesn't need this to the same degree.

These days open source is a well known development model even among corporate
developers from the closed source world, but their OSS knowledge is almost
always based on a GitForge workflow, not an email workflow. 

Maintainer patch submission overload
------------------------------------

Any long term contributors to libvirt will have become familiar with the mail
based development workflow, however, long term usage and familiarity does not
imply that the solution used is the optional one, rather that the chosen
solution has been optimized.

With the volume of patches submitted to libvirt there is always pressure on
reviewer bandwidth. The imbalance between incoming patches and reviewer time
is common across all major open source projects, regardless of development
workflow and tools. What matters is thus how well the project deals with the
overload, to priortize the right aspects.

The efficiency with which developers can perform reviews is one factor and is
typically the area where email is said to be most beneficial. I personally find
email is efficient for writing comments inline to patches, but that is largely
due to the use of a text based user interface (mutt), rather than the inherent
use of email as a messaging platform. There are other aspects to the use of
email which harm our productivity as reviewers.

Personal patch tracking
-----------------------

To review a patch you first have to realize and/or remember that it exists,
and this is an area where email is quite poor. It is easy for patches to go
unattended for a prolonged period of time, drowned out in the volume of
patches on the list. We encourage users to send a "ping" if they think their
patch has been ignored for too long. This is a poor workaround for our
inability to keep track of what patches are pending. New contributors won't
even know they're expected to send a "ping", or may wonder how long they
should wait.

Essentially the mail folder needs to be thought of as a mixture of signal and
noise, where 'signal' refers to patches which need reviewer attention, and
'noise' refers to patches which are already processed in some way. The most
recent mails in an folder have a high signal component, turning increasingly
to noise as they age. The key task for email processing is thus to distinguish
signal from noise, such that we never miss any valid messages.

The supposed positive of using email is that contributors have the flexibility
to customize their way of working to enable them to keep track of pending work
in whatever way suits them best. People come up with clever mail filtering
practices / rules / scripts to organize their incoming mail stream. Instead of
considering this a benefit, we should be asking why our tools are so ill-suited
to the task that every contributor has to spend (waste) time reinventing the
wheel to distinguish signal from noise when processing email, duplicating
that same work across all contributors ?

This has a particularly high burden on contributors who don't engage with the
project on a daily basis, as they get a firehose of emails, many of which are
already obsolete by the time they look at their inbox. IOW, in-frequent
contributors experiance high noise, low signal, when interacting with the
libvirt mailing list.

At first glance merge requests may feel like they are no better, since there
would be the same number of patches arriving each day. The key difference is
that there is good structure and metadata to the merge requests. Multiple
versions of the same series, appear as a single entity. Merge requests which
have been merged or abandoned no longer appear. Active merge requests can be
tagged with labels & users to organize them. Thus looking at the list of
merge requests gives you information with a high signal content, and little
noise content. This is good for both regular contributors and infrequent or
new contributors.

Patch tracking tools
--------------------

With every contributor organizing their own personal incoming stream, there is
no visibility / interaction between contributors on pending work. There is no
way of knowing whether any given patch series is on the to do list for another
person to deal with. There have been tools created which attempt to fill the
gap by providing a centralized view of incoming patches. To name just a few,
there are patchwork, patches, patchew, and patchecker. We have tried many of
the tools to some degree or another with libvirt, but none have had success.
Some are also in use in QEMU but see periodic breakage in consuming the mail
stream for a variety of reasons.

First and foremost, they suffer from the difficulty of turning free format
mail threads into well structured data. This can only be done reliably if all
contributors follow a set of well defined conventions for sending patches,
which immediately puts a burden on contributors, and is a barrier to new
contributors. These tools typically assume that all patches can apply against
git master and if this isn't possible  drop them with an error. Contributors
rarely intentionally send patches against old versions of the code, but with
the rate of commits, git master can still conflict with a newly posted patch
series, requiring human intervention. Some support special magic comments in
the mail to indicate what git version the patch is based on, but this is more
secret sauce for contributors to learn.

The problem of patch conflict resolution also creates challenges for the tools
to determine when a series has been merged & can thus be considered complete.

Then there is the matter of correlating different versions of the same
patch series. This again largely relies on contributors following a convention
of using the same subject line for the leading patch / cover letter each time
it is posted. Some people post URLs pointing to previous versions of the same
series, but this is yet more manual work to work around the limits of email
when used for patch submission. Keeping a fixed subject does not always make
sense. This again makes it hard for the tools to determine whether a patch
series has been merged or obsoleted.

So overall, while it is possible to write tools to turn incoming patch series
emails into structured data, their reliability is poor, which in turn makes
it difficult to consider relying on them. These same problems are why it is
difficult to have a bi-directional email gateway between a mailing list and a
merge request system. Sending mails from a merge request is easy. Parsing
contextual replies and getting them back into the merge request is incredibly
hard, often impossible.

Attaching CI to submissions
---------------------------

Without an ability to reliably process incoming patches in an automated manner,
it is impractical to automate downstream tasks. It is generally considered
best practice to run tests on all code before it is accepted for merge. Libvirt
CI testing is currently all done post-merge. Contributors are generally
expected to make sure code compiles and passes tests before posting it. The
testing done this way rarely has the same breadth of coverage as that done by
post-merge CI and may be done against an outdated git master. Reviewers may
or may not take the time to actually apply the patches and run builds and
tests. Where this is done, the coverage is usually just one or two platforms,
a small subset of post-merge CI.

The result of this is the frequent build & test breakage of git master, which
has to be addressed through "brown paper bag" commits, mostly pushed without
review under a "build breaker" rule. This is a prudent way to fix the breakage,
however, it would be better if the breakage was prevented in the first place.
If automation can catch the build problems before merge, there ceases to be
any need for having special exceptions to allow pushing code without review.
The result will be a higher quality output overall, with less time wasted on
manually pointing out silly mistakes, or fixing them up after the fact.

We know there is insufficient reviewer time to handle all the incoming patches
submitted, and thus we should not expect them to spend time on tasks that can
be automated. Reviewers should never be building code or running tests in
general, or spending time pointing out problems in the code that can be caught
by the tests. The tools should do all testing the moment the code is submitted,
with reviewers only spending time once CI has passed & the submitted had a
chance to re-upload any obvious fixes. The patchew tool attempts to do this
kind of automation, but as noted above it suffers from the hard problem of
reliably parsing email, determining what git master patches apply to, and needs
secret comment syntax to be used by contributors for cases where it can't do
the right thing.

Barriers to new contributors
----------------------------

The only real solution to contributor overload is to spread the work across
more people. This in turn implies a need to attract new contributors to the
project. There need to be interesting problems to attract their attention.

The task is then need to turn them into regular contributors, and this is
where first impressions of a project are important. Simply put, the first
interaction needs to be a positive experience. This might be via a bug tracker
or via a code submission, or something else. Anything which may hurt first
impressions needs to be eliminate or mitigated as much as practical.

A need to learn unusual workflows, or configure new tools, or learn about
project specific (unwritteN) conventions, are all things which stand in the
way of succesful participation.

The key value of the merge request workflow is the standardization across
projects, such that it has immediate familiarity enabling people to
contribution without having to read docs. This is a bit of an over
simplication, since there is always some project specific knowledge required.
Ideally this knowledge will be primarily around the code and not on the
development process.

I've increasingly recognised this in my own interactions with projects. Since
the widespread adoption of Git, I find it a significant turn off to find a
project is still using CVS, Subversion, or any other non-Git SCM. Increasingly
I feel the same way about development workflow and the git forge. If I can
fork a project on GitLab/GitHub, and open a pull request, I'm more likely to
send a patch upstream, than if I have to register for a new account on a
custom bug tracker or mailing list. On the flip side, while one of my own
personal projects still has a mailing list, it is more enjoyable dealing with
contributors if they open a pull request, as I see the CI results and know the
patch can be applied with 1 click if desired. The more I use the tools, the
more effective they become for me.

Keeping new contributors interested
-----------------------------------

The response to a new contributor is also important. The earlier someone
engages with them, the better. With this in mind it is useful to be able to
quickly identify new contributor and prioritize a response to them over other
work. Automation (CI) can help in this respect by picking up & reporting
problems without needing human attention.

None the less, triage of incoming requests will be needed, primarily though
tagging to classify incoming requests, and nominating certain people as
reviewers. The power of the merge request model is that such triage is
centralized & thus visible to participates, as opposed to being repeated by
every individual when using email. People can see this triage and more
effectively direct their attention where it is needed.

The Web UI problem
------------------

The use of a web UI is often seen as the biggest downside to using a Git Forge,
especially amongst people who are used to an email based code review workflow.
This is a position I have great sympathy with, as I do largely prefer text
based tools with strong keyboard navigation, over graphical tools with strong
mouse navigation. The downsides of the web UI for merge requests becomes more
apparent the larger the change is, especially if split across multiple
patches.

While common, this viewpoint on the merits of text UI over graphical UI is not
universally held. The popularity of Git Forges amongst the open souce world
demonstrates that many developers have no trouble with the web UI, or at least
are willing to accept it in order to benefit from the other features offered.

My preference for a text based UI has lead me to spend time investigating the
API support for GitLab and from there developing a custom application for
interacting with GitLab merge requests.

    https://gitlab.com/bichon-project/bichon

The goal of the tool is to provide a user interface for merge requests that
is familiar to users of the mutt email client. It is not especially advanced
in its impl yet, and certainly has rough edges, but for me it has proved the
general point that we can build useful tools to integrate with GitLab, which
can mitigate the worst aspects of a Web UI.

The need to build such a tool, however, has a cost as it diverts time away
from working on libvirt itself, which negates some of the benefits of adopting
GitLab in the first place. It is hoped, however, that this is a relatively
short term cost that could ultimately be amoritized across multiple developers
who use GitLab for their projects, while the benefits are repaid continuously
over the long term.

The review comment model
------------------------

Those familiar with a email based workflow using clients like mutt will often
cherish the deeply threaded discussion model.  It is worth bearing in mind
that not all email clients will follow this, with many turning the threaded
discussion into a flat discussion.

The GitLab review comment system certainly doesn't offer the kind of threading
that mutt does, but it is a step above what many email clients do. Against any
single patch, it is possible to one or more standalone comments, and one or
more discussion threads, attached to specific lines of code. The distinction
between the two is whether replies are permitted. The threads are, however,
flat, not deeply nested.

There are two interesting and perhaps useful things to be aware of. First each
discussion thread on a patch has a notion of whether it has been "resolved"
or not. It is possible to (optionally) have threads auto-resolve when a new
version of the patch series is posted to the merge request, or they can be
resolved manually by a contributor. This can be valuable in keeping track of
whether feedback on v1 was addressed in changes for v2, which is a challenge
with email review workflows.

The second interesting thing is that threads can be "rebased". This is both
useful and annoying at the same time. What appears to happen behind the scenes
is that the (commit hash, file, line num) that is associated with a comment
thread can be auto-updated to point to the latest commit that contains the
matching (file, line num). This is useful when a v2 is posted, as the comments
from v1 are moved to v2, helping identify if the problem was addressed. It is
annoying when dealing with patch series though, as it appears that a comment
created on patch 3, might be reported against patch 15 when later queried, if
patch 15 has context covering this same source file location. This makes some
sense when considering the web UI, because by default it will show the combined
diff of all patches in a series. It is less helpful for my bichon tool which
is trying to present a per-patch view. In theory it should be possible to
follow the diff context backwards to display the comment against every single
patch which has context for the same line of code. This would in fact make it
quite useful, but until that is done it is quite annoying.

More generally, comments have nice integration with the rest of the toolset
making it easy to refer to issue trackers, merge requests via magic markdown
extensions, and to tag individual contributors to get their opinion on a
question. 

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|