[EXTERNAL] [edk2-devel] [edk2-rfc] GitHub Pull Request based Code Review Process

Mon May 25 04:09:04 UTC 2020

> On May 21, 2020, at 10:48 PM, Bret Barkelew <Bret.Barkelew at microsoft.com> wrote:
> 
> “But we do have tools that insert the Bugzilla number in all the commits of the series, assist with the squash, etc.”
>  
> Are these internal-only, or are they something we could evaluate when we move to the PR process? If not, are they based on anything we could leverage?
>  
> I believe that the plan is to stick with Bugzilla for the immediate future rather than use GitHub Issues (which is probably for the best for now, given the span across repos and access levels), so any tooling to tie that together would be interesting to evaluate.
>  

Bret,

Sorry for being confusing, and lazy.....

The lazy part is in house we have a bug tracking tool called radar, so I just replaced radar with BZ to make a general point. 

So the scripts I mentioned are useless for us as they 100% rely on massive amounts of internal framework, and are hard coded to our current process. 

Probably better to tell the story... So we revamped our internal process and that was lead by the folks who make the OS (so kind of like our edk2 process derived from the Linux kernel). So building  the OS made sense, but developers got stuck doing a bunch of manual work. The response was to get a group of smart folks together and  write good documentation, and build tools to automate common task.  That seems like a good plane for TianoCore too?

So I finally tracked down on our internal git mailing and figured out the mail reflector I needed to follow the basic edk2 rules for patches. To me it seems like we could try and automate thing more. I also found I had to Bing/Google to find the detailed instructions I needed as a developer, as the Wiki seems to assume you just know the Linux kernel patch process. That feels like an area we can improve. 

So this makes a couple of ideas pop into my head:
1) It would be good for folks that are not conversant in the  Linux mailing list process to give feedback on all the Wikis. 
2) Can we make a mail reflector that makes it easier to contribute?  The hardest thing for me was tracking down my internal git work group that had setup for how to configure a mail server. Is there a way to help others with that? 
3) We want to make sure any new process makes it as easy as possible to contribute. 

I'm reminded of the epiphany I got reading Code Complete the 1st time. The data shows the most important thing is consistency. So I'd say our reluctance to change in rooted in the science of computer science but progresses is always our goal.  

Thanks,

Andrew Fish

> <Tangent>
> In Mu we have a similar problem of keeping track of what features/bugs have already been upstreamed and when can they be dropped during an upstream integration, so that’s the more personal interest I have in such automation.
>  
> Thanks!
>  
> - Bret
>  
> From: Andrew Fish <mailto:afish at apple.com>
> Sent: Thursday, May 21, 2020 8:00 PM
> To: Laszlo Ersek <mailto:lersek at redhat.com>
> Cc: devel at edk2.groups.io <mailto:devel at edk2.groups.io>; spbrogan at outlook.com <mailto:spbrogan at outlook.com>; rfc at edk2.groups.io <mailto:rfc at edk2.groups.io>; Desimone, Nathaniel L <mailto:nathaniel.l.desimone at intel.com>; Bret Barkelew <mailto:Bret.Barkelew at microsoft.com>; Kinney, Michael D <mailto:michael.d.kinney at intel.com>; Leif Lindholm (Nuvia address) <mailto:leif at nuviainc.com>
> Subject: [EXTERNAL] Re: [edk2-devel] [edk2-rfc] GitHub Pull Request based Code Review Process
>  
> 
> 
> > On May 21, 2020, at 6:30 AM, Laszlo Ersek <lersek at redhat.com> wrote:
> > 
> > On 05/20/20 00:25, Sean wrote:
> >> On 5/19/2020 1:41 PM, Laszlo Ersek wrote:
> > 
> >>> Your proposal to "don't exclude squash merge workflows" is a trap. If
> >>> we tolerate that option -- which is obviously the sloppy, and hence
> >>> more convenient, option for some maintainers and some contributors,
> >>> to the detriment of the git history --, then almost every core
> >>> maintainer will use it as frequently as they can. In the long term,
> >>> that will hurt many consumers of the core code. It will limit the
> >>> ability of people not regularly dealing with a particular core module
> >>> to file a fine-grained bug report for that module, maybe even propose
> >>> a fix. From the regression analyst's side, if the bug report starts
> >>> with "I have a bisection log", that's already a good day. And your
> >>> proposal would destroy that option, because maintainers and people in
> >>> general are irrepairably lazy and undisciplined. We cannot post a
> >>> community member shoulder-by-shoulder with every core package
> >>> reviewer/maintainer to prevent the latter from approving a
> >>> squash-on-merge, out of pure laziness. I'm 100% sure the "option" to
> >>> squash-on-merge would *immediately* be abused for a lot more than
> >>> just "typo fixes". There isn't enough manpower to watch the watchers,
> >>> so "no squash-on-merge" needs to be a general rule.
> >> 
> >> 
> >> I have trouble with this line of thinking. The maintainers are and
> >> should be considered the representatives of this code base.   They
> >> have a vested interest to enable this repository to work for them.  If
> >> they really are viewed as "sloppy" or "lazy" then we are destined to
> >> fail anyway.
> > 
> > You put it very well. "They have a vested interest to enable this
> > repository to work for them." Key part being "*for them*".
> > 
> > Core maintainers are responsible for making this repository work for a
> > lot larger camp than just themselves. Even if squash-on-merge satisfied
> > the requirements that core maintainers presented, squash-on-merge would
> > still hurt the larger community that depends on those packages.
> > 
> > The core-consumer community may not necessarily participate in the
> > day-to-day maintenance of the core packages, but they do report bugs and
> > even contributes bugfixes / occasional features, when their particular
> > use cases require those actions.
> > 
> > And squash-on-merge hurts those activities, down the road, because the
> > git history is instrumental to analyzing and learning the code base.
> > 
> > For example, the question "why do we call this function here?"
> > immediately leads to running "git blame" (possibly a series of git-blame
> > commands, to navigate past code movements and such). In the end
> > git-blame leads to a particular commit, and that commit is supposed to
> > answer the question. If the commit is huge (e.g. a squash of an entire
> > feature), then the question is not answered, and git-blame has been
> > rendered useless.
> > 
> > 
> >> Nothing in my statement of "don't exclude squash merge workflow"
> >> requested that we allow a PR to be squashed into a single commit that
> >> you believe should be a patch series.
> > 
> > If the button is there, maintainers will click it even in cases when
> > they shouldn't, and I won't be able to catch them. The result will not
> > necessarily hurt the maintainer (not at once, anyway), but it will harm
> > others that investigate the git history afterwards -- possibly years
> > later.
> > 
> > I can't watch all CoreFoobarPkg pull requests on github to prevent this.
> > On the other hand, I can, and do, monitor the edk2-devel list for
> > seriously mis-organized patch sets, especially for core packages where
> > I've formed an "I had better watch out for this core package"
> > impression.
> > 
> > I have made requests under core patch sets where I was mostly unfamiliar
> > with the technical subject *for the time being*, asking just for
> > improvements to the granularity of the series. Knowing the improved
> > granularity might very well help me *in the future*.
> > 
> > 
> > The mailing list equivalent of "squash-on-merge" would be the following:
> > 
> > - contributor posts v1 with patches 1/5 .. 5/5 (for example),
> > 
> > - reviewer requests updates A, B, and C,
> > 
> > - contributor posts (in response to the v1 blurb, i.e. 0/5) further
> >  patches 6/8, 7/8, 8/8
> > 
> > - reviewer checks the new patches and approves them, functionally,
> > 
> > - maintainer says "OK let me merge this",
> > 
> > - maintainer applies the patches (all 8 of them) from the list, on a
> >  local branch,
> > 
> > - maintainer runs a git rebase squashing the whole thing into a single
> >  patch,
> > 
> > - maintainer does *not* review the result,
> > 
> > - maintainer opens a PR with the resultant single patch,
> > 
> > - CI passes,
> > 
> > - the patch is merged.
> > 
> > 
> > With the list-based process, the disaster in the last step is mitigated
> > in at least three spots:
> > 
> > - All subscribers have a reasonably good chance to notice and intervene
> >  when the incremental fixups 6/8, 7/8, 8/8 are posted as followups to
> >  the v1 blurb, clearly with an intent to squash.
> > 
> > - Because the maintainer has to do *extra work* for the squashing, the
> >  natural laziness of the maintainer works *against* the disaster. Thus
> >  he or she will likely not perform the local rebase & squash. Instead
> >  they will ask the contributor to perform a *fine-grained* squash (i.e.
> >  squash each fixup into the one original patch where the fixup
> >  belongs), and to submit a v2 series.
> > 
> > - If someone interested in the git history catches (after the fact) that
> >  the maintainer merged a significantly different patch set from what
> >  had been posted and reviewed, they can raise a stern complaint on the
> >  list, and next time the maintainer will now better.
> > 
> > (This is not a theoretical option; I low-key follow both the list
> > traffic and the new commits in the git history (whenever I pull). In the
> > past I had reported several patch application violations (mismanaged
> > feedback tags, intrusive updates post-review, etc). Nowadays it's gotten
> > quite OK, thankfully, and I'm terrified of losing those improvements.)
> > 
> > 
> > If we just plaster a huge squash-on-merge button or checkbox over the
> > web UI, it *will* be abused (maintainer laziness will work *towards* the
> > disaster), with only a microscopic chance for me to prevent the abuse.
> > 
> > It's not that "I believe" that this or that *particular* series should
> > not be squashed. "Not squashing" is not the exception but the rule. The
> > *default* approach is that the submitter incorporates incremental fixes
> > into the series at the right stages, they maintain a proper series
> > structure over the iterations, and they propose revised versions of the
> > series in full. Squashing is the exception; for example one reason is,
> > "if you separate these changes from each other, then the tree will not
> > build in the middle; they belong together, please squash them, and
> > resubmit for review".
> > 
> > 
> >> I do think those rules will need to be defined but that is needed
> >> today anyway.
> > 
> > Rules are only as good as their enforcement is.
> > 
> 
> In my work world we require code review by a manager and that is the de facto enforcement mechanism. Basically there is always an owner to make sure the process was followed :)
> 
> Also in our world the squash is a developer choice. But we do have tools that insert the Bugzilla number in all the commits of the series, assist with the squash, etc. 
> 
> > The question is not how nice it is to use squash-on-merge in the
> > minuscule set of situations when it might be justified; the question is
> > how difficult it would be to  prevent the inevitable abuses.
> > 
> > The list lets me advocate for proper git history hygiene reasonably
> > efficiently (although I still miss a bunch of warts due to lack of
> > capacity). With the squash-on-merge button or checkbox, the flood gates
> > would fly open. I won't stand for that (not as a steward anyway).
> > 
> > I think our world views differ fundamentally. I value the git history
> > *way* above my own comfort, and everyone else's (accounting for both
> > contributor and day-to-day maintainer roles). I guess you prefer the
> > reciprocal of that ratio.
> > 
> 
> I'd also point out that the processes you chose kind of defines your quanta of work. It is likely you would be willing to tackle a really big change as a large patch set, that you would likely break up into multiple PRs in a squash on commit world. In a squash on commit world you also might break a Bugzilla (BZ) up into dependent BZs, a tree of BZs. That might sound crazy, but when you work on a bigger project and there are BZs for EFI, T2, macOS, the Installer, and the RecoveryOS for a customer visible feature this tree of BZ might be familiar and make sense  to you. 
> 
> But I think the real argument for consistency is we have a rich git history that has value. We have made resource tradeoffs to have that rich git history so to me it makes the most sense, for these project, to try to preserve our past investment in git history. 
> 
> Thanks,
> 
> Andrew Fish
> 
> > Thanks,
> > Laszlo
> > 
> 

-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#60204): https://edk2.groups.io/g/devel/message/60204
Mute This Topic: https://groups.io/mt/74450665/1813853
Group Owner: devel+owner at edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [edk2-devel-archive at redhat.com]
-=-=-=-=-=-=-=-=-=-=-=-

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/edk2-devel-archive/attachments/20200524/6ba3e81a/attachment.htm>