[EnMasse] Proposal: fewer repositories

Ulf Lilleengen ulilleen at redhat.com
Fri Jun 30 10:21:48 UTC 2017


Hi,

I've made a prototype of a single repo where i've merged all 
repositories into one:

https://github.com/EnMasseProject/enmasse/tree/travis-builtibuild-test

To build and unit test all components (and bundle artifacts):

./gradlew build pack

To build docker images

./gradlew buildImage

I have also defined 'pushImage', 'tagVersion' and 'pushVersion' for 
pushing the commit image, tagging a version of an image, and pushing a 
snapshot to docker hub. Adding another target 'systemtests' to run 
systemtests would be something that can be added as well. Having a 
single consistent way of building everything simplifies development IMO.

If you look at 
https://travis-ci.org/EnMasseProject/enmasse/builds/248721272, the whole 
build of EnMasse (without pushing docker images and running systemtests) 
takes ~10 minutes, basically just running

./gradlew build pack buildImage -i

In this PoC I used gradle, but that could just as easily have been make 
or any build system that supports building projects in different 
languages and does parallel builds well.

The reason I used gradle is mainly because most components were already 
using it and it has a DSL that allows us to move a lot of the 
functionality in travis shell scripts into the build system (allowing us 
to run on other build infrastructure with less duplication). It supports 
doing generic scripty things easily as make+shell, but also doing 
language-specific things like interacting with a maven repo (with make 
we'd have to invoke mvn or gradle).


Ulf


On 29. juni 2017 09:26, Ulf Lilleengen wrote:
> Hi all,
> 
> Today, the components that we release in EnMasse is spread across 12 
> github repositories. There are a couple of advantages to this approach:
> 
>     * We may release them independently. This is not something we do today
>     * Each component may be built independently by travis
>     * Makes it easy to do incremental builds and only build the 
> component that was changed
>     * Developers don't have to be 'disturbed' by other components and 
> merge conflicts
> 
> There are, however, a few disadvantages:
> 
>     * Duplication of build configuration across components. Right now we 
> have a ~50 line .travis.yml in each repository that essentially does the 
> same thing. Whenever we need to change something (i.e. push artifacts to 
> bintray or just change credentials, or change configuration), we have to 
> update all 12 repositories.
> 
>     * Doing integration testing between components. This is 'solved', 
> but there is a set of fragile scripts maintained to get it working.
> 
>     * Different build systems for different components. This is kind of 
> a feature of travis which assumes each repo contains only code in 1 
> programming language, but it has resulted in 3 (make, gradle, maven) 
> different ways to build components.
> 
>     * Keeping track of all repositories 'in use' in release scripts. To 
> release, we need to tag all 12 git repositories. The list of 
> repositories needs to be maintained somewhere and is currently hardcoded 
> in the scripts.
> 
>     * It is confusing for new developers where to find the source code, 
> which repo to look at, where to file issues etc.
> 
>     * It is sometimes confusing for us working on EnMasse already to 
> file issues for the correct component, and this again makes it harder to 
> keep track of what development work is being done.
> 
> The current repositories we have (that are released together) are:
> 
>     * enmasse            - openshift/k8s templates + documentation
>     * admin              - address-controller + configserv + 
> queue-scheduler + common lib
>     * ragent             - router agent
>     * subserv            - subscription service
>     * artemis-image      - artemis docker file + plugins + shutdown hooks
>     * router-image       - router docker file + configuration
>     * router-metrics     - router metrics collector + docker file
>     * routilities        - console
>     * topic-forwarder    - forwarding of messages between brokers in a 
> cluster
>     * mqtt-gateway       - MQTT gateway
>     * mqtt-lwt           - MQTT last will and testament service
>     * (amqp-kafka-bridge - AMQP-Kafka bridge) - ignored for the rest of 
> this post
> 
>  From reading the above you can probably feel my desire to merge these 
> repositories into fewer. I'm proposing a few alternatives where the 
> repositories are named without taking into concern the CI system or 
> programming language used.
> 
> # By deployment
> 
>     * enmasse - templates and documentation
>     * admin   - admin, routilities, router agent
>     * router  - router-metrics, router-image
>     * broker  - artemis-image, topic-forwarder, subserv
>     * mqtt    - mqtt-gateway, mqtt-lwt
> 
> The repositories in this list each contain components that are deployed 
> together. Building and testing changes to code each of these 
> repositories makes sense I think, and changes to each of them _should_ 
> have minimal impact of the other components. On the other hand, the way 
> we deploy components can change over time, so we might have to move 
> components around in the future, which I don't like.
> 
> # By address space types
> 
>     * enmasse            - templates, documentation
>     * enmasse-common     - address-controller, configserv, (console?)
>     * enmasse-standard   - router-image, queue-scheduler, artemis-image, 
> routilities, ragent, router-metrics, topic-forwarder, subserv, 
> mqtt-gateway, mqtt-lwt
> 
> 
> This structure groups the repositories into pieces that match the 
> address space types. The argument is that we could release the 
> components of each address space type individually. I would, however, 
> argue that it's not about the need to release them independently, but 
> for the enmasse-common components to work with multiple versions of 
> enmasse-standard. I don't think we need to release them independently to 
> guarantee that.
> 
> # Single repo
> 
>      * enmasse
> 
> This is my personal favorite. Todays repos would be modules within the 
> repo. If we find that some of these modules needs to be released 
> independently, we can move them into separate repos at that point. Some 
> work is required for travis to work, but should be doable. I also think 
> that using a common build system that supports incremental builds would 
> be valuable here.
> 
> There are 2 common issues with this approach: build times and merge 
> conflicts. Build times can be addressed by incremental builds. Merge 
> conflicts can be avoided by having a proper structure of the components 
> within the repo. It also warrants strong guidelines on having clear 
> boundaries between components and a review process that ensures that the 
> 'right thing' is done.
> 
> All in all I think having a simple way to build and test the whole 
> project is a key feature here.
> 

-- 
Ulf




More information about the enmasse mailing list