[Freeipa-devel] [RANT] pytest fixture scope is braindead

Mon Dec 8 11:27:27 UTC 2014

On 12/05/2014 11:51 PM, Nathaniel McCallum wrote:
> So I've been working this week on porting the OTP tests that were
> manually coded to the new pytest framework. Now, I'm the first to want
> better tooling to make our job easier. But, pytest? Meh. I'm having to
> write just as much code (or more) to get my tests on par with pytest,
> and they are riddled with subtle bugs.

Hi,
Yeah, IPA runs into specific situations that are very different from how 
tests usually work in other projects. And you took one of the harder cases.
IPA's existing tests also don't really follow best practices – they 
pretty much assume you run the whole test suite, or at least the whole 
module, so selecting individual tests is currently not reliable. On the 
other hand, this assumption made some things easy to code – like 
complicated ordering and setup/teardown.

One thing we want to do during the move to pytest is to have the tests 
be more independent of the current state of the database, and also of 
each other.

If certain assertions only make sense in a specific order, either put 
them in the same test*, or write additional code that explicitly handles 
dependencies and ordering. (Sorry, that's the price to pay for isolated 
tests – I don't see a way around it.)

If on the other hand the order doesn't matter – any order of any subset 
of individual tests should pass, which is our goal – but pytest's 
default ordering makes the suite slow, there is a hook that allows you 
to override the order: pytest_collection_modifyitems [0]. Stick it in 
the affected module (or conftest.py file for multiple modules, or a 
plugin for the whole suite), and sort the items explicitly.

If your test depends on something not existing or being disabled, you 
should try removing/disabling it before the test, or write a fixture to 
do that.

* might make sense, for starters?
[0] 
http://pytest.org/latest/plugins.html#_pytest.hookspec.pytest_collection_modifyitems

> Most painful, is pytest.fixture(). It is a cool idea, but the scoping
> and lifecycle is horribly broken. Here is an example:
>
> Here is an example. I want to create a user, enable otp in two different
> ways, test with two different types of tokens under two different
> authentication scenarios. This is a total of 8 tests.
>
> @pytest.fixture(scope="function") # This scope is default.
> def user(request):
>    user = create_user('tuser1')
>    request.addfinalizer(lambda: delete_user('tuser1'))
>    return user
>
> @pytest.fixture(params=["per-user", "global"])
> def enablement(request, user):
>    output = enable(user, request.param)
>    request.addfinalizer(lambda: disable(user))
>    return output
>
> @pytest.fixture(params=[{'type': 'HOTP'}, {'disabled': True}])
> def token(request, user, enablement):
>    output = create_token(request.param)
>    request.addfinalizer(lambda: delete_token(output.id))
>    return output
>
> @pytest.mark.parametrize("opts", [foo, bar])
> def test_authentication(user, token, opts):
>    return auth(user.uid, token.code(), opts)
>
> Because the default scope is "function", a new user is created for every
> single run of token. This is *way* more work than necessary, and leads
> to slow runtimes. Fortunately, we can fix this by changing the scope of
> the user fixture to "module". In this case, only one user will be
> created per module. So far so good.
>
> The enablement scope is still "function", so it will still be called
> eight times (#tokens * #enablements * #opts). Now, this is a problem
> because (let's say) enablement takes a long time to switch tests. Here
> is problem #1 with pytest: it presumes that all variants of fixtures
> should be grouped by test parameters. This is a bad assumption. It may
> make sense in some cases, but there is no way to change it.
>
> Now, we can work around this problem just like with the user fixture:
> change the scope to "module". Unfortunately, we've just caused a second
> bug. Now, enablement() will only be run twice; but the finalizer will
> not be called until all tests are run. This means that "per-user"
> enablement will still be active when "global" enablement runs; both will
> be torn down simultaneously.
>
> This same problem arises for the token fixture: eight tokens will be
> created and destroyed when only four are necessary. You guessed it, we
> can fix this by changing the scope. But by doing this all four tokens
> will be deleted simultaneously. This means we cannot test the state
> where only a disabled token exists.
>
> Any attempt to use a fixture to create long-lived states (precisely what
> fixtures are designed for) with mutually exclusive parameters is
> extremely error prone as it is not clear which items are alive in the
> LDAP db at any point. This leads to both long runtimes *and* extremely
> subtle bugs in the tests that are multiplied by every layer of fixtures.
> Call me nonplussed.
>
> Now, let's look at how this same test can be implemented without pytest:
>
> user = create_user('tuser1')
> try:
>    for enablement in ('per-user', 'global'):
>      enable(user, enablement)
>      try:
>        for token in [{'type': 'HOTP'}, {'disabled': True}]:
>          object = create_token(token)
>          try:
>            for opt in [foo, bar]:
>              auth(user.uid, object.code(), opt)
>          finally:
>            delete_token(object.id)
>      finally:
>        disable(user)
> finally:
>    delete_user('tuser1')
>
> Comparing the non-pytest implementation to the pytest one we can see a
> few things. First, the pytest code has some nice separation of roles.
> This is the big benefit of pytest. Second, the non-pytest implementation
> is less code. Third, the non-pytest implementation has no subtle scoping
> bugs: we know exactly when each database record is created and
> destroyed.

This "non-pytest" implementation is a *single test*, since it hard-codes 
dependencies and ordering. If you write it as as a single test function 
in pytest, it should look much the same, and get the 
ordering/dependencies right (except the case of the test user already 
existing).

> I *want* to like pytest. Honestly I do. But this scoping idiocy has
> already cost me roughly two days tracking down subtle bugs in the tests
> (while not finding any bugs in the OTP code itself). And I can't think
> that OTP tests written using pytest will be maintainable in any regard;
> not when such small changes can cause such hard to find bugs.
>
> Now, it may be possible that I have just missed something obvious. If I
> have, forgive me this rant. But I've spent a lot of time reading docs
> and I haven't seen any obvious solutions to these problems.
>
> Nathaniel

-- 
Petr³