[Pulp-dev] 3.0 Validation update
jortel at redhat.com
Mon Apr 24 19:12:21 UTC 2017
On 04/21/2017 01:16 PM, Austin Macdonald wrote:
> I think we have arrived at a consensus around these points:
> * Django's ModelForm validations that are used during `full_clean` is an anti-pattern in DjangoRestFramework
> and should not be used
> * Validation, by our definition, will be done by the serializers in the API layer
> * The Task layer should run business logic checks at the time of use
> We have not arrived at a consensus on whether Updates/Deletes for Importers/Publishers will be synchronous or
> asynchronous. I've tried to think through each way, and what it would need to work.
> *Synchronous Update and Delete:
> _Validation Heuristic_
> The serializers validate in the API layer only, and it is desirable to have that validation occur at the time
> of the write.
> _Concurrent Writes:_
> Let's assume synchronous Updates and Deletes. There is a possibility of concurrent writes because sync/publish
> tasks write to the importer/publisher tables respectively and these tables can also be written by the Update
> API at any time. If there is a collision the last write wins.
> Concurrent writes may not be a big deal because they should be not affect each other. In platform, sync and
> publish only write to the last_sync/last_publish fields, which should never be written by the update API. If
> the writes are restricted to only the affected fields, then there would be no clobbering. As long as plugin
> writers' implementations of sync and publish do not write to the same fields allowed in an API update, this
> concurrency is safe. We would obviously need to be clearly documented for plugin writers.
> _Dealing With Broken Configurations_
> For example, while a sync task is WAITING, the user updates an importer in a way that breaks sync. The
> importer configuration should be read only once at the start of a sync. If this configuration has been deleted
> or is not syncable, business logic checks catch this and fail immediately. Even if the configuration is
> changed during the sync task, the sync task will continue with the original configuration. The sync task will
> update last_sync, but will not write to other fields. I don't see a problem here, as long as this is
> documented for the plugin writer.
> I don't see any concurrency problems here. The downside of this design is that sync and publish, (implemented
> by plugin writers), should not write to the same fields as API updates.
> *Asynchronous Update and Delete*
> _Validation Heuristic_
> Assuming async updates, the data comes into the API layer where it is validated by the serializers. The update
> reserves a task (reservation is based on repository) and waits to be performed. It is usually undesirable to
> separate validation from use. There could be concern that data validated in the API layer could become
> unacceptable by the time the Task layer actually performs the write.
> Practically, this separation may not be an issue. Valid data becoming invalid involves validations that are
> dependent on state. We can split these validations into two categories, uniqueness validations and multi-field
> validations. Fortunately, the Django ORM will enforce uniqueness validation, so the database is protected, but
> we might have to handle write failures. Multi-field validations are probably not validations by our
> definition, but are business logic, which could be checked in the Task layer.
> _Concurrent write prevention_
> Leveraging the reserved resource system, concurrent writes could not be a problem because writes could never
> be concurrent at all. Syncs, Updates, Deletes, Publishes all make reservations based on the repository (this
> is an assumption based on what we do in Pulp 2), so each would run in the order that they were called.
> _User Controlled Order_
> A user can issue a [sync][update][sync] and be confident that the first sync will use the original importer
> configuration, and the second sync will use the updated configuration.
This is not deterministic (from user point of view) even with task reservation. You still have a race
condition queuing tasks. In the following example, the update made by user B will win and the update made by
user A will not be used for the 2nd sync. By making this async, seems we have done is moved the race
condition from the DB to the tasking queue.
1. user-A [sync]
2. user-A [update]
3. user-B [update] (winner)
4. user-A [sync]
> _Uncertain State_
> The configuration of an importer is unpredictable if there is a waiting update task from another user.
> Pulp-dev mailing list
> Pulp-dev at redhat.com
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 847 bytes
Desc: OpenPGP digital signature
More information about the Pulp-dev