<div dir="ltr">Thanks for the idea. I think we're pretty close to a design. More thoughts would be great. I wrote back some inline.<br><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Dec 14, 2017 at 4:39 PM, Jeff Ortel <span dir="ltr"><<a href="mailto:jortel@redhat.com" target="_blank">jortel@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"><span class="">
<p><br>
</p>
<br>
<div class="m_3796635442788111230moz-cite-prefix">On 12/14/2017 12:55 PM, Brian Bouterse
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>
<div>
<div>
<div>The behavior brings me back to an attribute name like
'user_visible' and it would default to False. Thus you
have to explicitly take the step to make it user
visible. Whatever the name, I think this would this
apply to both RepoVersion and Publication objects.
Plugin writers who produce these objects also need docs
that identify they need to set user_visible=True.<br>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<br></span>
Agreed, except for field name.<br>
<br>
My concern with the name user_visible is that rather than describing
the incomplete state of the resource it describes only one aspect of
how the resources should be handled. That is, that a non-visible
resource should be hidden from the user. But there's more to it.
For example, associating a publication to a distribution should be
prevented by the viewset. Not based on user visibility but the
incomplete state of the publication.<span class=""><br></span></div></blockquote><div><br></div><div>I'm convinced by this. What about having it be called 'complete' and having it default to False. Only complete Publications and RepositoryVersion objects are returned in the list, detail, or filter views.<br></div><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><span class="">
<br>
<blockquote type="cite">
<div dir="ltr">
<div>
<div>
<div>
<div><br>
</div>
If an exception is raised while creating the repo_version
or publication, or from the plugin code, the core catches
it, deletes the repo_version/publication and re-raises the
exception. This will cause the task the user is tracking
to error and report the error.<br>
</div>
</div>
</div>
</div>
</blockquote>
<br></span>
Agreed.<span class=""></span> <br></div></blockquote><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><span class="">
<br>
<blockquote type="cite">
<div dir="ltr">
<div>
<div>
<div><br>
</div>
<div>We had some challenges on irc in finding a working
design for the crash case. If a crash occurs though the db
record will just be there with user_visible=False. We need
some way to clean those up. We can't assume that there
will be just one outstanding one for us to cleanup next
time for a variety of reasons I won't recap here. During
the irc convo, @jortel suggested we consider if the
tasking system can help cleanup the work like it cleans up
other areas and I think that is a good idea. We could
record on the Task model a list of objects to deleted if
the tasking system cleans up a task that crashed while
running. For example, when a publication is made, the
first thing done it to associate it with the running task
as an object that needs to be deleted if the task crashes.
We would also hide this objects_to_delete list from the
user in the Task serializer which would omit this data. If
we don't omit that data from a Task serialization when the
user tries to load the url they will get a 404 because
that object has user_visible=False.<br>
</div>
</div>
</div>
</div>
</blockquote>
<br></span>
I think it would be best to omit from the task serializer. <br></div></blockquote><div><br></div><div>I agree leaving them out for now I think makes sense. It's more a relationship for internal correctness.<br></div><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF">
<br>
All seems reasonable but want to note that for this to be crash
proof it is imperative that the resource insert and the insert into
<i>things-to-be-deleted-when-the-<wbr>task-crashes</i> must be committed
in the same transaction in order to be crash proof. The same is
true for when the task completes successfully. Updating the
(valid|visible|?) field on the resource, inserting into
CreatedResources and deleting from <i>things-to-be-deleted-when-the-<wbr>task-crashes</i>
needs to be done in the same transaction. This is trivial for the
core because it can be done in the task code. Relying on plugin
writers to do this is a little concerning.<br></div></blockquote><div><br></div><div>I agree completely that the creation of the Publication needs to be saved in the same transaction that we update the Task with a link to the Publication to be cleaned up. Similarly for a RepositoryVersion. We could probably do this transaction and the updating of the Tasks' foreign key automatically in a fancy save() method on both Publication and RepositoryVersion. Something like: in save() start the transaction, save the self object, if in a task, update the Task with the FK reference, end transaction and return control back from save(). This would cause anyone who creates one of them to receive the "safe" transactional behavior automatically. At that point the only thing people need to know is that they have to later call a:</div><div><br></div><div>my_publication.complete = True</div><div>my_publication.save()</div><div><br></div><div>This would be in exceptional situations where plugin code has to make the RepositoryVersion on their own (i.e. depsolving copy). Also for the sync() and publish() methods, core would be handling the setting of complete=True for plugin writers.<br></div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>
<br>
Perhaps we can do something simpler. Given the frequency of crash
or worker restart, I wonder if we could delete incomplete things
based on another event that ensures that no tasks are running. Kind
of like how <i>/tmp </i>is cleaned up on system reboot. I don't
think having some of these things hanging around in the DB is a
problem. It's mainly that we don't want to leak them indefinitely.
Any ideas?</div></blockquote><div><br></div><div>Not cleaning them up could be ok in the near term. In terms of a simple cleanup mechanism I think the one proposed is almost as simple as we can afford. Otherwise we'll end up in the "singleton concurrency" design, which I think we should avoid at almost all costs. The "singleton concurrency" is an idea; it's the pattern is one where the resource manager stops routing work, waits for all workers to finish their tasks, and then runs code on itself. That is the only way the pulp cluster can be safely guaranteed that some set of code is the *only* thing running. The work will backup significantly while waiting for all existing work to finish, especially if one of those sync's is long running which is probable.<br></div><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><div><div class="h5"><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div>
<div><br>
</div>
What are thoughts on these approaches, behaviors, and the
attribute name? Should this be moved into Redmine?<br>
<br>
</div>
<div>
<div>
<div><br>
<br>
</div>
</div>
</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Thu, Dec 14, 2017 at 11:14 AM, Jeff
Ortel <span dir="ltr"><<a href="mailto:jortel@redhat.com" target="_blank">jortel@redhat.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"><span>
<p><br>
</p>
<br>
<div class="m_3796635442788111230m_-4293885011110332990moz-cite-prefix">On
12/13/2017 01:54 PM, Brian Bouterse wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>Defining the field's behaivor a bit more could
help us with the name. Will it actually be shown
to the user in viewsets and filter results?</div>
<div><br>
</div>
<div>I think the answer should be no, not until it's
fully finished. I can't think of a reason why a
user would want to see inconsistent content during
a sync or publish. </div>
</div>
</blockquote>
<br>
</span> Agreed.<span><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div>There are some downsides when users thinking
things are done when they aren't. For instance,
the user could mistakenly think the publish is
done when its not, trigger package updates, and
many machines will still receive the old content
because it hasn't been switched over to
auto-publish for the expected distribution.</div>
<div><br>
</div>
Also how is this related to when the
'created_resources' field is set on a Task? I had
imagined core would set that at as the last thing it
does so that when the user sees it everthing is
"consistent" and "live" already.<br>
</div>
</blockquote>
<br>
</span> Agreed.
<div>
<div class="m_3796635442788111230h5"><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_extra"><br>
</div>
<div class="gmail_extra">-Brian</div>
<div class="gmail_extra"><br>
</div>
<div class="gmail_extra">
<div class="gmail_quote">On Wed, Dec 13, 2017 at
2:42 PM, David Davis <span dir="ltr"><<a href="mailto:daviddavis@redhat.com" target="_blank">daviddavis@redhat.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">Thanks for answering my
questions. I agree on not using an “is_”
prefix and avoiding “visible.”
<div><br>
</div>
<div>Your suggestion of “valid” sounds
fine. Maybe some other options:
finished, complete[d], ready.</div>
<div class="gmail_extra"><span class="m_3796635442788111230m_-4293885011110332990m_-4631390161871604727HOEnZb"><font color="#888888"><br clear="all">
<div>
<div class="m_3796635442788111230m_-4293885011110332990m_-4631390161871604727m_-7869206984782099237m_4346958101146153224gmail_signature" data-smartmail="gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div><br>
</div>
<div>David<br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</font></span>
<div>
<div class="m_3796635442788111230m_-4293885011110332990m_-4631390161871604727h5">
<br>
<div class="gmail_quote">On Wed, Dec
13, 2017 at 2:15 PM, Jeff Ortel <span dir="ltr"><<a href="mailto:jortel@redhat.com" target="_blank">jortel@redhat.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"><span>
<p><br>
</p>
<br>
<div class="m_3796635442788111230m_-4293885011110332990m_-4631390161871604727m_-7869206984782099237m_4346958101146153224m_1657149683956893774moz-cite-prefix">On
12/13/2017 12:46 PM, David
Davis wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">A few
questions. First, what
is meant by incomplete?
I’m assuming it refers
to a version in the
process of being created
or one that was not
successfully created?</div>
</blockquote>
<br>
</span> Both.<span><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div><br>
</div>
<div>Also, what’s the
motivation behind
storing this
information? Is there
something in Pulp that
needs to know this or
is this so that the
user can know?</div>
</div>
</blockquote>
<br>
</span> There may be others
but an importer needs to be
passed the new version so it
can add/remove content. It
needs to exist in the DB so
that it can add/remove content
in separate transaction(s).<span><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div><br>
</div>
<div>Lastly, I imagine
that a task will be
associated with the
creation of a version.
Does knowing its state
not suffice for
determining if a
version is
visible/valid?</div>
</div>
</blockquote>
<br>
</span> IMHO, absolutely not.
That is not what tasks records
in the DB are for. Completed
task records can be deleted at
any time.<span><br>
<br>
<blockquote type="cite">
<div class="gmail_extra"><br clear="all">
<div>
<div class="m_3796635442788111230m_-4293885011110332990m_-4631390161871604727m_-7869206984782099237m_4346958101146153224m_1657149683956893774gmail_signature" data-smartmail="gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div><br>
</div>
<div>David<br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<br>
<div class="gmail_quote">On
Wed, Dec 13, 2017 at
12:16 PM, Jeff Ortel <span dir="ltr"><<a href="mailto:jortel@redhat.com" target="_blank">jortel@redhat.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<p>There has been
discussion on
IRC about a
matter related
to versioned
repositories
that needs to be
broadened. It
dealt with
situations
whereby a new
repository
version exists
in the DB in an
incomplete
state. The
incomplete state
exists because
conceptually a
repository
version includes
both the version
record itself
and all of the
DB records that
associate
content. For
several reasons,
the entire
version cannot
be constructed
in the DB in a
single DB
transaction.
The problem of <i>Incomplete
State</i> is
not unique to
repository
versions. It
applies to
publications as
well. I would
like to discuss
and decide on a
standard
approach to
resolving this
throughout the
data model. <br>
</p>
<p>The IRC
discussion (as
related to me)
suggested we use
a common
approach of
having a field
in the DB that
indicates this
state. This
seems reasonable
to me. As
noted, it's a
common
approach.
Thoughts?</p>
<p>Assume we did
use a field,
let's discuss
name. It's my
understanding
that a field
named <i>is_visible</i>
or just <i>visible</i>
was discussed.
I would argue
two things. 1)
the is_ prefix
is redundant to
the fact it's a
boolean field
and we have not
used this
convention
anywhere else in
the model. 2)
Historically,
the term <i>"visible"</i>
has strong ties
to user
interfaces and
is used to mask
fields or
records from
being displayed
to users. I
propose we use a
more appropriate
field name.
Perhaps <i>"valid"</i>.
Thoughts?<br>
</p>
</div>
<br>
______________________________<wbr>_________________<br>
Pulp-dev mailing
list<br>
<a href="mailto:Pulp-dev@redhat.com" target="_blank">Pulp-dev@redhat.com</a><br>
<a href="https://www.redhat.com/mailman/listinfo/pulp-dev" rel="noreferrer" target="_blank">https://www.redhat.com/mailman<wbr>/listinfo/pulp-dev</a><br>
<br>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</span></div>
<br>
______________________________<wbr>_________________<br>
Pulp-dev mailing list<br>
<a href="mailto:Pulp-dev@redhat.com" target="_blank">Pulp-dev@redhat.com</a><br>
<a href="https://www.redhat.com/mailman/listinfo/pulp-dev" rel="noreferrer" target="_blank">https://www.redhat.com/mailman<wbr>/listinfo/pulp-dev</a><br>
<br>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</div>
<br>
______________________________<wbr>_________________<br>
Pulp-dev mailing list<br>
<a href="mailto:Pulp-dev@redhat.com" target="_blank">Pulp-dev@redhat.com</a><br>
<a href="https://www.redhat.com/mailman/listinfo/pulp-dev" rel="noreferrer" target="_blank">https://www.redhat.com/mailman<wbr>/listinfo/pulp-dev</a><br>
<br>
</blockquote>
</div>
<br>
</div>
</div>
</blockquote>
<br>
</div>
</div>
</div>
<br>
______________________________<wbr>_________________<br>
Pulp-dev mailing list<br>
<a href="mailto:Pulp-dev@redhat.com" target="_blank">Pulp-dev@redhat.com</a><br>
<a href="https://www.redhat.com/mailman/listinfo/pulp-dev" rel="noreferrer" target="_blank">https://www.redhat.com/mailman<wbr>/listinfo/pulp-dev</a><br>
<br>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</div></div></div>
<br>______________________________<wbr>_________________<br>
Pulp-dev mailing list<br>
<a href="mailto:Pulp-dev@redhat.com">Pulp-dev@redhat.com</a><br>
<a href="https://www.redhat.com/mailman/listinfo/pulp-dev" rel="noreferrer" target="_blank">https://www.redhat.com/<wbr>mailman/listinfo/pulp-dev</a><br>
<br></blockquote></div><br></div></div>