[katello-devel] Comments on https://github.com/Katello/katello/pull/1180

Miroslav Suchý msuchy at redhat.com
Wed Dec 5 16:10:43 UTC 2012


On 12/05/2012 03:36 PM, Bryan Kearney wrote:
> I can not comment on a comment, so I figured I would bring them back here:
>
> This is about migrating from spacewalk to katello. The pattern which is
> in the pull request is an ETL (Expoert/Transfrom/Load) process, where
> the Extrac and Trasform are done in the export steps, and the Load is
> done by the CLI in the import actions. The main comment from Lzap and
> Msuchy was around using DB to DB tools as opposed to to using scripts.
> The main reason for this is that today, katello is 4 data stores and
> will soon be 5. When we create an activation key, it goes into katello,
> elastic search, and candlepin. Systems will go into pulp, candlepin,
> katello, ES, and foreman. If we do postgres to postgres, we will end up
> having to to write something which says "go into the Katello DB, and
> push the changes out to all the other systems". TBF, we may need that
> anyway in the future.

Yes it can be hard.
But I seen instances where were more ten ten thousands channels (and we 
are importing channels as well, isn't it?)
And I would not be surprised if that such huge import over API would 
take one week.
It *can* be reasonably fast. But it *can* be slow as well. And I would 
bet on the second.
If you give me - together with PR some preliminary data, that e.g import 
of 2000 packages will last 15 mins, then I will clap the hand. But if it 
will last 8 hours, then we will have problem. And we would have to 
communicate with those 5 stores no matter how hard implement it would be.
Can you run some benchmarks with this first-shot script?

> Some other questions:
>
> * How to deal with None: This is treated as an email column in the CSV/JSON
> * CSV Sucks! : Yes, but the import portions can be used without the
> export.. so you can create your data in $YOUR_SPREADSHEET_TOOL and then
> load them up.
> * The performance will suck: Could be. But I do not believe we need to
> migrate over either the package manifest or the action history. I do not
> think users will care about this.

Why do you believe in that? :)
I - as sysadmin - would expect that package manifest would be migrated.
And to be precise, what is definition of package manifest? List of 
package which is on machine registered to Katello?

> * Why do it in private: I did send out on RFC a while ago. Then  figured
> something running would be easier to discuss.

Yes, I recall it (it was in September) with your proposal on our 
internal wiki. But it was never updated based on the decision.
And the conclusion (at least for me) was Jan idea to use PostgreSQL dump 
of Spacewalk database as entry point for our import.

And as we can see - using Spacewalk XMLRPC call can be tricky, because 
that call you used "activationkey.listActivationKeys" is available only 
in Spacewalk 0.6 (Sat 5.3) or higher. So it will not work in migration 
from older instances.


-- 
Miroslav Suchy
Red Hat Systems Management Engineering




More information about the katello-devel mailing list