Factoring RPM sets (and parsing comps.xml)

Philip Prindeville philipp_subx at redfish-solutions.com
Fri Oct 27 23:44:41 UTC 2006


Oden, James wrote:

>  
>
>>Ok, understood.
>>
>>Well, the code to read the comps.xml file and stick it into a
>>    
>>
>reasonably
>  
>
>>meaningful datastructure must exist at least, right?
>>
>>    
>>
>There is something that parses the comps.xml.  Go to the anaconda web
>page:
>
>  http://rhlinux.redhat.com/anaconda/comps.html
>  
>

Ok, thanks.  Jeremy:  minor comment...  it would be more
useful to see real excerpts from a production version of the
file, rather than hypothetical sample groups, etc.  An
example of using the "requires" attribute (perhaps more than
once) on a package, for instance, would be handy.

I guess I'm going to have to learn Pythong if I want to
use rhpl.comps...  too bad there isn't a Perl equivalent.


>The problem is what you want is something that basically performs set
>operations on the components, and looks at a list of packages and
>deteremines what components best fit that list of packages which the
>library does not provide.  The latter may not be too hard starting with
>the existing parser.  Just off the top of my head:
>
>   - parse with the existing library. 
>   - Walk through the packages and each component and build a dictionary
>
>     Whose key is package name and value is component name.
>   - Now armed with this dictionary go through you list of packages, and
>see
>     what components are used.
>   - When you find a component is used keep a counter of how many
>packages  
>     have been used by this component.
>
>That would give you just in raw form which components are used and the
>raw information to at the end do a final pass and apply some heuristic
>to pick which components have been added to or subtracted from.  For
>example if you have less than 50% utilization of a component then throw
>that component away and say the packages were added to it (this is one
>possible heuristic).
>
>This is of course a real rough sketch of an algorithm.  YMMV.
>  
>

Sounds about what I came up with.

-Philip

> 
>  
>
>>I'll dig around the Anaconda and Kickstart sources when I'm back in
>>front of a development machine...  hopefully there'll be something
>>    
>>
>that
>  
>
>>can be reused.
>>    
>>
>Unfortunately there are not a lot of higher level distribution build and
>management tools out there (there are some, but there just not at the
>level of management you are looking for), so your going into AFICT
>uncharted territory.
>
>Seriously, good luck...james
>
>_______________________________________________
>Kickstart-list mailing list
>Kickstart-list at redhat.com
>https://www.redhat.com/mailman/listinfo/kickstart-list
>  
>




More information about the Kickstart-list mailing list