[katello-devel] search over rest api - interface design

Dmitri Dolguikh dmitri at redhat.com
Mon Jul 18 13:59:24 UTC 2011


On 11-07-14 10:07 AM, Lukas Zapletal wrote:
> On 07/14/2011 12:01 PM, Amos Benari wrote:
>>
>> I hope this is not too long mail, and waiting for comments on the 
>> interface suggestion.
>
> Well I am surprised that scoped_search is not a fulltext. As a 
> long-time Apache Lucene user I have to recommend this project. It is 
> de-facto standard in fulltext searching and it has been ported to many 
> language (including Ruby I guess).
>
> Main advantages of fulltext:
>
> - speed (always faster than RDBM)
> - comes with very rich query language
>  - what you describe here is already implemented in Lucene
> - works Like Google (TM)
>  - scoring
>  - calculate relevancy
>  - bonus scores
>  - fuzzy search (correcting typos...)
> - it works with special structures like ids, dates, package names, 
> versions - developer just need to implement "tokenizers"
> - good scaling
> - Spacewalk already use Apache Lucene, so it works under our scenario :-)
>
> Very rough overview how I would implement it:
>
> - collector component
>  - periodically checks database (or backend systems - Pulp etc.)
>  - downloads new/changed/deleted data
>  - updates index database
>
> - search component
>  - provides searching capabilities
>  - easy to implement
>
> Maybe a fulltext engine could be the answer in this case. The only 
> drawback is it builds its own data files, but its just some files on 
> the disc. And it takes some time to index new data. It does not hurt 
> much.
>
The tricky part is that we have several backends with different models 
(pulp is not even relational). Would it make sense to "de-normalize" the 
data - crawl external systems for the data we need, store the results 
locally, and use that data set for searches?

-d
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/katello-devel/attachments/20110718/6eaf41d4/attachment.htm>


More information about the katello-devel mailing list