[katello-devel] search over rest api - interface design

Amos Benari abenari at redhat.com
Sun Jul 17 08:55:42 UTC 2011



----- Original Message -----
> From: "Bryan Kearney" <bkearney at redhat.com>
> To: "Amos Benari" <abenari at redhat.com>
> Cc: it-eng-tower at redhat.com, katello-devel at redhat.com
> Sent: Thursday, July 14, 2011 6:57:26 PM
> Subject: Re: [katello-devel] search over rest api - interface design
> On 07/14/2011 06:01 AM, Amos Benari wrote:
> > Hi All,
> > I want to start a discussion on our REST search interface. I prefer
> > to do that by email because there are a lot of details involved.
> >
> > Background:
> > -----------
> > The search in Katello UI is build it on top of a rails package
> > called scoped_search.
> > The scoped_search enables a free-text search as well as specific
> > keys search, it supports a powerful query language that includes:
> > logical operators, negation, brackets and more, it helps the users
> > get familiarized with the syntax by offering a syntax
> > auto-completer.
> > scoped_search translate the users query into SQL query, it relays on
> > Rails ActiveRecord to be database agnostic.
> >
> > Since Katello is build out of several entities (Pulp, Candlepin and
> > Foreman) a large portion of the data is managed by the other
> > entities.
> > Katello doesn't have direct access to the respective databases. This
> > leads to the point of asking how do we support a search on remote
> > entities?
> >
> > To understand the issues involved I suggest looking at an example:
> > Let's say that a user is looking for a package that was updated
> > yesterday and called something like pulp.
> >    The user can type the following in the GUI:
> >    Package search: "updated = yesterday and name ~ pulp*"
> >    If the data was in Katello db, scoped_search could parse it and
> >    create the following query:
> >    "select * from packages where updated_at>= July-13-2011 and
> >    updated_at< July-14-2011 and name like pulp%"
> >
> > The query is processed by the scoped_search in two steps:
> > 1. The user query is validated and parsed into an abstract tree.
> > 2. An sql query is build, using the language definition, the
> > application model, and the specific database sql dialect.
> >
> > To learn more about scoped_search:
> > ---------------------------------
> > source: https://github.com/wvanbergen/scoped_search
> > documentation: https://github.com/wvanbergen/scoped_search/wiki
> > blog: http://scopedsearch.wordpress.com/
> >
> >
> > To the point:
> > -------------
> > Three options comes to mined when looking at where to put the
> > interface boundary:
> > 1. At the user language end, lets make all our participating parties
> > understand the user query and convert them into the data source that
> > they are using.
> > essentially, in our case, this means to converting the scoped_search
> > to java and python.
> > 2. Use an intermediate language.
> > 3. Use SQL or mongodb-Jason as an interface.
> >
> > The 3rd option was rejected in a previous discussion because it is
> > very much un-secure and it exposes internal structure that we
> > probably don't want as a stable API.
> > The first option of porting scoped_search to python and java will
> > take about 6 weeks per porting effort, and then we will need to
> > maintain all three projects, this might make sense in the long run,
> > depending on my ability to form an active community around the
> > original project.
> >
> > That leaves us with the second option an intermediate interface.
> >
> > Now is this going to get to the point?
> > --------------------------------------
> > I have started my quest for interface by looking at the vast field
> > of existing query languages. I was looking at languages listed here
> > http://en.wikipedia.org/wiki/Query_language,
> > I did found some interesting ideas (YQL) but found them to be mostly
> > over-kill for our purpose.
> > To make the search interface simple enough to be easily translated
> > into query on the receiving end, and yet powerful enough to be
> > useful, I suggest the following guidelines and limitations on the
> > search interface:
> > a. It will be made of a number of conditions.
> > b. A conditions will always describe a single infix term (for
> > example: "arch = i383") and will always refer to a single property (
> > name, arch, create_date, etc.).
> > c. A logical AND will be used between conditions. This means that
> > every additional condition will further limit the result set.
> >     If a user wants to expand the result set he will not be able to
> >     use logical OR he will be forced to run yet another query.
> >     This limitation simplify the interface a lot because it also
> >     eliminates the need for brackets.
> > d. No prefix negation. The interface will not allow writing "NOT
> > updated = yesterday".
> >     It can be proved that combination of limitation c and d means
> >     that some queries are simply impossible in a single query.
> >     However it is greatly simplifies the interface.
> > e. An exception to (b) and (c) is the free-text element. The
> > interface will allow query such as "free-text ~ pulp*" this will be
> > translated
> >     into a query where each "string/text" property in the searched
> >     element will matched with the value.
> >     For example the result of the above example can be something
> >     like "name like pulp% OR description like pulp%".
> >
> > So what will the suggested interface look like?
> > Going back to the example:
> > the user typed:
> > "updated = yesterday and name ~ pulp*"
> >
> > query = [name = [" ~ pulp*"]
> >           updated_at = [">= July-13-2011", "< July-14-2011"]
> >          ]
> > Pulp.get_package_by_repo(repoId, query)
> >
> > rest call will look like:
> > http-get: /pulp/api/repos/repo_id/packages?query[]=&name[]= ~
> > pulp*&updated_at[]=>= July-13-2011&updated_at[]=< July-14-2011
> 
> I like it being a query parameter, but would a body element be better?
> Dunno.. probably less resty as a body parameter.
> 
> >
> > I haven't encoded the url for obvious reasons :)
> >
> >
> > I hope this is not too long mail, and waiting for comments on the
> > interface suggestion.
> ______________________________
> > katello-devel mailing list
> > katello-devel at redhat.com
> > https://www.redhat.com/mailman/listinfo/katello-devel
> 
> couple of questions:
> 
> 1) How would I annotate the scoped search in the ruby code to get the
> auto complete features?

Not a final syntax but here are a couple of options from my poc code:
  rest_search :on => :name, :rename => :vm, :operators => ['= ', '> ', '< ']
  rest_search :on => :disk, :rename => :image_size, :field_type => :integer
  rest_search :on => :status, :field_type => :integer, :complete_value => {:down => 0, :up =>1}
  rest_search :on => :created_at, :rename => :created, :field_type => :time, :complete_value => true


> 2) Related to above.. if I wanted to have an attribute be a set of
> values... would that be duplicated in the ruby code?

Yep, with the benefit that you can rename them as in:
  :complete_value => {:down => 0, :up =>1}

> 
> -- bk




More information about the katello-devel mailing list