Sunday, April 19, 2009

Field Search: Searching on titles

‹prev | My Chain | next›

The next scenario up is "Searching titles", which is described in Cucumber as:
    Scenario: Searching titles

Given a "pancake" recipe
And a "french toast" recipe with a "not a pancake" summary
And a 0.25 second wait to allow the search index to be updated
When I search titles for "pancake"
Then I should see the "pancake" recipe in the search results
And I should not see the "french toast" recipe in the search results
The Given a-recipe-with-summary step already has a step definition. The Given a-recipe-with-a-title step needs a definition:
Given /^a "(.+)" recipe$/ do |title|
date = Date.new(2009, 4, 19)
permalink = "id-#{title.gsub(/\W/, '-')}"

recipe = {
:title => title,
:date => date,
}

RestClient.put "#{@@db}/#{permalink}",
recipe.to_json,
:content_type => 'application/json'
end
The next step is When I search titles for "pancake", which can be defined as:
When /^I search titles for "(.+)"$/ do |keyword|
visit("/recipes/search?q=title:#{keyword}")
end
The only difference between this and the already defined When I search for "foo" is the addition of the title query parameter. Attempting to run this query, however results in a brutal RestClient failure:
cstrom@jaynestown:~/repos/eee-code$ cucumber features/recipe_search.feature -n \
-s "Searching titles"
Feature: Search for recipes

So that I can find one recipe among many
As a web user
I want to be able search recipes
Scenario: Searching titles
Given a "pancake" recipe
And a "french toast" recipe with a "not a pancake" summary
And a 0.25 second wait to allow the search index to be updated
When I search titles for "pancake"
HTTP status code 400 (RestClient::RequestFailed)
/home/cstrom/.gem/ruby/1.8/gems/rest-client-0.9.2/lib/restclient/request.rb:144:in `process_result'
/home/cstrom/.gem/ruby/1.8/gems/rest-client-0.9.2/lib/restclient/request.rb:106:in `transmit'
/usr/lib/ruby/1.8/net/http.rb:543:in `start'
/home/cstrom/.gem/ruby/1.8/gems/rest-client-0.9.2/lib/restclient/request.rb:103:in `transmit'
/home/cstrom/.gem/ruby/1.8/gems/rest-client-0.9.2/lib/restclient/request.rb:36:in `execute_inner'
/home/cstrom/.gem/ruby/1.8/gems/rest-client-0.9.2/lib/restclient/request.rb:28:in `execute'
/home/cstrom/.gem/ruby/1.8/gems/rest-client-0.9.2/lib/restclient/request.rb:12:in `execute'
/home/cstrom/.gem/ruby/1.8/gems/rest-client-0.9.2/lib/restclient.rb:57:in `get'
./features/support/../../eee.rb:20:in `GET /recipes/search'
/home/cstrom/.gem/ruby/1.8/gems/sinatra-0.9.1.1/lib/sinatra/base.rb:696:in `call'
...
(continues for quite a while)
RestClient errors warrant a peak in the CouchDB log, where I find:
[info] [<0.3573.3>] 127.0.0.1 - - 'GET' /eee-test/_fti?q=all:title:pancake 400
We are getting an HTTP 400 / Bad Request response because the search itself is invalid. Lucene does fielded searches by prepending the field name to the search term, separated by a colon. Similar to how Google does it (e.g. "site:eeecooks.com spinach"), a lucene search for a recipe with the word "pancake" in the title would be searched for as "title:pancake". It makes no sense to smush two fields togther as we have, "all:title:pancake". Hence the 400 response.

It is probably a good thing that an invalid search returns an invalid (400) HTTP response code as opposed to some other code. Still, I should investigate a bit more later, so I make a note for myself to do so in the form of a step-less scenario:
    Scenario: Invalid search parameters
Getting back to the current failing step, it is time to move inside the feature.

A second example for "/recipes/search" will describe the new, desired behavior:
    it "should not include the \"all\" field when performing fielded searches" do
RestClient.should_receive(:get).
with("#{@@db}/_fti?q=title:eggs").
and_return('{"total_rows":1,"rows":[]}')

get "/recipes/search?q=title:eggs"
end
The original example is only slightly different, defaulting to the "all" field that we are using to index entire documents:
    it "should retrieve search results from couchdb-lucene" do
RestClient.should_receive(:get).
with("#{@@db}/_fti?q=all:eggs").
and_return('{"total_rows":1,"rows":[]}')

get "/recipes/search?q=eggs"
end
The first time I run the spec, the new example fails:
cstrom@jaynestown:~/repos/eee-code$ spec ./spec/eee_spec.rb
....F

1)
Spec::Mocks::MockExpectationError in 'eee GET /recipes/search should not include the "all" field when performing fielded searches'
RestClient expected :get with ("http://localhost:5984/eee-test/_fti?q=title:eggs") but received it with ("http://localhost:5984/eee-test/_fti?q=all:title:eggs")
./eee.rb:20:in `GET /recipes/search'
/home/cstrom/.gem/ruby/1.8/gems/sinatra-0.9.1.1/lib/sinatra/base.rb:696:in `call'
...
The easiest way to fix the error is to remove the double fields:
get '/recipes/search' do
query = "all:#{params[:q]}".sub(/(\w+):(\w+):/, "\\2:")
data = RestClient.get "#{@@db}/_fti?q=#{query}"
@results = JSON.parse(data)

haml :search
end
Now the specification passes:
cstrom@jaynestown:~/repos/eee-code$ spec ./spec/eee_spec.rb
.....

Finished in 0.079155 seconds

5 examples, 0 failures
With the inside, detailed specification passing, I try the outside specification and it works:
  So that I can find one recipe among many
As a web user
I want to be able search recipes
Scenario: Searching titles
Given a "pancake" recipe
And a "french toast" recipe with a "not a pancake" summary
And a 0.25 second wait to allow the search index to be updated
When I search titles for "pancake"
Then I should see the "pancake" recipe in the search results
And I should not see the "french toast" recipe in the search results


1 scenario
6 steps passed
I have some reservations about this particular simplest solution. The edge cases of parsing search queries are many. I will worry about that another day. Maybe even tomorrow.
(commit)

No comments:

Post a Comment