dee

Merge lp://qastaging/~kamstrup/dee/tree-index into lp://qastaging/dee

Proposed by Mikkel Kamstrup Erlandsen
Status: Merged
Approved by: Michal Hruby
Approved revision: 343
Merged at revision: 309
Proposed branch: lp://qastaging/~kamstrup/dee/tree-index
Merge into: lp://qastaging/dee
Diff against target: 4389 lines (+2926/-603)
32 files modified
bindings/python/Dee.py (+2/-2)
dee/Makefile.am (+8/-2)
dee/dee-analyzer.c (+504/-0)
dee/dee-analyzer.h (+174/-0)
dee/dee-glist-result-set.c (+4/-3)
dee/dee-glist-result-set.h (+3/-3)
dee/dee-hash-index.c (+30/-17)
dee/dee-hash-index.h (+3/-2)
dee/dee-index.c (+61/-19)
dee/dee-index.h (+6/-47)
dee/dee-model-reader.c (+157/-229)
dee/dee-model-reader.h (+60/-11)
dee/dee-term-list.c (+130/-27)
dee/dee-term-list.h (+8/-4)
dee/dee-text-analyzer.c (+192/-0)
dee/dee-text-analyzer.h (+90/-0)
dee/dee-tree-index.c (+713/-0)
dee/dee-tree-index.h (+83/-0)
dee/dee.h (+4/-1)
doc/reference/dee-1.0/dee-1.0-docs.sgml (+4/-1)
doc/reference/dee-1.0/dee-1.0.types (+3/-0)
tests/Makefile.am (+18/-3)
tests/test-analyzer.c (+137/-0)
tests/test-dee.c (+4/-2)
tests/test-glist-result-set.c (+3/-3)
tests/test-index.c (+346/-112)
tests/test-model-readers.c (+49/-83)
tests/test-python.py (+31/-0)
tests/test-term-list.c (+54/-0)
vapi/Dee-0.5-custom.vala (+2/-12)
vapi/Dee-0.5.metadata (+0/-1)
vapi/dee-1.0.vapi (+43/-19)
To merge this branch: bzr merge lp://qastaging/~kamstrup/dee/tree-index
Reviewer Review Type Date Requested Status
Michal Hruby (community) Approve
Review via email: mp+83141@code.qastaging.launchpad.net

Description of the change

WARNING API+ABI break: Big overhaul of the whole text analysis, DeeAnalyzer, and DeeIndex framework.

Primary intent is to implement a DeeTreeIndex that can do O(logN) prefix searching.

To post a comment you must log in.
Revision history for this message
Michal Hruby (mhr3) wrote :

Wow, this is huge...

There are (closure) annotations missing for DeeModelReaderFunc, DeeCollatorFunc, and all the *_reader helpers.

dee_term_list_clone_real: is trying to be insanely quick, doing memcpy of the ptrarr->data fields could squeeze up some extra performance.

find_term_real: shouldn't it use g_sequence_lookup when doing MATCH_EXACT?

2573 +//fixme fixme - is there something to fix?

Revision history for this message
Michal Hruby (mhr3) wrote :

Oh yea, and regenerate the Vala bindings please.

Revision history for this message
Michal Hruby (mhr3) :
review: Needs Fixing
Revision history for this message
Michal Hruby (mhr3) wrote :

Also please add a test that creates TreeIndex on a model that already has a few rows. Hint: it'll crash. ;)

Revision history for this message
Mikkel Kamstrup Erlandsen (kamstrup) wrote :

Just pushed fixes for everything except the last crasher you pointed out. I added a failing unit test as well for the latter, but will wait until tomorrow to actually fix it because it'll require some work (the root cause is described in //FIXMEs).

Revision history for this message
Mikkel Kamstrup Erlandsen (kamstrup) wrote :

Just pushed a fix for the last issue. Please review again.

Revision history for this message
Michal Hruby (mhr3) wrote :

I can't say I like the pointer property, and moreover it's copying the delegate which is *really* dangerous if the original was also used. That being said, this allows easy use of the existing readers, so it shouldn't be that bad. But please, make the "reader" property write-only and add one more ModelReader initializer that will have the (ModelReaderFunc, gpointer, GDestroyNotify) parameters.

As I mentioned on IRC, my test still crashes with SIGFPE, and after some digging, it seems it has something to do with the exact data I chose, so adding that here (the clear call crashes):

void lookup_rows (Dee.Index index, string term)
{
  var rs = index.lookup (term, Dee.TermMatchFlag.PREFIX);
  var m = rs.get_model ();
  foreach (var iter in rs)
  {
    debug ("resultset contains iter: %p", iter);
    debug (" data: %d, %s", m.get_int32 (iter, 0), m.get_string (iter, 1));
  }
}

void main ()
{
  var m = new Dee.SequenceModel ();
  m.set_schema ("i", "s");

  var reader = Dee.ModelReader.new_for_string_column (1);

  var i = new Dee.TreeIndex (m, new Dee.TextAnalyzer (), reader);
  m.append (44, "file:///local_uri");
  m.append (23, "http://foo.com");
  m.append (11, "ftp://bar.org");
  m.append (07, "file:///home/username/file@home");

  lookup_rows (i, "user");

  m.clear ();
}

Revision history for this message
Mikkel Kamstrup Erlandsen (kamstrup) wrote :

Thanks for spotting this. Finally nailed it and pushed the fixes. See
the commit comments. Please review again.

Revision history for this message
Michal Hruby (mhr3) wrote :

There are just minor things: ModelReader shouldn't be marked as Compact class (+vapi regen), dee_model_reader_new annotations should have (scope notified) for the first param, the third should be just (allow-none).

Other than that I think we're good. Awesome job! :)

review: Approve
Revision history for this message
Michal Hruby (mhr3) wrote :

+ unitialized -> initialized :)

Revision history for this message
Michal Hruby (mhr3) wrote :

> + unitialized -> initialized

Actually, "uninitialized" :)

Revision history for this message
Unity Merger (unity-merger) wrote :

The Jenkins job https://jenkins.qa.ubuntu.com/job/automerge-dee/1/console reported an error when processing this lp:~kamstrup/dee/tree-index branch.
Not merging it.

343. By Mikkel Kamstrup Erlandsen

Fix test /Index/ResultSet/GList/Test3 by making dee_glist_result_set_new() a public symbol - although not part of the contractually "stable API". You'll need to include non-distributed headers to access it. The problem with the tests was that we compiled DeeGListResultSet into the test executable making it register the DeeGListResultSet GType twice which causes a g_error.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.

Subscribers

People subscribed via source and target branches