[Egothor-tech] Question on merging barrels

shef shef31 at yahoo.com
Thu Aug 3 22:10:20 BST 2006



--- Leo Galambos <leo.galambos at egothor.org> wrote:

> shef wrote:
> > Thanks. I'm guessing that the algorithm is
> implemented
> > in the optimize() method, or maybe the add()
> method,
> > but I can't make heads or tails of it. Can you
> > describe  it in a few words?
> >   
> 
> The algorithm tries to point the optimization step
> to barrels which
> contain most outdated values. This strategy gives
> you: (I do not
> remember my latest results, so read the numbers
> [+/-] several percents)
> 1) an index might be up to 20% larger than fully
> optimized index
> 2) searching is about 20-30% faster (comparing to
> fully optimized
> index), but you pay with (up to) 10% higher peeks on
> a disk bus
> 3) updating: number of I/O operations is
> proportional to size of the update
> 
> As a side effect, the algorithm uses less barrels
> (aka index parts or
> segments) and it also implies less file handles.
> 
> The basic algorithm is described here:
> http://www.enformatika.org/ijcs/v1/v1-2-20.pdf

Thanks. I'm still having trouble understanding the
algorithm, though. I don't follow the notation in Def
1 and Algorithm 1.

As far as I can tell, the size of the barrels is
targeted to be some power of 2, and that when a
certain number of documents is deleted from a barrel,
this triggers a merge. (Is it 1/8 of the docs in a
barrel? 2^3?)

And then you somehow select a subset of the barrels
for merging. But I could be wrong.

Can you describe the algorithm in words?



__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


More information about the Egothor-tech mailing list