[Egothor-tech] Few problems with Egothor 1.2.6

Leo Galambos Leo.Galambos at egothor.org
Wed Jan 19 11:30:39 GMT 2005


Filip Koczorowski wrote:

> First and foremost, I've been unable to search for a couple of words from
> my index. The index was created from 12 pages about the competitor of
> Egothor - Nutch :)  To put it straight, the word "nutch" couldn't be 
> found
> using Egothor. It is odd as it occurs a number of times on every page. I
> also found that a few other words were missing during search. So my
> problem is - what is happening? What am I doing wrong?
>

Useless words are excluded by default (useless = low inverse document 
frequency). For more details see 
http://www.egothor.org/pipermail/egothor-tech/2004-April/001023.html

> The second problem is with the links' rank mechanism. I seem to get 
> nearly
> meaningless numbers - especially when direct phrase search is used. On 
> the
> other hand, I sometimes get number like 2000 or more for common words.
> Again, please help - is there a way to get some valuable information 
> using
> link ranks?
>

I am sorry, I cannot tell you anything, I would have to see the queries 
and results.

> I've also got one more question - is Capek (the crawler) capable of
> incremental crawling (Michelangelo indexes incrementaly)?
>

Yes, Capek supports ``incremental crawling''.

Cheers,
Leo

-- 
Leo Galambos
Faculty of Mathematics and Physics, DSE
Malostranske namesti 25
Prague 1
CZE

http://kocour.ms.mff.cuni.cz/~galambos/




More information about the Egothor-tech mailing list