[Egothor-tech] keeping index uptodate

HM hm at hmLyons.com
Thu Nov 18 22:40:31 GMT 2004


Hello list,

I'm evaluating EgoThor for use at our company. I've been reading the docs and I'm trying to
figure out what the recommended approach is for maintaining an uptodate index of a website.

It seems pretty straight forward to run Capek as a daemon,

java org.egothor.robot.Capek -daemon [your URLs to crawl]

I assume this means that it will crawl the entire site, then when it's done, start over again.
I suspose this could be used with the egothor.server.pause argument so that Capek will slowly
and continuosly crawl the site.

And for indexing, Michangelo could be run at an interval like every 48 hours or something.

Am I understanding everything correctly, is the correct approach to use to keep an uptodate
index of a website?

Thanks,
HM


More information about the Egothor-tech mailing list