[Egothor-tech] Query results page improvement
Leo Galambos
leo.galambos at mff.cuni.cz
Wed Oct 5 13:09:15 BST 2005
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Filip Koczorowski wrote:
| Is it possible for Egothor to display a fragment of an indexed web
| page in results, like Google does? As it is now, there are only a
| few tokens (words) surrounding the query string in the results. For
| example, a search on "egothor" at
| http://www.egothor.dundee.ac.uk/egothor/ results in: "students
| search centre egothor a java based search engine egothor dundee the
| 4 ..." instead of "Egothor A java based search engine Egothor @
| Dundee The 4th J of Dundee, Jam, Jute, Journalism and Java." which
| makes the results page less readable.
|
Hi,
this is not possible with 1.3.x (unless you write the code). On the
other hand, 2.x comes with a clustering support and this will also
offer much better snippets than today. I guess, this code can be
ported to 1.3 as well.
| One more thing concerning query results - Egothor generates a "page
| summary" for each result. It happens to be the very first part of
| text on the page ("access keys | text only SOMiS Court Senate RAE
| Hermes All Dundee SOMiS Homepage Privacy Legal Intranet Departments
| and Offices hosted..." in the earlier example). It is not very
| useful when indexing a group of pages with static content as the
| first paragraph, eg. a portal where each page has a navigational
| menu first. Any hints on how to make a better use of "page
| summary"?
If you can use HTdig's tag "nohtdig" (see their documentation), then
you can specify what part of the page is processed and what part is
skipped over.
Cheers,
Leo
- --
Leo Galambos
Faculty of Mathematics and Physics, DSE
Malostranske namesti 25
Prague 1
CZE
http://kocour.ms.mff.cuni.cz/~galambos/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
iD8DBQFDQ8JrBBOhx7G4BXcRAgJ7AJ9QasiMNNbxtrNuybIWvuon/0p/LgCdH5SE
jhGz+29B7qU2ZDUT1YYJP0k=
=WuF1
-----END PGP SIGNATURE-----
More information about the Egothor-tech
mailing list