-----BEGIN PGP SIGNED MESSAGE-----
The shell script wgetrel intelligently transverses the Internet
searching for web pages by determining the relevance of HTML documents
to a search criteria. The criteria is specified by boolean
operators-the supported operators are logical or, logical and, and
logical not. These operators are represented by the symbols, "|",
"&", and, "!", respectively, and left and right parenthesis, "(" and
")", are used as grouping operators. The relevance of the documents is
determined by the program htmlrel, which is a modification to the
program rel(1). The relevance is determined by comparing the phonetic
representation of the keywords with the phonetic representation of
every word in a document. Source modifications are provided in the
wgetrel distribution. Also required is Hrvoje Niksic's excellent
program, wget(1). The shell script, wgetrel, can reduce web searching
by about an order of magnitude.
John
Title: wgetrels
Version: 1.0
Entered-date: February, 1998
Description: Source modifications to the rels(1) program sources to
make a variant of the program, htmlrels(1), that
determines the relevance of HTML text documents to a
set of keywords expressed in boolean infix
notation. The relevance is determined by comparing the
phonetic representation of the keywords with the
phonetic representation of every word in a document.
(Phonetic searching has some degree of tolerance to
misspelled words.) The output file syntax is Netscape
level 1 bookmark compatible. The shell script,
wgetrels(1), executes the programs htmlrels(1) and
Hrvoje Niksic's excellent wget(1) to form an
intelligent Internet search engine. (The program
sources to wget(1) are available via anonymous ftp
from ftp://prep.ai.mit.edu/pub/gnu/wget.tar.gz. The
program sources to rels(1) are available via anonymous
ftp from
ftp://sunsite.unc.edu/pub/Linux/utils/text/rels.tar.gz.)
Installation requires virgin sources of the rels(1)
program-there is a shar file in the wgetrels
distribution that installs the modifications in the
rels program source directory. The program wgetrels(1)
controls search direction across the Internet through
determination of the relevance of the documents to a
search criteria.
Keywords: www robot infobot bot information retrieval Internet search phonetic
Author: john _at_ johncon.com (John Conover)
Maintained-by: john _at_ johncon.com (John Conover)
Primary-site: sunsite.unc.edu /pub/Linux/utils/text/wgetrels.tar.gz
Alternate-site:
Original-site: johncon.com
Platform: Linux, USG, BSD
Copying-policy: No limitations for non-commercial use
- --
John Conover, 631 Lamont Ct., Campbell, CA., 95008, USA.
VOX 408.370.2688, FAX 408.379.9602
john _at_ johncon.com
- --
This article has been digitally signed by the moderator, using PGP.
http://www.iki.fi/mjr/cola-public-key.asc has PGP key for validating signature.
Send submissions for comp.os.linux.announce to: linux-announce _at_ news.ornl.gov
PLEASE remember a short description of the software and the LOCATION.
This group is archived at http://www.iki.fi/liw/linux/cola.html
-----BEGIN PGP SIGNATURE-----
Version: 2.6.3ia
Charset: latin1
iQCVAgUBNPVph1rUI/eHXJZ5AQHeMgP+OojUNB2anQM/rlcS+uGuLqk/PfKjGW/h
bmQMO8GPw6gUpFIlfXHcrLmvcHryg7BK+efr9rJUJF4LWvnmHf1b5Jzx1SyMBU6z
//7omoeqt+HF4h9teGWB0wGMnRNI6CDeFCPoKyaQEWm2jksH7jdj6GPwa20MC3lf
p7OUOPSfhDE=
=nmNj
-----END PGP SIGNATURE-----