Mon Nov 29 19:19:49 1999  Loic Dachary  <loic@ceic.com>

	* webbase-5.4 release

	* check/*: find in $srcdir
	
	* crawler/Makefile.am: added html_parser.h

	* Makefile.am: added .version

	* check/index_test (samples): indexed is non accented

	* tools/isomap.c (unaccent): added string_length argument

Fri Nov 26 10:59:21 1999  Loic Dachary  <loic@ceic.com>

	* webbase-5.3 release

	* check/* : include apache detection, autodetect modules

Thu Nov 25 19:35:00 1999  Loic Dachary  <loic@ceic.com>

	* crawler/html_*: complete rewrite of the html parser

	* hooks/*: isolate hooks in separate library

	* check/*: more tests for html parser

Wed Nov 10 17:18:20 1999  Quiedeville Rodolphe  <rodo@banquise.ceic.com>

	* man/crawler.1:  -create option : Exclusive, no other option accepted.

Tue Nov 02 16:21:45 1999  Loic Dachary  <loic@ceic.com>

	* bin/furi2md5.c: convert FURI to FURI_MD5 (see uri(3))

Fri Oct 29 15:39:17 1999  Loic Dachary  <loic@ceic.com>

	* crawler/robots.c (robots_load_1): netloc now is a unique key, added rowid
	  to get a unique identifier per server. Handle the race conditions when
	  two process try to insert the same robots entry.
	
Fri Oct 29 11:13:46 1999  Loic Dachary  <loic@ceic.com>

	* crawler/webbase_url.c (webbase_url_start_ok): only cannonical
	  and absolute url are valid starting points.

Thu Oct 28 16:39:42 1999  Loic Dachary  <loic@ceic.com>

	* crawler/crawl.c (mirror_schedule): if delay <= 0, default to 1 week.

Thu Oct 28 15:27:21 1999  Loic Dachary  <loic@ceic.com>

	* bin/consistentc.c (fix_keys): implement -keys_url, -keys_md5, -keys_normalize

Thu Oct 28 09:23:41 1999  Loic Dachary  <loic@ceic.com>

	* crawler/webbase.c (webbase_unlock): uses md5 key instead of long
	  ascii names.

Wed Oct 27 16:31:54 1999  Loic Dachary  <loic@ceic.com>

	* crawler/webbase.c (webbase_insert_url): fix big problem with
	  realloc(&p, &s, s + value) changed to realloc(&p, &s, value).

Tue Oct 26 18:48:00 1999  Loic Dachary  <loic@ceic.com>

	* crawler/webtools.c: implement -webtools_limit to limit the maximum size
	  of a document.

Tue Oct 26 16:01:22 1999  Loic Dachary  <loic@ceic.com>

	* bin/consistentc.c: consistentc -key cannonicalize urls

Fri Oct 22 13:58:03 1999  Loic Dachary  <loic@ceic.com>

	* crawler/webbase.c: use mysql_real_connect instead of deprectated
	  mysql_connect.

	* crawler/webbase.c: read defaults from ~/.my.cnf if options are missing

	* crawler/webbase.c: do not try to connect twice

	* crawler/webbase_create.c: add bz2 and wdz extensions to unknown mime
	  type

	* bin/crawler.c (init): added -schema to print default database schema

Thu Oct 21 18:57:34 1999  Loic Dachary  <loic@ceic.com>

	* port to freebsd-3.3
	
	* crawler/crawl.c,webtools.c: conditionaly use ETIME, prefer ETIMEDOUT

Thu Oct 21 18:19:14 1999  Loic Dachary  <loic@ceic.com>

	* check/index_test: created

Tue Oct 19 19:00:14 1999  Loic Dachary  <loic@ceic.com>

	* crawler/hook_mifluz.cc: initial version

	* configure.in : --with-mifluz implementation

Mon Oct 18 17:55:46 1999  Loic Dachary  <loic@ceic.com>

	* test/webbase_test: feed url_md5 + call consistentc -key
	  when manually inserting urls in start.

Fri Oct 15 10:31:34 1999  Loic Dachary  <loic@ceic.com>

	* crawler/webbase_url.c: add webbase_url_free and call
	  on context.webbase_url objects.

	* crawler/webbase.c: add webbase_start_free and call
	  on start objects.

Thu Oct 14 17:24:52 1999  Loic Dachary  <loic@ceic.com>

	* DEBUGGING: create

	* tool/getopt*: Upgraded

	* fix various warnings reported by purify.

	* crawler/webbase*.c: fix memory leak : do not reset
	  w_*_length to 0 in *_reset.

	* added .cvsignore everywhere

Thu Oct 14 11:04:41 1999  Loic Dachary  <loic@ceic.com>

	* bin/consistentc: added -keys that rebuilds all the url_md5 keys in start and
	  url tables.

Wed Oct 13 09:36:23 1999  Loic Dachary  <loic@ceic.com>

	* crawler: add url_md5 field in start and url tables. Modify all
	  sources to fill and use this field instead of url.

	* tools/md5str.[ch]: create

	* configure.in: cleanup add link to mifluz 

1999-07-30  Bertrand Demiddelaer <bert@ceic.com>

	* crawler/webtools.c (webtools_open_1): timeout for connect() added

Mon Jul 19 14:33:25 1999    <loic@ceic.com>

	* webbase-5.2 release

1999-07-17  Loic Dachary  <loic@ceic.com>

	* crawler/webbase.c (webbase_alloc): break if connection successfull

	* crawler/dirsel.c (hnode_free): strdup key to prevent unexpected
	  deallocation

	* check: test suite

1999-07-15  Loic Dachary  <loic@ceic.com>

	* tools/dirname.[ch]: rename to urldirname to prevent conflict

1999-07-13  Loic DACHARY  <loic@home.ceic.com>

	* webbase-5.1 release

1999-07-09  Loic Dachary  <loic@ceic.com>

	* Initial import
