ANET.at HomepageSearchEngine History ============================ 1. Chronology of version releases ------------------------------ __________ ____________ Date Version 2006-12-12 3.62 beta6+ 2006-10-31 3.61 2005-04-14 3.6 2004-09-28 3.53+ 2004-01-20 3.53 2004-01-11 3.52 2003-12-15 3.51 2003-10-24 3.5 2003-06-10 3.42 2003-03-27 3.41 2003-01-30 3.4+ 2002-09-24 3.4 2002-06-03 3.37 2002-01-30 3.36 2001-12-06 3.35 2001-11-04 3.34 2001-10-23 3.33 2001-07-04 3.32 2001-05-27 3.31 2001-03-29 3.3 2001-02-01 3.21 2000-12-14 3.2 2000-10-27 3.1 2000-09-03 3.03 2000-08-08 3.02 2000-07-17 3.01 2000-06-18 3.0 2000-03-05 2.07 2000-01-31 2.06 2000-01-18 2.05 2000-01-10 2.04a 1999-12-27 2.04 1999-10-25 2.03 1999-10-04 2.02 1999-09-27 2.01 1999-09-14 2.0 1999-09-06 1.01 1999-08-02 1.0 2. History of version changes ("change log") ----------------------------------------- ____________________________________________________________________________ v3.62 beta6+ (December 12, 2006) (What's new) + support for searching PDF files (indexed search method only). + new shell command "pdfconvert" which converts supported PDF files into plain text format and determines unsupported PDF files. + improvements to the "spider" shell command: + new option '-pdf2txt', which adds *.pdf PDF URLs to the URL-list if corresponding *.pdf.txt URLs to text files (as created by the "pdfconvert" command) exist. + properly handle the "noindex" directive of the robots meta-tag. The manual now contains a table of all possible index/follow actions and their corresponding robots meta-tag directives. + the '-querystrings' option accepts arguments to specify only those variable names, optionally with (wildcarded) values, that should be kept (and re-ordered) within query strings. + if a URL to be verified contains a query string, it will first be requested by HEAD, instead of making a full GET request. A GET request will follow only if the URL has been successfully verified to be 'text/html' content without an restricting robots meta-tag. + if a HEAD request unveils a URL to be redirected (status code 302), and the returned Content-Type is not of text/html, a GET request follows to determine the correct Content-Type. + Last-Modified header information will now always be printed once the URL passed verification. + if the URL-list file contains *.pdf URLs, the "geturls" shell command automatically gets the corresponding *.pdf.txt URLs, in order to support URL-lists created by the "spider" command with the '-pdf2txt' option set. + the "makelist" shell command automatically adds *.pdf.txt files to the list of Non-HTML files, as long as the corresponding *.pdf files exist or if they have been catched by the "geturls" command. + the "index" shell command automatically adds *.pdf.txt files to the list of Non-HTML files, as long as the corresponding *.pdf files exist or if they have been catched by the "geturls" command. + both shipped cronjob sample scripts for Unix and Windows have been updated to include PDF support when building the search index. + the "Search text of Non-HTML files" checkbox area in the advanced built-in search box of Pro editions now also displays an icon to indicate support of PDF files. + a built-in icon to be displayed on the results page for PDF documents is available. The default icon image for a PDF document can be changed via the new "imgsrc_pdfdoc" directive of the hse.ini file. + the results details for a PDF document include the PDF version, displayed when mousing over the PDF icon and after the file size. + the built-in icon for RTF documents has been changed to clearly stand for RTF. + the help window available from the built-in search box now looks nicer and is made of valid XHTML. Its elements can be designed via the HomepageSearchEngine.css style sheet. + the shipped HomepageSearchEngine.css Style Sheet has been updated and now includes a definition for the help window's background-color. + when a result head should print the title, but there is no title available in the result file, the file's name (and not the file's entire URL) will be printed. + ensure that result URLs may only be passed through the "highlightmatches" filter, when their Content-Type is either text/html or text/plain. Open all other URLs by redirecting them directly to the document. + when a result URL will directly redirect to the document, don't do the redirection by sending a "Location:" header. Instead, create a new HTML document that does the redirection. This ensures Internet Explorer not to show a blank page in a new window instead of opening a RTF- or PDF-file with their associated application + add "#page=n" to the direct URLs of PDF-files, to open supported PDF files at the specified page number n (currently, always equals to 1). + new hse.ini directive "notarget_list" to specify a list of URLs where the "target" directive should be ignored + glitch corrected that went into HSE 3.61: query links to global search engines (set via hse.ini's "query-links" directive) contained the entire query string instead only the actual terms to be queried + the distributed packages for Unix platforms are now BZip2 compressed TAR archives (.tbz2 files), rather than GNU-Zip compressed TAR archives (.tgz files), to reduce file sizes. Under Windows, they can be unpacked with WinZip as of version 11.0 ____________________________________________________________________________ v3.61 (October 31, 2006) + access to certain configuration sets can be disallowed by using a dynamic HTML template in those conf sets, having a "HSE-Disallow-Conf" meta-tag set to "true". Thus, adding to a dynamic template will disallow HSE accessing the conf set calling that template. Both the shipped PHP and SSI enabled sample template files have been updated to include this meta-tag, set to "false" per default (allowing access). + when a dynamic HTML template is used, the number of HSE's configuration set (as defined via the "conf" query variable) is available as the "HTTP_HSE_CONF" HTTP Header Environment Variable. Both the shipped PHP and SSI enabled sample template files have been updated to show how to obtain this value. This may be useful to conditionally disallow access to certain conf sets (as explained above), especially within a PHP based user access system. + user properties (like the user's IP address and the value of a specified cookie) can be provided to the environment of the dynamic HTML template. This is useful in a cookie based user access system written in PHP. + the "geturls" command is working properly with URLs containing Authorization data (such as 'http://username:password@www.somesite.tld/' or - without password - 'http://username@www.somesite.tld/') + when the "changeurls" command restores URLs in index files and the URL found in the URL-list file contains Authorization data, the Authorization data will not be copied into the index files + when HomepageSearchEngine prints a date which format is not configurable (as output from the Shell Executable), the format is now always in ISO 8601 compliant international standard date notation (YYYY-MM-DD) + support for less used platforms has been discontinued. The platforms Win32, Linux, FreeBSD, Solaris and MacOSX are still supported and will be so in future releases. This allows us to focus on much more important things rather than wasting time with support of platforms that probably noone needs. If you really need a version for a previously supported platform, contact us. ____________________________________________________________________________ v3.6 (April 14, 2005) + the "hse.ini" configuration file has been reorganized + "protection_time" hse.ini directive added to protect the CGI application against overloading. + the directives "formtable_width", "formtable_border-color", "formtable_background-color" and "formtable_background-image" have been removed. Instead, the search box's general outfit is now defined by the HomepageSearchEngine.css Style Sheet's ".HSE-searchbox" properties. + the "formtable_input-size" directive has been removed. Instead, the style of the search box's input text field is now defined by the Style Sheet's ".HSE-inputtext" properties. + where to place the search box is now controlled by the "searchbox_place" directive, allowing it at the top or/and at the bottom, or none. + the "formtable_alignment" directive has been replaced by "searchbox_align" which defaults to the value "auto" + the "results_details" directive's keyword regarding the icon has been simplified: "icon:custom16x16" (to show it in the size of 16 x 16 pixel) has been removed because its style is defined by the Style Sheet (".HSE-icon" properties). The default keyword is now just "icon", which will show a custom icon if present, or otherwise print a default icon. + "imgsrc_webpage", "imgsrc_rtfdoc" and "imgsrc_textfile" directives have been added to specify source URLs of default icon images + the directive "results_descriptions" has been renamed to "description" + the directives "results_previous_img" and "results_next_img" have been removed. Instead, the source URLs of the "previous" and "next" images within the navigation panel below the results are now specified by the directives "imgsrc_previous" and "imgsrc_next". The style of these images is defined by the Style Sheet's ".HSE-nav-image" properties. + the style of the currently displayed results range (within the navigation panel below the results) is now defined by the Style Sheet's ".HSE-current-range" properties. + number of possible categories increased to 99 + Queries file has been updated: 'A9.com' added, less important search engines removed, some URLs corrected + the OpenSSL packages for the platforms that support accessing https URLs (Windows 32bit, GNU/Linux, FreeBSD, Sun Solaris and WindRiver BSD/OS) have been updated to the latest OpenSSL version (0.9.7e) ____________________________________________________________________________ v3.53+ (September 28, 2004) + when a results URL is shown using the "highlightmatches" feature, the UTF-8 flag is only turned on if the restrictive search options are set to 'matchcase=off' or 'noparts=on' and hse.ini's "utf8" directive is set to "on" and active. + If a custom form contains a "uft8" parameter (to override the usual rules of applying the UTF-8 flag), it can be set to "off" (to never apply the UTF-8 flag), to "on" (to always apply it) or to "auto" (to only apply it on 'matchcase=off' or 'noparts=on' searches) + URLs with an unsupported Character Encoding that are shown using the "highlightmatches" feature or the "URL Header inspector" fall back to the default Character Encoding + fixed: on some rare webservers such as Rapidsite/Apa an Error 500 may have occured + fixed: no results were printed when the "results_global" directive did not contain the "summary" keyword and more results have been found then specified with the "max_found_files" directive + the '-cat' option is now mandatory for the "spider" and "geturls" commands + the "spider" command now gives an error when the directory to be prepared for the "geturls" command is not below the "basepath" directory. + HomepageSearchEngine used as spider or HTTP client also recognizes "Last-Modified" headers in uncommon formats such as "Wed Sep 15 08:38:21 2004 GMT" or "Wednesday, 15-Sep-04 08:38:21 GMT". The "Last-Modified" value will be printed along with the corresponding time in ISO standardized format, such as "2004-09-15, 08:38:21" + the "geturls" command now skips getting an URL when the associated file name to be saved as exceeds the length of 250 characters + the style of the icon printed before each result can be customized more effectively (using the the Style Sheet's additional ".HSE-icon-div" definition) ____________________________________________________________________________ v3.53 (January 20, 2004) + fixed: determination of the Executable's directory works properly on all Windows platforms (there may have been problems with version 3.52 on some Windows systems) + fixed: forcing the UTF-8 flag to be on by delivering a "uft8" parameter now also works on search methods other than the indexed search. + fixed: the 'WindRiver BSD/OS' package now works on all BSD/OS 4.x versions + some Style Sheet improvements for nice dynamic CSS2 features on form elements __________________________________________________________________________ v3.52 (January 11, 2004) + further spider improvements: + the directory containing the previously grabbed files will now be cleaned up, so the cronjob script does not need to use system commands for that anymore. This behaviour can be unset by applying the new '-nocleanup' option. + already existing files with the same Last-Modified date as the remote URLs will now be provided to be used as cache. + the '-noquerystrings' option has been replaced by the '-querystrings' option. The former '-noquerystrings' behaviour is now set by default and can be unset by applying '-querystrings'. + the "geturls" command now uses the cache, unless the new '-nocache' option is set. + the default Character Encoding is not limited to byte based Encodings (previously called "character sets"), now supporting Unicode Encodings (such as "utf-8") as well. Consequently, the name of the "charset" directive has been changed to "encoding". + the default Character Encoding is now checked to be W3C compliant. If the check fails, a message will be printed, containing a list of all supported W3C approved Encoding names + the search page's Encoding is now sent directly as HTTP Header instead of as a meta-tag within the HTML content + the "URL Header inspector" now also prints the URL's Character Encoding + when a results URL is shown using the "highlightmatches" feature, the original Character Encoding is preserved + enabling case-insensitive searches on Non-ASCII characters is now done via the new "utf8" directive. It replaces the previously used "locale" directive. If enabled, a so-called "UTF-8 flagged search" takes place. + when an UTF-8 flagged search took place, that can be identified by mousing over the "Required time:" value on top of the results page (unless displaying of time has been disabled). + for optimised speed, searches with default restrictive options ('matchcase=on' and 'noparts=off') automatically turn off the UTF-8 flag, since it is not needed in that cases + within the found files (with the highlighted matches), the search will always take place UTF-8 flagged, since the overhead is not significant + the UTF-8 flag can always be forced being on or off, overriding the usual rules, by manually delivering a "uft8" parameter with the value "on" or "off". This may be useful for testing the performance difference. + UTF-8 flagged case-insensitive searches now performe better + restoring the search terms into a custom input form via the "fill_form()" JavaScript function (included in 'hse_customform.js') can now be forced to always work properly, also with Character Encodings that previously may have produced garbage on some characters. To do so, deliver a parameter called "encterms". That will add the terms in the "encodeURIComponent" format, as "enc" delivery parameter, to the query string. + support for https URLs on the WindRiver BSD/OS platform ____________________________________________________________________________ v3.51 (December 15, 2003) + additional license models introduced: "Wildcard Site license" and "Host license" (a global, machine-based license key) - see "license.txt" for details + all commands accept different values given to the '-debug' option, to specify different verbose levels + spider improvements: + supports URLs that require Authorization + added '-prerobotsfile[=FILE]' option to add custom robot rules to those defined in the site´s /robots.txt file. To be verbose (only) about robot rules related behaviour, you can set the '-debug=robotrules' option. + accepts uncorrect syntax within the robot rules, when the command doesn't end with ":" (such as "User-agent " instead of "User-agent: ") + accepts (uncorrect) "\" characters in found links to be "/", acting as directory separator + cleans up found links containing "/./" or "/../" + the "geturls" command now stores each grabbed file with the Last-Modified timestamp determined from the remote URL + lines in the "hse.ini" file can be continued on the following one by ending them with ' \' + fixed: "Autocomplete indexing" could not work properly when used without a '-cat' option + fixed: in some circumstances, a custom value of the "results_details" directive was not determined properly + rewritten, enhanced cronjob scripts ("hse_cronjob.sh" and "hse_cronjob.bat"), containing a well documented example how to spider some sites + Queries file ("hse_queries.ini") is now pre-configured to enable a query to Google in several languages + updated documentation, both the Manual's one and within "hse.ini" + WindRiver (formerly BSDi) BSD/OS platform is supported ____________________________________________________________________________ v3.5 (October 24, 2003) + new libraries bring these advantages: + support for https URLs (SSL enabled pages) on FreeBSD, GNU/Linux, Sun Solaris and Windows 32bit platforms + converting character sets does not require iconv (GNU libiconv / GNU libc) anymore + shared object files are now residing in the executable's "lib" sub directory, to make things more easy to survey + shared object files have the "so" extension on all platforms (including Windows and Mac), to make things more unique + no "locale-enabled HSE Executable" is required anymore to solve the "always-case-sensitive" bug affecting Non-ASCII characters. Now, the search string and the searched text are treated directly as encoded in the character set specified by the "charset" directive. To avoid speed decreases on case-insensitive searches, this feature only applies on indexed searches and must be enabled using the "locale" directive (which now behaves differently than previously). + "totalmatches" keyword added to the "results_global" directive + a new, more flexible directive "query-links" replaces the "engine-links" part of the "results_global" directive + global search engines to be queried can be fully customized via a central Queries file ("hse_queries.ini") + "rank", "head:title (T)" or "head:description"; "description", "nobr", "print:'...'" and "link:'...'" keywords added to the "results_details" directive + the style of the ranking number before each result can be customized via the Style Sheet (using the ".HSE-rank" definition) + to get the most relevant results in the first search, the pre-built input form now defaults to "matchcase=on" + design of the pre-built input form slightly simplified + fixed: under IIS 6 (on Windows 2003), determination of the Executable's directory may not have worked properly + fixed: when the highlightmatches feature is turned on to highlight the matches in the result files, newlines within and
...
tags will be preserved properly (that may have broken some JavaScript functionality) + fixed: wildcard symbols at word boundaries have not been treated properly + fixed: the result description for a RichTextFormat document may have included some garbage characters + the spider now only accepts a /robots.txt file when its MIME type is set properly (to 'text/plain') + it is now possible to spider an unlimited number of URLs (by running the "spider" command with the '-max=-1' option) + restoring Japanese terms into a custom input form via the "fill_form()" JavaScript function (included in 'hse_customform.js') now also works properly using Apple's Safari browser + language files slightly revised + language support for Croatian (language code "hr") + updated Czech language files (language code "cs") to work properly with the "iso-8859-2" character set + discontinued support for the platform BSDi BSD/OS ____________________________________________________________________________ v3.42 (June 10, 2003) + File-list files are now sorted by date of last modification, to ensure the latest modified files to be searched first. The sorting method can be changed by the new '-sort=date|name|none' option of the "makelist" command. + Incremental indexing (using the "index" command's new '-part=PART[/TOTALPARTS]' option) allows to index a large amount of files via the web based console, which would probably fail when tried to index in one step. + "Autocomplete indexing" feature allows easy index creation using the web based Admin Area, with just one click. All required actions will be performed automatically, such as switching to incremental indexing if necesarry. + the style of the icon printed before each result can be customized via the Style Sheet (using the ".HSE-icon" definition) + When the search source for an indexed search will be listed (by searching for "list:files"), the day of last modification of each file is printed (in addition to its URL and file size). This makes it possible to easily ensure if the file-list has been sorted by date of last modification. + When the search source for an on-the-fly search will be listed (by searching for "list:files"), the files appear in the order the real search takes place. That list can be displayed in alphabetical order instead by selecting "the path name" from the "Show ... hits ... sorted by" drop down menu of the pre-built input form. + all commands can be used with the '-debug' option, to enable verbose mode + HTML files will not be searched and indexed if they contain a "robots" meta tag with the content "noindex" or "none" (unless it contains "search"). These skipped files can be viewed when the "index" command is used with the '-debug' option. + improved Admin Area (with a Session ID based, server-sided User authentification system) ____________________________________________________________________________ v3.41 (March 27, 2003) + spider improvements: + supports the /robots.txt Robots Exclusion Protocol (with "HomepageSearchEngine" as the robot name) + supports "robots" meta tags (content="noindex, nofollow, none") + '-noquerystrings' option added to cut query strings from links + directory URLs will only be recognized once, regardless if they have a trailing slash or not + added '-prefix' option for the "changeurls" command to directly set the local start URL to be prefixed + to be more flexible, the strings in the "ban_list", "search_always" and "categories_sourceNR" directives are now *not* wildcarded at the beginning and at the end. To do so, the "*" wildcard symbol must be added at the desired positions. + ASP (Active Server Page) pages are also recognized with the .aspx extension (as created by ASP.NET) + when a required shared object is missing, the error message should point to the object in question instead of just printing an "Unknown error" + the design of results output has been changed slightly to be similar to that currently known from Google and AltaVista + the Free version (formerly Light edition) is now more configurable (see http://free.HomepageSearchEngine.com) ____________________________________________________________________________ v3.4+ (January 30, 2003) + libraries replaced by newer versions in order to support new UTF-8 related features + a custom input form that preserves the previous form settings now also restores Japanese terms properly using IE and Opera + automatic determination of the CGI executable's URL now also works properly on special server environments (eg. with sbox) + language code for Japanese changed from mistakenly used "jp" to the proper ISO 639-1 code "ja" + problem solved that may have occured when trying to access the admin area via the https protocol + added IANA character set information to each "WhatsThis.txt" file residing in the language directories + if platform detection by the shell script ("platform.cgi") fails, the Perl script "platform.pl" should do its job + discontinued support for the platform GNU/Linux-mips ____________________________________________________________________________ v3.4 (September 24, 2002) + new shell command "spider" which can spider an entire site and makes the URL-list file to grab URLs automatically + flat (file-list based) search method introduced which combines a semi-on-the-fly and a semi-indexed search method + indexing process splitted into two steps in order to consume less resources in each process: making of the file-list (by the new "makelist" command) and creating of the index files (by the "index" command) + Unicode support for all ASCII and Latin-1 characters, in both hexadecimal and decimal notations + alternative "locale-enabled" HSE executable added that supports the new "locale" directive to solve the "always-case-sensitive" bug affecting Non-US-ASCII characters on English based systems + added "title (T)" key word for the "results_details" directive to disable or customize printing of the title of a web page before each listing of a found file on the results page + added "maxsize:SIZE" key word for the "results_href" directive to automatically disable the highlightmatches/gotofirstmatch feature on target files with a size higher than specified, in order to reduce memory consumption + Extensible Markup Language (XML) files (.xml) are recognized as web pages + Wireless Markup Language (WML) files (.wml) are recognized as web pages + enhanced support for named entities: now 101 forms are supported, including the ones for: + all the 96 Latin-1 characters + the HTML ASCII characters (", &, < and >) and the Euro sign (€) + '&' in URLs to be printed will be converted into the entity to prevent strings to be interpreted as entities + fixed bug regarding '<' and '>' in titles + fixed bug regarding highlighting of more than one search terms in the "Google-like" style of the results' descriptions + improvements when the highlightmatches feature is turned on to highlight the matches in the result files: + a newline that separates two words of a matching phrase will be ignored + documents with the text/xml MIME type will be printed directly by the browser (and not parsed through HSE) + all ' ' named entities replaced by its Unicode equivalent (' ') in order to be XML compliant + all '@' characters replaced by its Unicode equivalent ('@') to make eMail addresses harder to find by spam robots + Helper Application Perl CGI script "passurl.cgi" included to redirect the result URLs to a custom application + "hse_customform.js" JavaScript library now works properly with Mozilla 1 and Opera 6 + in the FreeBSD package, a wrapper script has been added to achieve compatibility with all FreeBSD 4.x installations + when the "lang" delivery parameter is set to "de", the mouseover titles on the pre-built input form are printed in German + character set for Thai (language code "th") changed from "windows-874" to "tis-620" + language code for simplified Chinese (Chinese/China) changed from "zh" to "zh-cn" ____________________________________________________________________________ v3.37 (June 3, 2002) + added "icon:default|custom16x16|custom" key words for the "results_details" configuration directive to show an icon image with each result + the "sort" parameter can now also be "name" to sort the search results by the name of the file path + the value of an "append" delivery parameter will be appended to the result URLs in order to support dynamic shopping carts + HTML sections to be excluded from being searched can now also be spanned using the
tag, additionally to + on the results page, each listing of a found file is now anchored (with the name of its number) + a "debug_level" of "2" now avoids unsecure access of the admin area; a value of "3" also disables printing of the file list + a stand alone asterik character ("*") as search term will no more be treated as wildcard + the pre-built advanced input form contains images + the file list prints all files with their associated icon images + updated language files + deprecated names "HomepageSearchEngine..." for all the config files are no longer supported, so only "hse..." will work ____________________________________________________________________________ v3.36 (January 30, 2002) + CGI application can be used to inspect any HTTP(S) URL: + to highlight all matches within the URL's content, supporting one or more search terms + to show the URL's Header informations responded by the webserver ("URL Header inspector") + HTML version used for all output to the browser updated from HTML 4.01 to XHTML 1.0 Transitional + in the results list, the number of matches from each file is also given for each term separately + added configuration directive "allowed_referer_sites" to restrict sites that may call the CGI application + a "debug_level" of "1" now prevents pathes to be printed to the browser in all messages + added sample HTML template containing a custom input form including JavaScript code to preserve the previous form settings + SSI and PHP templates now contain a custom input form including "URL Inspector" and preserving the previous form settings + directive "results_descriptions" also allows to set how many characters surround the match of a list item + deprecated key name "hse.key" is no longer supported, so now only "hse_key.cgi" will work ____________________________________________________________________________ v3.35 (December 6, 2001) + when the first match is part of a link, the "gotofirstmatch" feature now does not break the link anymore + support of finding characters as entities re-enabled (was broken since v3.33) + to minimize problems on MacOS and with Non-Unix users, all shipped configuration files begin with "hse" instead of "HomepageSearchEngine" so a filename is not longer than 31 characters and in all lowercase + new directive "cgiurl" can be used if the SCRIPT_NAME environment variable is not set properly in some rare cases + the "results_previous_img" and "results_next_img" directives now default to specify a built-in image set + added "matches" key word for the "results_details" directive to be able to prevent printing the number of matches + Netscape 4 issue on non iso-8859-1 character sets solved + language support for Romanian ____________________________________________________________________________ v3.34 (November 4, 2001) + added "gotofirstmatch" keyword for the "results_href" directive to locate matches within the found pages even easier + the "results_href" directive now defaults to "highlightmatches + gotofirstmatch" + searches with wildcards within search terms are now possible (by applying the "*" wildcard character) + added configuration directive "exclude_dirs" to prevent directories from ever being looked into + the "*" wildcard character can be used for all configuration directives that specify file sources (exclude_dirs, ban_list, search_always and categories_sourceNR) + searching for "list:files" prints additional useful information, to improve the configuration for on-the-fly searches + updated language help files to reflect the possibility to use wildcards + language support for Japanese + improved highlighting method avoids problems with Japanese or any other non Latin characters ____________________________________________________________________________ v3.33 (October 23, 2001) + added '-nononhtml' and '-nohtml' options for the "index" command to enable faster index processes + additional syntax to be used in the URL-list file to enable searching different locally hosted sites as different categories + dynamic progress bars now work with the browsers Internet Explorer 4+, all Gecko based browsers including Netscape 6+ (enabled by default) and optionally Netscape 4 and Opera 4+ (disabled by default) + progress bar now also appears while determining search source for creating the file list (by searching for "list:files") + text corresponding to the progress bars is in the proper language + delivery parameter "case" changed to "matchcase" to avoid problems with custom JavaScript + enhanced search routine for RichTextFormat documents + URL values of configuration directives (baseurl and template_url) beginning with "/" now also work with ports other than 80 + Help Window now also works with Opera browsers + distributed packages have a new, more intuitive directory structure + Apple MacOS X platform is supported ____________________________________________________________________________ v3.32 (July 4, 2001) + index format changed from character separated .csv to tabstop separated .txt files + terms containing the "|" character can now be searched which enables full support for the big5 character set (used for zh-tw) + results contain only design-neutral HTML tags with class identifiers which enhances StyleSheet support + redesigned list that contains the links to further results pages + sections within HTML files can be spanned to be excluded from being searched + added configuration directive "debug_level" to control debugging output + added directive "max_found_files" to limit the search process to a given number of found files + added directive "cgi_timeout" to control the maximal CPU time the CGI application may require + more options for the "results_global" directive + dynamic progress bars added to show the current status of the search + password protected directories can now be searched without the need to turn off the "highlightmatches" feature + the value of the "lang" delivery parameter will also be sent to the server as accepted language header (HTTP_ACCEPT_LANGUAGE) + added '-lang' option for the "geturls" command for sending its value to the server as accepted language header + the result links for viewing the found files with highlighted matches are now in the proper format (the QUERY_STRING is application/x-www-form-urlencoded) + separator between CGI arguments changed from '&' to ';' to meet W3C compliant valid HTML 4.01 + the default HTML templates now include a button to verify that the document is valid HTML 4.01 + when trying a search without a term the user will be prompted for the missing input instead of doing nothing + the Shell Executable commands now also print date and time of their actions to their log + cronjob scripts for both Unix and Windows, as well as a detailed How-To ("hse_cronjob_ReadMe.txt") are now included + language support for Norwegian and Thai ____________________________________________________________________________ v3.31 (May 27, 2001) + redesigned source code structure to consume less system's resources + added '-cat' and '-nocheck' options for the "index" command + CGI application can be run in debug mode for easier troubleshooting + viewing files with highlighted matches now also work with Non-HTML files + querying the search string to a world wide search engine out from the input form now opens a new window + MIVA Script files (.mv) are recognized as web pages + for improved security, the key file can (alternatively) be named "HomepageSearchEngine_key.cgi" + Swedish language help file added ____________________________________________________________________________ v3.3 (March 29, 2001) + found files can be viewed with all matches highlighted thanks the "highlightmatches" feature + highlighting of matches can now be done using any desired style + support of dynamic HTML templates (using any script language supported by the server) + the search string can now be queried to several world wide search engines + the restrictive search options ("Match case" and "Find only whole words") are now turned off by default + the "hse.ini" configuration file can now be saved in either DOS, Unix or Mac format + Arabic and Finnish language help files added ____________________________________________________________________________ v3.21 (February 1, 2001) + customization of results pages is now more configurable (handled by the directives "results_global" and "results_details") + changeurls functionality is now also available for on-the-fly searches + Sun's JavaServer Pages files (.jsp) are recognized as web pages + all configuration files can now have a short name alternatively (beginning with "hse" instead of "HomepageSearchEngine") + language support for Finnish and (Latin) Serbian + NetBSD platform is supported + some Portable Network Graphics (.png) images added + fixed bug: values of advanced search options may have been lost when the advanced search box was switched off ____________________________________________________________________________ v3.2 (December 14, 2000) + indexed search now performs approximately twice as fast as before + the executable file has been splitted into a smaller one and some libraries + each index file has been splitted into an index file pair of a HTML- and a Non-HTML- index + the template files "*_header.txt" and "*_footer.txt" have been merged into one "*_template.html" file + improved Shell Executable handling + language support for Hungarian and Italian ____________________________________________________________________________ v3.1 (October 27, 2000) + shell command "geturls" added that allows to grab any URLs making them searchable on your site + improved descriptions shown for each hit: the configuration directive "results_descriptions" allows "AltaVista-" or/and "Google- style" output with specified quantities + AltaVista.com or Google.com can be queried via links on the results list in addition to out from the input form + added configuration directives "results_previous_img" and "results_next_img" to use images alternatively to "[ << previous ]" and "[ next >> ]" in the link list to further results on the bottom of each result page + ColdFusion Markup Language files (.cfm) are recognized as web pages + handling of the Admin Area is more user-friendly and secure + name of the users file renamed to "HomepageSearchEngine_users.cgi" to improve security on badly configured webservers + revalorization of the Light edition: phrases and logical operators are now available in all version types + updated language help files + language support for Greek and Polish ____________________________________________________________________________ v3.03 (September 3, 2000) + shell command "changeurls" added for modifying the index files + Web based Admin Area has been added that allows authentificated users to execute HomepageSearchEngine Shell Executable in their browser, to create and modify the index files. + the help window headlines' color can be changed (by changing in the "*_help.txt" file) + language support for French + fixed bug that affected categories and was caused by some license keys ____________________________________________________________________________ v3.02 (August 8, 2000) + the Light edition is now freeware (it can be used completely free of charge) + added possibility to supress the pre-built input form + changed the results' links class identifier name to meet the W3C CSS2 recommendation + language support for Danish and Swedish ____________________________________________________________________________ v3.01 (July 17, 2000) + several different configuration sets can now be used alternatively (by delivering the "conf=DIR" parameter) + the "index" shell command has a new option '-conf=DIR' + language support for traditional Chinese, Portuguese and Russian + fixed bug regarding the index files that may have occured under IIS + fixed bug regarding switching between simple and advanced input form in the default language + fixed default international settings for Turkish ____________________________________________________________________________ v3.0 (June 18, 2000) + now the indexed search method can be applied (in addition to the on-the-fly search method), to speed up search time + the CGI executable file can also be executed on the shell (command prompt), for creating the index files + it is possible to switch between all available languages and their associated international settings (by delivering the "lang=LANG" parameter) + added configuration directive "date_format" to support international date formats + added directive "decimal_sep" to support international decimal separators + added directive "chars_alignment" to support languages read from right to eft (such as Arabic) + added directives "helpwindow_width" and "helpwindow_height" for adjusting the help window´s size + added directive "target" for specifying a target frame name where the result URLs should link to + added directive "results_href" to parse the result URLs through an external script + the results´ links have been assigned to a specific class to be accessable via StyleSheets + all files and directories beginning with "_" or "." are now always in the ban_list + better support for IIS webservers (only affects Windows 32bit packages) + language file splitted into a core file for general output ("*_lang.txt") and a file for output of the help window content ("*_help.txt") for more flexibility and easier editing + language support for Arabic, simplified Chinese, Czech, Dutch and Turkish + DEC OSF/1 platform is supported + improved licensing mechanism supports Multi Site license keys + stripping of script code and comments improved + distinction between HTML and Non-HTML files now works properly + workaround to solve problems with the help window's JavaScript code when using Netscape on Macintosh ____________________________________________________________________________ v2.07 (March 5, 2000) + user choosable search option to search Non-HTML text files including RichTextFormat (.rtf) documents added + added support for ASP (Active Server Page) pages (with the .asp extension) + added configuration directive "formtable_input-size", to customize the input form + number of possible categories increased to 25 + input terms will be checked more effective and then sorted before being applied to the search + enhanced method for sorting files that are displayed in the results pages + search string "list:pages" generalized to "list:files" + language files revised ____________________________________________________________________________ v2.06 (January 31, 2000) + added configuration directive "highlight-color" to customize the outfit of the results pages + input form can be configured to have no background + HTML code for the results page tweaked to work properly with Mozilla/5.0 (Netscape 6) pre-releases + in the results pages, the URL of a file will be printed instead of its title if it has no title ____________________________________________________________________________ v2.05 (January 18, 2000) + enhanced categories features + possibility to query your search terms to a world wide search engine + configuration file ("HomepageSearchEngine.ini") is more syntax tolerant + now there is one package compatible with all GNU/Linux distributions for x86 + GNU/Linux-mips platform is supported + fixed bug that caused bad performance when searching in some rarely occured files ____________________________________________________________________________ v2.04a (January 10, 2000) + now all special characters with corresponding HTML-entities (concerning German, Spanish, Italian, ...) will be found as both direct characters (eg. ä ñ à) and HTML-entities (eg. ä ñ à) + OpenBSD platform is supported ____________________________________________________________________________ v2.04 (December 27, 1999) + added an editable language file "HomepageSearchEngine_lang.txt" to enable support for all languages. Language files for English, German and Spanish already included. + file naming scheme changed (now all files begin with "HomepageSearchEngine") + FreeBSD platform is supported + IBM AIX platform is supported ____________________________________________________________________________ v2.03 (October 25, 1999) + added ability to search in several categories + added configuration directive "results_details" + all matches within the informations showed in the results pages can now be highlighted. + a list of all searchable files can now be viewed in the browser + BSDi BSD/OS platform is supported + HP HP-UX platform is supported + SGI IRIX64 platform is supported ____________________________________________________________________________ v2.02 (October 4, 1999) + added configuration directive "charset" for supporting international character sets + fixed bug concerning links on the results page when the number of hits per page has been changed from the default value + fixed bug: in some cases the "search_always" directive was ignored + Sun Solaris platform is supported ____________________________________________________________________________ v2.01 (September 27, 1999) + added configuration directives "formtable_border-color" and "formtable_alignment", to customize the pre-built input form + the input form defaults to a simple search and can be enhanced to an advanced search mode ____________________________________________________________________________ v2.0 (September 14, 1999) + the search string can now include phrases (enclosed between double-quotes), together with single words + each term of the search string can now be marked with the plus- or minus-sign to force or forbid its presence in found files + a checkbox has been added to automatically mark all signless terms with the plus-sign + restrictive search options arranged as a checkbox-group + added restrictive search option to find only whole words + now you can choose the parts of the web pages to be searched (you can turn on or off each part): title, description and keywords, text of the file body and alternative texts of the images + it's now possible to search for terms including special characters (eg. for "C:\boot.ini") + online-help integrated in the pre-built input form + the number of matches of all terms can be shown in the results page + the required time for the search can be shown in the results page + configuration file is now shorter and easier + text and links in the search form are now customizable via a StyleSheet ____________________________________________________________________________ v1.01 (September 6, 1999) + ranking list can now be sorted by the time of the last update, alternatively to the number of matches + Windows 32bit platform is supported ____________________________________________________________________________ v1.0 (August 2, 1999) + first public release for GNU/Linux platform