Sphider-plus version 3.2017a - The PHP Search Engine





All required information.

[ Change Log Summary ]

- Actual release: 3.2017a


- Former versions:

Version 2.9

Version 2.8

Version 2.7

Version 2.6

Version 2.5

Version 2.4

Version 2.3

Version 2.2

Version 2.1

Version 2.0

 

 

- Older versions:

Version 1.9

Version 1.8

Version 1.7

Version 1.6

Version 1.5

Version 1.4

Version 1.3

Version 1.2

Version 1.1

Version 1.0




[ Older versions ]

Version v.1.9

Not published; only internal developing version.

 

Version v.1.8

Release date: 26. February, 2009

 

New feature: Search for media content. If activated in Admin settings, media files like

- Images

- Audios

- Videos

will be indexed and become searchable. Result listing is separated into 4 sections: found text, found images, found audio streams and found videos. Thumbnails are presented for the image results. All media results are linked to the source, so that the files could be opened with the appropriate media player. As also ID3 and EXIF data is indexed, it is possible not only to query for a media title, part of a title or suffix, but also to search for e.g. all songs of a specific author, or for all images done with 'f/2.0' or perhaps flash setting 'red-eye'.

For more details, please notice the documentation chapter Media Search

 

New feature: Index RSS and Atom feeds.

If activated in Admin settings, RSS (v.0.93 - v.2.0) and Atom (v.1.0) feeds are indexedand the content becomes searchable.

For more details, please notice the documentation chapter RSS and Atom feeds

 

New feature: Result cache for text and media queries. If activated in Admin settings, this item offers:

- Extremely reduced response time for queries already cached.

- Controller to keep the 'Most Popular Queries' always in cache.

- Separate caches for text and media results, configurable in Admin settings.

- Automatic cleaning of caches during 'Erase & Re-index' procedure.

- If debug mode is enabled, activity/status of cache is presented in Result listing.

For more details, please notice chapter Result cache for text and media queries

 

Enlarged Admin statistics. In table 'Search Log' the following items are additionally presented:

- User IP

- Users country code

- Users hostname

 

New item in Admin Settings (Section: Index Log Settings):

Suppress browser output of logging data during index / re-index.

This item will speed up index / re-index procedure and prevent browser overflow on huge amount of sites to be indexed. If activated, this setting also disables the real-time output of logging data.

 

New feature: Use the blacklist to reject queries. To be activated in Admin settings.

If the query input contains a word of the blacklist, the complete query will be deleted.

For more details, please notice documentation chapter Use of Blacklist

 

If 'Convert all to UTF-8' is activated, the files

- common_xyz.txt

- whitelist.txt

- blacklist.txt

are also converted. This is performed always when the script is started, so that this transformation is valid only for the current session.

 

If 'Enable distinct results for upper- and lower-case queries' is not selected in Admin settings, the words placed in

- common_xyz.txt

- whitelist.txt

- blacklist.txt

are converted to lower case characters, so they will match independent of their spelling in the .txt files. This is performed always when the script is started, so that this adaptation is valid only for the current session.

 

New feature in Admin statistics. In table 'Image functions', details about the installed GD library as part of the PHP environment will be presented.

 

New feature in Admin 'Clean' section: Clean text and media cache (separate items). Additionally count of results in cache and currently used memory space are presented.

 

The status of last search request (done 'in category xyz only' or in 'all sites') is cached for next query input.

 

Improved Log output if file mode is set to 'text'.

 

Additional common file for French language. Thanks to Florian Vugier.

 

Updated French language file. Thanks to Manuel Pardo, Florian Vugier and Marie-Cécile.

 

Updated Portuguese language file. Thanks to Humberto Branco.

 

Involved files that have been modified / added for this release:

.../addurl.php

.../php.ini

.../search.php

.../admin/admin.php

.../admin/admin_header.php

.../admin/configset.php

.../admin/db_main.php

.../admin/GeoIp.dat

.../admin/geoip.php

.../admin/index_media-php

.../admin/install_all.php

.../admin/install_sphider-plus.php

.../admin/install_v.1.8.php

.../admin/messages.php

.../admin/php.ini

.../admin/spider.php

.../admin/spiderfuncs.php

.../admin/thumbs/ (new empty folder)

.../converter/rss2html.php

.../converter/rss.html

.../converter/rss_parser.php

.../include/commonfuncs.php

.../include/searchfuncs.php

.../include/search_media.php

.../include/media_counter.php

.../include/search_links.php

.../include/search_media.php

.../include/common/audio.txt

.../include/common/image.txt

.../include/common/suffix.txt

.../include/common/video.txt

.../include/images/ all files

.../include/mediacache/ (new empty folder)

.../include/textcache/ (new empty folder)

.../languages/ all files

 

Attention: This release requires additional database tables and additional table rows in already existing tables. If you update from a former version of Sphider-plus, please run the .../admin/install_v.1.8.php script. If you upgrade from original Sphider or install from scratch, you don't need to run this script. Its features are also included in the other installation scripts.


Top

Version v.1.7a

Release date: 27. November 2008

 

New item in Admin / Settings / Spider Settings:

Delete special characters like dots, commas, quotes, exclamation and question marks etc. as part of words. If activated, only the 'pure' words are indexed. Secondary characters before and at the end of words are deleted. For more details, please notice chapter Delete secondary characters

Improved behaviour if charset of page to be indexed can't be detected.

Bug fixed that prevented correct link to search result.

Additional translation table to convert upper to lower case characters for Cyrillic charset.

Updated Russian language file, thanks to vipraskrutka.

 

Version 1.7

Release date: 20. November 2008

 

New item in Admin / Settings / General Settings:

- Enable Debug mode.

If selected, during index / re-index procedure the following information will be presented individual for each page:

- New links found here

- New keywords found here

For more details, please notice chapterError messages and Debug mode

 

New item in Admin / Settings / General Settings:

- Enable / Disable MySQL and PHP error messages.

It is recommended to disable the output of these messages for production systems, as they could reveal sensitive information.

For more details, please notice chapterError messages and Debug mode

 

New item in Admin / Statistics / Server Info:

- PHP security Info.

Some basic info about current server configuration, presenting the security information status of the PHP environment.

 

Completely rewritten Suggest framework.

Based on 'script.aculo.us' and 'prototype' scripts, now suggestions for non-Latin symbol and accent characters are also presented in IE browser.

Additional items in Admin settings:

- Define minimum count of query letters in order to get a suggestion.

- Show / Hide the amount of found keywords in suggestion table.

 

New capability to prepare language specific common files.

If multilingual sites, or sites with different languages, are to be indexed, this feature improves overview. Common words to be ignored during index / re-index procedure can be placed in individual files. The common word files should not be used, if 'phrase search' is the standard type of search. Sphider-plus will become problems to find complete phrases. Therefore, in Admin settings the use of the common word files may be activated / deactivated by a checkbox. For more details, please notice chapterIgnored words

 

New feature: Use a blacklist.

If the content of a page to be indexed / re- indexed contains one word of the blacklist, it will not be indexed / re-indexed. To be activated / deactivated in Admin settings

For more details, please notice chapterUse of Blacklist

 

New feature: Use a whitelist.

The content of a page to be indexed / re- indexed must contain at minimum one word of the whitelist to be indexed / re-indexed. To be activated / deactivated in Admin settings

For more details, please notice chapterUse of Whitelist

 

New feature: If available, show multiple hits of search result (per page) in result listing.

To be defined (1 - 9) in Admin / Search Settings.

 

Improved URL import / export function:

- The names of URL files now are including date and timestamp of export procedure.

- This enables the Admin to import selected URL files.

- Also a file individual delete function was included.

- Delimiter in URL file changed from "," to "|". As suggested by Ranbir.

 

Improved Admin / Settings section:

- Included directory with links to the different Setting blocks.

 

New item in Admin / Settings section:

- Backup current configuration settings. Individual files are created with date and timestamp.

- Restore configuration settings from former created backup file.

- Individual delete of backup files.

- Delete protected backup file that holds the default settings.

 

New item in Admin / Settings / Spider settings:

- Use a unique name (sitemap.xml) for all created sitemap files.

Could be selected, if only one single Site is to be indexed.

To be used in conjunction with selecting the destination folder for the sitemap files.

../ is the root folder of the Sphider-plus installation.

 

If the charset of a page to be indexed / re-indexed is not detectable, the home charset as defined in Admin settings is used.

 

Improved search function for non-Latin symbols.

 

Search function enabled for queries containing an apostrophe.

 

Included query input protection against Directory Traversals.

 

Bug fixed in index/re-index procedure that prevented indexing of last word in full text that should be stored as new keyword.

 

Improved storage of keywords in index/re-index procedure.

 

Updated Romanian language file, thanks to CyBerNet.

 

Some file types added to exclusion list in order not to be indexed / re-indexed. Thanks to clubmaster3.

 

Improved Admin Log-in for Microsoft IIS. Thanks to bobyn.

 

Involved files that have been modified / added for this release:

.../addurl.php

.../search.php

.../admin/admin.php

.../admin/admin_header.php

.../admin/auth.php

.../admin/configset.php

.../admin/confirm.js

.../admin/dbase.js (file no longer required)

.../admin/db_backup.php

.../admin/db_main.php

.../admin/ext.txt

.../admin/messages.php

.../admin/real_get.php

.../admin/real_log.php

.../admin/spider.php

.../admin/spiderfuncs.php

.../admin/url_manage.php

.../admin/phpSecInfo/ (all files)

.../converter/ConvertCharset.Class.php

.../include/categoryfuncs.php

.../include/commonfuncs.php

.../include/searchfuncs.php

.../include/search_links.php

.../include/suggest.php

.../include/ajax/ (all files)

.../include/common/ (all files)

.../include/js_suggest/ (folder no longer required)

.../languages/ro-language.php

.../settings/conf.php

.../templates/all folder/thisstyle.css

Top

Version 1.6

Release date: September 06, 2008

Build up with Sphider: v.1.3.4

 

Additional item in Admin settings to select:

- Instead of weighting %, show count of query hits in full text.

Selecting this item will also influence the order of result listing. Now only the number of keyword hits in full text will define the position of a page in result listing.

Additional item in Admin settings to select the chronological order of result listing:

- 'Most Popular Links ' on top.

Activating this item, Sphider-plus will present the result listing in order of before learned link attractivity. Defined as those links with the best user acceptance (clicks).

Additional items in Statistics overview:

- Queries total

- Link clicks total

Additional item in Admin / Statistics:

- Most Popular Links.

Presenting the quantity of clicks individual for each link with date and time of last click. Also the latest query before clicking that link is presented.

Additional item in Admin / Clean:

- Clear 'Most Popular Links' log.

Additional item for re-index procedure:

- Temporary ignore 'robots.txt'.

If utf-8 support is activated, result listing now is independent for queries with upper- or lowercase letters. Or alternatively, if selected in Admin settings, distinct results for case sensitive queries could be performed.

Improved utf-8 support for non-Latin characters.

Improved suggest framework for utf-8 support. Now offering suggestions

- for phrases

- for accented letters

- for non-Latin characters

Known issue: Well working for Firefox and Opera browser, for non-Latin characters IE is not cooperative. Need to rewrite the Suggest Framework completely for a browser independent presentation of the suggestions.

Improved search functionality for queries with accent letters without selecting the utf-8 support.

Phrase search improved, so that common words and too short (min_word_length) words could be used as part of the query phrase and are no longer marked as ignored.

Improved functionality for 'Most popular searches'. Now also

- Advanced search settings

- Categories

- Mode of highlighting

- Results per page

will be taken into account when clicking a 'Most popular searches' suggestion.

Bug fixed that seduced Sphider to follow links that are placed in HTML comments.

Bug fixed that created a wrong weighting calculation for keywords placed

- behind a word that did not match 'min_word_length'

- behind a 'common' word

- first found in full text

Bug fixed in 'Strict search' that caused invalid highlighting in result listing.

 

Involved files that have been modified / added for this release:

.../search.php

.../admin/admin.php

.../admin/configset.php

.../admin/install_all.php

.../admin/install_bestclick.php

.../admin/install_sphider-plus.php

.../admin/spider.php

.../admin/spiderfuncs.php

.../include/click_counter.php

.../include/commonfuncs.php

.../include/searchfuncs.php

.../include/js_suggest/suggest.php

.../include/js_suggest/SuggestFramework.js

.../languages/ all files

..,/settings/conf.php

Attention: Starting with version 1.6, Sphider-plus supports logging of 'Most popular links'. This item requires additional rows in 'links' table of the database. If you update from a former version of Sphider-plus, please run the .../admin/install_bestclick.php script. If you upgrade from original Sphider or install from scratch, you don't need to run this script. Its features are also included in the other installation scripts.


Top

Version 1.5

Release date: July 14, 2008

Build up with Sphider v.1.3.4

 

Improved Suggest Framework. Now suggestions are presented also for queries with accented letters.

Enable real-time output of logging data. Selectable in Admin setting together with the update interval (1 - 10 seconds).

In order to prevent performance problems and memory overflow for large amount of URLs, Sphider-plus may clean resources during index / re-index. Selectable in Admin settings, this item periodical will:

- Free memory that is allocated to unused MySQL recourses.

- Unset PHP variables, which are no longer required.

Define max. length of URL presented in result pages.

An additional input field in Admin "Search Settings" is presented for Admin determination.

For 'Maximum length of page title displayed in search results' the title now will be broken at the end of the word exceeding the defined length. Not inside a word at the character count limit defined in Admin setting.

PDF converter for LINUX/UNIX Operating Systems included.

Needs to be individualized according to readme.pdf documentation, chapter

'PDF converter for Linux server.' Thanks to rasc.

Additional item in Admin section: Server Info

To be found in submenu 'Statistics', important information are presented for:

- Server

- Environment

- MySQL

- PDF converter

- php.ini file

- PHP integration

Enlarged Admin interface if database is empty.

Improved printout for database connection problems. Now MySQL error message is included.

Improved printout if text converter could not extract words from PDF, DOC, XLS etc. files.

Improved printout for Database Backup Management.

Modified installation script. Thanks to Flemp.

Font file renamed to captcha.tff (former: captcha.TTF). Thanks to ethix.

All style sheets now are centralized in .../templates/all_folders/thisstyle.css

Consequently the file .../include/js_suggest/SuggestFramework.css is no longer required.

Function 'create sitemap()' improved for XML conformity and moved from script .../admin/spider.php to script ../admin/spiderfuncs.php.

Bug fixed in 'Phrase Search' if UTF-8 support is not selected.

Bug fixed in highlighting of found keywords on result page.

Some small bug fixed for mysql queries.

 

Involved files that have been modified / added for this release:

.../search.php

.../admin/admin.php

.../admin/admin_footer.php

.../admin/configset.php

.../admin/db_backup.php

.../admin/db_main.php

.../admin/install_all.php

.../admin/install_reallog.php

.../admin/install_sphider-plus.php

.../admin/messages.php

.../admin/real_get.php

.../admin/real_log.php

.../admin/real_ping.js

.../admin/spider.php

.../admin/spiderfuncs.php

.../converter/pdftotext

.../converter/pdftotext.script

.../include/captcha.tff

.../include/commonfuncs.php

.../include/searchfuncs.php

.../include/js_suggest/suggest.php

.../settings/conf.php

.../settings/database.php

.../templates/all_folders/navdown.jpg

.../templates/all_folders/thisstyle.css

Attention: Starting with version 1.5, Sphider-plus supports real-time output of logging info during index / re-index procedure. This item requires an additional table for the database. If you update from a former version of Sphider-plus, please run the .../admin/install_reallog.php script. If you upgrade from original Sphider or install from scratch, you don't need to run this script. Its features are also included in the other installation scripts.


Top

Version 1.4

Release date: May 28, 2008

Build up with Sphider v.1.3.4

 

In Admin settings the method of chronological order for result listing can be defined.

Results ordered by:

- Relevance (weight)

- Main URLs (domains) on top

- First URL names and then weight

- Only top 2 per URL

The mode of chronological order for result listing is shown as additional headline on top of the result pages. To be activated in Admin settings.

Select method of highlighting for found keywords in result listing.

If 'Advanced search' is activated, the user may select:

- bold text

- marked yellow

- marked green

- marked blue

The default highlighting can be defined in Admin settings.

If in Admin settings the option ' Index words in Domain Name and URL path' is activated, found keywords now are highlighted also in result listing (row URL).

If in users browser JavaScript is disabled, a warning message is displayed on top of the search form that full functionality of Sphider-plus will not be available (required for the suggest framework).

Improved printout for 'Show sites in category'. If in Admins 'Site options' the content for 'title' was not included, now title and short-description will be fetched from the HTML header (of the indexed sites). If also this information is not available, a warning message will be displayed.

Enable index and re-index for pages with duplicate content.

Additional item in Admin settings:

- If selected, pages with content that was already indexed by another page will also be indexed/re-indexed. A warning message together with the URL that also holds the duplicate content will be presented in spider log output.

- If not selected, the link (page) will be ignored. Never the less the message and URL info will be presented.

Improved function 'If available follow sitemap.xml' in order to prevent 'Page is duplicate' messages.

Improved printout if PDF files cause indexing problems.

If 'Follow sitemap.xml' is activated and a valid sitemap was found, the log output

Links found: 0 - New links: 0

is no longer shown. Because all links are delivered from the sitemap file and new links are not searched during index / re-index.

An eventually non-existing log folder will be created automatically during index / re-index process. So, the message 'Logging option is set, but cannot open a file for logging.' will be prevented.

If in Admin browser JavaScript is disabled, a warning message is displayed on top of Admin page that full functionality of Sphider-plus administration will not be available (required for warning messages).

Updated Romanian language file by CyBerNet.

Corrected Spanish language file by Willy.

Bug fixed in index / re-index function that caused problems to index words which consist only of upper case characters.

Bug fixed in index / re-index function that caused problems to index words containing the ' à ' character.

Some small improvements for result printout.

Length of words to be indexed is increased to 255 characters per word.

 

Involved files that have been modified / added for this release:

.../search.php

.../admin/admin_header.php

.../admin/auth.php

.../admin/configset.php

.../admin/install_all.php

.../admin/messages.php

.../admin/spider.php

.../admin/spiderfuncs.php

.../include/searchfuncs.php

.../include/search_links.php

.../languages/ all files

.../settings/conf.php

.../templates/all_folders/thisstyle.css


Top

Version 1.3.a

Build up with Sphider v.1.3.4.b

 

Individual 'Erase & Re-index' function for single sites.

- Additional item in Admin sites submenu ' Manage Site Indexing Options'

- 'Erase & Re-index' functionality for selected site

Translated Spanish and Dutch language files. Thanks to Willy

 

Involved files that have been modified / added for this release:

.../admin/admin.php

.../languages/es-language.php

.../languages/nl-language.php

 

Version 1.3

Release date: March 31, 2008

Build up with Sphider v.1.3.4.b

 

Tolerant search

- Selectable in search-box like AND/OR/Phrase and as new item: 'Tolerant search'

- Presents results that are 'like' the query as an integrated 'Did you mean'

- Presents search results for queries with e=é=è=ê, ä=a, Ü=U etc.

- Results are independent whether the user enters e or é or ê in the search query

Clear Category table

- Additional item in Admin section ' Database & Log Cleaning Options'

- Deletes all categories not associated with any valid site

Fixed charset to UTF-8 for User Suggestion Form (addurl).

 

Involved files that have been modified / added for this release:

.../addurl.php

.../search.php

.../admin/admin.php

.../admin/spider.php

.../include/searchfuncs.php

.../languages/ all files


Top

Version 1.2

Build up with Sphider v.1.3.4.b

 

UTF-8 support for (nearly) all charsets.

- Selectable in Admin settings the translation into UTF-8 charset can be enabled.

- Index and search functionality for Unicode.

- Please notice the important information and details to be found in chapter: UTF-8 Support and 'Preferred Charset'

Individual preferred charset.

- Charset for result page can be defined in Admin settings.

- This option will be overwritten by the UTF-8 option.

Use of 'Default results per page' (10, 20, 30, 50) also for Sites table in Admin section.

Use of 'Default results per page' (10, 20, 30, 50) also for Link search (site:).

Included PHP version check before admin.php could be used.

Translated Danish language file. Thanks to Brian Jorgensen

Media files excluded from index/re-index procedure.

Enlarged file list in .../admin/ext.txt

Improvements and bug fixes in:

- 'Admin settings' dialog

- 'Did you mean' option

- !strict search

- Converter for non-HTML files

- Site search (site:)

- Addurl suggest form

 

Involved files that have been modified / added for this release:

.../addurl.php

.../search.php

.../admin/admin.php

.../admin/admin_header.php

.../admin/auth.php

.../admin/configset.php

.../admin/db_main.php

.../admin/spider.php

.../admin/spiderfuncs.php

.../admin/ext.txt

.../converter/ConvertCharset.class.php

.../converter/charsets/ all files

.../include/searchfuncs.php

.../include/search_links.php

.../include/js_suggest/suggest.php

.../language/ all files

.../settings/conf.php


Top

Version 1.1

Build up with Sphider v.1.3.4.b

 

Included converters for indexing PDF, DOC, RTF, XLS and PPT files. To be activated individually in Admin settings

Warning message during index process when deactivated file was found Captcha protection for Submission Form 'Suggest a new Site'. Use of Captcha to be activated in Admin settings

Automatically adapt Sphider's dialog to user language.

Improved version by ^demon

Bug fixed in language depending user dialog.

Bug fixed in function check_robot_txt.

 

Involved files that have been modified / added for this release:

.../addurl.php

.../search.php

.../converter/ all files

.../converter/charsets/ all files

.../admin/configset.php

.../admin/ext.txt

.../admin/messages.php

.../admin/spiderfuncs.php

.../include/captcha.TTF

.../include/make_captcha.php

.../languages/ all files

.../settings/conf.php


Top

Version 1.0.a

Build up with Sphider v.1.3.4.b

Bug fixed in function validate_email

Involved file that has been modified / added for this release:

.../include/commonfuncs.php

 

Version 1.0

Release date: February 15, 2008

Based on the original Sphider v.1.3.4.a by Ando Saabas, the following items are modified:

 

Define min. relevance level (weight %) for results to be presented at result pages. To be defined in Admin settings.

Enable user suggestion for new Url to become part of Sphider-plus database (addurl by user). To be activated in Admin settings, the user is enabled to suggest sites.

- If enabled, a link at the footer of the result page leads to the suggestion form.

- The user will have to fulfil 'Url', 'Title', 'Description' and 'Dispatchers e-mail account'.

- Checked for valid input, DNS availability and MX-RR validation of dispatchers account.

Suggested Url will be stored in the Sphider-plus database until Admin decision.

- Suggested sites are presented in Admin submenu 'Approve sites' so that the admin may decide to

- accept

- reject

- bann

- Result of decision will be mailed to the dispatcher (if selected in Admin settings).

- Included is also the submenu 'Banned domains' to refuse all sites not welcome for this search-engine.

Create a sitemap during index/re-index.

- Compatible with http://www.sitemaps.org/schemas/sitemap/0.9 this module automatically creates a sitemap.xml file.

- In Admin settings the folder name for the sitemaps can be defined.

- The xml files will be individually named like 'sitemap_www.abc.de.xml'

- When running a 'Re-index', 'Re-index all' or 'Erase & Re-index' existing sitemaps will be overwritten with the actual data set. For index/re-index follow sitemap.xml (to be activated in Admin settings). If available Sphider-plus will use the sitemap to follow all links of that domain. This increases significant the speed for index and re-index. The mod will also force Sphider-plus to re-index only links that are:

- New and not jet known in Sphiders link table

and

- Links whose 'last modified' date is newer than Sphider's 'last indexed' date.

Search for part of a word by means of * wildcards.

This mod enhances the Sphider-plus capabilities to search also for parts of a word. Invoke this mod by entering a * as first character of your search query. You may use * wildcards like:

*searchme

*searchall*

*search*more*

Search !strictly for the search query.

Invoke this variant by entering a ! as first character of your search query. If you search for '!plus' only results for the word 'plus' will be presented in the result pages. No results for words that contain 'spider-plus' or 'spiderplustec' will be shown. This is the reverse function of 'Search for part of a word by means of * wildcards'

Search for all pages of a site.

This utility searches for all that pages, which belong to a domain. Initialize your search query with 'site:' followed by the domain you want to check. Also parts of domain names like 'site:www.abc.de' or 'site:abc.de' are valid search queries. The mod searches for all links in Sphider's link-table but not in the stored keywords. The search output has the same look and feel as usual in Sphider-plus search results.

Enabled search for dates like 2008-11-03 , 03/11/2008 or 03.11.2008

Enabled suggestion also for search queries that containing upper case characters.

Automatically adapt Sphider's dialog to user language.

This mod detects the language of visitors client and selects the according language from Sphider's language folder. If not available, Sphider will use the language as defined in Admin settings. Auto-detection may be enabled by checkbox in Admin settings

Show 'Most popular searches' table at the bottom of result pages. Selectable in Admin settings, the most popular queries are presented on the bottom of each result page. Count of rows for 'Most popular searches' is also to be defined in Admin settings.

Warning message if search string is only found in Url or <title> tag.

If the search string will be found only in title or Url, but not in the HTML body or meta tags, there is no short description for that Url with no possibility to highlight the search string. A warning message will be displayed instead: "Search string was found only in page title or Url." This mod is Admin selectable.

Index only new sites.

Additional item in Admin Sites submenu for bulk indexing of all the new sites that were added since last index/re-index.

Erase & Re-index.

Additional item in Admin Sites submenu that will clear the database and perform a re-index. Clear database done before the re-index will leave the following untouched:

- Categories

- Query log

- Sites and all options: spider-depth, last indexed, can leave domain, title, description, url must include, url must not include.

Limit max. link count to be indexed for each Url.

In Admin settings the count of links to be followed per Url is selectable. Will be followed by:

- Index

- Index only the new

Perform a link-check instead of re-index.

Selectable in Admin settings, a fast running link-check can be performed. Unreachable links are automatically deleted from Sphiders database.

Define max. length of title presented in result pages.

An additional input field in Admin "Search Settings" is presented for Admin determination.

Dynamic adaptation of <title> and <h1> tags.

In order to create an individual title for the result pages, a new input field in Admin settings 'Search Settings' is presented. Additionally the result page <title> in HTML-header is provided with

- User defined title

- Category (if selected)

- Search query

- Page number of results

New Admin Sites Option menu design with additional utilities. Based on the XHTML valid Admin by Peter__LT

3 new template designs selectable in Admin settings. Based on preparatory work by Peter__LT

The template folder contains only those files that are responsible for the design

Additional Admin Sites submenu: List all pages that belong to the selected site. To be found in Sites / Options / Pages a list is shown with:

- Page Url

- Last indexed date

- Page size

Validate all user input for security acceptance. All entries are checked

- Delete quotes

- Place backslash in front of special characters

- Shell commands, XSS attacks and SQL injections are blocked

Additional .htaccess security file. Prepared for:

- Prevent listing of folder content (files)

- Redirect client queries to search.php

- Prevent delivery of internal files

Sort Admin's Site table in alphabetic order. Selectable by checkbox, the table is presented in alphabetic order or by index date.

Export all current Url's from Admin section.

A file 'url.txt will be created with all existing Url's in folder .../admin/urls/

Import url.txt file from folder .../admin/urls/

The content of file 'url.txt will be copied into Sphider-plus database. Existing Url's will be lost and overwritten. Following rules are valid for the url.txt file:

- Url's must be in format: url,spider-depth,category

like:

http://www.abc.de

http://www.abc.de,2

http://www.abc.de,-1,Info

http://www.abc.de,3,Funny things

- Rows must be separated with 'LF'

- Url, spider-depth and category must be separates by commas.

- If you don't specify spider-depth it is automatically set to '-1'.

- Also category is optional. If not specified the new site will be stored without category.

- Not specifying spider-depth but category requires: url,,category-name

Delete Spider log. Spider log files now can be deleted separately or as bulk delete.

Added in the submenus:

- Admin / Clean / Clean Spider log

- Admin / Statistics /Spider logs

Search in categories: Four bugs fixed.

The following items are modified for proper function of Sphider's Category search:

- The 'Search' button now also sends the variable to the search script.

- Selecting 'Next' or the other page selections (on bottom of the result page), now transfers also the variable and to the search script.

- The check boxes 'Search only in category . . ' and 'All sites' are no longer pre-selected. So, once selected 'Search only in category . . ', you may now select search result page 2, 3, 'Next' and 'Previous' together with the category search.

- If 'Search only in category . . ' is selected, an additional headline is presented. So the user is informed about the actual situation.

Database Backup and Restore. Bug fix by re-writing the complete Database Management. Before backup, the 'Optimize Database' function is automatically performed.

- Separated folders for each backup task.

- Backups now are stored in individual files for each table.

- Backup utility selectable for: 'Structure only' or 'structure plus data'

- Unlimited file size for restore function is ensured.

- Backup files compatible to phpMyAdmin.

- Optimize Database. Bug fix by re-writing the complete Database Management.

Links that do not contain page name are now correctly followed (Bug fix by BenRosey) Original Sphider does not except links like <a href="?id=3">link text</a> Thanks to the bug fix of BenRosey Sphider-plus follows correctly.

Links that do not contain slash at the end of the Url are now correctly followed (Bug fix). Original Sphider does not except links like: http://www.abc.de Sphider-plus adds the required slash automatically like: http://www.abc.de/

Correct template selection for different css files in different template folders (Bug fix).


Top

The Sphider-plus honeybee