Sphider-plus version 3.2017a - The PHP Search Engine





All required information.

[ Change Log Summary ]

- Actual release: 3.2017a


- Former versions:

Version 2.9

Version 2.8

Version 2.7

Version 2.6

Version 2.5

Version 2.4

Version 2.3

Version 2.2

Version 2.1

Version 2.0

 

 

- Older versions:

Version 1.9

Version 1.8

Version 1.7

Version 1.6

Version 1.5

Version 1.4

Version 1.3

Version 1.2

Version 1.1

Version 1.0




Version 1.0.a

Build up with Sphider v.1.3.4.b

Bug fixed in function validate_email

Involved file that has been modified / added for this release:

.../include/commonfuncs.php

 

Version 1.0

Release date: February 15, 2008

Based on the original Sphider v.1.3.4.a by Ando Saabas, the following items are modified:

 

Define min. relevance level (weight %) for results to be presented at result pages. To be defined in Admin settings.

Enable user suggestion for new Url to become part of Sphider-plus database (addurl by user). To be activated in Admin settings, the user is enabled to suggest sites.

- If enabled, a link at the footer of the result page leads to the suggestion form.

- The user will have to fulfil 'Url', 'Title', 'Description' and 'Dispatchers e-mail account'.

- Checked for valid input, DNS availability and MX-RR validation of dispatchers account.

Suggested Url will be stored in the Sphider-plus database until Admin decision.

- Suggested sites are presented in Admin submenu 'Approve sites' so that the admin may decide to

- accept

- reject

- bann

- Result of decision will be mailed to the dispatcher (if selected in Admin settings).

- Included is also the submenu 'Banned domains' to refuse all sites not welcome for this search-engine.

Create a sitemap during index/re-index.

- Compatible with http://www.sitemaps.org/schemas/sitemap/0.9 this module automatically creates a sitemap.xml file.

- In Admin settings the folder name for the sitemaps can be defined.

- The xml files will be individually named like 'sitemap_www.abc.de.xml'

- When running a 'Re-index', 'Re-index all' or 'Erase & Re-index' existing sitemaps will be overwritten with the actual data set. For index/re-index follow sitemap.xml (to be activated in Admin settings). If available Sphider-plus will use the sitemap to follow all links of that domain. This increases significant the speed for index and re-index. The mod will also force Sphider-plus to re-index only links that are:

- New and not jet known in Sphiders link table

and

- Links whose 'last modified' date is newer than Sphider's 'last indexed' date.

Search for part of a word by means of * wildcards.

This mod enhances the Sphider-plus capabilities to search also for parts of a word. Invoke this mod by entering a * as first character of your search query. You may use * wildcards like:

*searchme

*searchall*

*search*more*

Search !strictly for the search query.

Invoke this variant by entering a ! as first character of your search query. If you search for '!plus' only results for the word 'plus' will be presented in the result pages. No results for words that contain 'spider-plus' or 'spiderplustec' will be shown. This is the reverse function of 'Search for part of a word by means of * wildcards'

Search for all pages of a site.

This utility searches for all that pages, which belong to a domain. Initialize your search query with 'site:' followed by the domain you want to check. Also parts of domain names like 'site:www.abc.de' or 'site:abc.de' are valid search queries. The mod searches for all links in Sphider's link-table but not in the stored keywords. The search output has the same look and feel as usual in Sphider-plus search results.

Enabled search for dates like 2008-11-03 , 03/11/2008 or 03.11.2008

Enabled suggestion also for search queries that containing upper case characters.

Automatically adapt Sphider's dialog to user language.

This mod detects the language of visitors client and selects the according language from Sphider's language folder. If not available, Sphider will use the language as defined in Admin settings. Auto-detection may be enabled by checkbox in Admin settings

Show 'Most popular searches' table at the bottom of result pages. Selectable in Admin settings, the most popular queries are presented on the bottom of each result page. Count of rows for 'Most popular searches' is also to be defined in Admin settings.

Warning message if search string is only found in Url or <title> tag.

If the search string will be found only in title or Url, but not in the HTML body or meta tags, there is no short description for that Url with no possibility to highlight the search string. A warning message will be displayed instead: "Search string was found only in page title or Url." This mod is Admin selectable.

Index only new sites.

Additional item in Admin Sites submenu for bulk indexing of all the new sites that were added since last index/re-index.

Erase & Re-index.

Additional item in Admin Sites submenu that will clear the database and perform a re-index. Clear database done before the re-index will leave the following untouched:

- Categories

- Query log

- Sites and all options: spider-depth, last indexed, can leave domain, title, description, url must include, url must not include.

Limit max. link count to be indexed for each Url.

In Admin settings the count of links to be followed per Url is selectable. Will be followed by:

- Index

- Index only the new

Perform a link-check instead of re-index.

Selectable in Admin settings, a fast running link-check can be performed. Unreachable links are automatically deleted from Sphiders database.

Define max. length of title presented in result pages.

An additional input field in Admin "Search Settings" is presented for Admin determination.

Dynamic adaptation of <title> and <h1> tags.

In order to create an individual title for the result pages, a new input field in Admin settings 'Search Settings' is presented. Additionally the result page <title> in HTML-header is provided with

- User defined title

- Category (if selected)

- Search query

- Page number of results

New Admin Sites Option menu design with additional utilities. Based on the XHTML valid Admin by Peter__LT

3 new template designs selectable in Admin settings. Based on preparatory work by Peter__LT

The template folder contains only those files that are responsible for the design

Additional Admin Sites submenu: List all pages that belong to the selected site. To be found in Sites / Options / Pages a list is shown with:

- Page Url

- Last indexed date

- Page size

Validate all user input for security acceptance. All entries are checked

- Delete quotes

- Place backslash in front of special characters

- Shell commands, XSS attacks and SQL injections are blocked

Additional .htaccess security file. Prepared for:

- Prevent listing of folder content (files)

- Redirect client queries to search.php

- Prevent delivery of internal files

Sort Admin's Site table in alphabetic order. Selectable by checkbox, the table is presented in alphabetic order or by index date.

Export all current Url's from Admin section.

A file 'url.txt will be created with all existing Url's in folder .../admin/urls/

Import url.txt file from folder .../admin/urls/

The content of file 'url.txt will be copied into Sphider-plus database. Existing Url's will be lost and overwritten. Following rules are valid for the url.txt file:

- Url's must be in format: url,spider-depth,category

like:

http://www.abc.de

http://www.abc.de,2

http://www.abc.de,-1,Info

http://www.abc.de,3,Funny things

- Rows must be separated with 'LF'

- Url, spider-depth and category must be separates by commas.

- If you don't specify spider-depth it is automatically set to '-1'.

- Also category is optional. If not specified the new site will be stored without category.

- Not specifying spider-depth but category requires: url,,category-name

Delete Spider log. Spider log files now can be deleted separately or as bulk delete.

Added in the submenus:

- Admin / Clean / Clean Spider log

- Admin / Statistics /Spider logs

Search in categories: Four bugs fixed.

The following items are modified for proper function of Sphider's Category search:

- The 'Search' button now also sends the variable to the search script.

- Selecting 'Next' or the other page selections (on bottom of the result page), now transfers also the variable and to the search script.

- The check boxes 'Search only in category . . ' and 'All sites' are no longer pre-selected. So, once selected 'Search only in category . . ', you may now select search result page 2, 3, 'Next' and 'Previous' together with the category search.

- If 'Search only in category . . ' is selected, an additional headline is presented. So the user is informed about the actual situation.

Database Backup and Restore. Bug fix by re-writing the complete Database Management. Before backup, the 'Optimize Database' function is automatically performed.

- Separated folders for each backup task.

- Backups now are stored in individual files for each table.

- Backup utility selectable for: 'Structure only' or 'structure plus data'

- Unlimited file size for restore function is ensured.

- Backup files compatible to phpMyAdmin.

- Optimize Database. Bug fix by re-writing the complete Database Management.

Links that do not contain page name are now correctly followed (Bug fix by BenRosey) Original Sphider does not except links like <a href="?id=3">link text</a> Thanks to the bug fix of BenRosey Sphider-plus follows correctly.

Links that do not contain slash at the end of the Url are now correctly followed (Bug fix). Original Sphider does not except links like: http://www.abc.de Sphider-plus adds the required slash automatically like: http://www.abc.de/

Correct template selection for different css files in different template folders (Bug fix).


Top

The Sphider-plus honeybee