Anth's Computer Cave

New search system for the Cave

21st September, 2018

Anth's Computer Cave is deploying AAIMI SiteSearch boxes across the site to help visitors find the articles and tutorials they need.

AAIMI SiteSearch is an open-source drop-in search system built on Python, PHP and Javascript. It indexes every word on each page of your website and provides search results based on word-repetition.

This program is part of the AAIMI SiteMod platform. You can try it out by clicking the Site Search button in the main menu of this page.

The new search system using AAIMI SiteSearch
The new search system using AAIMI SiteSearch

Why the change?

Up till now we have used an older program from the AAIMI Project, AaimiClip.

The old search system from AaimiClip
The old search system using AaimiClip

This worked okay, but it required manually compiling lists of keywords for each page, which is not practical for large sites.

Manually predicting search terms also means that people will only find your search suggestions. Often visitors will be looking for something you have not expected, and they are out of luck if your search terms are loaded towards what you think is important.

Using AAIMI SiteSearch, which is based purely on word-repetition, visitors can get more spontaneous results.

How it works

AAIMI SiteSearch will be available for download on the 25th of September, so you'll be able to embed it in your own site.

We'll feature a comprehensive setup and usage tutorial then, but for now I'll just give you a brief overview of the system's capabilities.

There are two parts to AAIMI SiteSearch, the Python crawler that scans the server and creates wordlists, and the Pyhton/PHP/Javascript programs that parse the wordlist and display search results.

The crawler is easy to use, just run the program and it moves recursively through your web directories. It opens each HTML file and separates then extracts the content from your HTML tags. It reads every word of this content and notes the number of times each word occurs in the file.

Once you have your site indexed you embed the Aaimi SiteSearch HTML into the sidebar of your pages. You'll also need to add the Aaimi SiteSearch Javascript to your own Javascript file.

When visitors use the search box, their search terms are sent to the Aaimi SiteSearch Python program, which finds matches and returns results to the browser as pre-formated HTML.

Future improvements

AAIMI SiteSearch will release as a nightly-build with a continuous update/upgrade cycle. There are several new options on the bench now.

The results are currently ranked by the number of matching search terms. Using more search terms will produce more results, but brings the most relevant results to the top.

In future builds the word list method will be just one part of the criteria. The program will also look at entire sentences, and their context to the current search.

There will also be more exclude options to avoid indexing unwanted pages. At the moment you can exclude entire directories but not single files.

Stay tuned for more. Leave a comment below if you have any ideas.





Leave a comment on this article

Leave a comment on this article