Query -->
Advanced Search | Help

Walhello > About

Knowledge
Actors/Actresses
Bands/Artists
Athletes
Celebrities
Cities
Cyclists
Formule 1 Drivers
Models
Movie Directors
Movies
Music
Acting Roles
Royalties
Skaters
Soccer Players
Sport
Television Programs
Tennis Players
More

Services
Directory
Knowledge/Ask
Picture Search
Multimedia Search
News Search
Shop Search
Classifieds
Partnerships
Advertising
Searchbox
Introducing The Walhello Search Engine

Walhello is a spider based search engine for the whole web. This index is one of the largest in the world and the basis for the Walhello search service. In addition to the basic search functionlaity based on matching keywords with most relevant web pages Walhello also provides the following integated functionality:
• News Search in news resources
• Picture Search
• A categorised Web Directory
• Product search of on-line shops
• An integrated reference with knowledge, articles and discussion boards related to keywords
• An answering engine that answers specific questions
The size of the index is continously growing and the quality of the services is improved by research on mathematical ranking algorithms and knowledge technology.

The beginning

The World Wide Web contains billions of documents containing publicly accessible useful information and knowledge. The problem of the World Wide Web is however that data on web pages is not very well structured making it difficult find relevant information and to use the information on the Internet effectively. Walhello.com started developing the Walhello (Valhalla + Web + Hello) website in March 2000 as a research & development project. The objective of this project was to structure the Internet and providing services to Internet users by granting access to this "structured Internet".

Downloading data (Appie spider)

As a first step the appie spider was developed which automatically downloads data from the World Wide Web. By extracting links from Web pages and subsequently downloading the pages corresponding with these extracted URLs. Currently millions of Web pages are downloaded and indexed on a daily basis, including pdf and Word documents.

Building World Wide Web Index (Classical Search Engine)

To structure the downloaded data software was developed which parsed the downloaded data and extracted the following information from the downloaded pages:
• Words and Locations of words on web pages
• Languages (about 40 languages are supported
• Links between web pages
This information was stored in a huge database, which was introduced on the Internet in June 2000 to help finding web sites matching a search query. Mathematically advanced ranking algorithms ensure that the most relevant results are shown first. In June 2000 this classical search engine was introduced on the Internet. At present the index contains about 2 billion web pages and is continually growing. The database is running on a very efficient architecture consisting of a cluster of cheap Linux servers. This architecture enables short response times because of parallized processing, scalability and fault tolerance. At present all application software is developed by Walhello in C/C++. Many years of research in compiler optimisation has resulted in very efficient high performing and reliable software on cheap hardware.

Ranking based on clustering and distance (Advanced Ranking Algorithms)

Based on research we concluded that information about a certain topic is clustered on adjacent Web pages. Research also showed that these clusters are unique for each search query. Walhello developed technology that can identify dynamically clusters and subsequently the size and relevance of a cluster for each search query performed. The ranking of a web site is based (in addition to characteristics of the page itself) on the distance of the page to clusters and the relevance of these clusters. Walhello is researching computational challenges to improve the ranking of web pages based on this clustering technology.

Integrating additional Search Services

To extend the search services Walhello has integrated the DMOZ Open Directory and products sold by several leading on-line shops, including Amazon.com and Allposters.com within the Walhello search engine. There are plans to integrate other external information sources as well.

Knowledge Engineering

The current search engines are mainly based on on mathematical algorithms to determine the relevancy. However humans use knowledge to determine the relevancy of web pages. Therefore Walhello started building a object oriented knowledge base which contains knowledge that can be used to better understand the semantical meaning of the content of web pages. This knowledge base combined with natural language syntactical and semantical parsing technology can be used to retrieve new knowledge and to determine potential data inconsistencies. As a new service some knowledge objects of the Walhello knowledge base are integrated within the Walhello Search engine and are made available on the Internet.

Advanced Search Option

Walhello has added proximity search as an advanced search option. The proximity search functionality allows users to define the maximum distance (number of characters) between the search terms and is a mixture between the standard keyword search and string search.

Maintaining Reference Information for search queries

Walhello has started to build and maintain reference information related to a search query to improving the search experience consisting of:
• References to News articles that match the search query
• References to Products that match the search query
• Maintaining knowledge, articles and feedback on search queries obtained from users


More information

If you want to know more about the Walhello Services you can send an e-mail to walhello@walhello.com.



 
Copyright (c) 2000-2005 Walhello.com, All rights reserved