~ Main search engines ~
         to basic    Main
search
engines
Updated February 2004
    
SEARCH ESSAYS OF CHOICE
effective queries ~ Search Engines Anti-Optimization ~ Fishing for troubles ~ Music searching ~ Catching the rabbit's ears ~ When your search fails ~ Follow Links in the Underground ~ Google's wild side ~ Using Fuzzy Logic ~ A Re-ranking trilogy ~ Searching scarcity ~
Instructions   ~   Ok:flange of myth ~   pda searches

Fravia's searching MAPA (masks and pages)    WFR: Width, Freshness and Relevance
Best s.e.
Google  
 ¤[GOOGLE]¤
Fast  
 ¤[FAST]¤
Teoma  
 ¤[TEOMA]¤
Best/Main s.e.
Inktomi
¤[INKTOMI]¤
Hotbot
¤[HOTBOT]¤
Northernlight (dead...? :-)
Lycos
Yahoo!
Alta!
Adva  Simple
Useful s.e.
Openfind stale :-(
Gigablast stale :-(
Yuntis (a next-generation engine)
Auxil s.e.
Kart00 (graph)
Touch (graph)
MSNsearch
Ouverture
Looksmart
Excite (ill)
Other
Wayback
Wisenut
[FTPSEARCH]
@ PHP
¤[Our searching scrolls!]¤
[600 engines for next to nothing]
@ fravia's
Targets
Local
Regional
Compound
Usenet
Accmail
@ fravia's
Live searches
Page Providers
Combing
Details
Databases
Allinones
@ fravia's
Images
Books
Laws
Files
Filez
Passwords

WFR: Width, Freshness and Relevance
Good Width: Google / Fast / Openfind / Wisenut Narrow: Altavista / Northernlight / Yahoo
Freshy and Crispy: Fast / Teoma / Gigablast / Yuntis Stale: Openfind / Wisenut / Gigablast
Relevant results: Teoma / Fast / Google / Hotbot / Inktomi / Yuntis Spammed heavily: Altavista / MSNsearch / Google
Rare results: Hotbot/Inktomi  
NB: There are few main search engines without crap paid results in the SERPS


 Always 100 rez
    fastsearching for:

Instructions & Caveats

Just copy this page onto your harddisk as c:\main.htm (or whatever), and then use it (after having edited anything you fancy) in order to perform EFFECTIVE searches on the web (and elsewhere) using the main search engines.

Note that just because one, hundred, or thousand pages from a site are crawled and made searchable trough one of the main search engines, this does not guarantee that every page from an indexed site has really been crawled and indexed. This shortcoming hits not only 'new' pages, that can take MONTHS to be indexed: beehives of spiders harvesting a site often MISS whole subdirectories, old and new. Useful material may be all but invisible to those that only use 'main' search tools to seek. Moreover anyone that uses regularly google (for instance, but other search engines are not that different) will have noticed how DISTURBING commercial sites results nowadays are. Would a search engine have a "please eliminate all commercial sites form SERPS" (Search Engines Result Pages) option, or switch, it would become king of the hill in a couple of months.

Anyway, when searching, you would be well advised to use regional engines, usenet and other specialized or targeted search tools, combing techniques and your own bots as well, when searching your various targets.

Note that you can also easily search and find targets that do not exist any more :-)

three rings    Google, or Teoma or Fast? That is the question.
The only correct answer is: "both... (troth?) and many more".
Go forth, explore the main search engines, and remember the differences:

One Google to rule them all with broad might, and americanocentric censorship, one Teoma to refine and refine and refine the queryes and find them, one Fast for the obscure deep web, to bring them all in the depth, and in the darkness bind them...

old seekers' lore, does not take into account good ole hotbot, btw.


SEARCH ENGINES FORMS
(Use the MAPA to navigate)


ALTAVISTA ADVANCED SEARCH [Only 400 results viewable]
AND,OR,(),NOT,NEAR,",*
link:text (search for links to 'text') anchor:text (search for links with the description 'text') url:text (search for given text in the url) domain:targetdomain (search files within 'targetdomain') host:hostname (search files on 'hostname') title:text (search 'text' inside the title tags) applet:text (search Java applets named 'text') image:filename (search images with such 'filename')

Read the Altavista in depth page!
Spammed as if there were no tomorrow & very badly commercialized.
The idiots behind altavista's marketing managed to ruin the best search engine of the middle nineties.
It is still THE ONLY search engine which is TRULY BOOLEAN, hence offering truly amazing opportunities to real seekers... once you have taken care yourself of the spam.

Altavista algos' main drawback is that they are very easy to spam, so you'll get most useless results in the first 20-30 positions: "hic alta, hic salta" (a seekers' proverb)... experienced searchers mostly jump directly in the middle of altavista's results lists.
Altavista is the 'dead links champion' among the 'main' search engines. Use the Simple search (which defaults to OR) ONLY if you really know what you are doing :-)

Boolean query: 

            Sort by:

        Language:          Show one result per Web site

                From:     To:   (e.g. 31/12/99)

Simple search - Graphic Version
ALTAVISTA SIMPLE SEARCH [Only 400 results viewable]
For boolean operators, and more info, use Advanced Altavista instead!

Ask AltaVista a question.  Or enter a few words in

search refine

Search - Advanced



Kart00
A "Graphical" search engine, rather interesting result clusters.
Here follows the text search form, but by all means try the cartographic interface

Worldwide web   English web  
more options    To use the best of KartOO, try the cartographic interface.


Try Openfind
Truly staggering results...
"MySearch" let the user register his/her interested terms. The system will automatically search in the new-page database every day and notify the user of matches if any of the registered queries are matched. A Whats'new system is also provided.



What's New Daily Search Preference
All Pages English Pages


Looksmart ~ For instance: searchlores
Quite commercial oriented... powered by Inktomi... but uses its own databases!
Search for    

Yuntis

(a next-generation engine)
Computation, serving to the user, and application to query result ordering of three kinds of global linkage-based web page scores: reputation, credibility, and portality page scores. These page scores are derived from the properties of the web linkage graph for the set of crawled pages as a whole. The used score computation algorithm is more general and accurate than the published description of Google's PageRank method.
Reputation and credibility scores provide good estimates of respectively the overall importance of a web page and the level to which the (meta) information on a page can be trusted as implicitly expressed using hyper-links by the web site authors on the pages examined by Yuntis.


Look for          Help
Show  results using the  order in  mode

For instance: http://yuntis-wrld.ecsl.cs.sunysb.edu/search?q=fravia&s=r&e=15&o=r&m=c

The Wayback machine
This is not only a -powerful- search engine, but also an incredible stalking tool! Explore the Net as it was!


YAHOO [Only 677 results viewable]
",*

Yahoo recognized the tragical mistake of going commercial and went 'back to basic' in late 2002 (better late than never) now, combined with google, seems to be gaining momentum

EXCITE [Only 4011 results viewable]
AND,OR,(),NOT,,",
Excite is a classical example of just another 'ignoble corporate merge'. Just click on rthe link above and look at it! See? Idiotical & useless, obsolete (late-ninety) 'portal' approach. As a consequence it ceased to be a major player in January 2002 when Infospace killed it injecting tons of paid search results. This applies to all merges btw: attempts to escape the fate of all pyramide schemes that always forebode catastrophes. Recently the Italians and Germans at Tiscali have try to revamp this engine on the sunset boulevard. It is still full of pay-per-click crap, so noone in his right mind uses it.


 Web Search 
exclude words 
search in 

excite image search (powered by fast)

 Image Search 
Format  ALL  JPEG  GIF  BMP 
Type  ALL  COLOR  B/W  LINE ART 

Visit the ad hoc GOOGLE section
WARNING: Google has been moved to its specific page, where you will find a wealth of information. Here only the mask:

Simple Google


        
Advanced GOOGLE
(only 3% of the users take advantage of it, poor zombies :-)

G. Univ search  ~  G. Classical :-)

and a nice "GoogleRanking" bookmarklet: internet+searching


Googlette:


LYCOS [As many results viewable as you get!]
AND,OR,(),NOT,NEAR,",

"Part Man, Part Machine" ~ Open Directory & DMOZ used. Uses especially Fast's index, with updates at greater intervals than FAST. Major sin: Has closed the VERY useful Trondheim ftp-search facility.

Lycos advanced: fields    Lycos advanced: language    Lycos advanced: link referrals
Lycos help page
Gigablast
Most recent search engine, quite good, it seems. HEY! It has a cache, like Google!
Search for...
all of these words
this exact phrase
and this exact phrase
any of these words
none of these words
Sort by date
Restrict to this Site
Restrict to this URL
Pages that link to this URL
Site Clustering yes   no
Number of summary excerpts 0   1   2   3   4
Results per Page 10   20   30   40   50


TOUCHGRAPH

A graphical map of incoming and outcoming links, still in beta, uses google.
http://www.touchgraph.com/TGGoogleBrowser.html

INKTOMY RAW

Mighty'raw' access to Inktomi's data

(pointed out by Sally)

Search @ http://169.207.238.189/search.cfm...



Visit the ad hoc HOTBOT section

WARNING: Hotbot has been moved to its specific page, where you will find a wealth of information. Here only the mask:

or use some...
...ADVANCED WEB FILTERS FOR HOTBOT
Language 
 
Limit results to a specific language
Domain/Site 
Include    Exclude
  
Return results in specific domain (e.g. wired.com) or top-level domains (e.g. .gov). Multiple domains/sites may be specified, separated by a comma.
Region 
Limit results to a specific continent or country.
Word Filter 
  
  
  
Limit results to pages containing/excluding the words specified.

Limit your query to specific parts of pages. .
Date 
or on
Limit results to pages published within a specified period of time.
Page Content 
Audio       MS PowerPoint   Shockwave/Flash
Image       MS Word   Video
Java       PDF (Acrobat)   WinMedia
MP3       RealAudio/Video
MS Excel       Script
Specific Extension:        (e.g. .gif)
Return only pages containing the specified media types of technologies
Block Offensive Content 
Always
Sometimes (for non-ambiguous, offensive content)
Never
Prevent pages containing offensive content from being returned.


NORTHERN LIGHT [You can (awkwardly) view its deep results]
AND,OR,(),NOT,",

Now defunct for the mass-public, alas, because the search service has been taken offline :-(
Yet we can still use it, see below.
Had a unique folders feature (dynamically generated by the search results!) to refine your query (very useful & powerful). Note that this engine automatically recognized and searched variants and plurals of your query.

   
Select: All Sources Search the World Wide Web & Special Collection World Wide Web Search the entire World Wide Web Special Collection 1 million articles not on other search engines

northern light power search ...and other commercial problems...
Northernlight's powersearch has been recently limited to the 'premium collection' crap
Should Northernlight not work, (or rather not seem to work), try using the following query, and then substitute 'searchlore' with whatever you are searching IN THE ADDRESS FIELD of your browser (not in the html page form)...
http://www.northernlight.com/nlquery.fcg?qr=searchlore
But the substitution MUST BE MADE 'per hand' on the string you use, not on the resulting mask 'offered' by northernlight (and you also better use a cookies killer à la proxomitron
Visit the ad hoc FAST section
WARNING: Fast has been moved to its specific page, where you will find a wealth of information. Here only the mask:
Search for:  
After having chosen the "boolean expression" option, you can use AND, OR, ANDNOT, and parentheses (!!) for nesting

FAST's [Advanced Search] is truly amazing and second to none.

FAST's [help] FAST's [news] FAST's [pictures]

OUVERTURE

This used to be "Go To", the commercial clowns changed the name because this "reinforces our leadership in performance-based search", haha :-) Uses Inktomi, like Hotbot. Ranks results by how much a company is willing to pay for listings and is heavily akamai infested :-(
The mask below is relatively "clean", you should NEVER use ouverture's site own mask to perform your searches (has visitor tracking sniffing annoying logging options aplenty).
 

WEBTOP
Winter 2001: For some nutty reason webtop, a 'sleeping giant' that had very good search algos and a huge database, has been discontinued (or, more probably, privatized). Click this link and ask them why...
http://www.smartlogik.com/commercial_bastards_why_did_you_discontinue_webtop?.htm

Example string:
http://www.webtop.com/search/vanilla/results.htm?WEBSITE_SEARCH=1&QUERY=fravia&EXPANDED=web&Search.x=40&Search.y=10
help  powersearch
European search engine developed at Cambridge uni. Runs on Linux (of course :-). Probability and Baysian inference applied to the search process. Hence no booleans: beware! Its results may be utterly weird because instead of the traditional method of searching for a matched keyword in a document, the 'probabilistic techniques' focus on the relative value of a word - either in the search expression, or in the document being indexed.
   "within the Web Zone"

WISENUT [Only 300 results viewable]
default to AND     phrase searching: use ""     use - for NOT    
no truncation     use + to force stopwords

Example string:
http://www.wisenut.com/search/query.dll?q=%22advanced+searching+techniques%22

WiseNut is a "Korean/Japanese" new 'main' search engine. has good customization feature and one single huge database of indexed Web pages. It lacks almost all advanced search capabilities, yet it seems useful because it gives results that you will not find elsewhere.
Search for Web pages... 
... WITH ALL of these words
... WITHOUT ANY of these words
... WITH this EXACT PHRASE


Visit the ad hoc TEOMA section
WARNING: Teoma has been moved to its specific page, where you will find a wealth of information. Here only the mask:

 
Find this Phrase
advanced teoma!! (& advanced search tips)
MSNsearch

Example string:
http://search.msn.com/results.asp?q=%22advanced+searching%22&FORM=SMCA&cfg=SMCINK&v=1&ba=0&f=any&sort=&rgn=&lng=&dom=&depth=&d0=&d1=&cf=
Note however that -as usual with Microsoft's malbehaviour, the PREVIOUS QUERY you have made is indicated inside the new querystring...
http://search.msn.com/results.asp?q=%22Microsoft+sniffing+practices%22&origq=%22advanced+searching%22&RS=CHECKED&FORM=SMCRT&v=1&nosp=0&cfg=SMCINITIAL

Actually this search engine is not that bad, not as crappy as you would expect from Microsoft programmers... but it is indeed quite commercial infested. It is target - basically - for "AOL type" lusers and commercial zombies, and uses therefore the Inktomi indexing services (infamous for its PPC - Pay per click - schemes).
Note moreover that this is a 'puritan' s.e. and will not retrieve adult content everytime someone uses 'banal' adult search words. But it nevertheless will fetch any sort of filth if it has an 'unusual' searchquery input. A doomed attempt of course, as all censorship attempts are :-)
Quote: "For research, they are useless, but honestly, how many people that need to do research on the net would really use MSN? AOL? IWON?"

to basic
Ok:flange of myth
(c) III Millennium: [fravia+], all rights reserved, reversed and revealed