| Search the site: |
Search Engines
How do I get my website listed on Search Engines?
This FAQ is to advise you about the changes afoot in the way search engines - robot or human manned - are changing their criteria that by next year will be fairly universal.
The changes do not seriously effect meta keywords and descriptions, but add two concepts; theming and linkage, which will have more weighting and these are better addressed now than later.
The Internet is growing up and search engines are being programmed to look for theming and in so doing will penalise sites which have content ranging from rose growing to car rentals (unless your site is a portal, ISP, a search engine, or a directory like Yahoo!). So if your site covers a whole load of disparate features you will be penalised by search engines. That means relegation to a lower rating or, if your site is new, it could be ignored and not listed at all. As 80% of all sites are found via search engines, non-listing and ratings below top twenty will defeat your objectives.
Be aware that theming covers linkage as well as site content.
Linkage is the second priority assessment. Without links, your web site will either be relegated to a lower rating or ignored entirely. Links which do not fit the site's 'theme' from a search engine point of view (if robots have any), will penalise the site and more than likely the site will be ignored (not listed). Links placed by other sites that do not relate to your theme will be ineffective and ignored.
The point is that you need to find and exchange links (for your site) selectively and not through and online scatter gun service. Links have to be with similar sites, sites with related themes. If your site is about art, exchanging links with the Ford Motor Corp. will penalise your site, whereas exchanging links with the Guggenheim or the Chelsea Art College or a supplier of art materials or other art sites would boost your ratings.
Therefore it is suggested that you start making lists of sites you have visited or visit which fit your theme so that once your site is launched an exchange of links can be requested by email. To avoid sprinkling each page of your site with links, it is advisable to set aside a page or two for the 20 most important links and creating a drop down menu for all links. The demands of search engines are such that sites which are linked to at least 300 other sites are not considered 'popular' and this factor will account for around 11% of the rating score.
If you have to add links to non-themed sites, we suggest, to avoid being penalised, to add a footnote or page about Sponsors. This is where the Ford Motor Corp., as in the above example, would be listed as a bona fide sponsor, but not shown on the Links pages.
Do remember that search engines have variable schedules and it takes anything from 2 days to 8 months to get listed. The sooner you submit your site, the better. If you are doing your own submissions, don't be beguiled by offers to submit your site to thousands of search engines. There are not thousands of search engines out there. There are however several thousand of FFA (free for all) sites which are like rolling billboards that put your site name at the top and depending on the site's popularity, will roll your name down and off the list in anything between 1 hour and 3 months. The former means high traffic, the latter means almost none. The true downside is that you will get inundated with automated junk mail at the rate of 700 or more per day for months.
If you want to do this to get massive exposure, we suggest that you sign up for an online email service (e.g. hotmail) and give that address to the FFAs and any service offering to submit your website to thousands of search engines.
Overall, search engine registration can be a tedious task but extremely important. With a little bit of research, you can cut down the amount of time you spend on it and increase the chances of a good listing with the top search engines in the world. Basically there are two ways in which search services work, engines and directories. The difference being that the directories are resources maintained by humans. So the more information you can throw at them the easier their job is going to be to put your site in it's correct category. The one golden rule to remember when registering a site is that it is better to do it yourself, that way you know who you have registered with. There are a lot of schemes out there that claim to register your site with all the top ten search engines in one hit for a fee but these are not a sure fire way of getting on the lists let alone on top of them. The whole process should take about a day and it is tedious but you will get results after about a month when all the registrations come through.
Once the site is registered it is often a lot easier and much preferred by search services to update the site if there are changes to the URL rather than re-register the site. They also don't like mirror sites, redirections and multiple registrations of the same site.
Usually when you register you will be asked anything from just the URL; to the URL, your email address, a description, a title, keywords, and a preferred category within the search site. The description and keywords are often in order to generate metadata tags, however you can make like easier by writing your own as not all search facilities incorporate this service within their site.
Metadata is "data about data" and is basically just an index or table of contents at the top of the front page. It is not the key element in gaining a good place on a search engine but it certainly helps. The HTTP-Equiv is used for redirections to other sites, the meta tags are the most important bits, as they set the keywords and a short 'description' phrase that search engines look at. The description is actually more important than the keywords in most cases.
Other things that may help (info from a posting by "The.Burt" from our newsgroups):
The most important part of getting your site a good listing in the search engines is a mixture of:
meta keywords
phrases and key words that the surfer will put into the search engine to locate a site.meta description
a brief description of your site which again helps the search engine to classify your site.title
try to use a title for your site that is descriptive but also uses 2 or 3 of the most popular keywords/phrases.domain name
a snappy domain like: www.how-to-cheat-on-playstation-games.com will rank a LOT higher than www.psxcheats.com in the search engines.Reciprocal links
links to other sites of the same nature. These must be quality links and definitely not the free for all page links. This is getting especially important recently as this is how Google ranks sites. The main engines I would try to get into are as follows (in order of importance): dmoz, yahoo, google, altavista, msn, lycos, excite, northernlight, inktomi, infoseek. These are the sites that will bring 97% of traffic. The remaining engines are simply not worth the effort of submission for the traffic they will bring. Under no circumstances use any form of proprietary submission program, most of the search engines will at worst ignore it and at best give you a very low ranking. dmoz and yahoo actually send a person to your site to review it - so it is of vital importance to get these 2 submissions first. The rest will spider within days (sometimes hours). Submit to Inktomi through Canada.com. Once you are in, resubmit approximately every 6-8 weeks for the spiders, in this way you will not lose your ranking.
Here's an example for you:
www.how-to-cheat-on-playstation-games.com
<head>
<title>playstation hints and tips - how to cheat on psx games</title>
<meta name=Description CONTENT="The best source of playstation hints and
tips. Includes a question and answer cheats section and patches for all
the upcoming playstation games including the new playstation 2."><meta name=Keywords CONTENT="playstation cheats, hints and tips, games, psx
games, how do I beat, how do I win at, Nintendo, console games, new
releases for the playstation, I want patches for all the new playstation
games, level by level walk through of all the major games, cracks"><meta name=robots content=XXXXXXXXXXX, XXXXXXXXX (see below)>
</head>
robots are what search engines will send out to spider your site for
inclusion in the database. The 4 allowed actions for the meta
robot are index or noindex and follow and nofollow. Index,
follow - spider will index this page and follow all the links on the
page.
Noindex, follow - spider will not index but will follow all links on
the
Index, nofollow - spider will index the page and then leave your
site Noindex, nofollow - spider does nothing
However not all spiders follow this standard, you will also need a robots.txt file. Using this file will mean the robots can be told where they can go and where they cannot.
Example robots.txt file:
User-agent: *
Disallow: /cgi-bin
Disallow: /private
In this example, we are telling the robot (aka user-agent) that it cannot access our cgi-bin and it cannot access our folder called "private".
If you want to disallow a certain robot eg the altavista robot (scooter IIRC) then the robots.txt would look like this:
User-agent: *
Disallow: /cgi-bin
Disallow: /private# altavista disallowed
User-agent: scooter
Disallow: /
So we are saying all agents except scooter may access everything except cgi-bin and private. The # means the robot cannot read that line and so is used as a comment field. If you can do all of the above your site will get more traffic.
The following is a list of the sites we at blueyonder have found
useful and a quick run down of the top search engines and
directories and how they work.
AOL Search : http://search.aol.com/
AOL Search allows its members to search across the web and AOL's own content from one place. The "external" version, listed above, does not list AOL content. The main listings for categories and web sites come from the Open Directory (see below). Inktomi (see below) also provides crawler-based results, as backup to the directory information. Before the launch of AOL Search in October 1999, the AOL search service was Excite-powered AOL NetFind.AltaVista : http://www.altavista.com/
AltaVista is consistently one of the largest search engines on the web, in terms of pages indexed. Its comprehensive coverage and wide range of power searching commands makes it a particular favorite among researchers. It also offers a number of features designed to appeal to basic users, such as "Ask AltaVista" results, which come from Ask Jeeves (see below), and directory listings from the Open Directory and LookSmart. AltaVista opened in December 1995. It was owned by Digital, then run by Compaq (which purchased Digital in 1998), then spun off into a separate company which is now controlled by CMGI.Ask Jeeves : http://www.ask.com/
Ask Jeeves is a human-powered search service that aims to direct you to the exact page that answers your question. If it fails to find a match within its own database, then it will provide matching web pages from various search engines. The service went into beta in mid-April 1997 and opened fully on June 1, 1997. Some results from Ask Jeeves also appear within AltaVista.Direct Hit : http://www.directhit.com/
Direct Hit measures what people click on in the search results presented at its own site and at its partner sites, such as HotBot. Sites that get clicked on more than others rise higher in Direct Hit's rankings. Thus, the service dubs itself a "popularity engine." Aside from running its own web site, Direct Hit provides the main results which appear at HotBot (see below) and is available as an option to searchers at MSN Search. Direct Hit is owned by Ask Jeeves (above). See the Using Direct Hit Results page to learn more about Direct Hit.Excite : http://www.excite.com/
Excite is one of the more popular search services on the web. It offers a fairly large index and integrates non-web material such as company information and sports scores into its results, when appropriate. Excite was launched in late 1995. It grew quickly in prominence and consumed two of its competitors, Magellan in July 1996, and WebCrawler in November 1996. These continue to run as separate services.FAST Search : http://www.alltheweb.com/
Formerly called All The Web, FAST Search aims to index the entire web. It was the first search engine to break the 200 million web page index milestone and consistently has one of the largest indexes of the web. The Norwegian company behind FAST Search also powers some of the results that appear at Lycos (see below). FAST Search launched in May 1999.Go / Infoseek : http://www.go.com/
Go is a portal site produced by Infoseek and Disney. It offers portal features such as personalization and free e-mail, plus the search capabilities of the former Infoseek search service, which has now been folded into Go. Searchers will find that Go consistently provides quality results in response to many general and broad searches, thanks to its ESP search algorithm. It also has an impressive human compiled directory of web sites. Go officially launched in January 1999. It is not related to GoTo, below. The former Infoseek service launched in early 1995.GoTo : http://www.goto.com/
Unlike the other major search engines, GoTo sells its main listings. Companies can pay money to be placed higher in the search results, which GoTo feels improves relevancy. Non-paid results come from Inktomi. GoTo launched in 1997 and incorporated the former University of Colorado-based World Wide Web Worm. In February 1998, it shifted to its current pay-for-placement model and soon after replaced the WWW Worm with Inktomi for its non-paid listings. GoTo is not related to Go (Infoseek).Google : http://www.google.com/
Google is a search engine that makes heavy use of link popularity as a primary way to rank web sites. This can be especially helpful in finding good sites in response to general searches such as "cars" and "travel," because users across the web have in essence voted for good sites by linking to them. The system works so well that Google has gained wide-spread praise for its high relevancy. Google also has a huge index of the web and provides some results to Yahoo and Netscape Search.HotBot : http://www.hotbot.com/
HotBot is a favorite among researchers due to its many power searching features. In most cases, HotBot's first page of results comes from the Direct Hit service (see above), and then secondary results come from the Inktomi search engine, which is also used by other services. It gets its directory information from the Open Directory project (see below). HotBot launched in May 1996 as Wired Digital's entry into the search engine market. Lycos purchased Wired Digital in October 1998 and HotBot as a separate search service.IWon : http://www.iwon.com/
Backed by US television network CBS, iWon has a directory of web sites generated automatically by Inktomi, which also provides its more traditional crawler-based results. iWon gives away daily, weekly and monthly prizes in a marketing model unique among the major services. It launched in Autumn 1999.Inktomi : http://www.inktomi.com/
Originally, there was an Inktomi search engine at UC Berkeley. The creators then formed their own company with the same name and created a new Inktomi index, which was first used to power HotBot. Now the Inktomi index also powers several other services. All of them tap into the same index, though results may be slightly different. This is because Inktomi provides ways for its partners to use a common index yet distinguish themselves. There is no way to query the Inktomi index directly, as it is only made available through Inktomi's partners with whatever filters and ranking tweaks they may apply.LookSmart : http://www.looksmart.com/
LookSmart is a human-compiled directory of web sites. In addition to being a stand-alone service, LookSmart provides directory results to MSN Search, Excite and many other partners. Inktomi provides LookSmart with search results when a search fails to find a match from among LookSmart's reviews. LookSmart launched independently in October 1996, was backed by Reader's Digest for about a year, and then company executives bought back control of the service.Lycos : http://www.lycos.com/
Lycos started out as a search engine, depending on listings that came from spidering the web. In April 1999, it shifted to a directory model similar to Yahoo. Its main listings come from the Open Directory project, and then secondary results come from the FAST Search engine. Some Direct Hit results are also used. In October 1998, Lycos acquired the competing HotBot search service, which continues to be run separately.MSN Search : http://search.msn.com/
Microsoft's MSN Search service is a LookSmart-powered directory of web sites, with secondary results that come from Inktomi. RealNames and Direct Hit data is also made available. MSN Search also offers a unique way for Internet Explorer 5 users to save past searches.Netscape Search : http://search.netscape.com/
Netscape Search's results come primarily from the Open Directory and Netscape's own "Smart Browsing" database, which does an excellent job of listing "official" web sites. Secondary results come from Google. At the Netscape Netcenter portal site, other search engines are also featured.Northern Light : http://www.northernlight.com/
Northern Light is another favorite search engine among researchers. It features a large index of the web, along with the ability to cluster documents by topic. Northern Light also has a set of "special collection" documents that are not readily accessible to search engine spiders. There are documents from thousands of sources, including newswires, magazines and databases. Searching these documents is free, but there is a charge of up to $4 to view them. There is no charge to view documents on the public web only for those within the special collection. Northern Light opened to general use in August 1997.Open Directory : http://dmoz.org/
The Open Directory uses volunteer editors to catalog the web. Formerly known as NewHoo, it was launched in June 1998. It was acquired by Netscape in November 1998, and the company pledged that anyone would be able to use information from the directory through an open license arrangement. Netscape itself was the first licensee. Lycos and AOL Search also make heavy use of Open Directory data, while AltaVista and HotBot prominently feature Open Directory categories within their results pages.Snap : http://www.snap.com/
Snap is a human-compiled directory of web sites, supplemented by search results from Inktomi. Like LookSmart, it aims to challenge Yahoo as the champion of categorizing the web. Snap launched in late 1997 and is backed by Cnet and NBC.WebCrawler : http://www.webcrawler.com/
WebCrawler has the smallest index of any major search engine on the web think of it as Excite Lite. The small index means WebCrawler is not the place to go when seeking obscure or unusual material. However, some people may feel that by having indexed fewer pages, WebCrawler provides less overwhelming results in response to general searches. WebCrawler opened to the public on April 20, 1994. It was started as a research project at the University of Washington. America Online purchased it in March 1995 and was the online service's preferred search engine until Nov. 1996. That was when Excite, a WebCrawler competitor, acquired the service. Excite continues to run WebCrawler as an independent search engine.Yahoo : http://www.yahoo.com/
Yahoo is the web's most popular search service and has a well-deserved reputation for helping people find information easily. The secret to Yahoo's success is human beings. It is the largest human-compiled guide to the web, employing about 150 editors in an effort to categorize the web. Yahoo has over 1 million sites listed. Yahoo also supplements its results with those from Google. If a search fails to find a match within Yahoo's own listings, then matches from Google are displayed. Google matches also appear after all Yahoo matches have first been shown. Yahoo is the oldest major web site directory, having launched in late 1994.
Additional Information from newsgroup posting by
"The
Burt" in blueyonder.discussion.general and from Search Engine Watch
[www.searchenginewatch.com].
Last Amended : 2004-08-07 by elfin
Original Author : Fiona Goscombe
This page was last updated at
