Monday, April 23, 2007

Spamdexing

Spamdexing is any of various methods to manipulate the relevancy or prominence of resources indexed by a search engine, usually in a manner inconsistent with the purpose of the indexing system. Search engines use a variety of algorithms to determine relevancy ranking. Some of these include determining whether the search term appears in the META keywords tag, others whether the search term appears in the body text or URL of a web page. Many search engines check for instances of spamdexing and will remove suspect pages from their indices.
The rise of spamdexing in the mid-1990s made the leading search engines of the time less useful, and the success of Google at both producing better search results and combating keyword spamming, through its reputation-based PageRank link analysis system, helped it become the dominant search site late in the decade, where it remains. Although it has not been rendered useless by spamdexing, Google has not been immune to more sophisticated methods either. Google bombing is another form of search engine result manipulation, which involves placing hyperlinks that directly affect the rank of other sites[1]. Google first algorithmically combated Google bombing on January 25, 2007.
The earliest known reference to the term spamdexing is by Eric Convey in his article "Porn sneaks way back on Web," The Boston Herald, May 22, 1996, where he said:
The problem arises when site operators load their Web pages with hundreds of extraneous terms so search engines will list them among legitimate addresses. The process is called "spamdexing," a combination of spamming — the Internet term for sending users unsolicited information — and "indexing."[2]

Google Bombing

The first Google bombs were probably accidental. Users would discover that a particular search term would bring up an interesting result, leading many to believe that Google's results could be manipulated intentionally. The first Google bomb known about by a significant number of people was the one that caused the search term "more evil than Satan himself" to bring up the Microsoft homepage as the top result. Numerous people have made claims to having been responsible for the Microsoft Google bomb, though none have been verified.[4]
In September of 2000 the first Google bomb with a verifiable creator was created by Hugedisk Men's Magazine, a now-defunct online humor magazine, when it linked the text "dumb motherfucker" to a site selling George W. Bush-related merchandise. A Google search for this term would return the pro-Bush online store as its top result.[5] Hugedisk had also unsuccessfully attempted to Google bomb an equally derogatory term to bring up an Al Gore-related site. After a fair amount of publicity the George W. Bush-related merchandise site retained lawyers who sent a cease and desist letter to Hugedisk, thereby ending the Google bomb.[6]
In April 6, 2001 in an article in the online zine uber.nu Adam Mathes is credited with coining the term "Google Bombing." In the article Mathes details his connection of the search term "talentless hack" to the website of his friend Andy Pressman by recruiting fellow webloggers to link to his friend's page with the desired term

Wednesday, April 18, 2007

Search Engine Optimization

Search engine optimization (SEO), a subset of search engine marketing, is the process of improving the volume and quality of traffic to a web site from search engines via "natural" ("organic" or "algorithmic") search results. SEO can also target specialized searches such as image search, local search, and industry-specific vertical search engines.

A typical Search Engine Results Page (SERP)
SEO is marketing by understanding how search algorithms work and what human visitors might search for, to help match those visitors with sites offering what they are interested in finding. Some SEO efforts may involve optimizing a site's coding, presentation, and structure, without making very noticeable changes to human visitors, such as incorporating a clear hierarchical structure to a site, and avoiding or fixing problems that might keep search engine indexing programs from fully spidering a site. Other, more noticeable efforts, involve including unique content on pages that can be easily indexed and extracted from those pages by search engines while also appealing to human visitors.
The term SEO can also refer to "search engine optimizers," a term adopted by an industry of consultants who carry out optimization projects on behalf of clients, and by employees of site owners who may perform SEO services in-house. Search engine optimizers often offer SEO as a stand-alone service or as a part of a larger marketing campaign. Because effective SEO can require making changes to the source code of a site, it is often very helpful when incorporated into the initial development and design of a site, leading to the use of the term "Search Engine Friendly" to describe designs, menus, content management systems and shopping carts that can be optimized easily and effectively.
Contents[hide]
1 History
1.1 Origin: Early search engines
1.2 Second stage: Link analysis
1.3 Current technology: Search engines consider many signals
2 Optimizing for traffic quality
3 Relationship between SEO and search engines
3.1 Getting into search engines' databases
3.2 Preventing search indexing
4 Types of SEO
4.1 "White hat"
4.2 Spamdexing / "Black hat"
5 SEO and marketing
6 Legal precedents
7 References
8 See also
//

History

Origin: Early search engines
Webmasters and content providers began optimizing sites for search engines in the mid-1990s, as the first search engines were cataloging the early Web.
Initially, all a webmaster needed to do was submit a page, or URL, to the various engines which would send a spider to "crawl" that page, extract links to other pages from it, and return information found on the page to be indexed.[1] The process involves a search engine spider downloading a page and storing it on the search engine's own server, where a second program, known as an indexer, extracts various information about the page, such as the words it contains and where these are located, as well as any weight for specific words, as well as any and all links the page contains, which are then placed into a scheduler for crawling at a later date.
Site owners started to recognize the value of having their sites highly ranked and visible in search engine results, creating an opportunity for both "white hat" and "black hat" SEO practitioners. Indeed, by 1996, email spam could be found on Usenet touting SEO services.[2] The earliest known use of the phrase "search engine optimization" was a spam message posted on Usenet on July 26, 1997.[3]
Early versions of search algorithms relied on webmaster-provided information such as the keyword meta tag, or index files in engines like ALIWEB. Meta-tags provided a guide to each page's content. But indexing pages based upon meta data was found to be less than reliable, because some webmasters abused meta tags by including irrelevant keywords to artificially increase page impressions for their website and to increase their ad revenue. Cost per thousand impressions was at the time the common means of monetizing content websites. Inaccurate, incomplete, and inconsistent meta data in meta tags caused pages to rank for irrelevant searches, and fail to rank for relevant searches.[4] Web content providers also manipulated a number of attributes within the HTML source of a page in an attempt to rank well in search engines.[5]
By relying so much upon factors exclusively within a webmaster's control, early search engines suffered from abuse and ranking manipulation. To provide better results to their users, search engines had to adapt to ensure their results pages showed the most relevant search results, rather than unrelated pages stuffed with numerous keywords by unscrupulous webmasters. Search engines responded by developing more complex ranking algorithms, taking into account additional factors that were more difficult for webmasters to manipulate.

Second stage: Link analysis
Larry Page and Sergei Brin, while graduate students at Stanford University, developed a search engine called "backrub" that relied on a mathematical algorithm to rate the prominence of web pages. The number calculated by the algorithm is called PageRank, and is based upon the quantity and prominence of incoming links.[6] PageRank estimates the likelihood that a given page will be reached by a web user who randomly surfs the web, and follows links from one page to another. In effect, this means that some links are stronger than others, as a higher PageRank page is more likely to be reached by the random surfer.
Page and Brin founded Google in 1998. On strong word of mouth from programmers, Google became a popular search engine. Off-page factors such as PageRank and hyperlink analysis were considered, as well as on-page factors, to enable Google to avoid the kind of manipulation seen in search engines focusing primarily upon on-page factors for their rankings. Although PageRank was more difficult to game, webmasters had already developed link building tools and schemes to influence the Inktomi search engine, and these methods proved similarly applicable to gaining PageRank. Many sites focused on exchanging, buying, and selling links, often on a massive scale. Some of these schemes, or link farms, involved the creation of thousands of sites for the sole purpose of link spamming.[7]

Current technology: Search engines consider many signals
To reduce the impact of link schemes, search engines have developed a wider range of undisclosed off-site factors they use in their algorithms. As a search engine may use hundreds of factors in ranking the listings on its SERPs, the factors themselves and the weight each carries can change continually, and algorithms can differ widely. The four leading search engines, Google, Yahoo, Microsoft and Ask.com, do not disclose the algorithms they use to rank pages. Some SEOs have carried out controlled experiments to gauge the effects of different approaches to search optimization, and share results through online forums and blogs.[8] SEO practitioners may also study patents held by various search engines to gain insight into the algorithms.[9]

Optimizing for traffic quality
In addition to seeking better rankings, search engine optimization is also concerned with traffic quality. Traffic quality is measured by how often a visitor using a specific keyword phrase leads to a desired conversion action, such as making a purchase, viewing or downloading a certain page, requesting further information, signing up for a newsletter, or taking some other specific action.
By improving the quality of a page's search listings, more searchers may select that page, and those searchers may be more likely to convert. Examples of SEO tactics to improve traffic quality include writing attention-grabbing titles, adding accurate meta descriptions, and choosing a domain and URL that improve the site's branding.

Relationship between SEO and search engines
By 1997 search engines recognized that some webmasters were making efforts to rank well in their search engines, and even manipulating the page rankings in search results. In some early search engines, such as Infoseek, ranking first was as easy as grabbing the source code of the top-ranked page, placing it on your website, and submitting a URL to instantly index and rank that page.[citation needed]
Due to the high value and targeting of search results, there is potential for an adversarial relationship between search engines and SEOs. In 2005, an annual conference named AirWeb[10] was created to discuss bridging the gap and minimizing the sometimes damaging effects of aggressive web content providers.
Some more aggressive site owners and SEOs generate automated sites or employ techniques that eventually get domains banned from the search engines. Many search engine optimization companies, which sell services, employ long-term, low-risk strategies, and most SEO firms that do employ high-risk strategies do so on their own affiliate, lead-generation, or content sites, instead of risking client websites.
Some SEO companies employ aggressive techniques that get their client websites banned from the search results. The Wall Street Journal profiled a company, Traffic Power, that allegedly used high-risk techniques and failed to disclose those risks to its clients.[11] Wired reported the same company sued a blogger for mentioning that they were banned.[12] Google's Matt Cutts later confirmed that Google did in fact ban Traffic Power and some of its clients.[13]
Some search engines have also reached out to the SEO industry, and are frequent sponsors and guests at SEO conferences and seminars. In fact, with the advent of paid inclusion, some search engines now have a vested interest in the health of the optimization community. All of the main search engines provide information/guidelines to help with site optimization: Google's, Yahoo!'s, MSN's and Ask.com's. Google has a Sitemaps program[14] to help webmasters learn if Google is having any problems indexing their website and also provides data on Google traffic to the website. Yahoo! has Site Explorer that provides a way to submit your URLs for free (like MSN/Google), determine how many pages are in the Yahoo! index and drill down on inlinks to deep pages. Yahoo! has an Ambassador Program[15] and Google has a program for qualifying Google Advertising Professionals.[16]

Getting into search engines' databases
As of 2007 the leading contextual search engines do not require submission. They discover new sites and pages automatically. Google and Yahoo offer submission programs, such as Google Sitemaps, for which an XML type feed can be created and submitted. These programs are designed to assist sites that may have pages that aren't discoverable by automatically following links.[17]
Search engine crawlers may look at a number of different factors when crawling a site, and many pages from a site may not be indexed by the search engines until they gain more PageRank, links or traffic. Distance of pages from the root directory of a site may also be a factor in whether or not pages get crawled, as well as other importance metrics. Cho et al.[18] described some standards for those decisions as to which pages are visited and sent by a crawler to be included in a search engine's index.
Some search engines, notably Yahoo!, operate a paid submission service that guarantee crawling for either a set fee or cost per click. Such programs usually guarantee inclusion in the database, but do not guarantee specific ranking within the search results.

Preventing search indexing
Main article: robots.txt
To avoid undesirable search listings, webmasters can instruct spiders not to crawl certain files or directories through the standard robots.txt file in the root directory of the domain. Additionally, a page can be explicitly excluded from a search engine's database by using a meta tag specific to robots. When a search engine visits a site, the robots.txt located in the root directory is the first file crawled. The robots.txt file is then parsed, and will instruct the robot as to which pages are not to be crawled. As a search engine crawler may keep a cached copy of this file, it may on occasion crawl pages a webmaster does not wish crawled.
Pages typically prevented from being crawled include login specific pages such as shopping carts and user-specific content such as search results from internal searches.

Types of SEO
SEO techniques are classified by some into two broad categories: techniques that search engines recommend as part of good design, and those techniques that search engines do not approve of and attempt to minimize the effect of, referred to as spamdexing. Most professional SEO consultants do not offer spamming and spamdexing techniques amongst the services that they provide to clients. Some industry commentators classify these methods, and the practitioners who utilize them, as either "white hat SEO", or "black hat SEO".[19] Many SEO consultants reject the black and white hat dichotomy as a convenient but unfortunate and misleading over-simplification that makes the industry look bad as a whole.

"White hat"
An SEO tactic, technique or method is considered "White hat" if it conforms to the search engines' guidelines and/or involves no deception. As the search engine guidelines[20][21][22][23][24] are not written as a series of rules or commandments, this is an important distinction to note. White Hat SEO is not just about following guidelines, but is about ensuring that the content a search engine indexes and subsequently ranks is the same content a user will see.
White Hat advice is generally summed up as creating content for users, not for search engines, and then make that content easily accessible to their spiders, rather than game the system. White hat SEO is in many ways similar to web development that promotes accessibility,[25] although the two are not identical.

Spamdexing / "Black hat"
Main article: Spamdexing
"Black hat" SEO are methods to try to improve rankings that are disapproved of by the search engines and/or involve deception. This can range from text that is "hidden", either as text colored similar to the background or in an invisible or left of visible div, or by redirecting users from a page that is built for search engines to one that is more human friendly. A method that sends a user to a page that was different from the page the search engined ranked is Black hat as a rule. One well known example is Cloaking, the practice of serving one version of a page to search engine spiders/bots and another version to human visitors.
Search engines may penalize sites they discover using black hat methods, either by reducing their rankings or eliminating their listings from their databases altogether. Such penalties can be applied either automatically by the search engines' algorithms or by a manual review of a site.
One infamous example was the February 2006 Google removal of both BMW Germany and Ricoh Germany for use of deceptive practices.[26]. Both companies, however, quickly apologized, fixed the offending pages, and were restored to Google's list. [3] [4]

Pay Per Click

Pay per click (PPC) is an advertising technique used on websites, advertising networks, and search engines.
Advertisers bid on "keywords" that they believe their target market (people they think would be interested in their offer) would type in the search bar when they are looking for their type of product or service. For example, if an advertiser sells red widgets, he/she would bid on the keyword "red widgets", hoping a user would type those words in the search bar, see their ad, click on it and buy. These ads are called "sponsored links" or "sponsored ads" and appear next to and sometimes above the natural or organic results on the page. The advertiser pays only when the user clicks on the ad.
While many companies exist in this space, Google AdWords and Yahoo! Search Marketing, which was formerly Overture, are the largest network operators as of 2006. In the spring of 2006, MSN started beta testing their own in-house service, MSN adCenter. In recent years agencies have also arisen to facilitate the use of pay-per-click advertising, such as Latitude White in the UK, leading to refinements in the PPC keywords-matching system. Depending on the search engine, minimum prices per click start at US$0.01 (up to US$0.50). These prices can reach up to GBP£18+ [1]per click for services such as unsecured personal loans. Very popular search terms can cost much more on popular engines. Arguably this advertising model may be open to abuse through click fraud, although recently Google and other search engines have implemented automated systems to guard against this.
Contents[hide]
1 Categories
1.1 Keyword PPCs
1.2 Product PPCs
1.3 Service PPCs
1.4 Pay per call
2 See also
3 External links

Categories
PPC engines can be categorized in "Keyword", "Product", "Service" engines. However, a number of companies may fall in two or more categories. More models are continually evolving. Currently, pay per click programs do not generate any revenue solely from traffic for sites that display the ads. Revenue is generated only when a user clicks on the ad itself.

Keyword PPCs
Advertisers using these bid on "keywords", which can be words or phrases, and can include product model numbers. When a user searches for a particular word or phrase, the list of advertiser links appears in order of the amount bid. Keywords, also referred to as search terms, are the very heart of pay per click advertising. The terms are guarded as highly valued trade secrets by the advertisers, and many firms offer software or services to help advertisers develop keyword strategies.
As of 2005, notable PPC Keyword search engines include: Google AdWords, Yahoo! Search Marketing (formerly Overture Services), Microsoft adCenter, LookSmart, Miva (formerly FindWhat), Ask (formerly Ask Jeeves), 7Search, Kanoodle, and Baidu.

Product PPCs
"Product" engines let advertisers provide "feeds" of their product databases and when users search for a product, the links to the different advertisers for that particular product appear, giving more prominence to advertisers who pay more, but letting the user sort by price to see the lowest priced product and then click on it to buy. These engines are also called Product comparison engines or Price comparison engines.
Noteworthy PPC Product search engines are: BizRate.com, Shopzilla.com, NexTag, PriceGrabber.com, and Shopping.com.

Service PPCs
"Service" engines let advertisers provide feeds of their service databases and when users search for a service offering links to advertisers for that particular service appear, giving prominence to advertisers who pay more, but letting users sort their results by price or other methods. Some Product PPCs have expanded into the service space while other service engines operate in specific verticals.
Noteworthy PPC services include NexTag, SideStep, and TripAdvisor.

Pay per call
Similar to pay per click, pay per call is a business model for ad listings in search engines and directories that allows publishers to charge local advertisers on a per-call basis for each lead (call) they generate. The term "pay per call" is sometimes confused with "click to call"[1]. Click-to-call, along with call tracking, is a technology that enables the “pay-per-call” business model.
Pay-per-call is not just restricted to local advertisers. Many of the pay-per-call search engines allows advertisers with a national presence to create ads with local telephone numbers.
According to the Kelsey Group, the pay-per-phone-call market is expected to reach US$3.7 billion by 2010.