SmallBusiness

WebKnowHow

Navigation

Search

Search by Category

Home

The Problem with Automated Content

	Not rated
	Rate:

Gordon Goodfellow
October 18, 2006

Gordon Goodfellow

Gordon Goodfellow is an Internet marketer and technologist, writer and researcher. His Content Artist website explores the issues and problems faced by automated content, and provides some software solutions.

Gordon Goodfellow has written 1 articles for WebKnowHow.

View all articles by Gordon Goodfellow...

Cast your mind back about three years.

Shortly after the dawning of the Google Adsense Age, webmasters learned that their sites were effectively little gold mines or "virtual real estate" as one expert put it. The more cyber-property you had, the more virtual billboards you were able to put up (also called Adsense blocks). And so if you made $n dollars by owning one web page with an Adsense ad (or any ad) on it, then it was reasonable to assume that you would make $n x 10,000 if you had 10,000 pages with similar ads on it.

Similarly, reason suggested that 1 million such pages would make you $n x 1,000,000.

Webmasters were eager to rise to this Gold Rush challenge, and so were those present-day providers of picks and shovels, the software developers. Applications were developed which could produce thousands of web pages in less than an hour from a keyword list. All you had to do was a little research using Overture's keyword tool or its many free derivatives - the more sophisticated practitioner of this art would have added Wordtracker into the mix - and you had your keyword list.

Add some adjectival superlatives such as "better" or "best" or "latest" before each keyword and you had an even bigger list. Then after each keyword add "in New York" or "in London" or even all the place names in the English speaking world (there are over 30,000 of them) and you had a massive list. The software which was available at the time could, and still can, produce whole websites consisting of tens of thousands of pages from your own such bloated keyword handiwork. Each page of that site would be highly optimized for one keyword phrase, so that you could more or less guarantee that your page would be in number one position on all the search engines, simply because it was so specific. Such websites could be cranked out and uploaded to your server all in the same day. You could produce 50 such websites, each with thousands of pages, in a single month; all of them with Adsense blocks on each page.

The problem was, they were all unreadable.

Pages that were manufactured at that speed could hardly rely on human dexterity in creating their content. So the software which produced them - and it was ingenious software - had to resort to other means. These largely fell into two groups: RSS feeds and what came to be called "scraped" content. The problem with RSS feeds was that lots of other people were using the same feed. The problem with scraped content was that it belonged to someone else. In both cases, the hyperlink which was obligatory (but which could be turned off in the case of the scraped content) bled Pagerank away and in other ways compromised the integrity of your site. Both practices also had the habit of leaving footprints for the search engines to spot. Lawyers' purses bulged a bit as well.

At about the same time, people searching the Internet complained of seeing bland web pages with content that was either non-existent, meaningless or repetitive (even, heaven forbid, duplicate). The search engines addressed this by punishing web sites that displayed those tendencies, and so raised the informational quality of their listings for a while. This punishment consisted of altering their algorithms so that sites or pages which demonstrated such blandness were either pushed so far down the listings that they effectively could not be seen, or delisted altogether (banned).

Along came a flurry of remedies. You could pay ghost-writers at Elance or Rentacoder to produce the content for you according to a specified keyword density (but even at $3 an hour it was expensive if you wanted to replace all those thousands of pages which had just been banned by Google). Then a huge mini industry of private label membership sites came along, charging you a monthly fee to use its thousands of stock articles without any copyright questions being asked. (But there were seldom the specific keyword phrases you wanted in those articles, and you could never control the keyword density; also you just knew that lots of other people were using the same articles from the same membership sites.)

Other software came along and inserted random text at the top and bottom of each article, so that each page became unique in its own way. Still more software was produced which substituted common words in existing PLR articles from stock synonyms (there was word going round that if a page was 28 percent more different than another page then you were okay). The problem was that if the page was read as a whole, it made no sense at all. But this could still fool the search engines. Just.

The search engines were reported to have recruited thousands of student "editors" to manually weed out such aberrations from their indices. More emphasis was placed on non-reciprocal inbound links with the appropriate keywords in the anchor text (or within ten words left or right of the anchor text), and other "off-page" considerations. And so it went on. And on.

There were all sorts of "solutions" offered to those webmasters who had known the heady days of the big-figure Google checks for doing very little, and were willing to pay almost any price to return to them. Accordingly, the software became more ambitious. In turn, the search engines became more demanding, and there were increasing signs that perfectly legitimate sites were being punished as well as the spam pages.

We seem to have reached a point where something has to give. The browsing public does deserve better than scraped content, RSS feeds and the abundance of proto-plagiarism that it still gets. The need is for content that makes sense and is readable by real people and also of value, as well as ticking all the boxes of the search engine bots' latest algorithm. Equally, webmasters have a need for such content as well, yet they also have an understandable need to be able to produce that content on demand to their increasingly information-hungry readers. To satisfy such demands it is unlikely that one piece of software alone will suffice. Instead, it seems clear that a system of content delivery needs to exist which is actually sophisticated enough to produce content which is of value to all concerned.

Add comment (Comments: 0)


Copyright © 1998 - 2018 DevStart, Inc. All Rights Reserved

DevStart Network

Partners

Related Resources

JavaScript City