Gordon Goodfellow |
Gordon Goodfellow is an Internet marketer and technologist, writer and researcher. His Content Artist website explores the issues and problems faced by automated content, and provides some software solutions. |
Gordon Goodfellow
has written 1 articles for WebKnowHow. |
View all articles by Gordon Goodfellow... |
Cast your mind back about three years.
Shortly after the dawning of the Google Adsense Age, webmasters
learned that their sites were effectively little gold mines or "virtual
real estate" as one expert put it. The more cyber-property you had, the
more virtual billboards you were able to put up (also called Adsense
blocks). And so if you made $n dollars by owning one web page with an
Adsense ad (or any ad) on it, then it was reasonable to assume that you
would make $n x 10,000 if you had 10,000 pages with similar ads on it.
Similarly, reason suggested that 1 million such pages would make you $n x 1,000,000.
Webmasters were eager to rise to this Gold Rush challenge, and so
were those present-day providers of picks and shovels, the software
developers. Applications were developed which could produce thousands
of web pages in less than an hour from a keyword list. All you had to
do was a little research using Overture's keyword tool or its many free
derivatives - the more sophisticated practitioner of this art would
have added Wordtracker into the mix - and you had your keyword list.
Add some adjectival superlatives such as "better" or "best" or
"latest" before each keyword and you had an even bigger list. Then
after each keyword add "in New York" or "in London" or even all the
place names in the English speaking world (there are over 30,000 of
them) and you had a massive list. The software which was available at
the time could, and still can, produce whole websites consisting of
tens of thousands of pages from your own such bloated keyword
handiwork. Each page of that site would be highly optimized for one
keyword phrase, so that you could more or less guarantee that your page
would be in number one position on all the search engines, simply
because it was so specific. Such websites could be cranked out and
uploaded to your server all in the same day. You could produce 50 such
websites, each with thousands of pages, in a single month; all of them
with Adsense blocks on each page.
The problem was, they were all unreadable.
Pages that were manufactured at that speed could hardly rely on
human dexterity in creating their content. So the software which
produced them - and it was ingenious software - had to resort to other
means. These largely fell into two groups: RSS feeds and what came to
be called "scraped" content. The problem with RSS feeds was that lots
of other people were using the same feed. The problem with scraped
content was that it belonged to someone else. In both cases, the
hyperlink which was obligatory (but which could be turned off in the
case of the scraped content) bled Pagerank away and in other ways
compromised the integrity of your site. Both practices also had the
habit of leaving footprints for the search engines to spot. Lawyers'
purses bulged a bit as well.
At about the same time, people searching the Internet complained of
seeing bland web pages with content that was either non-existent,
meaningless or repetitive (even, heaven forbid, duplicate). The search
engines addressed this by punishing web sites that displayed those
tendencies, and so raised the informational quality of their listings
for a while. This punishment consisted of altering their algorithms so
that sites or pages which demonstrated such blandness were either
pushed so far down the listings that they effectively could not be
seen, or delisted altogether (banned).
Along came a flurry of remedies. You could pay ghost-writers at
Elance or Rentacoder to produce the content for you according to a
specified keyword density (but even at $3 an hour it was expensive if
you wanted to replace all those thousands of pages which had just been
banned by Google). Then a huge mini industry of private label
membership sites came along, charging you a monthly fee to use its
thousands of stock articles without any copyright questions being
asked. (But there were seldom the specific keyword phrases you wanted
in those articles, and you could never control the keyword density;
also you just knew that lots of other people were using the same
articles from the same membership sites.)
Other software came along and inserted random text at the top and
bottom of each article, so that each page became unique in its own way.
Still more software was produced which substituted common words in
existing PLR articles from stock synonyms (there was word going round
that if a page was 28 percent more different than another page then you
were okay). The problem was that if the page was read as a whole, it
made no sense at all. But this could still fool the search engines.
Just.
The search engines were reported to have recruited thousands of
student "editors" to manually weed out such aberrations from their
indices. More emphasis was placed on non-reciprocal inbound links with
the appropriate keywords in the anchor text (or within ten words left
or right of the anchor text), and other "off-page" considerations. And
so it went on. And on.
There were all sorts of "solutions" offered to those webmasters who
had known the heady days of the big-figure Google checks for doing very
little, and were willing to pay almost any price to return to them.
Accordingly, the software became more ambitious. In turn, the search
engines became more demanding, and there were increasing signs that
perfectly legitimate sites were being punished as well as the spam
pages.
We seem to have reached a point where something has to give. The
browsing public does deserve better than scraped content, RSS feeds and
the abundance of proto-plagiarism that it still gets. The need is for
content that makes sense and is readable by real people and also of
value, as well as ticking all the boxes of the search engine bots'
latest algorithm. Equally, webmasters have a need for such content as
well, yet they also have an understandable need to be able to produce
that content on demand to their increasingly information-hungry
readers. To satisfy such demands it is unlikely that one piece of
software alone will suffice. Instead, it seems clear that a system of
content delivery needs to exist which is actually sophisticated enough
to produce content which is of value to all concerned.
|