Home | Internet
The argument over just how duplicate content is defined and whether or not duplicate content matters continues to rage and there is no sign that it is going away. So just how do you define duplicate content and does it matter? The generally accepted view is that duplicate content indeed matters and, despite the fact that one well known and highly respected search engine optimization expert recently expressed the opposing view in one of his regular articles, even a quick inspection of the mountain of material that has been written on the topic recently will clearly indicate that this is a minority opinion. But, if we agree with the view that duplicate content does matter, then how do we go about defining duplicate content? For example, if I write an article for submission to an article directory and then re-work that same article for submission to a second article directory how will the search engines check my two articles and decide whether or not they contain duplicate content? The simple answer is that we don't know, however, here are this webmaster's thoughts. When checking for duplicate content was initially carried out by the major search engines it was very much a matter of comparing one web page as a whole with another and no attempt was made to begin to cut apart the pages and compare individual elements of the pages. In those days it was possible to take identical content and simply add an introductory and concluding paragraph to one of the pages and that would be sufficient to fly under the duplicate content radar. Sadly for many publishers these days are long gone. The major search engines now cut up the two pages to permit them to compare individual elements and it is this which is the core of today's disagreement. Most people agree that attention is concentrated on the central content of a web page rather than the structure of the web page. A great many website owners use templates for their pages which define the structure of each page including such things as menus, headers and footers. This is widely believed to be accepted and the major search engines do not view this as duplicate content. What the major search engines are examining is the main content which is contained in the body of the page. But precisely how do they examine this page content? Some people believe that this comparison is done at 'block' level (comparing individual paragraphs or sentences), while other people think that filtering searches for phrases or possibly even for individual words. Nobody really knows of course but it might seem reasonable to assume that the likeliest basis for checking would be to use either sentence or phrase matching. Sentence matching is fairly straightforward and merely means cutting both pages up into chunks determined by the page's punctuation. Take, for example, this sentence: It is reasonably simple to find a good deal on a package holiday, provided you know where to look. This would either be viewed as a single sentence or as two sentences, depending upon whether or not you use the traditional definition of a full-stop as being the end of a sentence or choose to adopt a flexible approach which would make use of other punctuation marks, like commas. Matching at the phrase level is a little bit more difficult. What is the definition of a phrase? Should it be made up of 2 words or 3 words or 4 words or�? Let's say for now that a phrase is defined as 3 words. If this were the case the following phrases would be seen as duplicate content if they were to appear on two pages which were being examined: At that time Did you know Take a look Day to day You can get In the end The answer is In those days All of these phrases are normal everyday phrases which could be used on pages about dog training, cycling, making money online or anything else you care to mention. Now there are a few people who contend that the major search engines do check pages down to this level. For example, when I asked the staff of one particular content checker (Dupecop) about the basis on which they examined duplicate content they replied saying: "DupeCop compares both individual words and 3-word phrases. It also ignores all punctuation and scans across sentences" It was not a surprise therefore that when I ran a number of articles through this program (comparing articles on the subject of Christmas dec�r against articles about Labrador Retrievers) I found that they showed an average of 25% of duplicate content! Bearing this in mind, I think that it would be ridiculous to believe that the major search engines would have their filters set to this level. But at what level would the filters be set? Should they be at 4 words or 5 words or�? Quite honestly, your guess is as good as mine. Over the years I have published literally hundreds of articles and have monitored the results for signs of duplicate content penalties, as far as anyone can do this. On the basis of my own experience I believe that filtering is not carried out clear down to the level of 3 or 4 word phrases but probably stops at sentence level. Thus, providing you alter articles down to sentence level, you ought to have no problem in avoiding the duplicate filters. In actual fact, even if one or two of sentences are duplicated you will be okay.
Article Source: http://blisspublisher.com
WebMarketingCentre.com provides information on article writing and article submission and is also an article directory where you can pick up a free online article for your website or ezine and to which you can submit articles on a wide variety of topics including article marketing and much more.
Please Rate this Article
5 out of 54 out of 53 out of 52 out of 51 out of 5
Not yet Rated