Computing Magazine

Can a Search Engine Trace Your Doorway Pages?

Posted on the 12 February 2014 by Yacysh

Can a Search Engine Trace Your Doorway Pages?

Welcome aboard! If you’ve already begun your adventure with SEO, you know that link building is essential in enhancing a website’s position in SERPs. Now, once you get the hang of the backlink strategy, you’ll probably see that it’s not as easy as it seems. Say, you’ve got 1, 10 or a hundred inbound links leading to your website and now you’re puzzled as to what to do next. At some point, you’ll probably think of the following: instead of getting links on websites owned by others, why not build your own themed link pages to use for backlinks to your website? OK, so you’ve come up with your own hosted content and … now you will find out why it won’t work. Can Google trace up your hosted content back to you? Yes, it can and it will. How Google does it? And where did you go wrong?

1. Google Analytics

Imagine that you’ve already prepared a few presell pages. Let’s assume that you’ve got ten of them and they all differ in AlexaRank. To get the maximum data on their metrics, you’ll probably want to get a Google Analytics account to monitor all of them. Unless you create a new Google Analytics account for each and every of those pages, you can be sure that Google will easily trace them back to you and realize that all the sites belong to one person.

2. Google AdSense

Some of your themed link pages actually become popular, so you might think: why not make some money off them? If you decide to use Google AdSense for a few of them, just like in the case of Google Analytics, you can be sure that Google will notice that they all have the same owner.

3. Google Webmaster Tools

You’ve got a few pages on one account? It couldn’t be more obvious that they are owned and managed by one person. In this situation, you might quickly join the ranks of spam sites, so beware and learn more about this subject.

4. Phone Verification – Text Message

So you got yourself a few Google accounts for several of your doorway pages. If you think all will work fine now, you’re in for a surprise – at some point you’ll be asked for account verification via a text message. Google will easily spot the same phone number for all your accounts, so you’d better fix yourself up with a selection of additional phone numbers.

5. Google+ Pages

There’s no denying that authorship is important. If not now, you can be sure that it will become a crucial factor in the future. If you keep all Google+ profiles of your pages on one account, you will surely be traced back by Google.

6. Cookies

Imagine you work on one computer, but as you set separate accounts, get a few phone numbers and create several e-mail accounts, you still might forget about one thing – cookies. Google smartly stores text or flash cookies on your computer and so will be able to connect your accounts instantly.

7. Computer Fingerprint

To quote Forbes magazine: ‘This technique allows a web site to look at the characteristics of a computer such as what plugins and software you have installed, the size of the screen, the time zone, fonts and other features of any particular machine. These form a unique signature just like random skin patterns on a finger.’ Even though you cleared your cookies, changed your login IP and took all necessary precautions not to be traced back, you might still get caught because of your browser or computer fingerprint.

8. IP

Let’s consider for a moment the problem of detecting servers that host your doorwaypages. Your server’s basic indicator is obviously the domain that represents a resource characterized by a serial number – the IP. Once you get your multiple pages to link to one website with one and the same IP, you can be sure that Google won’t transfer the link juice from all the linking domains.

Moreover, there are several IP categories

IP subnets and SEO

source:http://osr600doc.sco.com/en/NET_tcp/tcpT.creating_subnets.html

When building a website’s authority, remember that links or signals coming from the same IP class may have a lower value. Internet holds millions of C classes, so undermining the strength of signals coming from the vicinity of your page’s IP class seems to work in a cheap and efficient way as far as link-building abuse is concerned. The chances of accidentally bringing down the ranking of a random page are extremely low – 1 to 16 million – and it’s obvious that for search engine operations assessing the strength of the signal needs to be done with as little effort as possible. What counts here is global efficiency, executed not through the perspective of the characteristics of your service. Diminishing the strength of a signal within the same C class of IP or the same operator is not as extensive as in the case of sending signals from the same server and the same IP number. This is when you know for 99% that the signal comes from the same owner, so it’s strength can be diminished or totally ignored. If unusually large amounts of links come from pages located within the same C class, you can be sure that the search algorithms will pick it up as something potentially fishy.

9. SEO Hosting

Let’s return now to the question of how easy it is for Google to actually trace the doorway pages to the same owner. First comes what I already mentioned above – the same IP address or the close location in the C class of IP. There are several solutions to this issue and one of them is SEO hosting, where you simply buy the hosting with an account to which hundreds of IP addresses are assigned. Unfortunately, you’ll share those IP with other SEOs, who might be doing things much worse than you like spamming or creating low quality link pages. You can be pretty sure that when using SEO hosting, you risk your service or your hosted content to be located in an unwanted neighbourhood that will badly affect its reputation with the search algorithm.

10. Hosted Content CMS

It’s important what tools your bring to power your search optimization. You can count on search engines to have the ‘fingerprint’ of plenty of Internet tools, CMSs, directory templates and so on. If you build your service on the directory template that in 95% of the cases is used to create spam content, you can be sure that search engines will notice that. In general, it’s better to be very careful when choosing your tools – it’s often best to use your own solutions and stay clear of the tools and templates that promise great effects with small cost – that’s not going to happen. These kinds of tools can be traced based on label name in metatags, as well as through some specific mechanisms inherent to them or repeating schemes in html code.

Can a Search Engine Trace Your Doorway Pages?

11. Server Footprint

A different method is buying servers to which with multiple IP addresses. Sometimes you’ll be given IP addresses that belong to the same C class and that, as I pointed above, is not going to work. Other times you’ll be allowed to choose one IP address from a specific country and different C class (or even B). You should know that all this can still pose a problem. Since here, as it is in SEO hosting, all the addresses are registered to the same operator, search engines will be able to connect them with no problem. After all there exist thousands of operators and if it happens that your service derives 70% of its most effective links from services belonging to the same operator, the search algorithm is bound to notice that.

Another thing is the device identification. Once your server provides information as it connects, it’s not going to look good if 30% of your best links come from devices driven by Apache in version 2.2.22, on Ubuntu at provider X and all IP addresses belong to the same C class. Here’s how it looks like:

telnet 127.0.0.1 80
Trying 127.0.0.1…
Connected to 127.0.0.1.
Escape character is ‘^]’.
GET / HTTP/1.0

HTTP/1.1 200 OK
Date: Wed, 18 Dec 2013 16:23:47 GMT
Server: Apache/2.2.22 (Ubuntu)

To get a server fingerprint is an easy job – look for Apache settings where you can see whether the compression is on, what are the modules, what is the expiry time for image, CSS files and other custom settings that allow the identification of your server and so leave your hosted content to be detected as belonging to the same owner.

12. Domain Data

Once you decide to buy several servers in order to avoid the problems I mentioned above, you must also think about buying a few domains. You might now think that the smart thing to do is to buy them from different operators and hide the name of domain owner. Will that be enough? Maybe, or maybe not. In 2006, Google itself became a domain operator thanks to which it can now access WHOIS API that grants an unlimited access to WHOIS public data. According to the access rules, it can be used solely for the purpose of solving domain problems through contacting the right domain operators. Using it for search engine optimization purposes would therefore be illegal. Still, it’s worth to remember that search engines can access public data, such as the record’s creation date, its expiration date, or its DNS servers, collect it and analyze.

13. TCP/IP Behaviour

That one is a bit tricky, since I have no idea whether any of the search engines actually use this method. One thing I’m sure about is that in order to get the best search results possible, search engines will enhance their mechanisms of detecting the services and sources of their positive signals. It then might be worthwhile to think about the level of TCP/IP protocol behaviours and how they might be used to your benefit. Another question that pops up here is: what if search engines recognize our intentions in meddling with TCP/IP protocols? A few years ago, I’d have just assumed that the links would be weakened or ignored. Now I suppose that something like that might be assessed as a negative signal from your service, and the accumulation of these signals of varying strength might at some point result in a filter named after a harmless black and white animal

;)
. You’ll find more information of TCP protocol identification here.

14. Link Pages Content

We all know that writing a fine article is costly and time-consuming. Automated content generation is a subject for another article, but you must know that if you automatically publish RSS feeds, Google is going to spot that.

rss-based-texts

A few years ago it would be good to just write a long article, use lots of synonyms and simply generate 10 articles from that material by spinning. Nowadays, in the Hummingbird era, publishing articles like that on your presell pages is going to send a clear message to Google – that all this belongs to one person, which might result in bans and filters.

wykrywanie-zaplecza-2

15. Link Schemes

Just some time ago, a successful link-building solution was offered on sites like ezinearticles.com or via presell pages, where, after writing an article, you could easily get cheap links out of it. Already back then Google was able to detect link schemes. If your clients’ websites are linked to in the same way, expect Google to have a closer look at that.

Conclusion

Can a Search Engine Trace Your Doorway Pages?

All the above brings us to one conclusion that long-term scheming and black-hat linking is not worth it. It’s relatively easy to detect such patterns, and even easier to punish them. A number of the detection systems work with a certain delay, so some websites that use unethical positioning techniques might still be visible in SERPs. Though their rankings are getting more and more fluctuant. Those short-term effects probably pay off for some niches, especially most competitive ones, if you consider the low cost of black SEO. On the other hand, I think gray SEO is an activity doomed to failure – if you want to work hard to rank your website high on Google, it’s not worth using tricks for better rankings, all of which might jeopardize everything that you’ve built before. Whatever method you try, remember you are not smarter than a search engine with the computational power of tens of thousands of computers. Black hat link-building will take its heavy toll in the long run. In the past, you only risked link juice weakening, but now there’s a real threat in the form of filters that can cause you to lose even 90% of search engine traffic. It’s definitely better to choose one side and consequently stick to it. I chose the white-hat one, which one will you choose?


Back to Featured Articles on Logo Paperblog