How to Bypass SCMP.com Paywall and How SCMP Should Resolve It

Update: SCMP.com is now free for everyone to view, rendering this tip unnecessary.

I used to be a subscriber of South China Morning Post — at a little over HK$300 it was a huge discount for the stories I read every day.

But times change and the subscription rates have gone way above what I used to pay. For full access, subscribers need to pay over HK$1,000 a year while registered users “receive 8 free articles a month.” In fairness, SCMP.com has underwent massive revamp and the current update is likewise a massive upgrade to last version.

Those who wish to read its news, however, and are not active subscribers or registered users who exceed their very limited quota see this:

scmp_paywall

Unlike other websites that invite visitors to come and read the news updates in exchange to attracting advertiser exposure, SCMP.com only allows paid subscribers… and those who make use of Google’s cache feature, a very simple and straightforward thing to do.

1. Go to SCMP.com and choose which article you like to read.

scmp_story

2. Right-click on the clickable headline and ‘Copy link address’ on Google Chrome browser.

3. Go to google.com. On the search form, type in site: and paste the URL address you just copied, so it looks like this:

scmp_google_result

4. Click on that tiny triangle to open the “Cached” link which represents the snapshot of Google when its robots/crawlers accessed and scanned the website. By then you should have bypassed SCMP.com’s registration/login roadblock. That’s because you are now accessing Google’s copy of the page.

scmp_cache

I seldom use this method as I find time to read the subscription copy of my building’s clubhouse. But to others who take advantage of this feature, it’s like downloading movies, software or music without compensating those who produce them.

As a business whose revenues are predicated on paid subscribers (and rate card), more effort should be done to allow only authorized visitors to view content, not just the ubiquitous login page.

Fortunately there is also a simple solution to this: disable Google cache. Just add the following code on every news article page that’s supposed to be subscriber-access only, as explained in its blog about Robots Exclusion Protocol:

<META NAME=”GOOGLEBOT” CONTENT=”NOARCHIVE”>

To apply across all (at least those who respect this code) search engines:

<META NAME=”ROBOTS” CONTENT=”NOARCHIVE”>

To even make a sweeping rule across the entire website, get help from robots.txt.

User-agent: *
Noarchive: /

This method, though, does not seem to prevent images from being displayed on Image search results.

Wall Street Journal, another website that requires subscription, practices this method.

wsj_story

While you can copy the URL and apply the same method as described above, you will notice there is no link to cached version available.

wsj_nocache