Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm hoping that Google will penalize them someday. They download all the content in the page so that they'll get indexed for it, but they don't offer it up without an account to the user.


How would google determine that the content being served to other users differ from what the crawlers see?


The big fish (experts exchange etc...) are big enough to be caught manually. Programatically they can crawl with a typical browser user agent, from a netblock not registered to google nor announced by their ASes. They'd need to account for the dynamic nature of sites to determine if the difference between different crawls is a sign of trouble, but as luck have it they're already in the business of algorithms and archiving websites.


I'm sure they'd do it algorithmically both for elegance and legal liability reasons. It'd be done in an identical fashion to the Panda update. Use the big fish to identify techniques you can build rules around and then make sure your arbitrary penalty doesn't affect too many innocents too much before you push it live.


Probably if enough people report them for cloaking: https://www.google.com/webmasters/tools/spamreportform?hl=en


Well, I did my part in reporting quora. I wish there were a way to track the progress of this report. The process makes me feel as if the message was lost in the ether


Thanks for the tip, I just filed a report.


Most likely by someone at Google hearing about it if it gets negative coverage like this.


If it's simply divs masking the content there must be a way to load the page on an iOS emulation, screencap the entire length of the page, and compare the amount of "content" in the source to what shows up in the screencap.

I'm not saying it's worth the trouble for Google. Quora is probably the worst example of this bullshit. But it's certainly possible to automatically catch devilry like this, isn't it?


Your proposal actually addresses quite a few sites that put lots of phrases in small font to boost pagerank rating


With the proper image analysis technology it should work, right?


Just penalize any content from Quora.com in page rank.


I was referring to automated processes, not a manual intervention


Change user agent to crawl as a mobile browser.


That wouldn't help if a site served different content to the google block of ip addresses


This is cloaking and will get you in big trouble.


I suspect Google could get ahold of a block of non-Google AS addresses if it really wanted to.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: