For more official policy you'd have to ask either Amit or Matt. I can't speak for the company here.
Speaking only for myself and only on the ethics, I generally feel that any site that allows itself to be indexed is pretty happy with Google (or bing) doing whatever they can to rank it better. Even with the link data that sites provide, you can add rel=nofollow to the links if you don't want search engines to use them but still want your pages indexed (yelp has done this for instance.)
For me that's the ethical boundary. Sites have various ways of indicating their wishes, and that ought to be respected in spirit beyond the technical details.
Legally, the technologies that make the internet work all rely on the idea of fair use, so it is very important whether something is "fair."
I've seen no statement that Google throws out Toolbar (and other) clickstream data for sites/pages that Googlebot can't visit (which includes not just robots-precluded but also login-required pages). Not that I think you should throw such data out; that's not what robots.txt was meant for, and the user arguably has more claim to that interaction trail than the site. But that seems the standard you're suggesting.
If Google doesn't want IE features or the Bing Toolbar observing its site interactions, it can disallow such visitors. A steep price to pay, at too coarse a level of control? Yes, just like a site deciding to bar Googlebot.
I would agree that a 'fair use'-like analysis makes sense.
I would further agree that any site solely, or predominantly, powered by indirect observations of Google users would be an unfair taking. You'd crush such a site in court.
Meanwhile, a site that tallies Google referrer inclicks for itself, or for a network of participating sites (as with analytics inserts), even republishing summaries of Google source URLs and search terms as public data, is almost certainly fair use. It's taking data you're dropping freely onto third-party site logs, and making a transformative report of it.
What Bing is doing seems to me somewhere in-between. The mechanism avoids literal copying of specific artifacts but the net effect in some cases approaches the same result. As with other 'fair use' analysis, it's rarely black-and-white. The magnitude of the information used, its effects on the market, and the value-added transformation afterward are all important. I don't know how a court would rule in such a suit but the discovery process would surely be fun for spectators like myself!
Speaking only for myself and only on the ethics, I generally feel that any site that allows itself to be indexed is pretty happy with Google (or bing) doing whatever they can to rank it better. Even with the link data that sites provide, you can add rel=nofollow to the links if you don't want search engines to use them but still want your pages indexed (yelp has done this for instance.)
For me that's the ethical boundary. Sites have various ways of indicating their wishes, and that ought to be respected in spirit beyond the technical details.
Legally, the technologies that make the internet work all rely on the idea of fair use, so it is very important whether something is "fair."