Before we get to the heart of the legal analysis, here are some additional facts which may be legally significant. They were provided in the Comments to yesterday’s post by Janice Brown of Cow Hampshire. Janice first called my attention to this issue in late August.
Ancestry also provided an option (to subscribers only, and even after IBC became “free”) to click and save the cached page to their “Shoebox”–a holding area of documents that subscribers are interested in.
Also, the initial Ancestry.com source description calls the IBC a “database-online,” not a search engine . . . .
Janice is correct about these additional facts and we will analyze their legal significance.
Janice also writes:
Also, there were several people who argued in commentary on various blogs and message boards that we, as bloggers and web sites owners, should have known that Ancestry would be doing this, due to various announcements and press releases they made, and the burden was on each of us to place a robots.txt file or some sort of HTML coding to prevent Ancestry.com from caching our sites. Is the burden truly on the blogger or web site owner, even if they are not commercial (i.e., the “mom and pop” web sites and blogs).
We’ll explore what the courts have to say about this issue as well. At the end of the series, I’ll have some suggestions for copyright owners.
I should point out to all readers that this remains an unsettled and evolving area of law; this ride may prove a bit frustrating at times. Now on with the show . . . .
Field v. Google, Inc., 412 FSupp2d 1106 (D.Nev. 2006) [the link is to a PDF version of the court’s Order], is the case that was cited by most commentators and bloggers concerning the Ancestry IBC issue. They opined that the outcome of that case likely would dictate the rule of law applicable to the IBC issue. My preliminary reaction was that since Field is a decision of a trial court, the lowest level of the federal judiciary, no other court is obligated to follow it; and second, there are some unique facts in this case that may have had an influence on the outcome.
Blake Field is a lawyer in Nevada. He’s also a poet. Field was familiar with Google’s search and caching processes. With this knowledge, according to the court, “Field decided to manufacture a claim for copyright infringement against Google in the hopes of making money from Google’s standard practice.” [412 FSupp2d at 1113]. In January 2004, Field created fifty-one works and put them on a website, accessible for free. He also created a “robots.txt” file for his site because he wanted search engines to visit his site and include the site within their search results. The court notes that “Field knew that if he used the ‘no-archive’ meta-tag on the pages of his site, Google would not provide “Cached” links for the pages containing his works.” [412 FSupp2d at 1114] So, he consciously chose not to use the “no-archive” meta-tag on his Website.
As Field intended and expected, the “Googlebot” visited his site, and indexed and cached its pages. Thereafter, each of Field’s pages was retrieved from Google’s cache by some individual or individuals.
Field sued Google for copyright infringement. “Field allege[d] that Google directly infringed his copyrights when a Google user clicked on a “Cached” link to the Web pages containing Field’s copyrighted works and downloaded a copy of those pages from Google’s computers.” [412 FSupp2d at 1115; emphasis added] Field did not allege that Google infringed his copyrights when the Googlebot initially copied his pages and stored then in the system cache.
Following established legal precedent, the court pointed out that for copyright infringement, a plaintiff must show ownership by the plaintiff, and copying by the defendant. Furthermore, the copying must the result of a volitional act on the part of the defendant. [CoStar Group, Inc. v. LoopNet, Inc., 373 F.3d 544, 555 (4th Cir.2004)].
Applying the law to the facts, the court ruled in favor of Google. The court said, “[W]hen a user requests a Web page contained in the Google cache by clicking on a ‘Cached’ link, it is the user, not Google, who creates and downloads a copy of the cached Web page. Google is passive in this process.” [412 FSupp2d at 1115] In other words, the court found no volitional act on the part of Google when a user accesses its system cache.
There’s more to the Field case, certainly. And certainly, it doesn’t answer questions such as whether the user can be sued for copyright infringement; whether Google is liable for infringement for the actions of its bot; and others. But let’s stop here for a moment and examine how the law would apply to Ancestry.
Presumably, the path leads in the same direction. That is, when a user clicked on the relevant link in the IBC, Ancestry would be “passive” in that process and thus there would be no infringement by Ancestry when users requested information from the IBC.
But a couple of facts seemed important to the court in reaching this conclusion. First, the court pointed out that pages retrieved from Google’s cache contain a “conspicuous” disclaimer that the cached page is not the “original” and that there are two separate links to the current page. It is not clear, or certainly was not clear at the outset, that Ancestry’s IBC would operate in that manner. Second, the court examined the purposes of Google’s cache. For example, “Google’s ‘Cached’ links allow users to view pages that the user cannot, for whatever reason, access directly.” As to the IBC, while it was behind Ancestry’s paid subscription wall, this was true only for paid subscribers. Additionally, Google’s cache enables users to determine how a Web page may have been altered over time as well as to determine more quickly whether and where a search query appears and thus whether the page is germane to the user’s query. It is not at all clear that Ancestry’s IBC would operate in this manner. Recall that Ancestry began calling it a “search engine” only after the negative initial response. We do not now know Ancestry’s true intent at the outset of this project or what would have happened had they chosen to press ahead despite the negative reaction. [These are matters that we might be able to discover through various procedures if litigation had been commenced].
Back to the Field Case: Google asserted several defenses to Field’s claim. First, Google asserted that Field had granted it an implied license to use his content. The law on this matter is that a copyright owner may grant a nonexclusive license expressly or impliedly through conduct. Melville B. Nimmer & David Nimmer, Nimmer On Copyright, vol. 3, section 10.03[A] (1989) An implied license can be found where the copyright holder engages in conduct from which the other party may properly infer that the owner consents to his use. The United States Supreme Court endorsed this rule in the 1927 patent case of De Forest Radio Telegraph & Telephone Co. v. United States, 273 U.S. 236. Consent to use a copyrighted work may be based on the copyright holder’s silence where the copyright holder knows of the use and encourages it.
Recall that Field knew that had he placed a “no archive” meta-tag on the pages of his Web site, Google would have known not to display “Cached” links to his pages. Nonetheless, Field specifically chose not to include the no-archive meta-tag on his site, knowing that Google would interpret this absence as permission to allow access to the pages via “Cached” links. The court said: “Thus, with knowledge of how Google would use the copyrighted works he placed on those pages,and with knowledge that he could prevent such use, Field instead made a conscious decision to permit it. His conduct is reasonably interpreted as the grant of a license to Google for that use.” [412 FSupp2d at 1116]
Does this ruling in the Field case mean the burden is always on the copyright holder to preemptively fend off those crawling or scavenging the Web for copyrighted material? Consider that the inclusion of a “no archive” meta-tag or the appropriate “robots.txt” file is relatively simple for the content owner while as the court said “Given the breadth of the Internet, it is not possible for Google (or other search engines) to personally contact every Web site owner to determine whether the owner wants the pages in its site listed in search results or accessible through ‘Cached’ links.” [412 FSupp2d at 1112]
On the other hand, a copyright owner should have the right to choose which “distributors” or search engines the copyright owner wishes to grant a license. This would require knowledge of the use to which the other party intended to make of the copyright holder’s content, as the Field court said. In the case of Ancestry’s IBC, no content owner knew in advance that Ancestry would make such use of their content.
On this last point, some have referred to The Generations Network’s Terms and Conditions, specifically this provision:
User provided content
Portions of the Service will contain user provided content, to which you may contribute appropriate content. For this content, Ancestry is a distributor only. By submitting content to Ancestry, you grant MyFamily.com, Inc., the corporate host of the Service, a license to the content to use, host, distribute that Content and allow hosting and distribution of that Content, to the extent and in that form or context we deem appropriate. Should you contribute content to the site, you understand that it will be seen and used by others under the license described herein. You should submit only content which belongs to you and will not violate the property or other rights of other people or organizations. MyFamily.com, Inc. is sensitive to the copyright of others.
In my view, nothing in that provision puts one on notice that Ancestry.com would use robots to crawl the Web in a manner similar to Google or other search engines. Indeed, the choice of the verbs “submit” and “contribute” suggest more than a passive or silent consent to use content.
Recall that Mr. Field set out to get Google to use his content so he could sue them for infringement!
But, one additional point on the responsibility of content owners to protect their content: the court points out that the use of meta-tags has been an industry standard “for years.” I can see a court in a future case using this fact to hold Web publishers responsible to protect their content by communicating their preferences to Web crawlers.
The “Estoppel” Defense: Google put forth (successfully) a defense to copyright infringement known as “estoppel.” This means that: (1) the content owner knew of the allegedly infringing conduct; (2) the content owner intended that the alleged infringer should rely on the content owner’s conduct or acted in such a way that the alleged infringer had a right to believe it was so intended; (3) the alleged infringer was ignorant of the true facts; and (4) the alleged infringer relied on the content owner’s conduct to its detriment.
Put plainly, this means, for example, that the content owner acted in a manner to lead the alleged infringer to believe that the content owner did not object to the alleged infringing conduct and in reliance on that, the alleged infringer went ahead with the conduct.
In the Field case, the success of this defense has much to do with Mr. Field’s (dishonest) conduct. But this defense could succeed where there is no dishonest conduct. For example, this morning, I discovered a rather new site called Blogoholix. It purports to be a “blog search engine.” There is a note on the main page which says “es.blogoholix.com is a blog search engine in development. The tech and design work is still in progress, so please send an e-mail to firstname.lastname@example.org if you have any suggestions on how to improve the site.” I found GeneaBlogie on that site. Suppose with that knowledge and the knowledge that I can prevent my blog from showing there, I do nothing, and the owner of that site continues to crawl my blog. I think a court following the Field reasoning would say that my silence is conduct that they are entitled to rely upon.
Well, that may be enough law for today. Tomorrow in Part 3, we’ll explain fair use and the Digital Millenium Copyright Act. In Part 3, we’ll take a very specific look at Ancestry’s IBC. After that, we wrap this up with Part 4 and some conclusions and suggestions.
Part 1 can be found here.
TOMORROW: Fair Use and The Digital Millenium Copyright Act Meet Ancestry.com
Notice: The information in this writing is intended for educational use only and is not intended nor should it be construed as legal advice. If you have a legal problem, consult a lawyer admitted to practice in your state of residence. I am an active member of the bar of the State of California and am admitted to practice before the United States Supreme Court and various other federal courts. I am not licensed to practice in any other state. I am not presently soliciting or accepting new clients in the matters discussed above.
September 9, 2007 Sunday at 5:30 pm