Hardcore Surveillance Law Nerding: On Yahoo scanning, “facilities,” and Stellarwind

Over at Emptywheel, Marcy Wheeler’s latest post about legal issues raised by the Yahoo scan controversy/mystery is worth reading. I agree with much but not all of it. This post will explain.

First, some context. The Yahoo scan issue, first revealed in an important but in places murky October 4 scoop by Reuters, has gradually become clearer through follow-up reporting. At this point I think we know that:

  • Operational: The government was hunting for a data signature linked to a communications method used by a state-sponsored terrorist organization, and this unusual step was deemed necessary because the government knew that the organization’s agents were transmitting their messages (which further used this tell-tale communications method) via Yahoo emails, but not which specific Yahoo accounts they were using to do it.
  • Legal: It was an individualized title I FISA court order, not a FISA Amendments Act/702 warrantless surveillance directive.
  • Technical: Yahoo embedded the scanning program used for this classified purpose within its operating system kernel, a different place than where it runs programs that already scan all incoming e-mail traffic for other, unclassified purposes like tagging spam, deciding which targeted ads to display, and blocking detecting child porn. This created systemic risks and alarmed Yahoo’s security team, which hadn’t been told about it, when they discovered it.

So. Yesterday, a group of civil-liberties groups sent a letter to the Office of National Intelligence asking the government to declassify more details, which includes the line, “If reports are true, this authority to conduct a particularized search has apparently been secretly construed to authorize a mass scan.” A Reuters story discussing this letter notes the two elements necessary for a traditional FISA Court order (that a target is probably a foreign power or its agent, and that the facility at which surveillance will be directed is probably used by that target) and adds this gloss: “An entire service, such as Yahoo, has never publicly been considered to be a ‘facility’ in such a case: instead, the word usually refers to a phone number or an email account.”

Marcy criticizes this gloss. Back in 2007, leading up to the enactment of the Protect America Act, the FISA Court briefly interpreted traditional FISA in a way that permitted the court to (purport to) authorize upstream collection of e-mails using traditional FISA orders to telecommunications companies. The court did so in part by determining that the “facility” could mean an entire telecommunications switch connecting the U.S. network to the world, since Qaeda agents surely used those switches to communicate (along with everybody else). Equipment at the telecoms (which had already been operating without court permission as part of Stellarwind) would scan the data packets passing through that facility and send messages containing an e-mail address “tasked” for collection (as likely used by Al Qaeda) to Fort Meade. This theory was summarized in the famous draft NSA Inspector General report about Stellarwind leaked by Snowden, and most of the important FISA Court documents related to this came out via one of our subsequent FOIA cases. I did my best to synthesize and explain all this in “Power Wars” chapter five, sections 10 (“The Path to Legalizing Warrantless Surveillance”), 11 (“A Hidden Fight”), and 12 (“Upstream Internet Surveillance”).

So Marcy’s first or main point, with which I entirely agree, is that it is not unprecedented for the FISA Court to define “facility” in a way that is more stretched than a particular phone number or e-mail account. Indeed, I would add, the original 1978 House Intelligence Committee report explaining the congressional intent behind FISA explicitly envisioned that sometimes a FISA order would require wiretapping an organization’s entire switchboard to get at one individual target within that organization, which was acceptable as long as minimization was used to discard innocent people’s conversations picked up as a result; this was to be lawful even in cases where “it may not be possible or reasonable to avoid acquiring all conversations.” So what happened in 2007 was a huge stretch but it didn’t start from nothing, and what happened at Yahoo likely flows from the same rationale.

Marcy’s second argument is that over time, “facility” and “selector” have become synonymous in FISA jurisprudence. If she’s right as applied to this case, at least, then it complicates the matter, as it would mean that the 2007 precedent is beside the point, because all of Yahoo’s system is not the “facility”; rather the specialized communications method linked to this data signature selector is the facility. (This would still leave open the Fourth Amendment issue of whether Yahoo’s act of scanning unrelated e-mails in search of that selector is an impermissible search, the same unresolved dispute that is central to legal debates about upstream-style surveillance at network switches, but that’s different from the statutory question of what a “facility” can be.) This would be something to look for if we ever get to see the FISA Court litigation documents.

Moving on, one of the interesting things about the latter idea — that the “facility” was the communications method being used on these messages before they were transmitted through Yahoo e-mail — is that it would essentially be one kind of content contained within another kind of content: The content of the actual message is contained in the special technique, which is in turn contained in an ordinary e-mail; as a result, lots of emails’ contents must be scanned to find the targeted ones. Marcy draws a link between that concept and another unresolved legal problem with surveillance that collects packet-based communications: data packets come in layers, like a series of envelopes wrapped in other envelopes, so what is technically content at one point in transmission becomes metadata at another. For example, one layer may say “send this packet from a Gmail server to a Yahoo server,” and the next layer (not normally examined until it gets to Yahoo) may say “place this message in Charlie’s Yahoo account inbox.” This may be important because in constitutional law, stuff that is considered the content of communications is protected by the Fourth Amendment, whereas mere addressing metadata, because it it shown to a third party, does not implicate privacy rights. So if you intercepted the packet at an AT&T switch as it is in transit from Gmail to Yahoo, only the fact that it is routing to Yahoo in general might be deemed the addressing metadata because that layer is all that anyone intended for AT&T to look at; the e-mail header envelope/layer is still sealed inside and is not intended for AT&T’s consumption. If you intercepted the packet at Yahoo, however, the fact that the message is addressed to Charlie’s Yahoo account, specifically, becomes merely metadata because it was intended for Yahoo to look at that envelope/layer. The best write-up I’ve seen about how this technical issue might have constitutional implications is by Julian Sanchez here, but as far as I can tell the FISA Court has never addressed it. (It flicks at this problem in footnote 7 of a ruling declassified in August which was about using a pen register to collect digits dialed after a phone call is already connected, but only to say it is not addressing legal issues raised by using a pen register with Internet technology in that opinion.)

That leads me to my last point in reaction to Marcy’s post. In wrapping up her (in my view correct) point that there is public precedent for traditional FISA (content) orders to involve an elongated understanding of “facility” and bulk scanning systems, she makes a passing reference to the famous 2004 hospital crisis over Stellarwind: “These scans likely replicate the problem identified in 2004, in that the initial scan is not of things that count as metadata to the provider doing the scan.” I understand her to be saying that the 2004 crisis — one feature of which was a dispute over whether the bulk e-mail metadata collection bucket was lawful — involved the problem of this blurry line between technical content and technical metadata described in the above paragraph: collecting e-mail headers from AT&T switches was gathering content.

I think that her premise here is incorrect: I have found no evidence that anyone involved in the 2004 crisis was thinking about the technical content/metadata issue and the Fourth Amendment issues it may raise. Rather the problem identified in 2004 was statutory. Under FISA, installing a device on a network to collect metadata related to communications counted as “electronic surveillance” (i.e. the thing regulated by FISA’s court order requirement isn’t always about “content”) and therefore needed a court order, and the new leadership at the Office of Legal Counsel in 2004 didn’t think a president’s wartime powers could trump or displace what FISA said was necessary here because the collection was bulk domestic rather than targeted at individual foreign wartime enemies. See “Power Wars” chapter five, section 6 (“Stellarwind”), 8 (“The Hospital Room Crisis”), and 9 (“Legalizing Bulk Data Collection”).