Hundreds of thousands of emails are being stored every day, for up to months at a time.
Image: Basheer Tome/Flickr
Unsurprisingly, Canadian spies want to know who is trying to attack the country's computer networks—and part of those efforts involve monitoring emails sent to Government of Canada email addresses for malware and suspicious links.
But what's concerning is the breadth and scope of that surveillance. Spies monitor hundreds of thousands of emails as they travel to government inboxes each day, and store details about those messages for months, and even years.
A newly released document obtained by Edward Snowden, and published by CBC in collaboration with The Intercept, describes these efforts as part of a Communications Security Establishment (CSE) email analysis tool called PonyExpress that has been in use since at least 2010.
CSE can extract and log metadata from emails that have been sent to government email addresses, and flag attachments for further analysis.
Scanning emails for malicious links and attachments isn't all that unusual, but retaining those messages for indeterminate amounts of time is, and the exact length of time this data is retained for is not clear. According to the document, retention periods range from days to months in the case of packet capture—in other words, storing all data travelling across a section of network under surveillance—and years in the case of metadata.
It's no surprise, then, that analysts describe the massive amount of data they collect as a "haystack."
According to one slide, the agency captured up to tens of terabytes of "passively tapped network traffic each day" in 2010—from which emails and attachments can be pulled—and hundreds of gigabytes of network metadata—for example, the path an email takes to reach a government email inbox, and the IP address from which it originated.
As such, tools such as PonyExpress are designed to address a not uncommon problem amongst intelligence agencies these days–"information overload" and having "too much data."
Still, PonyExpress alone was processing 400,000 emails a day in 2010—"emails that were previously filtered by Anti-SPAM [and] Anti-Virus softwares," the document reads. Of those emails, about 400 were deemed suspicious enough to require further human analysis–and of those, only four alerts per day were deemed worthy of alerting government departments.
Storing all of the email data destined for just 68 government departments was expected to require 150TB of capacity per month, and grow further as the agency expanded its scope to include just under 100 departments by 2012.
Notably, the data analyzed by PonyExpress can be obtained using Deep Packet Inspection Technology, or DPI. Such technology works by observing small portions of internet traffic known as packets, and matching the information describing each packet against a library of signatures—including known applications, protocols, network activity, and more.
DPI hardware can also flag all internet traffic destined for a particular IP address, or range of IP addresses, such as those belonging to the Government of Canada. It's possible that CSE's EONBLUE program—which is believed to be based on DPI technology—could be the first step in flagging email traffic for further analysis by PonyExpress.
PonyExpress appears to be the product of a CSE IT Security team called N2E. The team was formally established in 2010 to better focus the government's discovery of "new threats, techniques, tradecraft" against government networks. But it's not the only analysis tool in the agency's arsenal. CSE also detailed efforts to more accurately flag suspicious URLs contained in emails, and predict which emails were most likely to contain suspicious links or attachments based on the metadata of the email.
Usually CSE is forbidden from intercepting the communications of Canadians, but is allowed to do so when conducting cyber defence of Government of Canada infrastructure.
According to a CSE statement supplied to the CBC, "Data and metadata are deleted according to established data retention schedules that are documented in internal policies and procedures. To provide more detail could assist those who want to conduct malicious cyber activity against government networks."
Motherboard has reached out to CSE for further comment, and will update this article with the agency's response.