Expanding on Unfettered

Recently, Joe Weiss via his ‘Unfettered’ blog posted about an experience he had with IT scans in a utility environment. To summarize, the IT department of a utility port scanned a series of NERC CIP related substations and ended up crashing a service responsible for protection mechanisms. There is some confusion about whether or not the OT group had knowledge of the scans (Joe says “no notification was given”, but the next sentence says “The OT Team was notified”, but clearly the action was taken and the result was that the relays began misbehaving. The operations team had to reboot and clear the relays in order to restore normal operation.

From a technical perspective, Joe highlights that the problem is the GOOSE/IEEE61850 protocol on the relays. This protocol is used to communicate between relays at a substation, exchanging data necessary for protection and control information that is mostly local to the substation, but GOOSE can also go between substations as well. Protection and Control (P&C) is a distinct responsibility from EMS/SCADA, which in this case was using DNP3.  Basically P&C is responsible for the substation itself, and EMS/SCADA is responsible for monitoring the grid via data from multiple substations.

A good analogy is that you in your car is the EMS/SCADA. You are responsible for direction, acceleration, deceleration, turn signals, and a few other variables that are exposed to you. But the car itself is made up of subsystems that must require automation, such as airbags, engine control, transmission, and brakes. Those subsystems are not directly viewable by you as the driver, but you depend on them working in order to drive safely and efficiently. Had Joe’s scenario happened to a car, this means that subsystems were failing or not operating correctly, and you weren’t getting a check engine light to notify you.  Not a good place to be.

I’m going to agree with Joe…. with some clarifications. Joe makes a point that “IT Security should NEVER be left alone in industrial operations”. I agree, but this is a simplistic argument that doesn’t get to the core issue: Personnel who aren’t trained for industrial work, and who aren’t under the command of an Operations structure, require a formal process for performing work in industrial environments, and a clear consequence to violating that process. And I’m not talking just about safety requirements, this includes all the various things that linemen, foremen, mechanics, electricians, technicians, and others go through prior to performing work. They have a work order, they follow an energy control process, the work is planned with the operators and engineers, and approval is given at the top. All of this is necessary because what they do affects lives and livelihoods, and it’s magnified in electric power because we ALL rely on it.

Without this formal process in place, it’s left to each individual’s own experience and capability to judge how their work may affect the system. Some of us are good at it, we take our time, we talk with our peers in engineering and operations, we get ad-hoc (or even formal!) approvals because we are concerned of the effects.  Some of us are not good, some of us fire+forget, some of us assume the responsibility lies elsewhere, and are basically being reckless maybe without realizing that it’s reckless. This lack of responsibility and accountability isn’t an wholly an individual problem, it’s an organizational problem and it needs to be addressed by the organization’s management. This is risk, pure and simple, and it sounds like it’s not being managed adequately in this instance.

And lastly, I have to question the engineering rigor and CIP compliance analysis that went into the design of this system. Clearly, the scanning tool (which is not generally a standard part of an EMS/SCADA installation) has considerable access to the unauthenticated protocols (DNP3 and GOOSE) that run this system. With access via those protocols, crashing the relays is accidental,  but a savvy attacker can simply tell the relays to open, and they will open. This is equivalent to the level of access that the licensed operators and the protection engineers have, but without the tools, processes, and training those professionals are required to use.  I hope this prompted some uncomfortable conversations between the SCADA team, their networking team, and upper operations management. Back in the CIPv3 days, a design like this would prompt lots of questions from NERC CIP auditors, I hope we haven’t moved backward by going to CIPv5.

photo by tonyglen14

One Comment on “Expanding on Unfettered

  1. Unfortunately, I believe the audit regiment with current versions of NERC CIP has gone dreadfully backward. In my observations, far too many Regional Entity auditors simply do not possess the necessary OT skills to ask the right questions. The audits have become more and more like a paper exercise then years of yore. I would suggest that many CIP audits are more concerned about the format of the evidence submittals then the actual evidence artifacts. But…I caution utilities from building their CIP programs on the liklihood of a surface style audit. FERC auditors are not as kind and they are poking around recently. I feel there is a possible shift in audit approach in the future! Not building good security practices and compliance disciplines now will only hurt you double or triple as much in the future.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: