Challenges in securing legacy OT systems
And why a comprehensive risk-based approach is needed to secure today's converged IT and OT environments
What are OT systems?
Operational Technology (OT) refers to the hardware and software which monitors, controls, automates and manages an organisation’s industrial operations. This could include controlling devices, infrastructure and processes in an industrial setting. Certain sectors usually have a heavy reliance on OT systems, such as energy and gas, manufacturing, utilities, and transportation.
Examples of OT systems:
Industrial Control Systems (ICS)
Supervisory Control and Data Acquisition (SCADA) Systems
Computer Numerical Control (CNC)
Programmable Logic Controllers (PLCs)
These systems can be found within power plants and water treatment facilities manufacturing facilities, energy and gas distribution systems and transportation systems. OT can either be modular or monolithic, and many of them support processes, tasks and infrastructure that is critical to national infrastructure.
Some of the tasks OT systems might perform include:
Providing a means for human operators to interact with complex physical machinery
Collection of information
Automation and/or sequencing of tasks within operations
A real-world example of a legacy OT system can be found detailed here. Essentially, a Programmable Logic Controller (PLC) being used to control and carry out self-cleaning tasks (i.e., clearing debris from municipal wastewater prior to treatment) as part of the wastewater/utilities operations.
Exploring the history of OT and its design limitations
Many OT systems comprise of legacy components and equipment, some of which are decades old. They were developed during a time when IT networks didn’t exist to the extent that they do today, so they didn’t prioritise connectivity and interoperability with external systems (like IT networks). This means that the majority of legacy OT systems don’t have the same in-built functionality as IT systems, which presents challenges when it comes to applying one-size-fits-all security controls:
Below are some common legacy OT design limitations and their security implications:
OT devices tend to be built to serve a specific purpose. They generally run specialist software and industrial communication protocols (such as Modbus and DNP3 - which enable data exchange between devices), requiring a niche skillset for operation and maintenance
Many industrial sites are designed to operate reliably for years, decades even. The OT assets within these sites were designed to prioritise longevity without considering the need for connectivity or future changes. This point-in-time, purpose-built system approach has resulted in patchy visibility of OT environments and assets - you can’t secure what you don’t know exists!
Upgrading or patching legacy OT systems, such as ICS, isn’t always an option. These systems are often critical to local and national critical infrastructure operations, and have been designed to operate 24/7 without failure. Downtime for patching or upgrading can be extremely costly (that is, if patches and/or upgrades are even still available or feasible), both financially, and in terms of local/national safety and security. Even in cases where patching or upgrading is theoretically possible, the risk of downtime could be intolerable. Additionally, to ensure high availability and redundancy, OT assets often make use of redundant components, including backup servers. Consequentially, many legacy OT systems are plagued with vulnerabilities, which are often unseen and unmitigated
Some OT systems that are decades old are serial-based, rather than IP-based. This presents obvious challenges when attempting to integrate and connect these systems into wider IT networks (e.g., integrating them into IT-focused centralised security monitoring and management tooling)
Legacy OT assets were developed during a time when mature secure-by-design thinking didn’t exist, with many having no in-built encryption capability
How do OT and IT differ in their approach to security?
The IT and OT worlds differ in their approach to prioritising risk:
OT prioritises the availability, safety and reliability of industrial operations and processes
IT prioritises connectivity, interoperability and (most frequently) confidentiality and integrity of data
Both IT and OT environments are susceptible to cyber attacks. The consequences of an attack can be particularly severe in OT environments supporting critical national infrastructure. The impact from such attacks has the potential to be devastating, with far-reaching implications for public safety, wider society, and the economy. History has already shown us that cyberattacks on converged estates can often originate on the IT network, before traversing into OT production environments due to inadequate or missing controls. Securing the whole environment end-to-end is a complex, but absolutely necessary task.
IT and OT systems have traditionally operated in isolation, with separate technology stacks, standards, protocols, and governance models. But the world is changing. Today, widespread digitalisation (including a push to upgrade legacy infrastructure) is fuelling the continuing convergence of IT and OT. There is rapid acceleration across the OT industry towards connected production floors, driven by the focus on remote operations and supply chain management. The convergence is a double-edged sword; whilst it can enhance efficiency, the increasing reliance on connectivity inevitably expands the attack surface, introducing new vectors for compromise.
Bridging the gap between OT and IT: The golden thread
Building a robust security strategy for a converged environment is a daunting and complex task requiring a comprehensive approach. OT and IT environments have their differences, but there is also something they share. If we take a moment to zoom out and look at the bigger picture, it can be recognised that, at a high level, they are working towards a similar goal: enhancing security maturity and resilience to protect people and/or information.
This goal is a golden thread linking the two environments on a shared understanding of the ‘why’ - e.g., why they need to prioritise security. To achieve the why, we need some shared guardrails, and this is where agreeing on some universal security design principles becomes incredibly important. We’ll explore what these can look like in the next section.
So, it’s not the why that is the issue - it’s the how, i.e., how we assess and implement security measures aligned to the overarching design principles. This is where contextual consideration dependent on each unique asset and/or environment is important. It’s here that a one-size-fits-all approach fails due to its inflexibility. To factor in these necessary contextual considerations, organisations should consider adopting a risk-based approach aligned to secure-by-design thinking.
Universal security design principles
Listed below are some universal security design principles, along with examples of how and where the practical implementation of related security measures can differ for IT and OT environments:
1. Visibility
We can’t secure what we don’t know exists. Develop your ability to visualise your devices, the network and any associated interconnectivity.
Build and maintain an accurate inventory of your hardware (including their components and any backups), software, and networks encompassing IT, OT and Internet of Things (IoT, or IIoT) assets. Make sure you factor in any supply chain management considerations.
You should also develop a robust approach to Identity and Access Management (IDAM) - develop and maintain an accurate view of the identities across your estate, encompassing all digital credentials, users and accounts (shared, service, individual and third party), as well as physical access (e.g., ID cards, visitor logs, etc.). Access should be continuously monitored and managed, with differing levels of access control being applied to assets, networks and environments, based on risk. The principle of least privilege should be applied, as well as separation of duties.
Building visibility -whether it’s of assets, network architecture, or identities and credentials - can be complex and extremely time-consuming. Consider using automated tools to help with the task (or lean-on specialist service providers for assistance) where appropriate.
2. Criticality
Organisations should seek to develop a context-driven understanding of the criticality of their assets to inform more effective risk-based prioritisation. For example, it’s important to identify your crown jewels in order to prioritise and develop proportionate security controls. Don’t forget to include interconnected assets, e.g., IoT devices, and to account for third party dependencies/connectivity throughout the supply chain.
3. Segmentation
This one is particularly important. Organisations should look to use their risk-informed view of criticality to apply proportionate segmentation across the estate. Doing this effectively can reduce your attack surface and help to minimise the impact of breaches - when an attacker finds their way into a compromised estate, effective segmentation makes it much harder for them to move laterally into other environments, and can limit the blast radius.
Crown jewel assets should be isolated in their own security domains or even zones. Access to, and communication between, data, devices, networks and applications, should be restricted on a risk and need-to-know basis.
Visualisation is key in ensuring your segmentation (and other controls) are working as intended.
4. Monitoring and logging
We can’t manage, mitigate or respond to security issues we don’t know exist. Continuous real-time monitoring is needed to effectively detect and respond to threats.
For standard IT environments, implementing consolidated monitoring and logging is likely to be reasonable and technically feasible. It can be effectively achieved, for example, through deploying, fine-tuning, and managing a SIEM solution.
For OT environments, the picture gets more complex. Some OT devices, such as those that are Linux-based, may have the connectivity required to enable security monitoring. Discovering them might present a further challenge if they’re not already visible. On the other hand, legacy OT assets might not have any in-built connectivity whatsoever.
However, this doesn’t mean we should give up here and accept that this is just the way things are. In these cases, there are alternative options to be explored - for example, edge devices or 'sensors' (i.e., IPCs - Industrial PCs) can be deployed either on or around critical physical legacy OT assets, into which traffic is mirrored. These sensors can monitor traffic flows, protocols, and connection types, and can report meta-data to a centralised server. This example demonstrates that, even where the original ask seems impossible, there are often alternative proportionate measures to be explored.
5. Patch and update where feasible (and look at other options when not feasible)
In general, at least in the IT world, it’s widely accepted that regular and timely patching is a necessary part of security protection. But when we look at legacy, high-availability OT environments, patches and upgrades can often present an unacceptable risk due to downtime intolerance. Unfortunately, the view that patching or upgrading is too difficult to do for OT assets has become somewhat of a default stance across the OT industry, with many organisations consequently ignoring the risk. To be clear, ignoring the risk is not an option, nor is holding archaic, outdated views about the air gap between OT and IT. The risk still exists, and it needs to be addressed in one way or another.
For legacy OT assets that are unsupported and incapable of receiving patches, organisations should assess the feasibility of an upgrade (factoring in cost and risk). If an upgrade is deemed unfeasible, consider alternative compensating controls proportionate to the risk. Some examples might be: isolating the asset(s) via physical or logical segmentation, strengthening and limiting access controls, or implementing additional monitoring and/or resilience testing.
6. Security awareness
Security education is a vital, yet often neglected, component of any comprehensive security strategy. Humans have the capacity to be your strongest link IF you give them the right tools, education, and environment, to do their job efficiently and securely.
Successful security awareness programs are engaging, relatable, and regular in cadence, albeit not overwhelmingly so. They empower employees to understand their security responsibilities. Consider developing specialised and targeted awareness materials relevant to specific teams and departments.
Most companies focus on raising awareness on cybersecurity best practices and the importance of adhering to security policies and procedures (which is undoubtedly important). However, they often neglect another vital aspect of awareness: developing effective communication, collaboration and trust between their security team and other staff. Other members of staff need to trust your security team and feel able to approach them when they have any relevant concerns. This ensures security issues and incidents can be dealt with swiftly and efficiently.
7. Governance
A successful governance approach within a converged OT and IT business relies on effective collaboration between OT and IT teams. This should be one of your number one priorities. Everybody should be on the same page - with a shared understanding of the overarching risk-based secure-by-design approach. Clarity should be provided on ownership and roles and responsibilities when it comes to safeguarding assets and systems (whether OT, IT, IoT or IIoT), with efficient feedback loops in place to enable effective communication between teams.
A multi-faceted approach to compliance is likely to be required for complex organisations. What this looks like varies for each organisation, dependent on the unique needs of the business, the sector, their tech stack(s), and any applicable legal, contractual, and/or regulatory requirements. Multiple standards and frameworks are also likely to be needed in support of the organisation meeting it's compliance obligations.
Last thoughts…
Building a robust security strategy for a converged environment is a daunting and complex task requiring a comprehensive, risk-based approach.
Visibility of your entire estate and any linked interconnectivity is vital, as is effective governance and communication between teams.
When organisations invest effort in understanding their attack surface and getting security right, they stand a better chance of developing greater maturity and resilience.