Validating Process Safety Assumptions using Operations Data Taylor W. Schuler & Jim Garrison aesolutions 250 Commonwealth Drive, Suite 200 Greenville, SC 29615 Taylor s Bio Taylor Schuler has more than 15 years of experience in software product management for the Oil and Gas industry. Currently, Taylor is the Product Manager for aesolutions aefacilitator and aeshield process safety applications. Taylor s experience with numerous customers provides a unique foundation for gathering and prioritizing requirements, converting them into consumable and testable features for software development professionals, and ultimately deploying to the customers once complete. Drawing on his experience from hundreds of facilities across five continents makes Taylor and effective product manager for aesolutions. Taylor holds BS degrees in Nuclear Engineering from the University of Tennessee and Physics from Roanoke College. In addition, he holds Certification in Maintenance and Reliability from the University of Tennessee. Jim s Bio Jim Garrison, a recent addition to aesolutions, is a key member of the process safety engineering team in Greenville, SC. He is a graduate of Georgia Tech with a BS in Electrical Engineering. He has over 8 years of experience designing instrumentation systems for use in hazardous areas and performing HAZOP studies and SIL selection and verification. Jim is a licensed PE in four states and is an ISA Certified Automation Professional (CAP) and ISA 84 SIS Fundamentals Specialist (ISA84 SFS). Abstract As facilities are assessing risk, making recommendations for gap closure, and designing safety instrumented functions (SIFs), assumptions are made to facilitate calculations in the design phase of protection layers used to reduce the likelihood of hazards occurring. Each of these assumptions are made based on design standards, process safety experience, and data supplied by the manufacturers concerning operability and reliability. The purpose of this white paper is to identify key assumptions and replace the assumptions with real-world operations data to prove that the risk may be greater than perceptions based on design. This case study will focus on looking at real functional test intervals verses those applied in the safety integrity level (SIL) calculations. It will also compare unsafe bypasses verses probability of failure on demand (PFD) and the count of initiating causes compared to the frequencies documented in the layer of protection analysis (LOPA). Page 1 of 8
Overview As stated in the abstract, the purpose of this white paper is to use real-world data to replace assumptions made during the safety instrumented systems (SIS) lifecycle. Real-world daily operations data can be extracted from applications such as historians, asset management systems, and/or other tooling that captures relevant data regarding a SIFs performance. This paper focuses on three assumptions that are made either during a risk assessment or designing a SIF. The three assumptions are: 1. Test Intervals: the frequency the safety devices need to be tested in order to achieve the risk reduction factor (RRF) established in design. 2. Cause Tracking: the LOPA team identifies an expected frequency of the occurrence of the cause. 3. Unsafe Bypass: periods in which the SIF is in bypass while the process continues to operate. Placing this information in the hands of the subject matter experts enables better decisions resulting in a safer facility regardless if in a risk assessment, SIS design, or during operations and maintenance (O&M). Case Study and Assumptions A key assumption made in this paper is that the SIS engineers have a datamap that enables the relationship of the operations data to the SIS model from the hazard down to the tagnames required to minimize the risk. The tools used in this case study are the software products offered by aesolutions 1, a MS Excel spreadsheet containing data from a common historian, and a spreadsheet that contains the testing dates of critical safety devices as stored in a common asset management system. The case study was based on data from a common SIF (Case Study SIF-01) from an unnamed company and facility. The SIF has an IL Rating = 2, and has been added to a reactor to ensure the vessel returns to a safe state in the event it pressure becomes too high. 1 aesolutions offers a PRM tool aefacilitator and SIS design and monitoring tool aeshield. When using the tools in conjunction, the safety data model is connected from the node, to the hazard, to the protection layer, to the device groupings, and to the tagnames. In addition, the tool can perform SIL calculations which are required for this type of analysis. Page 2 of 8
Figure 001 Reactor Protected by Case Study SIF-01 The operations data reviewed was over a 5-year period, September 1, 2009 to August 31, 2014. The SIF has three pressure transmitters as sensors with 2oo3 voting and two ball-valves as final elements with 1oo2 voting as seen in the figure below. Figure 002 Case Study SIF-01 Architecture When reviewing the operating data, the historian events and test plans where based on the sensors only. All naming conventions were generalized to mask the identity of the equipment and simplify the analysis performed in this white paper. Extending Test Intervals As hazards with unacceptable risks are identified, the LOPA team may recommend designing a SIF to close the gap to an acceptable level. As SIS engineers design and investigate multiple Page 3 of 8
what-if scenarios, the test interval, in months, for each safety device is established to achieve the desired RRF. If that test interval is extended, the RRF calculated from design is no longer valid. In this case study, the devices on Case Study SIF-01 require testing every 18 months. Based on prioritization issues, it was decided by the facility to wait until the next turnaround on the equipment under control which resulted in doubling the assumed test interval on each device (see Figure 003). Figure 003 Sensor PT-123 Design vs Actual Test Interval Each of the three sensors were tested at the same time as well as the final elements. By adjusted the test intervals and re-running the SIL calculation, the results show: Figure 004 Sensor PT-123 Design vs Actual Test Interval To narrate the results displayed in Figure 004, the SIF was required to have an IL Rating = 2 and was slightly over designed (RRF = 119). However, updating the SIL calculation with the real world test intervals, the RRF dropped to a 90 (IL Rating = 1), and introduces ~10% additional risk which represents a gap. Is a 10% acceptable? Of course, the answer could vary depending on the organization and the severity of the hazard the SIF is protecting against; however, the example is evidence of how things can change over time as difficult decision are made. To recap the workflow: 1. LOPA recommendation following the identification of a gap 2. SIF was designed with a required test interval and SIL calculation finalized 3. Data retrieved from asset management system with timestamps to identify real world test intervals 4. SIL calculation performed with actual test intervals 5. Analysis to determine tolerance of change in risk level Page 4 of 8
Periodic Review of Historian Data Moving onto the other assumptions replacements discussed in this white paper, data from a common historian was required. In order to effectively analyze and annotate historized events, the following workflow is required. Many of the steps can be automated, however, manual steps are required to validate data and classify the data to associate it to the appropriate parts of the process safety data model. The manual steps may vary depending on the tooling available. 1. Identify the type of events that need to be tracked. When reviewing data from the historian, it will be in the soft-tag format of [tagname]&[suffix]. For simplification purposes, this paper focuses on two generic types to simplify the analysis: Cause Tracking: suffix = _TRIP Unsafe Bypass: suffix = _BYP 2. Create a data map between soft-tags and the sensor tags in the SIF (see Figure 005) Event Starting Soft tag Ending Soft tag Tagname Name Dangerous Demand Suffix Value Suffix Value PT 123 Bypass Yes No _BYP TRUE _BYP FALSE PT 456 Bypass Yes No _BYP TRUE _BYP FALSE PT 789 Bypass Yes No _BYP TRUE _BYP FALSE PT 123 Trip No Yes _TRIP TRUE _TRIP FALSE PT 456 Trip No Yes _TRIP TRUE _TRIP FALSE PT 789 Trip No Yes _TRIP TRUE _TRIP FALSE Data applicable to case study sensors only Figure 005 Data map from safety model to historian soft-tags 3. Retrieve the data from the distinct list soft-tags in the data map over a time period 4. SIS engineer reviews the results (shorter intervals of review are recommended to minimize level of effort) and documents events on architecture and voting Identify initiating causes on SIF demands Group events and focus on unsafe bypasses and durations 5. Aggregate data and perform analysis to determine tolerance levels The following table represents data that has been pulled from a historian and annotated by the SIS engineer. The data was limited to soft-tags associated with the sensors on Case Study SIF-01 over a time period of 5 years, a typical duration between revalidations. Page 5 of 8
Historian Events Event Time Value Duration, hr Initiating Cause Remarks PT 123_BYP Sep 12, 2009 05:00:00 TRUE N/A Bypass not a demand Bypass due to repairs PT 456_BYP Sep 12, 2009 05:00:00 TRUE required identified in PT 789_BYP Sep 12, 2009 05:00:00 TRUE funcitonal testing. DSD PT 123_BYP Sep 16, 2009 05:00:00 FALSE 96.00 process authorized. MTTR PT 456_BYP Sep 16, 2009 05:00:00 FALSE 96.00 VIOLATOR >72 HR PT 789_BYP Sep 16, 2009 05:00:00 FALSE 96.00 PT 789_TRIP Feb 12, 2010 07:17:00 TRUE N/A 2oo3 voting N/A 2oo3 voting PT 789_TRIP Feb 14, 2010 07:17:00 FALSE 48.00 PT 123_TRIP Jun 01, 2011 01:32:00 TRUE Control valve failure results This represents a demand PT 456_TRIP Jun 01, 2011 01:32:00 TRUE in increased heating, thus event which according to PT 123_TRIP Jun 02, 2011 09:58:11 FALSE 32.44 resulting in increased the LOPA should only occur PT 456_TRIP Jun 02, 2011 09:58:11 FALSE 32.44 pressure: FREQUENCY = 0.5 once every two years. PT 123_TRIP Dec 12, 2012 08:30:00 TRUE Blockage of overhead line This represents a demand PT 456_TRIP Dec 12, 2012 08:30:00 TRUE leads to increased event which according to PT 789_TRIP Dec 12, 2012 08:30:00 TRUE temperature, thus the LOPA should only occur PT 123_TRIP Dec 17, 2012 08:30:00 FALSE 120.00 increased pressure: once every ten years. PT 456_TRIP Dec 17, 2012 08:30:00 FALSE 120.00 FREQUENCY = 0.1 PT 789_TRIP Dec 17, 2012 08:30:00 FALSE 120.00 PT 456_BYP Oct 31, 2013 11:28:00 TRUE N/A 2oo3 voting N/A 2oo3 voting PT 456_BYP Nov 01, 2013 19:06:00 FALSE 31.63 PT 123_TRIP Nov 17, 2013 23:35:00 TRUE N/A 2oo3 voting N/A 2oo3 voting PT 123_TRIP Nov 17, 2013 23:52:00 FALSE 0.28 PT 123_BYP May 12, 2014 02:00:00 TRUE N/A Bypass not a demand Bypass due repairs required PT 456_BYP May 12, 2014 02:00:00 TRUE identified by online PT 789_BYP May 12, 2014 02:00:00 TRUE diagnostics. DSD process PT 123_BYP May 13, 2014 19:47:00 FALSE 41.78 authorized. PT 456_BYP May 13, 2014 19:47:00 FALSE 41.78 PT 789_BYP May 13, 2014 19:47:00 FALSE 41.78 LEGEND & DEFINITIONS Types of Events _BYP Bypass in Dangerous State _TRIP I/O Trip implying a Demand Historian Values TRUE Puts the device in event type FALSE Puts the device back in normal state Column Shading Interface into historian Manually entered by SIS enigneer Data above represents a subset from an historian on I/O associated with Case Study SIF 01 interface focused over a 5 year period Figure 007 Historian data used to analysis cause tracking assumptions and unsafe bypasses Again, the SIS engineer review and documentation is less demanding if tooling is available to relate protection layer to the safeguard to the cause-consequence pair to create refined pick-list on the initiating cause column. Cause Tracking The data in Figure 007 enables the count of events related to an individual initiating cause. The data shows that there are two initiating causes creating a demand on Case Study SIF-01 over the 5-year period. The LOPA team identified anticipated frequencies of the causes occurring on an annual basis. Figure 008 show the results of the analysis. Page 6 of 8
Figure 008 Cause Tracking Analysis on Case Study SIF-01 The green symbol is an indication that the historian capture a demand count less than the frequency, while the red indicates that the demand count is higher than the assumed frequency. Is this tolerable? Again, the answer is dependent on the organization and circumstances, but the data can certainly be useful in a cause review/assessment session. Unsafe Bypass The data in Figure 007 enables the aggregation of durations that a SIF is in an unsafe bypass state. The total duration can then be compared to the number of acceptable hours as calculated by multiplying PFD by the number of hours in the period. Figure 009 shows the comparison against the SIF target and achieved PFDAVG as well as the real world s. Figure 009 Unsafe Bypass Analysis on Case Study SIF-01 The real-world PFDAVG values mimic those Figure 004. The green text indicates that Case Study SIF-01 did not exceed the acceptable hours from any of the scenario, therefore, no warning needing. Summary Case Study SIF 01 Acceptable Over 5 year Period PFD AVG Hours Target 0.01000 438.24 Achieved 0.00839 367.68 Real World 0.01110 486.45 From Historian Data 137.78 Hours in 5 year Period 43,824 In closing, the assumptions made at the front end of the process safety lifecycle are educated, but are still assumptions. Facilities already collect a large amount a data that can ultimately be tied to safety functions. Using tooling and managing data mappings enables the facilities to place more emphasis exposures to risks and save money in areas where the process safety professional are over conservative. In this white paper, one SIF as a case study explored. On this SIF, operations data replaced test intervals in SIL calculations, actual frequencies of initiating causes as compared to LOPA figures, and unsafe bypasses compared to PFDAVG. Figure 010 starts to show the power of expanding this analysis to an entire facility assessing all initiating causes and all SIFs. Page 7 of 8
Test Plan Intervals Cause Tracking Unsafe Bypass Actual test plan intervals validate achieved RRF Actual test plan intervals invalidate achieved RRF SIF demands less than initiating cause frequency SIF demands greater than initiating cause frequency % where unsafe bypass is less than PFD implies % where unsafe bypass is grater than PFD implies Figure 010 Facility scorecard regarding process safety assumptions To reiterate, placing this information in the hands of the subject matter experts enables better decisions resulting in a safer facility regardless if in a risk assessment, SIS design, or during operations and maintenance (O&M). Page 8 of 8