National Railway Day 2019
Edmonton, AB
06 November 2019
Faye Ackermans — speaking notes
Check against delivery.
Slide 1: Title page
Good morning.
The TSB issues its Watchlist every two years to highlight persistent safety issues in Canada’s transportation system that require action. November 2019 marks the half-way point for the 2018 Watchlist, with the next one due in 2020. We are already looking at how the current edition might change. Over the next few months, we are encouraging the rail industry to let us know what they think about the current Watchlist, as well as what issues they think should make it onto the next edition, and what data they have to support their opinion.
The purpose of my being here today is to spark some of that same conversation with the railway supply industry, and to share some of our current concerns with you.
Slide 2: Outline
Slide 3: TSB 101
Slide 4: One of these … is not like the others
These are the microwaves in the lunchroom at TSB headquarters. Two of them have the same operating “system,” but the third is completely different. When given a choice, some staff choose one of the two that operate in the same way. Moreover, trying to operate the third often takes several attempts, a lot of muttering, and sometimes advice from others in the room.
So here is the question: why isn’t the design – the human/machine interface – a little more intuitive?
Slide 5: Train accidents in 2018 (by cause)
Speaking of that interface, the “human” part shows up a lot when looking at the causes of train accidents in Canada.
Onscreen are three graphs, each showing causal categories of train accidents in 2018: main track train accidents (collisions and derailments), non-main track train collisions, and non-main track train derailments.
There are five main causal groups: equipment, track, environment, “other” and … yes, “human actions.”
Guess which one occurs most frequently? It’s “human actions” — of all sorts — from operator actions, to train marshalling errors, to loads being inappropriately tied down. In fact, “human actions” were causal in 31% of main track train accidents, over 90% of non-main track train collisions, and 50% of non-main track train derailments.
There are so many human actions that immediately precede an accident, that the rail industry—including the supply industry—needs to take a step back and look at these events with an understanding of why people take the actions they do. The “proximate” causes include things like mishandling a derail or switch, not protecting the point, not maintaining equipment, improper speeds, not using equipment properly, and so on. But it’s not good enough to identify the “proximate” cause; the underlying issues must also be understood–whether it is fatigue, distraction, poor communication, experience, training, oversight, system design, work pressures, safety culture, and so on.
And finally, safeguards need to be designed and implemented to mitigate those underlying issues.
So … what can the rail supply industry do to help? Let’s look at some possibilities.
To start, we will take a look at an old accident, one that occurred more than twenty years ago. An accident, which, by the way, has echoes that are still being heard today.
Slide 6: Design and training … and why they go together
When systems are designed or redesigned, what training is required to ensure that the human operator understands what is going on and reacts appropriately?
On 02 December 1997, a CP train derailed 66 cars during an uncontrolled high-speed descent on a steep portion of the Laggan Subdivision known as "Field Hill." The three crew members were not injured. The locomotive engineer (LE) was relatively inexperienced – 14 years as a conductor, but only 1 year as a “spare” LE. In the previous 6 months, he had operated westward on the Laggan sub 25 times, but only one trip was with a GE locomotive. He had had only 2.5 hours of sleep in the previous 29 hours. There were several inappropriate train handling decisions which resulted in the depletion of the train air brake system. But the equipment being used and the interaction between man and machine also played a role.
This discussion will focus on the use of a pneumatic control recovery procedure to engage the dynamic brakes.
Slide 7: Design and training (continued)
The train was stopped in emergency.
The decision was made to recover the air while descending the hill using dynamic brake.
The LE used a common practice—and not the company operating instructions—to recover pneumatic control, namely:
- moved the combined controller (throttle/DB handle) from IDLE to DB APPLIED
- waited 60 seconds
- moved the train automatic brake handle from EMERGENCY to RELEASE.
But the company instructions stated:
- ensure combined controller is in IDLE and automatic brake valve handle in EMERGENCY
- wait 60 seconds
- turn automatic brake valve handle to RELEASE, pausing briefly in the HANLDE OFF position.
The common practice did not affect pneumatic control recover on GM locomotives … but it did on the GE locomotive, which was on this train. As a result, dynamic braking was not operating.
What indicators did the LE have that dynamic brakes were not working? The PCS open light was illuminated on a white background. But this display was not compelling enough for the LE to “see” the problem.
Slide 8: Design and training (continued)
Onscreen you can see 3 images:
Top left: GE AC 4400 control console (Note Integrated Function Display screen in upper left corner)
Top right: illustration of AC 4400 IFD screen immediately before emergency brakes are released. Note the "PCS Open" light illuminated
Bottom: Figure 6: Illustration of AC 4400 IFD screen after emergency brakes are released. Note speed increase and "PCS Open" light still illuminated indicating that control of power and dynamic braking has not been regained.
Here are some of the TSB’s findings from that report:
- The control system for PC recovery was built to a GE specification that was accepted by CPR without knowing the consequences if an unapproved PC recovery procedure was used.
- These systems were not designed to make it readily apparent to the locomotive engineer that an error had been made, nor did they indicate clearly how to reverse the error.
- The training provided to locomotive engineers did not ensure that they had sufficient understanding and proficiency in the use of the IFD screen on … locomotives or other system differences between GM and GE locomotives.
The Board concluded that the locomotive pneumatic control recovery feature and the Integrated Function Display were not designed with sufficient regard to error tolerance. Further, railway training and supervision did not ensure that the locomotive engineer had an adequate knowledge and understanding of all aspects of the operation of the GE AC 4400 locomotive.
Or, put another way, the supplier has a responsibility to ensure the operator understands functionality of design and to recommend essential training and operating instructions.
Today, the fall-out from the Boeing 737 Max software design continues. I thought the following quote from a 28 October 2019 U.S. Senate Commerce Committee hearing worth repeating. Chairman Roger Wicker said: “…it is imperative for the industry to ensure the interface between human operators and technologies is seamless going forward.”
Slide 9: Human-machine interface
An alternate title for this slide could have been “Why flashing lights and sounds are good.”
Why is that?
Because humans make very poor system “monitors.” As a result, we need to have information brought to our attention – for instance via flashing lights or colour changes or sounds.
The phrase “cognitive conspicuity” refers to the importance and relevancy of information to an operator’s context. To ensure the most important visual cues for a specific scenario are detected by the operator, the cues need to be easily discriminated as the most relevant. This safety significant information should not be masked, or weakened, by the presence of other more noticeable cues.
So today’s question is this: what parts of locomotive software control systems need to be re-engineered – to do more than just display information (but to also provide a warning or alarm if something is not normal)?
The role of the system designer is not an easy one. While the TSB believes that locomotive display systems may require some enhancement to bring important information to the operator’s attention, sometimes, too many warning sounds and flashing lights can be distracting—particularly in an emergency.
System design /human interface is one aspect of human-action-caused train accidents. What are some others? For example, how can we help employees maintain situational awareness? Let’s look at the next slide.
Slide 10: Switches
A task that is done thousands of times may not always be done right – i.e., lining a switch. Since I have been on the Board, the TSB has released two reports where an employee lined a switch on top of himself, walked away from the switch, and was struck from behind and killed by the movement. One was a trainee. The other was a 20+ year veteran.
Slide 11: Switches (continued)
This simple solution to help make it easier to understand which way a track is lined, is being implemented by some railroads.
Slide 12: Clearance point
Or, here is another example of a simple solution. CN is marking the fouling point in yards to aid employees’ understanding of where to position equipment in order to prevent sideswipes and collisions.
Slide 13: Watchlist issue: Following signal indications
Think of this Watchlist issue: following signal indications.
As I said earlier, humans are not very good at monitoring systems. Yet that is what locomotive engineers are expected to do: see, understand and take the appropriate action as their train approaches every automated signal on a route. Moreover, there are often operating restrictions en route (i.e., slow orders) that require action to slow the speed of a train.
So how can we expect humans – who are bad at this – to get it right every single time?
A better question is: why do we expect humans to suddenly get better at something they’re bad at? And how can we help them improve?
Slide 14: Looking ahead …
Onscreen is a photo of the operating screen of Trip Optimizer (TO).
New systems—such as TO—are starting to address some of these issues, but the rail industry in Canada has not embraced the types of train control systems being used elsewhere in the world to assist the locomotive engineer in his tasks.
When introducing new technology, it is important to assess the inherent risks and the risks associated with integrating it with existing operations. Assessing such risks enables an organization to manage any operational safety implications by implementing mitigations accordingly. These may include changes to the design or updates to the training, procedures, and tasks.
Slide 15: Conclusions
For many years, the rail industry has concentrated on reducing train accidents through technology-driven solutions primarily targeted at infrastructure and equipment failures, which were the predominate causes of main track train accidents. Today, the single largest cause of all train accidents is the “human” element. Part of the increase in human actions caused accidents may partially be due to the recent fairly significant increase in new operating employees. But, rather than think of these “newbies” as the cause, use the accidents to understand how to make the system safer from the mistakes people – even experienced people – can make.
Slide 16: Contact us
Slide 17: Questions
Slide 18: Canada wordmark