PACS downtime comes under growing scrutiny

Jul 7, 2014

As more and more core care processes move from paper-based to e-support, system or service uptime now plays a critical role in provider organizations, and planned or unplanned PACS downtime can be a serious problem, according to German expert Joachim Zaers.

"For most patient cases in acute care, imaging information is vital," said Zaers, an engineer for Munich-based medtech consultants Birkholz und Partner. "Whereas physicians and nursing staff may be rather creative in surviving downtime of the administrative and documentation systems, a PACS lights-out scenario tends to be challenging."

Joachim Zaers has been working for Birkholz und Partner since 2004, and since 2008 has headed its IT project implementation team.

Zaers has worked as an IT systems architect for IBM Global Services, and has been in the PACS business for more than 15 years, his largest PACS installation generating more than a million studies annually. He has been working for Birkholz und Partner since 2004, and since 2008 has headed its IT project implementation team.

With high availability nowadays being sold as part of a server virtualization deal, the need for continuity plans may start to fade. In some provider organizations, high PACS uptime is safeguarded by a second backup or "shadow PACS." While this approach may be appropriate for larger organizations, smaller providers will wish to avoid the additional cost, including human resources, he told delegates at the 16th DICOM Meeting and HIS-RIS-PACS workshop, held at the end of last month in Mainz, Germany.

"The annual 'HIT and Miss' sessions at the HIMSS congress outline that we will keep experiencing ever new threats, and that our risk inventory is a tool of limited efficiency," Zaers said, adding there is no 100% uptime guarantee. The human factor will always contribute errors that may lead to unplanned downtimes, and everybody knows spectacular stories about what can go wrong in the complex and highly integrated IT landscape. "Even with an effective risk management, we will only reduce that risk -- this is what we should be aiming at."

Cut risks, increase lights-out tolerance

However, experience garnered from organizations that have implemented, and are executing regularly, their continuity plans reveals users can learn to tolerate unplanned downtime. In addition, robust and trained downtime procedures can lead to lower requirements regarding service levels, and this situation translates directly into lower operating costs.

Sample continuity plan for small department

For smaller radiology department with around four physicians, here's a possible downtime procedure:

Diagnose the emergency case "PACS down," when available with the IT service staff.
Communicate "PACS down" to all referring departments.
In a "PACS down" state, only urgent studies are conducted and the service catalog of the radiology department is reduced to CT and standard x-ray.
Radiology falls back to an isolated, highly controlled, and small network only containing the modalities, a workstation with a DICOM stack, film, and paper printers. The workstation is taking over the IP address of the PACS server, so there are no process changes on the modalities. The isolation procedure is trained and the department runs on two redundant switches. All network devices are marked with red patch cables, so that the technician on task can move all the "red" devices to "run."
When the normal state is recovered, "PACS up" is communicated and the parties involved run the processes, updating and synchronizing all systems.

Source: Joachim Zaers.

Today's IT applications are designed to support users unobtrusively, and this can lead to an underestimation of the degree of process support delivered by IT. In addition, use of IT applications in hospitals grows continuously.

"The full scope of utilization is often only realized when the IT systems are shut down for maintenance," Zaers explained. "With the advances in high availability technology, these planned downtimes are reduced, and users are not really forced to implement complete and sustainable continuity plans."

As for the regulatory framework in Germany, PACS continuity plans are required for the operation of teleradiology services in rural areas; for PACS installations in hospitals, however, there are no explicit standardized requirements for continuity plans. There is a rising interest in protecting critical IT systems -- including healthcare -- and the German Federal Office for Information Security BSI has developed a standard for IT continuity planning (BSI 100-4), he explained.

"We often find that the effect of an unplanned PACS downtime is underestimated and proactive planning for such downtimes is not really effective. A proper indicator for the organization's maturity in this respect is the simple question: When is your next planned PACS downtime that is longer than one hour?" he said.

Today's PACS solutions are complex, multitier integrated IT systems, and recovery often involves second-level support from the vendor. Even worse, IT diagnostics -- as in medicine -- has turned into a multidisciplinary activity where PACS admin, network, and medical engineering staff try to track down the issue. Furthermore, not all of these team members are available 24/7.

"In this context, it is anything but an easy task to identify the 20% of effort that will deliver 80% of the benefit induced by such proactive planning -- in line with the Pareto Principle," Zaers said. "So when we work on continuity plans and downtime procedures, we resort to a simple and robust environment which is easy to operate and which is tested against the following worst case scenario or the chief information officer's nightmare: Imagine Saturday morning, 2 a.m., and images of the current emergency case are not readable in the trauma room -- this scenario is marked by low quantity and quality in staff across all tasks involved, with damage potential on the patient side at the maximum."

This worst-case scenario is a real-world example. The facility was looking back on three years of excellent PACS operation, with no unplanned downtime during the service hours, and downtime procedures were never documented nor trained as the PACS was running smoothly. The technician on duty on the Saturday had only joined the organization six months after PACS rollout. As a consequence of the incident, the head radiologist implemented a process, allowing him to ask for a downtime drill at any time.

Integrating IT health checks

PACS vendors, indeed all healthcare IT vendors, should add "health check" features to the software to improve overall uptime of certain functionalities, Zaers urged. Continuity plans are often product-specific and can be provided as templates by the vendor, and open system technologies available could use an optimization for resilience and simple handling. On behalf of hospitals for whom his team is active, he has begun requiring monitoring specifications for the system and its environment to allow for a real-time health-check supporting users in detecting abnormal system status.

"With the advent of end-to-end electronic documentation by physicians and nurses -- electronic medical records -- it is high time to develop methods to create and maintain continuity plans for all these digitally supported processes," he underlined. "In our experience, the current staffing of IT departments is not taking into account the additional resources needed."

Real-world criteria of a continuity plan

Start out with a scenario implying that the PACS is inoperative and try to find an alternative setup using other system components. Later on, you can begin investigating threats related to the identified risks.
KISS: Keep it small, simple, and stupid. Start with one general downtime scenario. Later with trained users, you may want to go into further detail. For the start, one downtime scenario with one set of procedures is enough to communicate and practice. Keep as many process steps unchanged as possible.
Communication: Develop the continuity plan, together with all involved parties -- vendors, users -- to fulfill service-level agreements and safety objectives defined by the management. IEC 80.001-1 is a good framework to orchestrate this process.
Use a variety of appendices to communicate the content in different formats to the various stakeholders. The large number of part-time workers turns implementing changes to the continuity plan into a real challenge. Just think about some PACS or modality changes implemented on Thursday afternoon that need to be communicated to the typical part-time technician on the following Saturday night shift. Current Web 2.0 technologies such as wikis and forums can greatly improve communication and interaction between users and service providers.
Mandatory regular emergency drills will rapidly improve the continuity plan by correcting missing or out-dated information.
Include a continuous improvement process by reviewing the drills and improving the plan (see also IEC 80.001-1).
Take some time to solidify and improve your PACS by simplifying the system's architecture and application monitoring.
Start with the worst-case scenario and refine iteratively your downtime procedures. Simplicity is the key to success as it will ease communication and operation during downtime.