The phone rang just as I was going to bed.
This phone call had a number I knew all too well, even without the caller ID showing the name… K1 Remote Operations. A this time of night it would be a problem, a serious problem. This particular problem would have me on the road back to the summit an hour later.
Midnight runs to the summit are not common, but they do occur in my life. Usually we can work remotely, the night attendant serving as our remote eyes and hands. Just press the right button, flip the correct switch, done. Not this time. We tried, for over an hour we tried.
I really did not want to head back up. I had just gotten down a few hours ago, having spent the day on the summit working on the usual long list of things that need to get done. Days on the summit, in the thin air of nearly 14,000ft elevation are physically draining.
The irony of this malfunction is that I had seen it before. The dome had tripped out inexplicably on previous occasions. The problem would occur then disappear. Once it vanished you could not troubleshoot it. Unlike most of our other systems there are no logs from the shutter drive, nothing records what was going wrong.
The Friday before this it had happened to me again. But this time was different, I had a maintenance computer attached to the PLC serial port. This time I saw the error, something in the code labeled speed mismatch. No idea what this was, or how it worked. Again the error disappeared, and I could not troubleshoot further as the weather was getting worse. No opening the shutters again.
I needed a chance to figure out what this fault was… Later that day I read through the code, figured out this feature was a speed check to insure that both sides of the shutter are driven evenly. A check to compare the right and left sides of the shutter and to fault if the difference is too large. Two words of memory were compared, if the difference was too large it faulted the shutter drive.
Not quite sure how it worked, the numbers just seemed to appear in memory. Maybe ask Dennis about this, he designed this system. I put this on the list of things I needed to look into as soon as the opportunity arose, a nice day on the summit when I could open the top shutter, something we have had very few of this spring.
So when the phone rang and the operator described the problem I was pretty sure what the issue was. This time was different, this time the problem would not so conveniently disappear.
Shutter open. Weather moving in on the summit. This is simply all bad.
One of the worst things that can happen to the telescope is to be exposed to bad weather. The damage to optics and equipment could take months to repair. We have to get the shutters closed. The weather is not yet that bad, somewhat foggy, but it could so easily get worse.
After struggling to close the shutter for over an hour I came to one simple conclusion. I had to go up, there was no avoiding it. A couple phone calls later I was back in the vehicle and headed to Hale Pōhaku, headed to the summit for the second time in what was going to be a very long day.
It takes a bit under an hour to arrive at HP. Driving through the night on Saddle Road, the music blasting. I try to distract myself with a favorite play list, but instead I review the problem over and over, it does not help, I just do not have enough information. At HP I switch vehicles, grabbing the keys to a company vehicle for the last climb to the summit.
As I drive I note the time is just now turning midnight. The new day is significant… My birthday. Yeah, life just does this sort of thing to me. A couple years back there was an emergency summit run on Christmas morning. At least it is an adventure.
I arrive at the summit, grab a radio and announce my presence, a few moments later we are headed for the cold, dark dome.
The guy backing me up in all of this is Sniffen, the night attendant on duty. Sniffen is a Keck icon, one of those guys who define the place. A loud and proud native Hawaiian, born and raised on this island. Back when it was legal Sniffen raised and fought roosters for fun and extra cash, I think he still does. A born storyteller he has a lifetime collection of tales to spin, tales I never get tired of.
One thing about Sniffen, nothing phases him. In my experience I have never seen him get grumpy or angry. He approaches life with an unwavering good cheer that you just have to admire. Someone to work beside me in a freezing dome through the wee hours of the morning? Sniffen is the guy you want.
We work in an island of light in an otherwise cold, dark cavern. There is just our flashlights and a florescent lamp inside the equipment cabinet, we can not turn the dome lights on as other telescopes may be operating around us, and of course, the shutter is stuck open.
Cold? Yes, this may be tropical Hawaii, but we are at 13,600ft elevation. The temperature in the dome hovers near freezing. At least we are out of the wind and the telescope operator has shut off the giant dome exhaust fan. Cold, dark, and more than a little bit eerie. The feeling in those domes at night is hard to convey.
First problem… Upon checking the shutters I find that one of the motor drives has faulted in a new way. The message indicates it has lost its parameter memory. This is bad. If the parameters are wrong the controller can destroy one of the big shutter motors. This must have happened in the multiple power cycles of the drive while we attempted to close the shutters.
Sniffen and I carefully go through the parameters, checking each one against the documentation. I check, he double checks over my shoulder as I scroll through all 50+ numbers on the little LCD display. All of the parameters are correct, every one. We reset the drive.
Hooking the laptop up to the PLC I scroll down to the fault I had seen before… Yes it is the same thing! OK, this I can deal with.
We try the shutter again… Looking at the front of the two motors controllers I can watch the speed of both controllers on the displays. They match perfectly, or as closely as I can flick my eyes from one display to the other on the side by side motor controllers.
This is not a real problem, the safety check is lying and faulting the system for an error that does not exist.
I edit the PLC code to defeat the test, making the check value so high it will never fault out. Again we power up the shutters and begin to close them. As Sniffen runs the shutters I watch the speed values on the motor controllers, my hand hovering over the e-stop on Capt. Marvel, manually performing the check that was faulting.
The shutters do not fault this time, the drives hum smoothly as the enormous top shutter slowly closes.
Done… Shutters closed.
I can go home and go to bed, but the shutters are still broken. Day-crew has nobody with my knowledge of the shutter drive system. Back up in six or ten hours to fix it? Can I do it before the next night? I am still mostly functional, let us take a stab at fixing this now.
I ask Sniffen, “Got anything else to do?” Nope, let’s do this. DMM probes in hand I start checking out the system. I tell Sniffen to keep an eye on me, to tell me if I am about to do something foolish in my tired, oxygen deprived state. Basics are good, supply voltages and reference voltages on the drives are correct.
I knew what it was, I just did not know why. The two shutter speed numbers in memory did not match, one side showing no speed. Where does this number come from? In the comparison code the motor speed value simply appears in memory.
There are only a couple ways that can happen in a PLC. I feared that the shutter control computer just wrote it in there, it does that with some status information. In which case the problem is in the next cabinet over in the drive controllers, not the PLC.
Time to take a break from the cold, find a warm office to sit down with the computer, and look the code over again.
Searching on the specific address yields nothing, it is not until I examine everything that writes to the same bank of memory that I find it… A block memory transfer that writes a small block including the address with the speed… Got it!
The speed value is coming from a module, specifically an A/D module. Hitting the schematics I find that the motor controller outputs the speed as an analog voltage, through a bit of wiring, to channel zero of the analog card in the PLC.
Back out in the dome I try it again… The voltage looks right on the terminal blocks, looks good at the A/D module terminals… Nothing in memory. Yet the other A/D channels in the block write are correct.
We need to change the module. PLC modules are easy to change, just have to grab the spare from south tunnel and plug it in. I made absolutely certain sure those spares were there a few years back.
Final cause is a dead A/D input channel. Not really that surprising, any circuit that connects to the outside world is always a risk. On the other hand, this is the first failed Allen-Bradley PLC module I have ever encountered, they do make pretty good hardware to have an operational life of over three decades.
I re-enable the speed balance check in the code and try it again. Working just fine. We clean up the scattered tools, close up the panels, and go to find a warm place. A few messages to everyone to let folks know before we can head down the mauna.
Sunrise finds Sniffen and I having breakfast at Hale Pōhaku, the second breakfast at HP in my very long day. We chatted with the early day-crew going up, let them know that the issue was fixed. A few calories to run on a bit longer for the drive home. Home and to bed.