| Everything looked good
as we started the first day of vibration tests on the
High Energy Solar Spectrascopic Imager (HESSI). We chose
to do our environmental testing at the Jet Propulsion
Laboratory (JPL) in California and, so, we had brought
our spacecraft there from Spectrum Astro in Arizona.
We planned to launch in July 2000. Heading into March
that year we were on schedule, under budget, meeting
all of our performance requirements, and ready for the
final testing. I remember feeling proud of what the
development team, lead by the University of California
at Berkeley and its project manager, Peter Harvey, had
accomplished in the last two-and-a-half years. We were
in the homestretch -- or so I thought.
Near the end of the day, it was time for the sign
burst test. For 200 milliseconds we would put a non-feedback
force on our system, which meant we couldn't adjust
or halt the test in process. Something went wrong, terribly
wrong during the sign burst test. As mission manager,
I was standing just ten feet away from the spacecraft
when this happened. It sounded like a clap of thunder.
With the test stopped, we moved in closer to see what
had happened -- and we knew immediately that we had
damaged our spacecraft. How much, we didn't know.
Once they got our spacecraft off the table, it was
fairly obvious what had caused the problem: One of the
support bearings on the vibration test bed had failed.
This caused an abnormally high level of static friction,
which the computer read as mass. When it tried to compensate
by increasing vibration, it shook the spacecraft ten
times harder than we had planned.
If anyone knows Tom Gavin, Director of Flight Projects
at JPL, they know that he likes to share a little piece
of information with engineers during reviews: "If you
have an anomaly, you're going to meet a lot of important
people." Well, I started meeting a lot of important
people as soon as word spread about our testing disaster.
Three days later, I stood in front of the Mishap Board
to open the investigation. The Mishap Board concluded
that two primary factors contributed to the accident:
the absence of a scheduled maintenance program for the
test equipment, and the lack of proper test procedures.
| I didn't
just accept responsibility for our mishap; I accepted
responsibility for getting the project back on track. |
I don't think that I was alone in thinking about Mishap
Boards with trepidation. But I learned a lot of valuable
lessons from this one. JPL doesn't run this particular
test very often, and we should have reviewed their test
procedures thoroughly before allowing our spacecraft to
undergo testing. Because this was a non-feedback test,
it should have been standard procedure to run a mass simulation
before running the test on our spacecraft, and if we had
been thinking straight, we would have required that. (Now,
I don't care who tells me something, I insist on seeing
it verified.) In the end, the Board concluded that my
team was partially responsible for the accident, and I
agreed with them.
Putting HESSI back together again
I didn't just accept responsibility for our mishap;
I accepted responsibility for getting the project back
on track. And if I was going to do that, I couldn't
wait for someone to tell us what to do; we simply got
to work. Our standard support structure (a machined
aluminum main support ring) had broken in two places
on each side; the test snapped it. So, the structure
had to be replaced. But that was only the beginning
of our problems.
Then there were the arrays. This was a solar mission
designed to explore the physics of solar flares, and
we wanted it up in July for the peak activity of the
11-year solar cycle. If we couldn't get up in July,
we wanted to get up as soon as possible. Solar arrays
normally require a long lead time. How could we get
new arrays in time? Well, we got Goddard engineering
involved. They found some solar cells manufactured for
the Iridium constellation, which was now bankrupt.
The next problem we faced was the instrument boxes.
We had done a vibration that nobody expected these boxes
to see. We went back to the vendors and asked, "If we
do an ATP [Acceptance Test Plan], will you re-qualify?"
They declined. "Buy another box" was their response.
So, I had to fall back on another organization, the
Quality Assurance Group, that I had previously seen
as little more than an obstacle standing between me
and my launch date.The Quality Assurance Group made
me an offer: If they could get involved in the Acceptance
Test Plan, they would accept the vibration and certify
our boxes. That's what we did.
| For months
we had operated under the maxim, "If no one tells
you to stop, just keep going." |
But our problems weren't over. Though it didn't break
during the vibration test, two months down the road, our
flight cryocooler failed. This was a commercial product
that we had flight qualified. We still had about four
or five of them, but we had to flight qualify at least
one of the remaining coolers. So, we put together a tiger
team to do another ATP and get it done as quickly as possible
-- although it was already clear that we wouldn't make
our launch date, that team worked miraculously, as far
as I was concerned, and eventually they brought HESSI
back to its original condition.
Of course, this is just the technical part handled
by the team. As the mission manager, the person responsible
for overseeing all the project's facets, I had to be
off doing other things -- including reviews. For months
we had operated under the maxim, "If no one tells you
to stop, just keep going." So, we had kept working all
along, but if we were to complete our work on HESSI,
I needed to have our Recovery Plan approved. So, while
all the technical work was progressing, I made our case
in front of several review panels.
After an independent panel gave us their stamp of
approval in May, the Goddard Program Management Council
held a Reconfirmation Readiness Review in June. An independent
expert concluded that we probably only stood a 60 percent
chance of surviving launch. When you take that to senior
management, it's likely to be considered too high a
risk. We had to convince them that we understood the
system better than anyone else did. And you know what?
They accepted this risk; here again, was another organization
that I gained a new appreciation of.
After that, we had a NASA Reconfirmation Review in
August, led by Dr. Ed Weiler, then Associate Administrator
for Space Science. I had to ask him for the money we
needed to get to launch. I gave a presentation and when
we got to the slide that showed HESSI before we started
the repairs, he told me it was a good thing he hadn't
seen the slides back in March. "I would have cancelled
you," he said. But, in the end, he approved our plan
and gave us our money for a February 2001 launch. All
in all, I was astonished by the level of support from
almost everywhere I turned at NASA when I asked for
help in recovering this project.
And even more astonishing
A year after the mishap, we were ready. I remember giving
myself a mental pat on the back as I thought about how
well we were doing -- all things considered. Then we
ran into another series of problems.
HESSI was scheduled to be air-launched by a Pegasus
rocket (dropped from the belly of an aircraft flying
39,000 feet over the ocean). The Pegasus started running
into problems on other launches. Our launch date was
pushed back to June. When the time came, we integrated
our spacecraft with the Pegasus at Vandenberg Air Force
Base in California and then flew across the country
to the Kennedy Space Center. We were just four days
from launch when there was another Pegasus failure --
this one on a DoD mission. We were put on hold.
We pulled out, went back to Vandenberg to wait it
out, and put HESSI in storage. But this time Mother
Nature decided to test us. A major rainstorm swept through
the area, and they had to call out troops to sandbag
our facility because the floods were rising. The water
kept rising -- so, in the middle of the night, in the
middle of the flood, in the middle of the rainstorm,
we moved HESSI to another building across a swelling
creek.
We got a launch date in February 2002. It took that
long to resolve the various problems with the Pegasus
and to get a new place in the launch queue. Finally,
we brought HESSI back to Kennedy Space Center. Of course,
with our luck, we came in the middle of another rainstorm.
We were waved off the first time and couldn't land.
So we had to circle the landing strip with lightening
flashing around us until, finally, we saw a gap in the
weather. We were ready to land.
Then we got a radio call from our airstrip, "There's
an alligator out there on the strip. You can't land."
At this point, none of us could be astonished by much.
We got someone on the ground to go out and escort the
alligator off the skid strip. Finally, we landed --
another crisis averted.
But then we had to wait for things to dry out, because
our ground system control had been hit by the rainstorm.
If I hadn't wondered if HESSI was in someway cursed,
this was enough to make me consider the possibility:
Things began to dry up, but our ground support equipment
had been inundated with toads. We had to go out there,
of course, and get rid of all the toads and put plastic
strips around everything so the toads wouldn't come
back. We finally got to our launch date, the fifth of
February, and we were thinking, well, what's going to
happen today?
Countdown
I'll tell you what happened that day. As they say, it
was time to "open the book" four hours before launch.
So, we opened the book -- and we were red. One of our
ground antennas had gone down. It was mandatory for
launch. We started working that problem, at the same
time we had to work a series of battery temperature
problems. We did all of this on the skid strip waiting
to get our launch off.
Finally, we got the antenna back and got waivers on
the battery. We got the plane up in the air. We were
within two minutes of our drop zone, when I heard the
launch manager give the abort command. Excessive static
on voice communication with the drop plane caused the
abort. After correcting the problem, we flew around
and headed back to the drop zone. We had only one more
opportunity.
If you've ever been involved in a situation like this,
you're listening to four or five different channels
at once on your headset. You can hear everyone else
talking about any problems they see. I was listening
to all those voices as our plane was about four minutes
from drop, and I looked back down at my telemetry and
saw that the temperature on the battery had finally
gone down to the right spec. All of sudden everything
went quiet on the net.
All I could hear then was the launch countdown. It
went smooth. The Pegasus was dropped with HESSI abroad,
and in eleven minutes we were in orbit.The only thing
I could think at that point was that the gods must have
gotten tired of beating on us.They finally smiled on
the little spacecraft that would not give up.
It's been more than two years now since launch, and
the scientists are extremely happy with their science.
While they've studied solar flares and even taken a
look at the Crab Nebula, I've had ample opportunity
to reflect back on our trials with HESSI.
What saved us, time and again? We refused to give
up. But besides tapping reservoirs of perseverance,
I also learned to tap what I now like to call a project's
hidden resources. I learned to work with and get help
from organizations that I usually didn't think of as
"resources." I'm talking about Mishap and Failure Review
boards, program management councils, and the like. Before
HESSI, I tended to think of them as mountains in the
road. But when I was in a deep enough hole with little
margins to play with, I started to see them in a different
light. I asked for help, and I got it.
Lessons
- You can never say too much about the value of persistence
in the face of adversity. All projects suffer setbacks.
Sometimes the difference between succeeding and failing
on a project is an inexhaustible supply of persistence.
- When confronted by problematic situations, a project
manager with the determination to succeed identifies
and makes use of all available resources. That may
include looking at governing organizations in a new
light.
Question
In a crisis situation such as the one described at the
beginning of the story, what would you say to a Mishap
Board or Failure Review Board to gain their confidence
that you could lead your team to overcome this setback?
|