I don't want to break my arm patting my own back. But you know what? Sometimes a person has to feel satisfied with what he does.
For those of you who don't know, I fix electronic circuit boards for a living. Not in consumer equipment, but in industrial equipment. My company manufactures and repairs tester equipment for the semiconductor industry. I'm sure that many of the integrated circuits in your computer or your cellphone or in your VCR or DVD player were tested on my company's testers before being sent out to the manufacturers who build the computer motherboards and cameras and phones and VCRs and DVDs.
Anyhow, my job is to test the circuits that go in the testers. You might call me a "meta tester" in fact. Heh.
So my job is a complex mix of using computer software to run tests on the boards and to control them and using simpler test equipment like multimeters and oscilloscopes and spectrum analyzers and what not to trace faults down to the actual failing component. Much of the time the boards test themselves through feedback mechanisms and the software can tell us what failed. But not always.
So in a way my job is a lot like CSI. Only less glamorous and maybe a bit easier.
The worst problems are the intermittent problems. They are the bane of technicians of every ilk -- electronics, electricians, plumbers, auto mechanics, you name it. The problem only shows up randomly and never often enough to gather enough data to make a positive decision without guessing.
That's the kind of problem that I've been working on for the past couple of weeks. We've got this circuit board that is a computer in its own right. The symptom is that it loses communication with the rest of the system intermittently, and there is no indication as to why it is losing communication. I mean, once you reset the board in order to reestablish communications with it you lose whatever reason it failed.
So reaching above and beyond the call of duty, I find the tester software on the company network so I can read it and better understand what everything is doing. I learn a little bit here and there. Despite missing components of makefiles and unset paths and stuff I figure out how to build my own copy of the test program that includes some print statements that I can use to help me figure out what is going on. I learn a little bit here and there.
(Note: This particular circuit board is over ten years old; the previous software revision was probably three or four years ago, and some of the software is nearly twenty years old).
So I keep turning over rocks between the software and the hardware, trying different experiments and trying to either memorize or note down the results. And then finally today I notice -- an important part of the board's operating system is being loaded into a section of memory that I know that we have had problems with! I mean, until this point I didn't know that that section of memory was even used for programs, much less for the OS! It was like a ray of light from the heavens!
I had already replaced a few of the chips in that memory section because of hard failures (not intermittent). So this afternoon I replaced the other chips in that section and set the board up for looping on the test over the weekend. It usually fails within twenty minutes or so -- it was about twenty or thirty minutes before I left for the day.
If the board is still running without error on Monday, I will be so stoked and my ego will grow at least three sizes too big. If it has failed, then I will be crushed like pineapple.



