Oral-History:Charles Rader

About Charles Rader

Charles M. Rader was born in 1939 in Brooklyn, New York and attended Brooklyn Polytechnic Institute. He received an undergraduate degree in 1960 and a master's degree in 1961, both in electrical engineering. He accepted a position at MIT's Lincoln Laboratory in 1961, where he has been since that time. He began his early forays in electrical engineering through his early interest in artificial intelligence and speech processing. Overall, his research has focused on speech bandwidth compression digital signal processing, and space-based radar systems. One of his major accomplishments was as a leader of the team that helped build the LES-8 and LES-9 communications satellites launched in 1976. He is the author or co-author of Digital Signal Processing (with Ben Gold), Number Theory in Digital Signal Processing (with James McClellan), and Digital Signal Processing (with Lawrence Rabiner). Rader is a Fellow of the IEEE (1978) [Fellow award for "contributions to digital signal processing"], and was formerly the president of ASSP. He received the ASSP Technical Achievement Award (1976), and the ASSP Society Award (1985).

The interview tells us little of the earliest influences in the life of Charles Rader, or what sparked his initial interest in electrical engineering. Instead, it focuses upon Rader's contributions to the sub-fields of speech processing and his involvement in various professional organizations, including the ASSP and the IEEE. The interview ends with an interesting look at Rader's participation in the official acoustical inquiry after the assassination of President John F. Kennedy.

Other interviews detailing the emergence of the digital signal processing field include James W. Cooley Oral History, Ben Gold Oral History, James Kaiser Oral History, Wolfgang Mecklenbräuker Oral History, Russel Mersereau Oral History, Alan Oppenheim Oral History, Lawrence Rabiner Oral History, Ron Schafer Oral History, and Tom Parks Oral History.

About the Interview

CHARLES RADER: An Interview Conducted by Andrew Goldstein, Center for the History of Electrical Engineering, 27 February 1997

Interview #324 for the Center for the History of Electrical Engineering, The Institute of Electrical and Electronics Engineers, Inc., and Rutgers, The State University of New Jersey

Copyright Statement

This manuscript is being made available for research purposes only. All literary rights in the manuscript, including the right to publish, are reserved to the IEEE History Center. No part of the manuscript may be quoted for publication without the written permission of the Director of IEEE History Center.

Request for permission to quote for publication should be addressed to the IEEE History Center Oral History Program, Rutgers - the State University, 39 Union Street, New Brunswick, NJ 08901-8538 USA. It should include identification of the specific passages to be quoted, anticipated use of the passages, and identification of the user.

It is recommended that this oral history be cited as follows:
Charles Rader, an oral history conducted in 1997 by Andrew Goldstein, IEEE History Center, Rutgers University, New Brunswick, NJ, USA.

Interview

Interview: Charles Rader
Interviewer: Andrew Goldstein
Date: February 27, 1997
Place: Cambridge, Mass.

Education

Goldstein:

I’m with Charles Rader at his office at Lincoln Labs on February 27, 1997. Let’s get started from the beginning with your early education.

Rader:

I went to high school in Brooklyn, New York. Then I went to the Polytechnic Institute of Brooklyn, which has since changed itself to become the Polytechnic University. But, at the time it was the Polytechnic Institute.

Goldstein:

When Ernst Weber was there?

Rader:

Yes. I took a bachelor's degree, and then went on for a master's degree, which took another year. So I graduated in June of 1961 with a master's degree. Then I came to Lincoln Laboratory. It’s kind of funny that I can remember the date that I came here. It was June 19, 1961, because it was a curious date that you can write as 19/6/1961, and transpose it and it reads the same. So, I have no trouble remembering the date. It’s also the date before my twenty-first birthday, which is June 20th. I came here and worked with Ben Gold.

Goldstein:

Before you get started with that, can you tell me about your undergraduate work and your masters thesis?

Rader:

The undergraduate work was just a fairly straightforward electrical engineering degree. The graduate work was also relatively unspecialized, and I did a master's paper—we didn’t call it a thesis; it was just a project—in which I studied the motion of a charged particle in certain electromagnetic fields. The major contribution that I found was that you could come up with a potential function in an axially symmetrical magnetic field, so that you could solve problems the way you did with electric fields. It allowed you to solve problems involving axially symmetric magnetic fields, the way you did electrical fields. It wasn’t particularly notable or anything. These master's papers were not of the same importance as theses. It did get me graduated, but it had absolutely nothing to do with anything that I’ve subsequently done.

Goldstein:

It sounds fairly theoretical compared to some kind of systems work or anything.

Speech processing and computers at Lincoln Laboratory

Rader:

Actually my area of interest was in work on confining particles in fusion machines, an area that I never had anything to do with subsequently. But when I came here I was interested in working on artificial intelligence. I had this idea that if we could figure out how the brain worked and duplicate it, you know, that would be a wonderful career. The group that I joined was doing speech processing, which was about as close as you could come in Lincoln Laboratory to artificial intelligence. It never really got very close. But, you know, that was the reason I chose to come here. I was assigned to work with Ben Gold, who was kind of a mentor. He was significantly older than me, and had some significant accomplishments under his own belt, which I’m sure he will tell you about.

Ben had been working with the TX-2 computer, which was a very interesting machine in the history of computing. If you look at the history of computing as a kind of a tree, where every computer was the predecessor of one or more others, right up the main trunk of that tree you’ll find the TX-2 computer, and then it branches from there. It was several years old then in 1961, but it was in its day quite an impressive machine. It had a relatively large memory of about 64,000 words. They were 36 bit words. Actually 38 bits, but some of those bits were parity bits. It had built-in hardware index registers, a thin-film memory, and a whole lot of other nice features. One of its most unusual characteristics was that it was accessible to the user. You could attach equipment to it, and you accessed it directly rather than submitting your punched card deck to a cabal of operators. So when I came here, I knew nothing whatsoever about digital computers. I knew about analog computers, and thought of them as tools for simulating analog systems. But I knew nothing about digital computers, and Ben taught me the basic concept of a digital computer, and in particular how you use the TX-2 computer. I became very good at it and I realized that this was a universal machine. You could do almost anything with it. The project that we were involved with was designing vocoders. At that time, you built something to try it out, so we built a vocoder. Apart from its interesting components, it was just sort of a large kludge, consisting of a couple of filter banks. You’d have a filter covering, say, from 180 hertz to 300 hertz, and another one covering from 300 hertz to 420 and so on, stepping across the audio band. There were two such identical filter banks. In those days, these filters were each about the size of a book-- about a pound or two each. They had big, heavy coils that were wound to have a particular inductance.

Goldstein:

They were all RLC circuits.

Rader:

Right. They were big, and they weren’t cheap. But the major thing was that if you wanted to try something out that involved changing a filter, changing the bandwidths, or changing center frequency, etc., you had to build another one. That took a few weeks at best. So speech research, in effect, was being hampered by the need to build the hardware to try it out. One of the thoughts that we had was if we could simulate the vocoder, instead of building it, it would speed the pace of research, because we thought, “Gee, programming things doesn’t take any time.” Nowadays people understand that that isn’t true.

Vocoder simulation and filters

Rader:

So, I set out to simulate a vocoder. The difficult part of simulating the vocoder was simulating these filters. Now, what did I have to work with? I knew that a filter was a linear system and it had an impulse response, so the thought was if I sampled the data and sampled the impulse response at a high enough rate, the convolution integral could be replaced by a convolution sum. The equation is yn = ∑m hmxn-m, where hn is the sampled impulse response. But it led to a lot of multiplications, because these impulse responses were, say, thirty or forty milliseconds long, and you were sampling at thousands of samples per second. So you had impulse responses of hundreds of samples, before they died out. They never really died out; they would exponentially decay. But they died out slowly, and so you had a lot of multiplications and additions for each point of your output, and there was a point of output for each input and for each filter. It was this huge process which even for today’s computers would be a lot of computation.

I was thinking about this problem, and somewhere along the line the thought came to me that real filters, honest to goodness filters made of capacitors and inductors, didn’t have access to this long memory. Filters don't "know" their impulse responses. Every capacitor could store one state. Namely it’s the integral of the current. Every inductor could store one state, namely the rate of change of voltage across it. And so with n capacitors and inductors in a filter, there were only n numbers involved. N was about five or six instead of hundreds. So I set about modeling these derivatives and integrals by differences in sums. And every time I did this I kept getting equations of the form Yn = ∑m amXn-m + bmYn-m, where there were short summations. Five or six terms total. Of course, this was going to be enough to simulate a filter more efficiently, but these coefficients were complicated functions of the sampling rate and the inductance values and the resistance values, and so on.

The breakthrough came when I realized that there was a way to analyze the performance of a difference equation, which leads to a frequency domain interpretation with ratios of polynomials. Of course, the people designing the analog filters in the beginning started with specifications, and then people had solved the approximation function problem--that is "How do you get these specifications with a ratio of polynomials?" Then there was another step to making that ratio of polynomials realized through an RLC circuit. But here, in effect, I was saying, “I can go directly from approximating a behavior as a ratio of polynomials to realizing that as a difference equation.” Skip the analog filter design with R’s and L’s and C’s, and go directly to a difference equation. So, for the next few years I simulated vocoders of one sort after another.

Goldstein:

You mean the filters?

Rader:

Well, the filters, and the rest of the simulation was trivial.

Goldstein:

Right.

Rader:

And we married Ben’s pitch detector simulation with the vocoder simulation, and we were able really to do a lot of work on simulating speech compression systems. Ben and I realized that this was extremely powerful. This had implications well outside the field of speech.

Goldstein:

You mean the vocoder work you were doing in total, or your technique for simulating the filter?

Rader:

The simulation of analog filters and other analog operations on sample data representations of real signals had huge implications in other fields. In other words, if you could do everything you wanted to do with any signal anywhere on a computer instead of doing it in hardware, then that had important implications. Not just for simulation, because we understood that you could actually realize systems by simulating them on sampled data, and then recovering continuous data from the sampled data, even if you never needed to do that.

Computer simulation of filters; the fast Fourier transform

Goldstein:

So you mean it had great potential not just for design, which is what you were working on, but for actual operating equipment.

Rader:

Yes. For example, a computer memory could replace a tape recorder. Computer simulations of filters could replace real filters and therefore do what the real filters were doing in a system. So, we had this vision that this was going to be the future. Of course the problem was that for digital filters to replace analog filters, they had to operate in real time. If you were getting 8,000 samples per second in, you had to be able to produce 8,000 samples per second out, or they’d pile up and overflow any memory. So, we understood that our vision of the future depended in part on circuit engineers designing faster and faster digital circuits and digital computers. Or even specialized digital circuits that could do these multiplications, additions and delays, and so on.

That was about the state of it in 1965. We knew about digital filters and we knew how to simulate them, and we were starting to build digital hardware that would do some of these operations as an adjunct to computers. At least they would speed up simulations by making special hardware that you could attach to the computer that would do particular things faster than the computer could.

Then this other breakthrough came about. This was the fast Fourier transform. Now, you’ll probably have access to Don Johnson’s paper on the history of the fast Fourier transform, going back to Gauss. But I got Cooley and Tukey's paper in 1965 as a copy of a pre-print or something like that. Tom Stockham, who was teaching at MIT, showed it to me. I had taken a course from him as a special student. Lincoln Laboratory let me take one MIT course a term, ostensibly paying for it, although it really paid itself for it since the lab is a part of MIT. So I had taken a course on linear systems and Fourier transforms and gotten to know Tom.

Now, this is a side issue, but Tom had come out and used TX-2 to simulate the acoustics of a room by measuring an impulse response of the room, then playing sampled sounds through this impulse response. Tom wanted to know what the room acoustics looked like as a frequency response, and I had written a program on TX-2 which I had the audacity to call “Lightning.” It was supposed to be a fast Fourier analysis program that did exactly the computation that a fast Fourier transform did, but it didn’t do it quickly. It did it as n squared. In fact, it did it by taking the correlation function, an n squared process, and then Fourier transforming the correlation function. The great thing about this program was that TX-2 could be configured so that its arithmetic unit could split into smaller arithmetic units.

Goldstein:

Quad parallel?

Rader:

Yes. I mean, you could configure the multiplication to be four smaller multiplications, and add the four smaller products. I had figured out how to do a correlation process and a Fourier transform, using shorter word lengths and more of them at once. That’s why it was “fast.” But it was only four times as fast as it would have been directly. Tom was using this to simulate room acoustics. He actually made a significant contribution to the Bose loudspeaker company, which was forming then, by trying to show how these Bose speakers would work in real rooms. But Tom had gotten to know me first as my teacher and then as the person who had written this lightning program for Fourier analysis. He showed up here one day with the preprint of the Cooley and Tukey paper on the fast Fourier transform, and we read it.

It took a little bit of understanding to try to figure out what was going on. The first thing that I tried to do was something that Tom had taught me about linear systems. We drew a Mason signal flow graph first for the individual operations involved, and then more and more of them. This butterfly structure emerged. The flow graph looked like a whole bunch of bow-tie shaped lines that fit together, and it basically showed a lot of the properties of the Fourier transform algorithm. The butterfly diagram was a huge breakthrough. It enabled us to understand fast Fourier transform algorithms. It enabled us to explain it to other people. And, I suppose, with a bit of a messianic streak, we did. We organized a talk in which we both explained what we understood about the algorithm.

Goldstein:

Was there anything in the original paper that suggested the diagram?

Rader:

The original paper had summations over the bits, and it was hard for somebody with an electrical engineering background to understand. We played the role of translator into engineering terms. We organized a talk both here and at MIT, and a lot of people came to it. I guess you could say we probably introduced the algorithm to New England.
Tom did something else that was kind of significant. I talked earlier about these convolutions with long impulse responses? Well, Tom realized that there was an honest to goodness mathematical theorem which is that if you multiplied the Fourier transform of a signal by the Fourier transform of the impulse response, you got the Fourier transform of their convolution, and so if you did a Fourier transform, multiplied it by another Fourier transform, and then computed an inverse Fourier transform, you computed the convolution. It was another way to compute the convolution. But for very long sequences, this was much faster, because of the n log n behavior. It was much faster than doing it directly. So Tom invented this so-called fast convolution method. At the same time, another fellow by the name of Gordon Sande at Princeton, also invented it independently. I don’t know which one did it first. I’m not sure it particularly matters.

Digital signal processing groups and publications; FFT convolutions

Rader:

We thought this was important enough to contact Cooley at IBM. And Jim sort of formed a little group of people interested in the algorithm, and several times a year we would meet at his facility at IBM, the Thomas Watson Research Labs. So Tom and I went down there several times and met and told them what we were doing. I never in my life met Gordon Sande, but I met a number of other people who were working on the algorithm. The big breakthrough was the n log n realization. But there were always little improvements being made in the algorithm. I made several of them, and Tom made a few. So, for a few years there was this ferment in the work on the fast Fourier transform (FFT), but I was almost uniquely positioned in that I was aware of both developments in the recursive digital filter field and those in the fast Fourier transform. Ben, who hadn’t worked on the FFT as much as on the recursive filtering and some other vocoder things, certainly was also aware. He was still my mentor.

We began to have the idea that we should publish. We wrote a paper called “Digital Filter Design Techniques in Frequency Domain,” which was published in the IEEE Proceedings. We also organized, along with Al Oppenheim and Tom Stockham, a two week summer course at MIT on this new field of digital signal processing. That was perhaps a turning point in my career, because we prepared extensive notes for the course. We had a thick stack of notes, and we realized this could be the basis of a book. So we got permission from the laboratory and we wrote the book. We thought it was just going to be a simple matter of transferring the notes. It actually took about two years to go from the notes to a book. The course was ‘67. The book came out in ‘69.

It was almost the first book to cover any of the material, with one very significant exception. There was a book by a group of people from Bell Labs called Systems Analysis by Digital Computer. The major authors were Jim Kaiser and Frank Kuo. But it had a chapter on digital filters that Jim Kaiser had authored. That chapter, in my opinion, deserves to be called the first book that introduced any of this material. There were earlier books on digital filters, but they concentrated on non-recursive filters and the convolution side. So, I guess I’d give Jim credit for writing the first thing in a book, and we produced the first book that covered what was then the whole field. Now, our book had two contributed chapters, one by Oppenheim and one by Stockham, to make eight chapters in all.

Goldstein:

When you were developing the book, can you tell me about the audience you had in mind? Were you thinking about what level? Was it theoretical or applications oriented?

Rader:

The first thing is to define what it was not. It was not intended to be a classroom textbook. It was intended to be a monograph. I’m not sure that that has any well-defined meaning. We intended a working engineer to be able to learn about digital signal processing from the book. By the way, I have always had this rule of thumb when publishing anything, and that is to assume that one's audience is people who knew what you knew when you started working in the field. Not what you knew when you started working on the topic, but when you started working in the field. It’s sort of rule of thumb that had served me very well. So I wrote the book to be something that I could have understood when I started working on digital signal processing in ‘61, right out of school with a master's degree.

Ben and I shared the writing. We shared chapter one, which was kind of general introductory material, and a little bit about speech. I think he must have written chapter two, which was on the Z-transform, and he wrote the material on digital filtering. Chapter three was the one on design techniques and the approximation problem. I did the lion’s share of chapter three. Chapter four on quantization effects we shared. Chapter five, which we shared, was the simulation compiler. It had this block diagram and representation of computer programs. Chapter six was all mine. Seven was Tom Stockham’s chapter. Eight was Al Oppenheim’s chapter, and that’s the whole book. It turns out that there is some interesting material in chapter seven which I wrote and added to Tom’s chapter. One particular section, 7A, is the one that I wrote, and it turned out to be significant later on. This chapter was about his high speed convolution correlation method.

After Tom had written the chapter, and before the book was actually set, I did some work that was completely off the wall. It happened that I was trying to do something involving making random noise on a computer. Random noise is ever-present in nature, but computers don’t have it. So the algorithms for generating pseudo-random noise tend to involve doing a complicated computation, taking the result of that as a sample, and using it as the starting point for the next computation. The way you analyze these things and make sure they don’t get stuck producing a steady state number is there are theorems in number theory.

I went ahead and learned some number theory, and as I was doing that, I realized that some of the equations in number theory involving modular arithmetic looked similar to difference equations. So I began exploring these analogies, and I discovered some interesting things. One of the things I discovered was that I could take the expression for a discrete Fourier transform and by permuting the order in which certain computations were done, I could turn the computation into a convolution. Then I could use FFT techniques to do the convolution, but of a different length. It turned out this was easiest when the sequence lengths were prime, which is exactly the length that the FFT wouldn’t do anything for you. For FFTs you needed the length of the sequence to be composite, so I found a way to do the DFT of a prime number of points, using FFT. So that got into the high speed convolution chapter. It was mathematically "cute." It was also very interesting in later years, because it became the basis of the work that Shmuel Winograd did on the Winograd-Fourier transform algorithm.

Now, at the same time that I had found a way to express a Fourier transform as a convolution, a colleague of mine, by the name of Leo Bluestein, who was then at Sylvania, but who had actually shared an office with me a few years earlier when he was at Lincoln Lab, came around one day and said, “I’ve got this interesting result.” He said, “I can do a Fourier transform where the number of points is a perfect square, and I have this algorithm.” His algorithm was not very interesting in itself, but part of the way through explaining the algorithm, he had done some manipulations to change a Fourier transform into a convolution, in still another way. It had nothing to do with number theory. It had to do with multiplying the input wave form by what engineers would call a chirp wave form. It was a sinusoid whose frequency increases continuously as time progresses. And then, if you then would agree afterwards to un-multiply the Fourier transform by a chirp, what you found in the middle was you were convolved with a chirp. So, multiplication, and post-multiplication to undo it, was another way of converting a Fourier transform to a convolution.

I said to him, “Leo, we can use the FFT to do convolutions, so forget about your clever little algorithm for doing the middle part. Let’s use the FFT for that.” The advantage of that was that the chirp algorithm would work for any length sequence, and so you could do any length sequence using any length FFT. That was the so-called chirp Z transform. I showed the Chirp Z transform algorithm to Larry Rabiner, who was and still is at Bell Labs, and he got excited about it and wrote a paper about it, which he co-authored with me in the Bell Systems Technical Journal. It’s kind of unusual to have a non-Bell author write in the BSTJ, but. So that was-- it was in that same time frame, around 1968, ‘68 or ‘69, when the book came out.

IEEE Signal Processing Society

Rader:

Now, I’m may be a little bit confused about the sequence here, but at the same time that we were getting together with Cooley on a semi-regular basis at IBM, there was another group in the IEEE that later became the Signal Processing Society. It was the audio and electroacoustics group. They had a standards committee, which in turn had a concepts subcommittee. The subcommittee had organized a kind of traveling road show in which they would try to explain to people how you should go about measuring noise spectra correctly, using rigorous mathematics and algorithms. Basically it was the material in the book by Blackman and Tukey. In doing this, they became aware of the fast Fourier transform, and they changed their mission and became a working group on digital signal processing. Eventually they changed their name to the Signal Processing Subcommittee. But this committee was the nucleus of what eventually became the Signal Processing Society.

Goldstein:

Okay, could you elaborate on that for just a second. When you say they became aware of the FFT, they changed the focus of their presentations?

Rader:

No, it’s not that they changed the focus of the presentations; they changed the focus of the group. They realized that there was some technical "meat" in signal processing, and that there ought to be an IEEE group that would be the home to this new set of ideas. A number of people set about making the audio and electroacoustics society a home for this new set of technologies. The one whose name comes to mind as a very important contributor is Bill Lang. That is not to downplay the importance of some other people, like Howard Helms or Dave Bergland. When I was asked to join this group, we wrote a paper called “What is the Fast Fourier Transform?” The list of authors of that paper came from this so-called concept subcommittee, which eventually became the digital signal processing committee.

So, it has its origin in this subcommittee, which was the only thing going on in this society. Electroacoustics was a pretty moribund. So, that’s where the action was, and in effect, that’s the direction the society took. Over the years it changed its name from the Professional Group on Audio and Electroacoustics to the Acoustic Speech and Signal Processing Society. I actually took the petition around to get that name change. And then to the Signal Processing Society.

Goldstein:

I’ve been curious about this, and maybe you can tell me. From the point of view of signal processing, that transition makes perfect sense, but I wonder if there were any entrenched interests in the group at that time?

Rader:

Sure there were. And we were respectful of it. We didn’t try to force out the old groups. We tried to add the new groups.

Goldstein:

So was it a diplomatic process?

Rader:

It wasn’t particularly difficult. It was win-win. Their issues of the Transactions came out more often, because there were more papers being published, and more readership enabled the society to grow and still provide a home for them. The only objection to changing the name from audio and electroacoustics to acoustics, speech and signal processing came from libraries. But the society was perfectly happy to do that. There was some other IEEE societies who worried that we were cutting in on their scope.

Goldstein:

Are there any examples? Can you remember?

Rader:

I think the Circuits and Systems Society might have been concerned. Or Information Theory. I’m not sure. But societies infringe on one another’s scope all the time.

Goldstein:

Well, were the technical boundaries of the societies formally defined, or ratified by the IEEE Technical Activities Board?

Rader:

Yes, but these boundaries were always defined before it was clear how they should be defined. Fields would grow without regard to how people thought they should grow. So it was not a big deal.

Goldstein:

Okay.

Arden House workshops; ICASSP conferences

Rader:

What was a big deal, however, was being able to have conferences on these special areas, and let me say a few words about that. It was Bill Lang’s idea to have a signal processing workshop. That workshop was held at Arden House, which is a facility of Columbia University. I guess IBM had used it at some point, and Bill knew about it. By force of personality, he went ahead and reserved the facility. It had the capacity for 105 or 110 people to come and actually sleep over on the top of a mountain and be sort of isolated for a few days and do nothing but talk to one another. So, he proposed to our little concept subcommittee that we have a workshop which resulted in the Arden House workshop on the fast Fourier transform. It took the best part of a year to organize it and invite people to it, and we had it and we thought it was great. Two years later, we had another one on digital signal processing and I believe there was a third one a few years after that.

These workshops were the way that people in signal processing had their conferences. They were kind of small in the sense that there were only about 100 or 110 people. But that’s all that the field needed at the time. Somewhere along the line the president, Reginald Kaenel, decided that we should become a society. I think that was nothing but syntax, a name change. But one of the things you had to do in order to become a society was have an annual conference. So Reg proposed that we have an annual conference, and that’s how ICASSP was started. It was his idea. Reg was an interesting fellow. As far as I can see, he neither knew nor cared about signal processing. I’m not sure why he was even part of the group, other than wanting to have power or something. And he subsequently fell away and stopped participating. But he was the spark to turn us into a society and get the ICASSP conferences going. The first one was the one in Philadelphia. The "spark plug" was an older speech researcher by the name of Charlie Teacher who was keeping up with the new technology in simulation. Then every year the previous conference was the standard to beat. The next one was done in Hartford. The key person was Harvey Silverman, who is a professor at Brown now but who had been at IBM at the time. And they just kept growing and getting bigger, more diverse.

Goldstein:

Was there a natural comparison between the Arden House workshops and the ICASSP conferences?

Rader:

Well, the Arden House workshop involved like 100 people. And ICASSP’s first one would have involved 500 or 600 people. An Arden House workshop would have had perhaps sixty or seventy papers, everybody listening to every paper. The ICASSP ran for three days and had hundreds of papers. I can’t remember whether the first ICASSP had exhibits or not, but at some point they began adding exhibits. The conference, you know, just grew as interest in the field grew. It continues to get bigger and bigger each year and has become the major signal processing conference. But as the breadth of the field expanded, we had to also meet the need for workshops on specialized topics. So we continued to have Arden House workshops for a while, and then we stopped having them at Arden House, but we had specialized workshops on, say, multidimensional signal processing. There have been a whole bunch of them.

Defining the digital signal processing field and terminology

Goldstein:

I have a few questions. Listening to you now, you’re suggesting a laissez-faire attitude about the growth of the field. You said they’ll grow naturally, and one makes mistakes when you have to define it in the beginning. I wonder whether in the early days whether there was an effort to define the field. Do you know whether that took place?

Rader:

This is not going to answer your question, but it’s something you reminded me of. One of the things that I took initiative to do as part of this concept group, or signal processing committee, was to try to generate a uniform notation so that people could understand one another’s papers.

Goldstein:

I saw that paper on the terminology.

Rader:

That was my initiative. I suppose it probably was not successful in the long run, but it was successful in the short run. For a number of years I was the technical editor of the Transactions. People submitted papers, the production editor would send the papers to me, and I would review them for technical content or get other people to review them for technical content. I did my best work in the first few years. In those days I could review any paper. If it was a paper on digital signal processing, I could review it. Sometimes when you’d send the paper out for review and the reviewer didn’t get it back for months, I would review it myself. Plus, I knew everybody who was working in the field, and what they were working on, so I knew who was the appropriate reviewer for a given paper. That all changed. The beginning of the change of that was when another technical development came along, which was the Burg algorithm.

Burg algorithm

Rader:

If you had a wave form and you gave it to me and said, “What is its spectrum?” To me, that meant compute its Fourier transform. There’s another kind of problem with similar terminology, and that is to say, “Here’s a wave form from a class of similar wave forms that all have the same underlying statistical spectrum. But the particular one you’ve got is the only instance of it and differ slightly from one another. What is the underlying spectrum of this process?” A question comes up when you want to measure noise spectra. For example, if we’re sitting in this office, and here’s this thing buzzing away, and you wanted to know what its spectrum was. Well, from one second to the next, the Fourier transform would differ, but on average, over time, there would be a consistent spectrum. To estimate that is a different process than simply computing the Fourier transform of one wave form. It's called power spectral estimation.

You could use a Fourier transform, a fast Fourier transform, DFT, or you could do power spectra estimation. But it’s not the essence of the process. The essence is the statistics of how much it varies from one second to the next, and how should you process the wave form to keep that variation from showing up and appearing to be the signal. John Burg had done a Ph.D. thesis on a new way to approach that problem. It was extremely influential, and frankly I neither understood it nor even knew it existed for many years. The way I understand it today is in the sense that if you had a completely random, unstructured noise, and you played it through a linear system to produce a filter structuring noise, you could ask yourself, “What filter would be most likely to have produced this noise?” Then that filter is your spectrum.
It's a very powerful concept, because the mathematics of answering the question, “What filter would most likely produce the noise?” first of all led to some very pretty algorithms, like the Levinson recursion, and second of all, led to some interesting filter structures, ways of connecting delays, and adds and multiplies together that I hadn’t known of any interest in investigating until it turned out that they were intimately connected with these algorithms. So around 1974 or 1975, people working on this field-- the Burg algorithm, linear prediction, lattice filters, Levinson recursion, began publishing more and more in the society.

Number theory transformations; modular arithmetic

Rader:

At that point, I could no longer claim to know the whole field. I mean, I know that stuff now, but I didn’t then. Some other things that happened as the field grew. I did something interesting with number theory, though I’m not sure if it ever turned out to be important or not. The idea that I had was if you had a computation in ordinary arithmetic, like computing a filter response, or something like that, and if you kind of mimicked that computation in a completely different arithmetic system, or modular arithmetic, then theorems which had analogies in the modular arithmetic could be used to mimic algorithms in the modular arithmetic. You could, in effect, map your data from integers to integers in modular arithmetic. Then in this other arithmetic domain, where arithmetic is different, you could do an entire computation, and then map the answers back. As long as you had some way of convincing yourself that they survived the mapping process, you could do your computation in the modular arithmetic area.

There were some interesting properties of the modular arithmetic area. Namely, there was the analogy of a fast Fourier transform that didn’t involve any multiplications but only bit shifts. I called these ideas number theoretic transforms. I was invited to give a talk at Arden House on my number theoretic transform idea. They asked me how long it would take, and I said I thought it would take fifty minutes, and even though I ended up talking for two hours and fifty minutes, nobody left. It was really interesting stuff. It was the kind of stuff that professors and graduate students loved.
Unfortunately, the problem was these number theoretic ideas tended to fit in these very narrow niches. The sequence length had to be exactly right. The word lengths you used had to be exactly right. It wasn’t as if you could say, “Well, I want to do a fifty point transform.” It had to be sixty-four. It couldn’t be sixty-three, it couldn’t be sixty-five, it couldn’t be forty-nine; it could only be sixty-four. That was the seed that created another corner of the field, which is mostly the domain of graduate students and professors.

Goldstein:

Is it still known by that, number theory transformations?

Rader:

I suppose so, yes. Some folks call them Rader transforms, but not too many. The Fermat number transform and the Mercenne number transform were two that I invented that had particularly neat properties. But that stuff, along with the Winograd Fourier transform and some ways of permuting data so that you turned the transforms into convolutions, and vise versa became a kind of corner of the field. Jim McClellan and I wrote a book about that, which is this Prentice Hall book that has the green spine.

Multidimensional signal processing; Butler matrix

Rader:

Another thing that happened around the same time—I mean, all these things kind of happened one on top of another—was that people became more and more interested in multidimensional signal processing. Processing pictures, for example. And multidimensional signal processing began to become more and more important. Eventually we created a separate committee to do multidimensional signal processing, along with one to do speech signal processing. Speech is more than signal processing. Its algorithms of trying to understand, and synthesize, and quality judgment, and so on. But a lot of signal processing ideas, including some of these linear prediction spectral estimation ideas, were originated in the speech area. Two other areas that generated a lot of useful results were radar and seismic processing. So, the society really helped bring these people together, exchanging concepts with one another.
Here’s an interesting story. I don’t know if it’s an answer to your question. Do you remember that I told you that Tom Stockham and I had organized this talk about the fast Fourier transform?

Goldstein:

Yes.

Rader:

We gave the talk and a large number of people from Lincoln attended. One of the attendees looked at this flow diagram, and he said, “That looks like a Butler matrix.” I had never heard of a Butler matrix. What was a Butler matrix? A Butler matrix was a way that you could take received signals from an array of antennas and combine them in pairs, each pair taking two inputs and two outputs, that were in effect, were the sum and difference with phase shifts. When you combine them in this way, and combine 2n antennae, in stages you have basically formed 2n synthetic antennae, you know, each set of antennae taken to one output, a synthetic antennae, and it was as if that synthetic antenna had a beam pattern that was a combination of the beam patterns of the individual ones. It turns out the relationship between the beam patterns and the individual ones and the resulting ones is a Fourier transform.

The antenna guys had known for a long time that a phase progression becomes a spatial angle. So they had known, or some of them had known, about these ways of combining their antennae signals by adding and subtracting with phase shifts, which would produce a set of these spaced beams. They knew that this was a Fourier transform and they had, in fact, used computers and Fourier transform programs to analyze the performance of this matrix. It never occurred to them to use the theory behind the Butler matrix to devise a fast algorithm for computing the Fourier transforms. I guess if you’re trained to think in terms of analog components, thinking in terms of computational algorithms isn’t as natural as it seems after you’ve come to be used to it.

There was another fellow in the audience, by the way, who was a programmer for a radar group who had actually seen Cooley and Tukey’s pre-print. He had needed a Fourier transform, so he programmed it. Even before I did. But he hadn’t thought that it was important enough to make any fuss about. It was just, you know, he did it for his application, and that was that.

Goldstein:

How did you learn about it?

Rader:

Yes. He said, “I had written this program. Used it for this application.” His name was Ernst Gehrels. I don’t think he even saved the code. There was not a uniform appreciation that this was of tremendous, earthshaking importance. You know, some people just looked at it as, “You know, that’s what I did today, and tomorrow I’ll do something else.”

Bill Lang

Goldstein:

All right. Can I backtrack and pick up with a few questions I had as you were talking?

Rader:

Of course.

Goldstein:

Where was Bill Lang coming from? Where did he work, and what were his interests?

Rader:

He worked at IBM, the same laboratory as Cooley, but a different part of it. I guess his part of it must have been concerned, I’m guessing now, must have been concerned with making IBM’s equipment, you know, office machines, a little more pleasant to live with. They didn’t “clack-clack” quite so loud. Anyway, it was a noise analysis group. I don’t know why he had the interest in pushing these technologies and making them more accessible. I have to assume it was the same as mine. This is what a professional does. But he certainly had the vision to create the society that became the home for this technology.

Goldstein:

Okay. I wonder if we can read anything into his motives, and how he responded to certain dilemmas. You know, if there were any issues?

Rader:

What sorts of dilemmas?

Goldstein:

Well, I’m not sure. I don’t want to use the word “controversy” because I don’t necessarily mean anything that was acrimonious. But, you know, if there was a decision that had to be made, or some issue that needed to be settled, perhaps his motivation is latent in the positions that he took on things.

Rader:

Yes. I cannot recall anything acrimonious or controversial, or anything like that.

Goldstein:

I didn’t mean to suggest that there was that kind of thing. But even civil, you know, totally amiable issues?

Rader:

I think all of us in this group may have had different degrees of awareness of the long term implications of things. We all thought it was important to be able to publish and see other people’s publications, talk to one another and compare notes. In the short term there just wasn’t any question that these were good things to do. I don’t know how many people realized long term that this was going to become a technology that would be one of the most important parts of electrical engineering. I did. But, you know, I was in my early twenties and maybe had no authority to make prognostications.

Group meetings with Jim Cooley at IBM

Goldstein:

Another question I had was about this group that got together with Jim Cooley at IBM. Do you remember who was in it? Can you pick out some individuals?

Rader:

Well, I mentioned myself and Tom Stockham and Jim Cooley. I think Howard Helms was part of it. I know that Cooley was in touch with Tukey and Gordon Sande, but I never met Sande, ever. I only met Tukey years later. Who else can I remember? It’s really embarrassing to say that, you know, some of these people I worked with then I can’t remember.

Goldstein:

I heard you mention that you would drive down there, and meet in a room, and you’d all get together and just sort of chat.

Rader:

I remember once Tom Stockham rented an airplane and flew down. He had learned to fly when he lived in the West. But no, I would drive down even though it's a four and one-half hour drive.

Goldstein:

Once a year? Or more often than that?

Rader:

More like three times a year. It didn’t last for more than a few years either, because these other groups displaced the need for it around 1966 and 1967.

Developing standard DSP terminology

Goldstein:

You worked on that paper, you said you wanted to try to standardize the phraseology and terminology. Can you remember what some of the problems were, from there not being a standard terminology?

Rader:

I think so. I mentioned earlier that there are two different meanings of spectrum analysis, whether it was taking the Fourier transform or estimating a process. There were a number of different names being used for the different kinds of filter structures. You know, what some people would call a coupled form other people would call a recursive form. FIR, finite impulse response and "non-recursive" are the same thing, but people used different terminology and didn’t know what each other's terms meant.

Goldstein:

Would some of the usages correlate to different application areas? Did you find the seismic people using these terms?

Rader:

That might have been the case, but I’m not really sure. In the fast Fourier transform, as people began investigating variance of the algorithm. Somebody would call a particular algorithm the "Sande form" of the algorithm. Other people talked about decimation in time and decimation in frequency, and how were these related to one another. You could read somebody’s paper and not realize that he had done something that somebody else had done, just because the terms were different. So we did a lot of that. We had some recommended terminology, but maybe the most important thing was showing what some of the differences were and how they entered into the literature.

Goldstein:

This picture is forming in my mind of different groups sort of doing parallel work, and only when that work becomes well established do the workers start communicating with each other.

Rader:

That’s always the case. People far apart don’t do work together. They may share an idea or two, but basically they do work alone. When they tell each other about it, it’s a value to sort of recognize what’s the same and what’s not. As the editor, for example, I used to get design techniques papers on a certain mapping technique, where you would take an analog filter design and map it to digital filter design. There was one technique, the bilinear transform technique, that I would get two or three papers an issue on the same technique, none of them aware that it had already been published in the same journal a year before. They weren’t aware that they had discovered something that somebody else had discovered. It was just notational differences.

Goldstein:

Did you notice more standardization after you did your paper on terminology?

Rader:

For a while. But then the field changed and outgrew it. In some cases when we made recommended choices, we made the wrong choice, and that got recognized eventually.

Goldstein:

It would be interesting to think of it in these terms, that different isolated communities are almost speaking in a different language and can’t communicate, and only once a Rosetta stone is developed can they communicate with each other and start seeing more cooperation and synergy.

Rader:

The synergy was going to be there anyway. I don’t know that the paper had that big an influence. The annual ICASSP conference certainly had a huge influence, because it always had a substantial non-U.S. participation, and many, many excellent papers from the beginning came from researchers from Germany, France, Europe, and other parts of the world.

Editing Transactions

Goldstein:

One other thing you said is one of the other things I wanted to talk about, when you were editor of the Transactions, were there a surplus of papers, did you have the freedom to provide shape to the literature that was circulated?

Rader:

I accepted some papers and rejected others. The ones that were rejected, a lot of them were revised and resubmitted and ultimately published. But, in that sense, yes. The basic reasons for rejecting a paper could fall into several categories. One is that they could be nonsense. Just bad work. People publishing things that were wrong. Another was that they could be trivial. Another was that they could be work that had already been published, that the author wasn’t aware of. I guess that the main thing that an editor can do is reject things for those reasons. Rejecting something for not being easy to understand is rarer. It’s difficult to write well. Technical editors are usually not willing to rewrite a paper, so, when you say “shape the literature,” do you mean just to keep things out?

Goldstein:

Well, if there was an overwhelming surplus of papers, then the editor would have to make decisions about what areas the audience is interested in. You know, which papers to defer on, or maybe reject outright, simply because there aren’t enough people interested in that subject. And publish the other ones. Would that situation occur?

Rader:

Every once in a while a paper would come in that I thought would be of greater interest to readers of another journal, and I would suggest that it be submitted to that other journal. Let me think. Sometimes people would submit something to another journal, get rejected, and then they’d submit it to us. I doubt that I fell for that too many times, but I’m sure it was tried many times. I did another thing for the Signal Processing Society that I’m kind of proud of, and that is I instituted a procedure where you could submit a paper and have the abstract of the paper published before the paper was accepted or rejected. That way you never had the problem as an editor of feeling either that you were keeping good work out of the public domain for too long, or that work that might have been important to somebody would be lost. You know, you could just look through these abstracts and say, “That sounds interesting,” and contact the guy. I thought it was good innovation. They’ve been doing it for years, and I think it’s been very successful. I don’t know whether other societies do it or not. But they ought to, I think.

Goldstein:

I guess what I’m getting at is that one could imagine that the editor of a journal has some power to provide direction for the field. I’m wondering if you felt like you had that power.

Rader:

Oh, I’m sure I had the power, but I didn’t think it was my job. I did in one case. There was a development called the Hadamard transform, also called Walsh functions or Rademacher-Walsh functions. Papers on this began coming in at a great rate. Some of them were published, but after a while it was just mushrooming. I judged that there was not really important technical content being added by publishing yet another one, you know, generalizing the generalization of a previous generalization of these things. I basically just decided I won’t publish them anymore.

Goldstein:

Did you have any invited papers, or special issues with themes?

Rader:

We sometimes had invited papers, and we sometimes had special issues. Special issues grew out of an Arden House conference, for example. We had the conference, and we had a special issue that included the most important papers from the conference. But, you know, we could handle the growth in the transactions. As the number of papers were growing, the membership was growing, so we could afford to publish more. There wasn’t really any need to keep things out of the transactions, that I knew of, that I could detect.

Goldstein:

So it sounds like you’re saying that the decisions about what not to publish were based on the narrative of the paper, not the interest of the work.

Rader:

Right, for the most part. There were exceptions. If something was a computer design paper, I’d send it to the Computer Society, or if something was a network synthesis paper, I would send it to Circuits and Systems, or Information Theory.

Goldstein:

You see, some of those cases might be interesting. I think some of those cases might be interesting, because perhaps they show the areas where people believed this newly configured society might be headed.

Rader:

I’m going to guess that most of the papers that I directed elsewhere were papers that had been rejected from that elsewhere.

Arden House, ICASSP, and growth of digital signal processing

Goldstein:

You mentioned Arden House, and that reminds me still of another question I had. Earlier you said that at the Arden House you had an attendance of 100 or so.

Rader:

There were beds for 104 people. So counting a few people who were willing to commute from someplace else, you had to keep it to about 100 or 110 people, so it was by invitation.

Goldstein:

I thought I remembered you saying earlier that the people who attended were most of the people who should be there. The difference between the Arden House workshops and the ICASSP was that ICASSP was a lot bigger, but you said that the field had grown at that point, and there were more people involved.

Rader:

ICASSP kept getting bigger too. The working groups that I’m talking about, the groups like this concept subcommittee would have had ten people. Arden House would have been 100 people, and ICASSP would have been 1,000 people. A working group that gets together and decides to do something is ten people. You can’t have 100 people in a working group. They also tended to be East Coast. I mean, the Washington to Boston area of the East Coast could meet in New York, and it was a day trip. When West Coast people began to work in this field, they weren’t going to come and spend a day, you know, discussing the business of the Acoustic Speech and Signal Processing society, but they could come for a conference, for a three day conference, or something like that. So just because of geographic increase, Arden House could have 100 people belong there.

Goldstein:

Well, I guess here’s what I want to clarify. I thought I had heard you say that the 100 or so people who went to Arden House were the entirety of the people who would be interested in this sort of thing.

Rader:

At first, yes.

Goldstein:

Okay, so the contrast I want to make is that a fairly small number of people attended Arden House, and the presumably larger audience you were reaching out to with your book was different. What’s the difference in the audiences?

Rader:

Well, first of all, there’s always a difference in time. If something is going to grow exponentially, it starts out small. In the beginning it hadn’t grown very much. Another factor is there’s a difference between people who are actively researching in an area and people who want to know what those other researchers have done. That difference can be a large factor at the time.

Goldstein:

So Arden House was for the active researchers.

Rader:

For the most part. I think a lot of people who attend ICASSP now are there to learn what other people are doing, but are not themselves doing research. They’re, you know, trying to apply these ideas. So that’s one major differences between a conference and a workshop.

Digital filter design and performance, 1960s

Goldstein:

In the early ‘60s when you were first working on digital filter techniques what were the engineering trade-offs that were involved in designing digital filters compared to their analog counterparts?

Rader:

Well, at first, a digital filter was a computer and it filled an entire room, and an analog filter was the size of a small box

Goldstein:

Okay, that’s one.

Rader:

Digital circuits got smaller. One could predict that they were going to get smaller, and faster, and lighter, but I don’t know whether I would have predicted that they’d get to individual chip size or below. I certainly thought that they’d ultimately become competitive with the analog filters, and would have several advantages. One was flexibility, because you could program a digital filter, or you could program in its coefficients. You could multiplex them, and there was no limit to the accuracy you could obtain, whereas maybe you could make resisters and capacitors to within a tenth of a percent of what you wanted them to be, but more than that was hard to do. There was no problem in getting any desired accuracy with digital filters. They were reproducible.

Other advantages were that you could, in effect, stop time. You could do things like play data through them backwards. You could do all kinds of interesting things in the digital world that you couldn’t do in the analog world, which just marches by like the river of time. So there were all these potentials, and I just knew that this was revolutionary. It was just a question of when. I must say, during the first few years the technology hadn’t caught up with the potential. I have to give credit to a gentleman named Irwin Lebow, my boss, for supporting us. You know, letting us work on this stuff the potential of which was some years off. I’m sure that he felt some pressures to get things out the door. He insulated us from it, and encouraged us, and I hope that history will show that his contribution made my contribution and Ben Gold's contribution possible. Irwin Lebow had written a book earlier about computer-- it was one of the earlier computers, before TX-2. But he must be retired by now.

Goldstein:

His book is Information Highways and Byways. It’s a good book. This is the first I’ve heard of the work in his career. All the advantages you list seem evident and significant in retrospect. I wonder if at the time there were any performance characteristics of the analog circuits that seemed difficult to emulate, or that seemed like a shame to loose, when you went in to the digital domain?

Rader:

Just speed. You could do analog filtering in real time by making enough filters. They kept up with real time without having any problems. Digital filters were slowed down by how long it took to multiply and how long it took to add, and how many multiplications you were doing, and so on. When I used to use TX-2 to simulate vocoders, it would take about three minutes to do three seconds of speech, and that was after really working hard on the program and doing four multiplies at once.

TX-2 programming and uses

Goldstein:

That reminds me of one minor question I had about TX-2. Was there a high level language for it? How did you program it?

Rader:

No. You programmed in assembly language. The assembly language was something called Mark IV. It had been written by Larry Roberts. He programmed in assembly language. There was no FORTRAN for that machine. FORTRAN existed for other machines, but not for TX-2. It wasn’t a real problem to program in assembly language, but what was a problem was that if you wanted to compute a sine or a logarithm, you had to write the routine yourself. And sometimes that meant you had to invent the routine yourself because you couldn’t just call the library routine that somebody had written, to research how they’d done it and try to copy their method.

Goldstein:

Didn’t the library develop over time?

Rader:

Did the library develop over time? Probably it did develop over time, but with enormous glacial slowness. There’s a tendency, when nobody’s officially supposed to maintain a library, for it to fall apart. A routine was normally stored as this reel of punched paper tape. They were big and they were heavy, and they would be in the bottom drawer of a file until there wasn’t room for any more. You know, people didn’t know who had what. It wasn’t convenient.

Goldstein:

You mentioned that it wasn’t much to operate, you didn’t have to turn over your cards. I think Ben Gold had said the same thing, that you could work on it, that it wasn’t bureaucratized. But wasn’t there some scheduling for who had access, and how was that arranged?

Rader:

Yes. I kept it pretty much from midnight until seven o'clock in the morning.

Goldstein:

Would you submit a request?

Rader:

There was a sign-up sheet, and usually the blank places in the sign-up sheet were either far in the future or late in the night, and I would just take the late in the night.

Goldstein:

So its use wasn’t partitioned out according to laboratory priorities?

Rader:

Its use was partitioned out by saying each individual could sign-up for a certain amount of time. Signing-up for an hour during working time was the same as signing up for four hours late night. So, in order to get lots of time I would sign-up for the graveyard shift as soon as possible.

Goldstein:

Is that an indication that all of the TX-2’s uses were all at the same level of priority? Or, is that a characteristic of the allocation of all resources at this lab?

Rader:

The major use of the machine was as a test bed for developing computer techniques. The first thin-film memory, the first large, transistor driven core memory, index registers, and the first time-sharing system. A whole bunch of techniques were tried out. One of the first magnetic tape drives was one of them. It wasn’t the first magnetic tape drive, but it was the first addressable magnetic tape drive, where addresses were on it. There were a whole bunch of innovations that were introduced in this test bed.

But it was also a working computer and it was available for people to use. There was a small community of six or eight of us people who used it for our research. Ben and I used it for bandwidth compression research. There were other people who used it for speech recognition research and for research in graphics. The mouse was not invented on TX-2, but pointing devices allowing you to point to objects on the screen was a TX-2-introduced concept. We had a gadget called a light pen, which was sensitive to the flash of the CRT, and would tell you the time that it happened. Then from that you could figure out what you were pointing at. But these applications ideas were usually secondary.

Lincoln Laboratory communications group

Goldstein:

You had started to describe where speech processing was within the overall structure of the research here at the lab. You said that you were under Irwin Lebow. How many people were in the group, and what department was it in?

Rader:

It was called Group 52, and it was a communications group. If you look up the history of communication satellites, there’s a strange little chapter called “needles.” Where instead of putting active satellites or reflectors, we put up in orbit a belt of little copper fibers, which stretched out, disbursed over a large portion of an arc of the sky, and would reflect one particular frequency because of the resonant frequency of the fibers. This was a communication channel that had interesting military applications. There’s no way you could either shoot it down or jam it. It also had a very low data rate. We designed, in that group, the sequential coding techniques and the machines that would enable advanced coding techniques to protect the channel from its bad features.

We were a communications group, and a small part of the group was concerned with speech communications and whether you could get the bandwidth requirements down low enough that you could digitize the speech and then encrypt it. We didn’t work on encryption, but we knew that if you turned it into bits, the bits could be encrypted, and decrypted, and so it would have military communications potential. A special problem was that vocoders of those days turned normal speaking voices into extremely machine sounding, Donald Duck-sounding voices. We wanted to do speech compression without horribly corrupting the quality. A major argument, which I never thought made any sense, was that if the president was to give an order on one of these encrypted channels, people should be able to recognize his voice. But there was funding to do this, and that was what supported our speech research at the time.

Goldstein:

So how big was Group 52? Was it the entire communications group there?

Rader:

Group 52 probably had thirty or forty people in it, maybe half staff and half technicians. But, the speech part of it was me and Ben Gold and Joe Tierney.

Goldstein:

That’s a small group. How did your work on vocoders relate to what was going on at Bell Labs. Were they similar?

Rader:

It was friendly competition. They had a much larger group with a much stronger background in physics of speech. I don’t think I ever did anything very significant in speech research. There were a few little inventions that I think were of no great importance in the long run. Ben did very important work on pitch detection. Joe Tierney worked on the hardware implementation of some of these ideas. Over time the group working on speech got bigger, but I got less interested in it as these digital signal processing ideas became more and more compelling.

Satellite communications research

Rader:

Roughly speaking, after 1966, I would say, I stopped working on speech. In 1969, I thought digital signal processing theory had gone as far as it was going to go for a while, and I changed fields completely. Talk about a bad decision. I changed into a group that was doing communications satellite work, and I became an assistant group leader of what was then called Group 69. I worked on a couple of communication satellites that were launched in 1975, and are still working since I was responsible for reliability! [laughter] Those satellites hold the record, and every day set a new record for long-lasting reliable satellites in orbit.

I watched signal processing and it surprised me. It continued to develop new innovations and new techniques. I’ve mentioned a few of them. The only one that I introduced during that time was the number theoretic transform stuff, which was kind of done in my spare time. What got me back into signal processing was another development. I worked on the problem of radar and communication systems in the presence of jamming. You can form antenna patterns that actually have a deep null in the direction the jammer is coming from. But you don’t know how to do that when you design the system, so you have to do it adaptively. There are some analog systems that have adaptivity built into them to do this. But you can also do it by collecting the signal and computing on the signals. You compute how to combine your antenna elements to form a pattern that’s guaranteed to minimize jamming.

So I began working on that. At first, the thought was to have a computer that was part of the system. Have the computer that was part of these computations. But we really wanted to put these antennas on satellites. On satellites you don’t have much power to work with, and you don’t have much weight, and size, and stuff like that. So I became very interested in trying to take these algorithms, and both change the algorithm and the hardware and the methods, all sort of together, and see if we could actually shrink a roomful of equipment into something that you could put on a satellite.

I came up with this [pointing to a picture of the cover of an engineering magazine]. It is an entire sixty-four element nulling system. This is an integrated circuit, but it’s not an integrated circuit like we have today, it’s an integrated circuit that’s this size. It’s approximately life-size. Of course, an integrated circuit that big will never work, because you can’t make it without flaws. So, this is made of a bunch of identical pieces, and there’s wiring on the surface of the integrated circuit that is discretionary. There are discretionary opportunities to make cuts and weld together wires. That technology of making a big integrated circuit and testing it and making welds and cuts with a laser had been developed at the laboratory in the early ‘80s, and had been used to make several other systems of this size, but they were sort of toy systems. They were systems that demonstrated that you could work the technique.

I used a technique called a CORDIC, which is a way of doing rotations with digital hardware. And a way of organizing the algorithm so that rotation was the fundamental arithmetic step of this antenna nulling. I showed how to do it with CORDIC, and then I realized this array of CORDICs was spun together, and the whole process is CORDIC rotation being directed-- you know, each rotation being directed to the next circuit to do the next rotation, and so on. It does the whole algorithm, which originally nobody even realized was rotation based. So this represents about a giga-op on something that weighs less than a package of cigarettes. Having done that, I was kind of hooked on being back in signal processing, and in particular, antenna array processing. And that’s what I’ve been working on since. But hopefully the next success will come before I retire.

Goldstein:

When did you do this work?

Rader:

From ‘69 to ’76, I worked on satellite reliability.

Airborne radar signal processing research

Goldstein:

But you were still involved with signal processing by virtue of your association with the society?

Rader:

Only as a listener. In the early 1980s, I was the president of the society. But mostly I was working on satellite communication technology. Some of what I was doing at that point was administering other people's work. Some of it was doing work that was particularly devoted to the communications problem, only a small part of which was signal processing. I worried a little bit about something similar to cryptography that you use in anti-jam assembly, mixing things up and seeking new random ways. But when this communications problem was so clearly a signal processing problem, I got back into it, and I’ve been working in that area ever since.

This group that I’m in now, Group 102, is an airborne radar signal processing group that has a huge number of people, and talks about building, or even buying, very large capacity computational systems and applying them to the radar problem.

Evolution of the digital signal processing field

Goldstein:

I’m interested when you said in ‘69 you felt that DSP theory had reached some kind of plateau?

Rader:

I figured it was finished for a while.

Goldstein:

At that point, can you remember what the cornerstones were of this theory that was, more or less, finished?

Rader:

There was digital filtering, and much of the work on digital filtering had to do with different ways of connecting adders and multipliers and delays to make a digital filter. Another had to do with analyzing the finite word length effects. In any computer you pick a word length, and you have to make approximations, and it’s all in the effect of these approximations. So there’s the finite word length effect, both with the filter coefficients and the signals. There was the FFT and related versions. There were various versions of the FFT that also had a finite word length problem. There were uses of the FFT in the convolution problem. And sort of that was it, you know, for 1969. There were other small problems. The Burg algorithm had, by that point, I think been invented, but I didn’t realize its importance. Certainly the way that it changed the signal processing field was not anticipated. I thought that multidimensional signal processing was a minor pedestrian extension that wasn’t worthy of spending the rest of my career on.

Goldstein:

And there are these adaptive filters.

Rader:

I didn’t know anything about them.

Goldstein:

That came up later.

Rader:

So, you see, I called it wrong.

Goldstein:

Well, that was the other thing I wanted to ask about. Earlier when you were still running through chronologically, you had taken it up to the mid-’70s when you mentioned that all these new developments suddenly caused the field to expand beyond the scope of any one person’s ability to keep track of it. So it was around that time that you became involved with the cordic?

Rader:

That was more like the ‘80s. Between about ‘75 and ‘83 or so, we built and launched these satellites. We had this group of people that knew how to do that. But we weren’t funded to do another one, but we were funded to keep working on the kinds of techniques that would become part of some future satellite. Somewhere along the line we actually did get another satellite contract that used some of all that stuff that we had been working on. I worked on that for a while. And this didn’t get done until like ‘86 or ‘87. It took a few years to go from a few graphs to the product. Since then I’ve been looking for another great thing to do. Hoping for one more success before I can’t do it anymore.

Goldstein:

How would you describe the growth in signal processing in that period, beginning in the mid-’70s, when it began mushrooming off in all these different directions? Is there a way to characterize the field?

Rader:

There were a lot of different things, and it’s hard to keep them all together. It’s hard for any one person, me included, to summarize the whole field. Back somewhere before the ‘80s, I was thinking in the terms of leaving the laboratory and becoming a professor somewhere, and when you do that, they like you to give a talk. I put together a talk on the field of digital signal processing. And I was able to put on one sheet of paper a set of topics and connect them all together with lines, showing what had led to what and what was related to what. I don’t think anybody can do that anymore. First of all, if you tried to put it all on one piece of paper, no matter how big, there would be so many lines crossing one another that you couldn’t follow it.

A very important development in the ‘70s and ‘80s was the importance of the relation between algorithms and architectures. What do I mean by architectures? In a computer, like an IBM PC you basically have a central processing unit that can do arithmetic.

Goldstein:

It’s the Von Neumann architecture.

Rader:

So, the architecture is CPU and memory. Your algorithm in effect is a program. In specialized signal processing, where you know what the algorithm is and you’re willing to commit it to hardware, you can have a lot of multiplications, additions, registers, divisions, and square roots, and so on, all working together, all working at the same time. And then one of the issues is how do these things all get connected together so the results of one computation could get to where they’re needed next, and things can all be kept busy at the same time. It doesn’t do to put on this corner of the chip or in this corner of the rack the thing that’s going to communicate with something way over here, because it will take a long time to pass the data back and forth. So architectures, the way you arrange components, is related to algorithms, which are the way that data is going to be processed. That’s become an important area. This MUSE is an excellent example of an architecture algorithm technology marriage, where you get every part working at 100 percent efficiency, if you’re lucky, and no data has to travel very far to where it’s needed next, and so on. That’s become an important part of the signal processing field.

What are some others? Another part that I paid very little attention to is the artificial intelligence approaches, neural nets and so on, that people are applying to recognition problems. I paid absolutely no attention to that, which is intriguing, because do you remember what I wanted to do at the beginning of my career? But artificial intelligence approaches and related things have become more and more important. Recognition techniques for recognizing the meaning of data have become important. Statistical decision making has become an important part of the field. It was sort of always there in other fields, and it came into the digital signal processing field, because we were the ones producing the data that a decision could be made on, so people interested in making the decision on how to make them with good probability statistics began contributing to the society. There’s a lot of that. I don’t know that I could coherently give you the picture of the whole field.

What one can do, is one can look at the program of the annual conference, or the papers and proceedings, and say, “How much of this represents something, you know, related to what was being done in ‘69 or ’70?” The answer is "relatively little." Certainly under ten percent. Not very long ago a group of us here at Lincoln put together a course for the local IEEE called "Advanced Techniques in Digital Signal Processing." We had perhaps a dozen lectures. It was entirely composed of material that was not present in the Gold and Rader book, or the one by Oppenheim and Schafer that came along a few years later, and which is the textbook that the field needed, and from which most people learn digital signal processing. The modern material that the conferences are about today are completely absent. Lattice filters are not in that book. Linear predictions are not in that book. Architectures are not in the book. We just didn’t know it existed.

Goldstein:

What is different in that presentation, the treatment in the Schafer and Oppenheim book, from the book you did with Gold? Are they trying different things?

Rader:

I can give a facetious answer: Theirs is a book that has problems. A serious answer is that it’s a book that is meant to be taught from. We had neither the time nor the inclination to write a textbook, we just wanted to get the information published. We didn’t put nearly enough attention into the book for it to be a teaching tool. Although it was used, you know, for lack of anything better. For a long time, it was used as a teaching tool. Ben and Larry Rabiner wrote another book, more like this, at about the same time that Oppenheim and Schafer wrote the textbook. Their book is more like this, except it has more material. But it was not meant to be a textbook. It was meant to be just a collection of techniques presented with some semblance of order, and not something that you would build a course with.

Goldstein:

Yes. I’m wondering if, in a fast moving field, if the growth in the field has an impact on the way that the fundamentals would be presented. So I might look for that kind of thing by comparing books over time.

Rader:

It would be hard for you to do, because, you know, the books were not written for somebody to scan. They were written to teach the material in depth. Nobody can write a general book on digital signal processing anymore. There may be a book on linear prediction, or a book on multi-rate systems, or a book on filter forms, or fast Fourier transform techniques. Nobody can write a book that is reasonably titled “Digital Processing of Signals,” in 1997, because there’s too much, and nobody can organize it. At least I don’t think anybody could. Certainly it would be this thick.

Acoustic committee investigating Kennedy assassination

Rader:

In the mid-’70s there was a congressional committee on assassinations, the assassination of President Kennedy and Martin Luther King, among others. That committee was given testimony from some acoustics experts who believed that one of the police radio channels in Dallas had recorded the gun shot sounds. And by a further analysis, they thought they could prove that one of what they thought were four shots had originated not from the school depository building, but from the so-called grassy knoll. The House Assassinations Committee believed that it had proven a second gunman. They then sent their report to the Justice Department, who asked the FBI to comment, and the FBI said, “This is total, utter nonsense.” Interestingly, the FBI used the exact same technique on data recorded somewhere in North Carolina and decided they were able to prove, tongue in cheek, that the gunshots recorded in Dallas had been fired in North Carolina. Clearly poo-pooing the technique.

So, the Justice Department asked the National Academy of Engineering to put together a team and look at this data and technique and judge its validity. Along with several Nobel Prize winners, like Louis Alvarez, I was on this committee. I played a significant role in proving the negative conclusion that the sounds that were being identified as gunshots were in fact distorted human speech. I knew what the speech was, and could associate it with a time, about a whole minute after the assassination, because it was a Dallas sheriff talking about moving men around to check out the possible site where somebody had seen somebody suspicious. We were able to figure out a lot about this horribly distorted data.

Then I did something that I’m very proud of. You know, assassination buffs will never let this die. Perhaps they’re right. Perhaps there were cover-ups and so on. But, this particular piece of work that we did, which was critical of work that somebody else did, instead of just doing a press release as the agent of the Director of the National Academy of Engineering on this committee wanted us to do, I said, “No, instead of doing a press release, let’s circulate our report to the people whose work we are criticizing, and then get their comments, react to their comments, fix it up to the point where they agree that we’ve successfully explained the data and supported our conclusions. That’s the time to do the press release.” It lasted a few more months, and after that we in fact had put this particular piece of the assassination investigation to bed. It was no longer something that even the most rabid proponent of one view or another was going to find controversial. So it ended there.

Goldstein:

Right, so you couldn’t be seen as part of the cover-up.

Rader:

We weren’t playing the game of press releases. We were doing good science.

Goldstein:

That's a nice way to finish up the interview. Thank you very much.