Oral-History:Hans Musmann: Difference between revisions

From ETHW
m (Text replace - "IEEE History Center Oral History Program, 39 Union Street, New Brunswick, NJ 08901-8538 USA" to "IEEE History Center Oral History Program, IEEE History Center at Stevens Institute of Technology, Castle Point on Hudson, Hoboken, NJ 07030 US)
m (Text replace - "New Brunswick, NJ" to "Hoboken, NJ")
Line 21: Line 21:
It is recommended that this oral history be cited as follows:  
It is recommended that this oral history be cited as follows:  


Hans Georg Musmann, an oral history conducted in 1994 Frederik Nebeker, IEEE History Center, New Brunswick, NJ, USA.  
Hans Georg Musmann, an oral history conducted in 1994 Frederik Nebeker, IEEE History Center, Hoboken, NJ, USA.  


== Interview  ==
== Interview  ==

Revision as of 16:37, 30 June 2014

About Hans Georg Musmann

Hans Musmann

Dr. Musmann was born in Germany in 1935, and received his education in that country. In 1966 he received his Ph.D. in the communications field from the Technical University of Braunschweig. He has worked with tunnel diodes, microwaves, satellite communications, computer and facsimile transmission, and computer-generated images. After obtaining the doctorate, Musmann began focusing on facsimile transmission and source coding, digitizing facsimile images. In the1970s he participated in fax standardization, and also in reducing cost and time factors in fax transmission. Musmann then concentrated on satellite communication of video and audio signals, including the transmission of moving images. In 1972 he founded the Institute for High Frequency Science, which researched innovations in communications technologies, including video transmission techniques for satellites.

The interview begins with Musmann's descriptions of his secondary and university education. Musmann concentrates on his career in the electronic communications field, and provides detailed descriptions of his innovations with source coding for facsimile, television, telephone, and satellite technology. He explains his contributions to developing international standards for fax use and to computerized generation of moving and three-dimensional images. Musmann describes his decision to found the Institute for High Frequency Science, as well as the Institute's research activities and funding sources. In addition, he outlines his experiences with motion estimators, digital communications networks, image coding, and computer-generated virtual representation. The interview concludes with Musmann's optimistic opinions on the prospects for technological progress.

About the Interview

Hans Georg Musmann: An Interview Conducted by Frederik Nebeker, IEEE History Center, 30 August 1994

Interview # 223 for the IEEE History Center, The Institute of Electrical and Electronics Engineers, Inc.

Copyright Statement

This manuscript is being made available for research purposes only. All literary rights in the manuscript, including the right to publish, are reserved to the IEEE History Center. No part of the manuscript may be quoted for publication without the written permission of the Director of IEEE History Center.

Request for permission to quote for publication should be addressed to the IEEE History Center Oral History Program, IEEE History Center at Stevens Institute of Technology, Castle Point on Hudson, Hoboken, NJ 07030 USA. It should include identification of the specific passages to be quoted, anticipated use of the passages, and identification of the user.

It is recommended that this oral history be cited as follows:

Hans Georg Musmann, an oral history conducted in 1994 Frederik Nebeker, IEEE History Center, Hoboken, NJ, USA.

Interview

Interview: Hans Georg Musmann

Interviewer: Frederik Nebeker

Date: 30 August 1994

Place: University of Hannover

Childhood and Education

Nebeker:

Professor Musmann, could you describe your present position?

Musmann:

In 1972 I got a post as professor at the University of Hannover, and I became head of this Institute for Theoretical Information Technology (Institut für Theoretische Nachrichtentechnik). I built up this Institute, which now has three professors and about 45 research staff members. In addition to this Institute, I am also head of the Directory Board of the Laboratory for Information Technologies, which is just opposite. There, another 40 research scientists are working under the advice of two other professors with whom I collaborate.

Nebeker:

Could I ask you briefly about your education? You were born, I know, in 1935. Do you remember the war years?

Musmann:

I remember the war. At that time I went to the fundamental school in Salz. I think it was the tenth of April 1945 that the Allied troops—the Americans—came over, so I was a young boy of about ten years. That area is a more countryside area; we were not so much affected by the war.

Nebeker:

So you weren't bombed?

Musmann:

No.

Nebeker:

Was it hard for your family in those years?

Musmann:

Of course, at that time there was a lack of food, but in the countryside it was not so difficult as in the big cities.

Nebeker:

I know that also in the immediate postwar period it was very difficult for some people.

Musmann:

Especially in the big cities, not so much in the countryside. And this was [true] only in the western part of Germany. We're very close to the border of East Germany. I went to high school in Salz, which is close to Goslar. Then, after finishing high school (I was about nineteen), I first went to work in industry to learn how industries operated. I worked for some months with Siemens, with Bosch, Blaupunkt and with Rhode & Schwartz. These are industries in different fields of electrical engineering.

Nebeker:

This was specifically to try out different areas?

Musmann:

Yes, there were two things. First, in order to start a study in electrical engineering, it was necessary to have some practical experience.

Nebeker:

So you had to go out in industry.

Musmann:

Yes, I had to go out, but I was free to go where I wanted to. And I was also interested in learning how these industries operated, and I wanted to learn about the different methodologies these industries employed.

Nebeker:

What sorts of electrical technology did you deal with?

Musmann:

Communications, I wanted to go into that direction.

Nebeker:

You knew that at the time?

Musmann:

Yes, I started my studies at the Technical University of Braunschweig. This takes about five years: first you have to pass the Vordiplom, and then the main Diplom. After finishing the Diplom I worked on a Ph.D. At that time we did not have microelectronics.

Nebeker:

Right, but the transistor was around!

Musmann:

Yes, it was around at that time.

Nebeker:

You're talking about the middle 60s?

Musmann:

Yes. I finished my Ph.D. in 1966. I intended to go into a research institute.

Ph.D. Thesis on Tunnel Diodes

Nebeker:

You did a thesis for your doctorate. What was that?

Musmann:

It was in the field of microwaves. I wanted to use microwaves to make very fast switching circuits. In my thesis, I investigated a special effect of tunnel diodes. If you operate them in a certain condition, then these tunnel diodes have only two stable states. I wanted to use these two states for indicating zero and one, a binary information and to switch between two with an outside control and then to connect several of these to construct a very fast counter. If we had had fast transistors at that time it would not have been necessary. And during the time that I developed this, high-speed transistors and the first microelectronics circuits came out.

Nebeker:

This was an alternative to microelectronics switches?

Musmann:

Yes, at that time.

Nebeker:

Has that work been followed up anywhere?

Musmann:

No, because it is much more complicated. You need microwaves to operate these tunnel diodes, and it is much easier with the transistor.

Nebeker:

So that was your thesis that you finished in 1966?

Musmann:

Yes. I got some experience in microwaves.

Nebeker:

Has that proved useful?

Early Experiment in Voice Recognition

Musmann:

To a certain extent. During the time I worked on my thesis, I also built up a small computer with transistors—I had to connect the transistors of course by hand! I wanted to learn how computers work and operate. So, together with a colleague, I built up a small computer that could be operated by voice. And that was my first publication.

Nebeker:

How was that done? Did you have any voice recognition techniques?

Musmann:

Well, to a certain extent. I developed a very simple technique in which the voice is split up into frequency bands. Then I sampled the output of the frequency bands and used these patterns to distinguish between ten digits. Some commands like "add in," or "subtract," or "multiply," and so on, were for operating the computer.

Nebeker:

So it could recognize a couple dozen?

Musmann:

I think altogether there were between fifteen and twenty words it recognized.

Nebeker:

Did it often mistake a three for a four, or did it make mistakes very often?

Musmann:

Of course. But you could recognize it, because it showed what it understood. If you said "one" and it indicated "two," then you recognize it. This was a very early publication on speech recognition. Later on I learned that there was only one earlier publication in that field on speech recognition. That was published by IBM, and the system was called the Shoebox.

Nebeker:

Why did you choose to use voice input for that device?

Musmann:

I thought it would be very nice to have a computer you can operate with voice. That is what you need still today!

Nebeker:

Yes, I am just surprised that you would attempt it at that time.

Musmann:

We still have too much technology where the human being has to adapt to the technology instead of vice versa.

Nebeker:

You said that you wanted to develop a counter with these tunnel diodes and microwaves. What application did you have in mind, or was it purely a matter of demonstrating that one could do this?

Musmann:

In physics there was a need for very fast counters. I wanted to develop a very fast counter.

Nebeker:

I know in high-energy physics the fast electronics meant a lot.

Musmann:

Yes. In reality, I wanted to construct a fast switch, which later on was substituted by the transistor. But I tried to do it with the tunnel diode. At that time we had the first transistors, but they were germanium transistors, which were very slow. They were switching with clock rates of kilohertz at that time. I wanted to make a switch with a clock rate of a hundred megahertz.

Nebeker:

So you finished your thesis on that. And your work on the computer: that was just your own independent work?

Musmann:

Yes. I also used this work to educate students in digital signal processing.

Nebeker:

That must have been very impressive for the time that a person could just talk to this computer.

Musmann:

Yes, at that time, it was impressive.

Nebeker:

Did it attract much interest?

Musmann:

Yes, there are several publications on it then, speculating on how this might develop, but the development of speech recognition from that time until today has taken about thirty years. Today we have the first systems that you can really use.

Nebeker:

I have a colleague with a computer with some voice recognition, and though it usually works, it often doesn't.

Analog-to-Digital Conversion

Musmann:

<flashmp3>223 - musmann - clip 1.mp3</flashmp3>

But my real interest was in a different direction. At that time, we learned from publications that all kinds of information could be represented in digital form. This was fascinating. We learned how to represent a speech signal, and we learned about the effects of analog-to-digital conversions. We also learned that it is very complicated to convert video signals or any kind of moving images to digital representation. If you compare a TV signal with a speech signal, then you recognize that the transmission bit rate is about three thousand times that of the speech signal. But it was fascinating that all kinds of information could be represented in a digital form. My special interest at that time was the representation of visual information for future visual communications. But the problem was the very high bit rate.

Then I learned from the current literature how you have to sample the analog signal in order to convert it into a digital representation. If you have band limited signals, you can do it with perfect reconstruction, theoretically. Then, if you quantize the samples in order to have a completely digital representation, you introduce quantization noise. By sampling and then quantizing—we call this PCM for Pulse Code Modulation—you come up with a bit rate for speech signals which is about sixty-four kilobits per second. You need an eight-kilohertz sampling rate, which is two times the bandwidth, in order to represent fast signal changes, and you need at least eight bits per sample, in order to avoid quantization noise. The first papers came out at this time from NASA and Bell Laboratories in the U.S., which showed that it was possible to represent this signal with fewer bits than PCM by processing the bits.

Nebeker:

Is this an audio signal you're talking about?

Musmann:

Yes, at that time it was audio, because it was impossible to convert a video signal at that time. You would have needed a very fast sampling system that was not available.

Nebeker:

One could certainly understand why NASA was interested in digital communication. What was your interest?

Musmann:

I saw the advantage of having every kind of information represented in one form by bits. This is really a big advantage; it offers the opportunity to transmit different kinds of information all on one line. At that time the different kinds of communication services had separate lines. Digital representation, however, allows sound, speech, video, and facsimile to be transmitted by bits on one and the same line. This was fascinating. The main question for me was whether it is possible to reduce the number of bits of the PCM representation. What is the real theory behind this?

Nebeker:

Were you more mathematically inclined?

Musmann:

Yes. I wanted to understand what the theoretical background [of digital representation] was. So I studied the work of Claude Shannon. I think he had prepared the fundamentals of information theory, including source coding which addresses the problem of "How many bits are required for representing information?" At that time we did not have digital communications, with the exception of space probes. That was about 1966, 1967. Since I was looking especially for representation of visual information, I started with facsimile.

Development of the Digital Fascimile

Nebeker:

Why were you particularly interested in visual information?

Musmann:

I thought, "We have a present type of communication, the telephone; visual information should be the next step in communications." But it took a long time! So I started when I was still in Braunschweig at the time when I finished my Ph.D. thesis and I wanted to change to go to the southern part of Germany to a research institute near Munich. I presented my ideas about what I wanted to do to the institute and its head, Professor Hartl.

Nebeker:

At what Institute?

Musmann:

At DLR. Professor Hartl talked to my then head in Braunschweig, Professor Kirstein. Hartl said, "These ideas are very interesting and I am interested in financing them, but I think he has a better opportunity to study these coding problems in Braunschweig than at DLR." So I was supported by DLR, and I developed some of these ideas in my Habilitation.

Nebeker:

And why did Hartl think Braunschweig would be a better place?

Musmann:

If I had gone to the DLR, I would have been an exception by doing work which was initiated by me and which was theoretically based, while all the other people were involved in projects.

Nebeker:

Was this work on source coding theoretical?

Musmann:

Mainly, yes, but I also tried to realize it concretely. I studied this in terms of facsimile and speech, because we did not have very fast A-to-D converters.

Nebeker:

So while your ultimate interest was visual, you did do some work on audio and facsimile? Was there already an established technology for facsimile?

Musmann:

Yes, there was an analog facsimile system—we still have an old machine here—which was invented by Doctor Hell. This was invented, I think, around 1927.

Nebeker:

Was it commercially used?

Musmann:

Yes, it was. It was used on telephone lines. But the resolution was relatively low compared to today's facsimile. So I studied that.

Nebeker:

And that was because you could handle the digitization of that signal?

Musmann:

Yes. Since we had the A-to-D converter, I could realize the digitization. In the meantime—that was at the end of the 60s—the first computers had just come onto the market—the big ones. We had one central computer at the university; all the terminals were connected to the computer. We started to use the computer to investigate the efficiency of these facsimile-coding techniques.

Nebeker:

This was a mainframe computer?

Musmann:

Yes, we had one computer for the whole university. That was about 1967 or 1968.

Nebeker:

I can imagine that that was a very useful tool for this study.

Musmann:

We digitized the facsimile images. At that time we used punch cards to transfer an image into the computer. At that time, I remember, facsimile was considered to be a "sleeping giant," but nobody believed it. At that time nobody used it, with the exception I think of police.

Nebeker:

It may be that meteorologists used it for weather maps.

Musmann:

Yes, you are right. We learned that by going digital and then reducing the bit rate, you can reduce the transmission time, and thus the cost. In the meantime, the facsimile machines were increasing in resolution even though they were still analog. But one page required about fifteen minutes for transmission. I think at that time the transmission cost to the United States was about twenty to thirty D-marks, so it was very expensive to transmit one image.

Early Uses of the Fax

Nebeker:

Was this commercially used at that time?

Musmann:

Only for police and industry, because it was too expensive for most people. Also the machines were very expensive: ten thousand to twenty thousand D-marks. But we recognized that there was potential development in the facsimile, which was related to the development of printers for computers. Computers require printers, and facsimile machines are similar to a printer. So there was a synergy between these two developments.

Nebeker:

Were there people in Braunschweig, or people you had contact with, who were working on developing printers? Or did you know about this through the literature?

Musmann:

The company of Dr. Hell, which was in Kiel, constructed facsimile machines for the police, and we were in contact with them. We found that it is possible to reduce the number of bits for presenting an image by about a factor of ten. This was relatively complicated; it was the Ph.D. work of D. Preuss, and it is still used as reference today. There was a parallel development in both the United States and Japan. At this point we were asked by the German PTT [Post, Telegraph, and Telephone administration] to support the standardization of facsimile. The facsimile was starting to be standardized. Everybody recognized that it was possible to reduce the transmission time by representing a facsimile in a digital form, then transmitting it on an analog telephone line, with help of modems, and then converting it back.

Nebeker:

It was the digitizing of the information that allowed that compression?

Musmann:

Yes. If you go digital, you can apply complex processing to reduce the bits.

Nebeker:

With transmission costs as high as they were, compression by a factor of ten must have been great importance.

International Standards for the Fax

Musmann:

In the middle of the 70s—1975 or 1976—the standard for facsimile was fixed.

Nebeker:

Was that for Europe only?

Musmann:

No, it was worldwide.

Nebeker:

Did the ISO [International Standards Organization] set it?

Musmann:

At that time it was the CCITT, the Consultative Committee of International Telephone and Telegraph. Now it has a different name.

Nebeker:

You were involved in that organization?

Musmann:

We were asked by the PTT to transfer results to the CCITT standardization effort.

Nebeker:

Were you involved in the discussions of different possible standards?

Musmann:

Yes. At that time, Mr. Preuss, a student working on his Ph.D., was the one immediately involved in the standardization.

Nebeker:

What was your relation to him? You were an associate professor by now, at Braunschweig?

Musmann:

I was associate professor. I advised Preuss, who was a student in Braunschweig. I got support from industry and from PTT, and I have carried on this collaboration with industry and PTT into today.

I remember that there were several proposals for coding facsimile: from the United States, by IBM; from Japan, by KDD; from Germany, by the PTT and University of Braunschweig; and several others. CCITT tried to find the best solution for a standard. But this is competition of interests, of course. The proposal we presented was too complicated to be adopted. The bit-rate reduction was the best of all, but the system was too complicated. Electronics had not developed sufficiently at that time.

So the result was a compromise, called a modified Huffman code. That was the first digital facsimile coding standard, and every machine today has this standard. I think it was only three or four years later when everyone realized that the more complicated coding techniques could now be realized in just one chip.

The first standard reduced the bit rate only by a factor of four, or something like that. In order to get a reduction by a factor of ten, we worked on a second standard. The second standard, which is a two-dimensional coding system, is also still used today. Thus facsimile became the first digital transmission of visual information, I think. The big industries, especially the Western industries, still hesitated. It was Japanese industry that recognized the potential behind the facsimile and pushed this to make products out of it. There was a special need for Japan; the Western countries already had Telex.

Nebeker:

Which, of course, doesn't work with Japanese characters.

Musmann:

I think the need for facsimile machines in Japan was the reason that the Japanese industry became the leader in facsimile manufacturing.

Nebeker:

In the West there just didn't seem to be a large enough potential market for companies to put a lot of money into developing it?

Musmann:

There was, in fact, an interest in developing more advanced Telex systems that would operate digitally. At that time, they were operating mechanically. The moment the coding standard was available, Japanese industry started high volume production.

Nebeker:

When was the first standard set?

Musmann:

In about 1975. I have to check that.

Nebeker:

When was the second standard set?

Musmann:

About three or four years later.

Nebeker:

Was it the second standard that was adopted for the fax machines that the Japanese used?

Musmann:

Both. Today's machines work with two standards. When you begin sending a fax, the machines exchange header information. In this header information the receiving machine is asked, "What code do you operate on? Code one or code two?" So both are used. But when the first standard was fixed, it was not possible to realize the second standard on one chip. The developmental of microelectronics made it possible.

Nebeker:

Would it be possible today, many years later, to devise a substantially better source coding of facsimile?

Musmann:

Not really. If there were a substantial factor for improvement, then I think we would have seen it by now.

Nebeker:

The coding, then, is a fairly stable technology.

Musmann:

There are new coding techniques that are going to be standardized for very high resolution and color facsimile. At that time everything was black and white.

Nebeker:

Was your work on all this mainly theoretical?

Musmann:

Yes, mainly, but also we wanted to have practical codes, of course.

Nebeker:

So you were designing algorithms that would do this?

Musmann:

Yes, algorithms to process the bits in order to reduce the number of bits is mathematics! We call it coding. So we were looking for codes that would represent information exactly, without losing any of the information. Today, you know, facsimile machines need only about one minute, instead of ten or fifteen, for a page. To transmit a document from a desk in Germany to one in the United States costs about three D-marks. This is convincing! But I remember discussions in 1975 worrying that these machines were too expensive for widespread use.

Nebeker:

Did you foresee that facsimile would become so important in commerce?

Musmann:

Here we were convinced. The main problem was the printing mechanism. It had to be done cheaper. The cost of transmission was reduced by these coding techniques. Today you can buy the facsimile machine for six hundred D-marks, or something like that.

Development of Digital Visual Communication

Musmann:

But at the same time I was working on the facsimile machine, I was also interested in looking for a visual communication system, a system that would allow you not only to speak to another person, but also to see the other person. At that time the problem was the digitization of a video signal, which is required for realizing such a visual communication system, and reducing the bit rate for moving images, which is almost three thousand times that of a speech signal. I always said our goal would be to cut down the bit rate of a video signal to that of a speech signal. Otherwise it's too expensive to use it.

Nebeker:

The factor of three thousand is for television resolution?

Musmann:

Yes. Of course, you can use a smaller picture. That was also proposed later on. But even if you have an image one fourth as large, then you'll still have a factor of eight hundred or something like that. You can reduce the frame frequency from fifty hertz to ten hertz. Then you come down to a hundred times the rate of the speech signal. Nobody thought at that time that this compression factor could be achieved by coding.

We made some very early experiments in 1977, when the facsimile work was finished. We tried to come down to 64 kilobits per second. I recognized that the first satellites used this bit rate for transmitting speech signals. In order to have a visual communication system that operates worldwide, there are no other digital lines which can be used.

The transmission lines, which were available worldwide, were 64 kilobit per second lines for speech channels. When you want to extend the normal communication system, you cannot afford a satellite link of a hundred times the speech signal.

Nebeker:

The 64 kilobits per second was each speech channel?

Musmann:

Yes. Of course, these satellites have several speech channels. This indicated to me that if we wanted to create a visual communication system, we would have to cut down the bit rate for visual information to that of speech—or be close to it. Otherwise, nobody would be able to afford it. So we studied some techniques. And, in 1979, two years after the facsimile work, we demonstrated in the United States a transmission of moving images requiring just 64 kilobits per second. I still have this here.

Nebeker:

You still have the original system?

Musmann:

Yes. We recorded it to show what image quality can be realized with a speech channel transmitting moving images.

Nebeker:

How did you record it at that time?

Musmann:

We recorded it on film.

Nebeker:

On celluloid film?

Musmann:

Yes. I also made a video recording with an early videotape recorder.

Frame Memory

Nebeker:

And was this demonstration well received?

Musmann:

Yes. This was at the International Communications Conference in Boston in 1979.

I think that was the first time it was shown that it is possible to transmit moving images with 64 bits per second. It was, of course, in monochrome, not in color, and only a small picture. A girl was shown there speaking with a telephone in her hand. Whenever she did not move too fast, the image was clear, but when she moved quickly, it became smeared. This was the beginning. There was a big interest. The communications industry saw that it might be possible to transmit moving images via speech channels. But they were still hesitating, waiting to see if the problems with motion could be solved in the future. We transmitted mainly those parts of an image that had changed, and we took the other parts from the stored preceding image in the memory of the receiver.

Nebeker:

So the receiver had to have a frame memory?

Musmann:

Yes, and this memory was a problem at that time. The memories of computers were not transistor memories, but magnetic-core memories. Each core was one bit. I needed a frame memory: 400,000 picture elements and 8 bits per picture element. The price was 250,000 D-marks! So I wrote an application to the German Science Foundation, the Deutsche Forschungsgemeinshaft. They asked me to come to explain my application to a committee of experienced people in that field. I was a young man at that time, and I came with an application that required mountains of money! People argued, "Nobody knows if we will really have a digital communication system in the future." This was 1973 or '74. I remember there was Professor Unger from Braunschweig, who had spent several years at Bell Labs. He said, "What I have seen in Bell Labs is that people working in that field are only thinking about digital transmission." That was very helpful for me.

The result was a compromise. "Can you start with half a frame?" they asked. So I started with memory for half a frame, and I got the second half later on. I still have this in our laboratory. You can see a small part of the old memory, and you can see today's frame memory. Ten D-marks! One chip! This was a beginning at that time.

Nebeker:

Was that something that you developed yourself, that technique of transmitting only the changed parts of the picture?

Musmann:

That was an invention that came from the leading group in the field at that time: Cutler's group at Bell Labs. He built [the] prototype of a video telephone in 1974, with a bit rate of 1.5 megabits per second. It was too expensive.

Nebeker:

But he was using that technique?

Musmann:

Yes, the idea of transmitting only the changed parts of an image came from that group. The main people in that group then were John Limb, Charlie Rubenstein, F. W. Mounts, and A. N. Netravali who is today vice-president of the AT&T research labs in the United States. But we had the problem that if there was a lot of change then the capacity of the 64 kilobits per second line was still too small. And we investigated techniques to solve that.

Nebeker:

New source coding?

Visual Perception & Reducing Bit Rates

Musmann:

Yes. We transmitted only those parts that had changed. And we tried to avoid these problems when we have not enough bits available by two techniques. You can reduce the number of bits by mathematics, exploiting the statistical characteristics of the signal. And you can exploit the visual perception of the eye. You hide distortions where your eye will not notice it.

Nebeker:

Can you give me an example of a type of distortion that wouldn't be noticed by the human eye?

Musmann:

Yes. We know that the human eye is more sensitive to low frequency detail than to high frequency detail. Your eye can distinguish very nicely small changes of the gray value in low-frequency image areas. But in the high-frequency image areas, your eye is less sensitive. It cannot distinguish small changes of the gray level differences, so you can allow coarse quantization in the high frequency part and thus you save bits.

But there are even more of these effects, especially if you have temporal changes. The eye recognizes details only after some time, not immediately. If you have a very fast change, you can't recognize immediately the detail in the next picture. You need some time to recognize it.

Nebeker:

Your work was then in part guided by research on human perception?

Musmann:

Yes. We collaborated with physiologists, and still do it today. We exploit both statistics and human perception.

Nebeker:

Was any research in the physiology of perception carried out specifically for your project?

Musmann:

We found that their interest is oriented toward different problems than ours. They are interested in understanding the processing of the visual information in the brain. But we want to know answers to very specific questions. We got some support, but the investigations, especially for application in coding, are then verified by engineers. So we try to study what physiologists have found, and then, in order to find out if we can use it, we try to verify it in connection with our application.

Nebeker:

So when you try a different manner of coding, you conduct experiments yourself to judge its effectiveness.

Motion Estimators

Musmann:

We did try to exploit characteristics of human visual perception, but there was still obvious distortion in the images transmitted at 64 kilobits per second. One way of reducing the quantity of information transmitted, which exploits statistics, we spent years investigating.

If you have a moving object, then its position differs from frame to frame. The area covered by the moving object in the two frames is the area that has changed from one frame to the next. If it is possible to measure the motion of the object, then you can use the object from the previous frame to put the object in the correct position in the next frame. You do not need to transmit all the areas that have changed; you transmit only those parts that have been uncovered. The moving object itself is reconstructed from the information you already have transmitted, which is stored in the frame memory. But in order to do this, you need some technique for measuring the motion of the objects and you have to transmit the motion information.

The first paper that investigated how to measure motion was, I think, one by John Limb at Bell Labs. But these techniques were not accurate enough at the beginning to be used in coding. We needed more precise motion measurements. So we spent a great effort developing what we call motion estimators that were reliable and had an accuracy sufficient for coding.

Nebeker:

What other groups, besides the group at Bell Labs and your group, were working on these problems?

Musmann:

The PTTs worldwide.

Nebeker:

There were a lot of people?

Musmann:

Worldwide, yes, and a lot of universities. In the United States at that time, there was Professor Pratt at the University of Southern California in Los Angeles. Also, MIT was always working in this field, and there was Professor Huang in Illinois. Nowadays you find also the University of Santa Barbara (UCSB) and the University of Davis (UCD) working in this field.

Nebeker:

It seems that it was a hot topic that a lot of people were working on it.

Musmann:

Yes. We recognized that motion estimators might be the key to a greater bit-rate reduction. Later on, this turned out to be true.

Funding for Institute's Research

Nebeker:

Can you tell me how you started the Institute here in 1972? What was the staff initially?

Musmann:

I started, together with a secretary and three or four engineers. These were researchers or, as we call them, wissenschaftliche Mitarbeiter.

Nebeker:

They were working on their Ph.D.s?

Musmann:

Yes. I started the work with these four staff members. But then I came in contact with industries and with PTTs worldwide. Today 45 people are working here and forty are paid by industries.

Nebeker:

You've been able to build this up by bringing in outside money?

Musmann:

Yes.

Nebeker:

Where did these initial graduate students come from? Did you know some of them at Braunschweig?

Musmann:

Yes. The first came from Braunschweig, and then I picked up a few from Hannover.

Nebeker:

So it was you and this group of graduate students who were doing this?

Musmann:

Yes. The graduate students, what we call wissenschaftliche Mitarbeiter, are financed, for the most part, from outside the Institute.

Nebeker:

What were the main sources of outside money?

Musmann:

Of course, there is the support of the DFG—the German science foundation. There are also the German PTT, the German ministry of research, and industries.

Nebeker:

Which industries were interested in this Institute?

Musmann:

Communication industries.

Nebeker:

Which ones in particular?

Musmann:

At the beginning, there were for instance—and this is always changing—Hell, BASF, Bosch, Telefunken, and Siemens.

Nebeker:

These were all German companies?

Musmann:

Yes, but only at the beginning. Now we are collaborating with European and Japanese industries. We always have guest scientists here who are paid from the outside—from outside governments and industries. They are supported to study here. The research topic is recommended by the Institute and agreed to by the home institution. I do not take care of the financial aspects, because they are paid by their company.

Nebeker:

But they work with you?

Musmann:

Yes.

Nebeker:

Do they conduct projects of their own or do they take part in projects that you already have going here?

Musmann:

They take part in the projects we have going here. I think that all together there are about ten scientists here who are supported by industries from other countries.

Nebeker:

So how many researchers do you have now?

Musmann:

All together we are five professors and about 85 researchers. They are divided up into groups, but they collaborate closely.

Motion-Estimator Progress & ISDN Standards

Nebeker:

To return to your career: it was about 1980 and you were working on video research.

Musmann:

At that time we tried to improve video transmission techniques or coding techniques by developing motion estimators. I remember that at the beginning, people who did not have motion estimators that worked precisely enough came to incorrect conclusions. We made some theoretical investigations and found out that it is true: if the estimator is not accurate enough, it makes the results worse. So we worked to develop stable and accurate motion estimators. Then there was a breakthrough.

Nebeker:

When did that take place?

Musmann:

<flashmp3>223 - musmann - clip 2.mp3</flashmp3>

It was between 1983 and 1986. It took three to four years to develop these estimators. With their help, it was possible to save bits as long as an object was moving translatorically. This breakthrough cut down the bit rate.

At about the same time, at many labs, people showed that it was possible to transmit color moving pictures with 64 kilobits per second. At this point, the CCITT initiated standardization for transmitting moving images with 64 kilobits per second. The solution was the following: a new image is split into blocks of sixteen by sixteen picture elements. And for each block, a motion vector is transmitted, which indicates the position of the content of a block in the stored preceding image. Using the preceding image and the motion vector, the receiver composes block wise a so-called prediction image, assuming translatorically moving blocks. Only very few remaining prediction errors have to be transmitted. This technique allows us to transmit slowly moving images at 64 kilobits per second. At this point, the standard was fixed.

The ISDN network was developed at the same time, so it turned out to be a good idea to work on 64 kbit/s transmission rates. ISDN uses an existing analog telephone line for transmitting two times 64 kbit/s plus 16 kbit/s. A worldwide ISDN network is being built up. We need one 64-kbit/s channel for speech, and the other one now can be used for video application. But these systems are very expensive. They still cost ten thousand to twenty thousand D-marks, at the present. We expect that as soon as there are single chip solutions, the price will go down to two thousand D-marks, and even below two thousand.

Nebeker:

Do you think this will be a big market?

Musmann:

Yes, I think so. At the beginning it will be mostly used for special applications, such as surveillance.

Nebeker:

Excuse me for asking a mercenary question, but if this develops into a large industry, do you have patents on this? Might it be valuable to the Institute?

Musmann:

Yes. We have some patents. Some of our research members get money already.

Nebeker:

Do they hold an individual patents? What is the patent situation here at the university?

Musmann:

We normally get patents through collaborations with the industries or PTTs. That means that these industries help us to write up and file for a patent. Because they have financed the research, we feel obliged to offer the patent to them. The research staff members here are considered as employees of the industry. This is very helpful to us. We don't have the problem of financing the patent. After patent granting, we offer the patent to the industry that we have collaborated with. We have an agreement with the industry that the individual who did the research has the same advantages as an employee of that industry.

Nebeker:

When was this standard for visual communication agreed upon?

Musmann:

It took about three or four years to work on the standard, and it was fixed four years ago, in about 1989. But now the industries will need several years to develop the chips and the systems.

Nebeker:

Can you tell me about the process of reaching the standard?

Musmann:

It was initiated by the CCITT. They invited the members of the CCITT, mainly the PTTs; to contribute to a group set up to develop the standard. Then the PTTs and the companies sent delegates to this group, which met I think every three or four months and developed this standard. I will come back to the question of how this is done later. I have done this type of work, myself, in the 1990s.

Object-based Coding

Musmann:

Analyzing the coding standard, we recognize that the prediction errors, which have to be transmitted, are mainly at the boundaries of the moving objects. Because of the block-wise processing there are parts outside the moving object and parts inside. Thus you cannot define a correct motion vector. The next step is to figure out what you have to do. We decided not to use blocks. Instead of blocks we use arbitrarily shaped moving objects.

Nebeker:

You then somehow have to code the shape of the object?

Musmann:

Yes, and that is the problem we're working on today. In 1989 we suggested object-based coding instead of block-based coding. We represented each moving object by three sets of parameters: the shape, the motion, and the color of the object. If you transmit these parameters, you can synthesize the image at the receiver. The code estimates the motion of an object, and also tries to estimate the three-dimensional shape. The shape is represented by a wire frame. The color is projected on top of the wire frame surface.

We have developed algorithms that automatically estimate the three-dimensional shape and the three-dimensional motion of an object. Then we move these model objects and calculate a projection of the changed scene. Thus we generate a moving image sequence, which is used for prediction, parallel to the real one. By this technique, we have reduced the areas of prediction errors to only four percent of each image. Ninety-six percent is predicted correctly and needs no transmitted information. This makes it possible to transmit moving images with a bit rate in the range of 8 kbit/s to 64 kbit/s. That means you can transmit moving images in the mobile telephone system.

In the next step of our research, this wire frame will use a predefined wire frame for the head. Then, according to a proposal of Professor Forchheimer of Sweden, we need only one so-called action unit to describe what the eye is doing. Motion around the eye is automatically synthesized or animated by the transmitted action unit.

Nebeker:

So this is a move to a smart camera, a camera that understands the world as objects.

Musmann:

Now there are new outcomes from this work. If you can automatically model a complicated 3-D object, this is of great interest for producing movies. Nowadays in animated movies, the animators produce the objects by hand, and then they move the objects through computer animation. Now you are able to automatically model real looking objects, which is very complicated by hand. This means that you can combine real and virtual objects. If you combine these two, then you can generate movies of a world that is mixed between real and virtual worlds.

Nebeker:

So, for example, you could have a person, for which you have a real image, saying things that he never said?

Musmann:

That is also possible. You can mix, for instance, the texture of one person with the body of another one.

Nebeker:

Is any of this making its way into film making?

Musmann:

Yes. Such a film is being produced in our laboratory at this moment. A newspaper asked us to model a package of newspapers. And we did it. Of course, if you have such a computer model of a package of newspapers, you can do all kinds of computer animation. And it now looks very realistic. We animated a sequence where a spot goes into space and finds this package. The package is flying to you but coming to your eyes so close as you normally can't see. And then you see the letters running like a runway. It looks realistic. But everything is virtual. There was no real camera. Everything was done by the computer.

3-D Communication

Nebeker:

Your interest has always been for communications over telephone channels.

Musmann:

Yes. Our view of the future is that we will be able to have visualization of a human being. Then the long-term future of a communication system could look like this: if you want to speak with somebody, then the other person should appear on the other side of your desk, in actual size, without a monitor; you talk to him, and you can put some documents on the table that he sees also, and both can write on them. That should be the future.

We are preparing experiments for that. Since our algorithm is able to calculate this three-dimensional shape of a person, we can transmit the three-dimensional shape. We are able to reproduce the three-dimensional object here, and with motion. What we need is a projection system that gives the impression that you are really seeing a 3-D object or a 3-D person. In reality, we only transmit the model and visualize the model object.

At the present we use a stereoscopic projection system for which you need polarized glasses. There are also autostereoscope displays that do not require glasses. By stereoscopic vision you can put an object on the table, but you cannot look behind the object. However, if you control the motion of the head of the viewer, then you can immediately calculate and project the views from that side. That means you always have the impression that you're looking around the object. As soon as you have the 3-D model, you can do that, because then the computer knows your position and can calculate the view from that position.

Nebeker:

You don't really have full information of that three-dimensional object. You're starting with just some projections of it.

Musmann:

In the case of coding, we do not have a complete model of the object. But it appears to be complete with respect to the angle of view of the original camera.

Nebeker:

But what if you, at your desk, move quite a bit in looking at this object for which you don't have full information?

Musmann:

Then you see a distorted object. There are still a lot of unsolved problems. There are situations where it is complicated to estimate the shape correctly. I remember it took four years to develop an estimator for motion. Shape estimation will also take four years. Furthermore, the estimators could not have been realized if there had not been microelectronics. To make these complicated calculations and estimations, in real time, you need microelectronics. That was a necessary parallel development to this.

Influential Publications in the Field

Nebeker:

How will this development look when one looks back on the 1960s, '70s, and '80s from the future? What will one think of as the really important breakthroughs? One way to document that is to determine what books, textbooks, or research papers were very influential. So one thing I like to ask people is what publications have been very influential to them.

Musmann:

The main influence came from the fundamental work of Claude Shannon. There are only a few books in the special field of image coding. The first book on image coding was one edited by Pratt. It was made up of contributions from several research groups.

Nebeker:

Pratt's Image Transmission Techniques, 1979.

Musmann:

The next book was one edited by Netravali and Haskell. It is five years old. There was a long time between the two books. There aren't so many. Another book, which is more on speech coding, is edited by Jayn and Noll. Their content is based on many papers published by various authors.

Nebeker:

It was such a new field that there were no fundamental books.

Musmann:

No. It was just the beginning of digital communications. I think it is about twelve years ago that the first proposals were made for ISDN (Integrated Services Digital Network). Just now ISDN is coming onto the market, so it takes a very long time. Remember, we showed the first 64-kbit/s transmissions in 1979, and only now can the first systems be bought.

Digital Audio

Musmann:

But in digital audio the development was much faster. You know the compact disc. Compact discs present a digital sound signal of 1.5 megabits per second, providing excellent quality. That's why the compact disc is growing so rapidly and records are vanishing. Of course there will be a need for multimedia communications in the future—we haven't talked about that—so that the bit rate for sound coding, 1.5 Mbit/s, has to be reduced. There was also a standardization initiative on coding of audio and video for broadcast and computer applications, which is still going on. It was initiated in 1990. The ISO (International Standardization Organization) has established a special group, MPEG (Moving Picture Experts Group). The first aim of this standardization group was to represent a sound signal by two times 128 kbit/s (which is 256 kbit/s) instead of 1.5 Mbit/, providing a sound quality that cannot be distinguished from the original. The second aim was to develop a video coding standard that cuts down the bit rate of a TV signal with reduced resolution to 1.1 Mbit/s in a first step, and that of a full resolution TV to 4 to 8 Mbit/s in a second step.

The first step—we call it MPEG-1—was finished two years ago. I was chairman of the audio part. My colleague Dr. LeGall was responsible for the video. The best researchers from industries and universities contributed to the standardization, and we succeeded.

I wanted to mention this because in this audio coding technique there is a special processor that simulates the processing of the ear. The sound signal is split into frequency bands as our ear does in the cochlea. The signals of the frequency bands are then sampled, and the quantization introduced is controlled by a special model called the psycho-acoustic model. The model simulates the perception thresholds of the ear.

Nebeker:

Which are derived from physiological information?

Musmann:

Yes. So the model of the ear is continuously calculating the sensitivity thresholds for additional noise in the different frequency bands, and then the coding and quantization is automatically and continuously adapted to this perception. By this technique it was possible to cut down the bit rate to two times 128 kbit/s. Now we want to cut down the bit rate to two times 64 kbit/s. Then there are many completely new applications.

Nebeker:

This is CD quality?

Musmann:

Yes. Then, you could use the telephone line to call a concert hall and connect your high fidelity set to the line. Then you would have CD quality music from a concert hall. This will also allow broadcasting in the terrestrial ISDN network. Furthermore, instead of an optical CD player, you can construct a CD player without moving parts by use of such a coding system. You can store one hour of music on a chip card. This means you don't need a laser. There would be no motion, so no moving parts.

Nebeker:

You could put this into your sound system?

Musmann:

Yes. We published this one or two years ago, and we want to reach this point by the end of the next standardization. By use of the video coding standard which is block wise, motion compensated coding—and not yet with the contours—the bit rate of a TV signal can be reduced from 166 Mbit/s down to 4 Mbit/s. And that is with a very good digital TV quality. This allows satellite broadcasters to transmit ten digital TV signals in one analog TV channel. That is happening in the United States at the present time, and in Europe it's coming in the next two years. We have a few satellites that have been developed to transmit twenty analog channels. But this means instead of twenty analog, you have two hundred digital channels. What is important is that the price of a channel is reduced. The reduced costs will allow channels that can be used not only by broadcasters, but also by publishers who want to make advertisements or to provide services.

Progress of Communications Technology

Nebeker:

It's fascinating what the implications of better coding are.

Musmann:

We did not recognize this development from the beginning. We knew that it was interesting to reduce the bit rate, but we were unable to see all the outcomes. If you look at these 3-D representations, you see that it is possible to make use of this technique for various applications, including medicine. If you have a precise model for instance of a head, its topography, you can visualize the head in three dimensions. You can look around it. If you need to operate in the head, then in the future it may be possible that the surgeon will do this using the model.

Nebeker:

He could do a virtual operation in advance.

Musmann:

Yes. You can do this much more precisely because you see everything. If a tumor is behind your eye, it is very complicated today for the surgeon to get to it. The surgeons use tomographic images in order to measure how to get to that place. Going there they can look locally but miss the three-dimensional orientation, which may be provided by 3D visualization.

Nebeker:

That's fascinating.

Musmann:

To come back to your question. What I have learned during all this time is that information and telecommunication technologies have developed faster and faster from year to year. I see no reason to say we are coming to any saturation. Beginning in about 1920, the development became faster and faster, from decade to decade. What argument is there for believing this will change in the next decade?

Nebeker:

There are certainly examples of technologies—some transportation technologies, for example—where there were sudden spurts of development. The automobile of today isn't all that different from one of the 1930s. But it's hard to see that there is any plateau in communications technologies.

Musmann:

There will be new ideas in communications. There will also be new ideas in microelectronics. I heard a presentation a few weeks ago about communications between human cells—involving the structure of DNA, the coding of DNA. Every part of the DNA describes a part of the human body. When I imagine the size of this kind of coding and processing and compare this with today's microelectronics, I see much room for future developments!

Nebeker:

Who knows what the commercial possibilities are for these things?

Musmann:

The size of the memory cells is 10,000 to 100,000 times smaller than our smallest computer cells today.

Nebeker:

A lot of progress might still be made.

Musmann:

There is certainly still a big field for research in communications and information technologies.