IEEE
You are not logged in, please sign in to edit > Log in / create account  

First-Hand:Cryo CMOS and 40+ layer PC Boards - How Crazy is this?

From GHN

(Difference between revisions)
Jump to: navigation, search
m (Text replace - "[[Category:Computers_and_information_processing" to "[[Category:Computing and electronics")
 
(39 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 +
'''Contributed by:''' Tony Vacca
 +
 
== How it started  ==
 
== How it started  ==
  
It was in the early 80's.  Control Data (CDC) had just launched the CYBER - 205 with modest success and the team was now focused on the next generation machine, the 2XX as I recall.  Speed, cost and meeting the schedule were all key objectives.  Speed because Cray Research under the guidance of Seymour Cray was setting  milestones for Supercomputers with the Cray 1 and then the Cray 2.  Cost, since Supercomputers were extremely expensive.  Schedules since the CYBER - 205 had established patience records as a machine that may never get out the door and this must not be repeated.  
+
<p>It was in the early 80's. Control Data (CDC) had just launched the CYBER - 205 with modest success and the team was now focused on the next generation machine, the 2XX as I recall. Speed, cost and meeting the schedule were all key objectives. Speed because Cray Research under the guidance of Seymour Cray was setting milestones for Supercomputers with the Cray 1 and then the Cray 2. Cost, since Supercomputers were extremely expensive. Schedules since the CYBER - 205 had established patience records as a machine that may never get out the door and this must not be repeated. </p>
  
A conventional evolutionary approach for Integrated Circuit (IC) logic was initially selected. &nbsp;Motorola, with some prodding, agreed to launch an 8,000 gate equivalent ECL (emitter-coupled-logic - the circuitry of choice for high performance processing units) provided that Control Data do the actual circuit development. &nbsp;There were insufficient customers for Motorola to commit their resources to this lofty development. &nbsp;Motorola did, however, commit their advanced ECL processes to CDC and a joint team was formed with the two companies. &nbsp;
+
<p>A conventional evolutionary approach for Integrated Circuit (IC) logic was initially selected. Motorola, with some prodding, agreed to launch an 8,000 gate equivalent ECL (emitter-coupled-logic - the circuitry of choice for high performance processing units) provided that Control Data do the actual circuit development. There were insufficient customers for Motorola to commit their resources to this lofty development. Motorola did, however, commit their advanced ECL processes to CDC and a joint team was formed with the two companies. </p>
  
Logic designers at the CDC Advanced Design Laboratory were given preliminary design rules based on computer device &nbsp;models and estimates of gate per chip densities. &nbsp;There was a natural follow up of grumbling by the logic design team led by very experienced and innovative folks (Ray Kort, Maurice Hudson and Dave Hill to name three) &nbsp;but circuit designers had learned to accept this since logic designers always found the circuits to be too slow and insufficient an quantity of gates and pins (I/O ports) per die. There was a lot of cooperation too. &nbsp;Basic building blocks were defined by the logic designers - gate functionality, register functionality, etc. From this set of preliminary rules &nbsp;function blocks were defined and capacity per reasonably-sized Printed Circuit (PC) boards defined. The initial design using the Cray CYBER - 205 based architecture was launched.  
+
<p>Logic designers at the CDC Advanced Design Laboratory were given preliminary design rules based on computer device models and estimates of gate per chip densities. There was a natural follow up of grumbling by the logic design team led by very experienced and innovative folks (Ray Kort, Maurice Hudson and Dave Hill to name three) but circuit designers had learned to accept this since logic designers always found the circuits to be too slow and insufficient an quantity of gates and pins (I/O ports) per die. There was a lot of cooperation too. Basic building blocks were defined by the logic designers - gate functionality, register functionality, etc. From this set of preliminary rules function blocks were defined and capacity per reasonably-sized Printed Circuit (PC) boards defined. The initial design using the Cray CYBER - 205 based architecture was launched. </p>
  
<br>  
+
<p>In parallel with this effort, and in the same design group; i.e.; circuit, packaging, PC board and newly formed CAD (tools for layout and design of chips and boards) - chief chip design engineer - Randy Bach - was assigned to develop an advanced CMOS chip for the Canadian Computer Development organization. At this time, early 80's CMOS was in it's infancy being used for memory devices, low performance peripherals and also for low performance microprocessors (5 to 10 MHz clock speeds). The design contained 5,000 gates plus appropriate input and output communication devices. Gate arrays for CMOS was also nearly non-existent so Randy and his small team of two assistants developed a cell library and worked closely with the Canadian Development team to meet their objectives as well. </p>
  
''In parallel with this effort, and in the same design group; i.e.; circuit, packaging, PC board and newly formed CAD (tools for layout and design of chips and boards) - &nbsp;chief chip design engineer - Randy Bach - was assigned to develop an advanced CMOS chip for the Canadian Computer Development organization. &nbsp;At this time, early 80's CMOS was in it's infancy being used for memory devices, low performance peripherals and also for low performance microprocessors (5 to 10 MHz clock speeds). &nbsp;The design contained 5,000 gates plus appropriate input and output communication devices. &nbsp;Gate arrays for CMOS was also nearly non-existent so Randy and his small team of two assistants developed a cell library and worked closely with the Canadian Development team to meet their objectives as well. &nbsp;''
+
<p>This effort was completely separate from the ECL based gate array to be used for the next generation Supercomputer. The product was developed for a low cost application. </p>
  
''This effort was completely separate from the ECL based gate array to be used for the next generation Supercomputer. &nbsp;The product was developed for a low cost application.''
+
<p>It was customary for Neil Lincoln - chief architect, Dale Handy - manufacturing manager and me to go off to lunch every 8 to 10 days to discuss status at either Author Treacher's Fish &amp; Chips or Zantigo's (high class - NOT) fast food restaurants. As a side note, both of these fast food places disappeared during the ETA Systems brief duration. Zantigo's has returned (I think because they know it is safe now that the three of us cannot visit together any longer - Neil unfortunately passed on a few years ago). </p>
  
<br> It was customary for Neil Lincoln - chief architect, Dale Handy - manufacturing manager and me to go off to lunch every 8 to 10 days to discuss status at either Author Treacher's Fish &amp; Chips or Zantigo's (high class - NOT - fast food restaurants). &nbsp;''As a side note, both of these fast food places disappeared during the ETA Systems brief duration.&nbsp;''&nbsp;&nbsp; ''Zantigo's has returned (I think because they know it is safe now that the three of us cannot visit together any longer - Neil unfortunately passed on a few years ago).''
+
<p>At one of these meetings, Neil had "news" for me. Simply stated, the gate array in active co-development with Motorola had unacceptable goals. The chip had too few I/O pins, consumed too much power and insufficient gates. In addition, he completed a cost model which indicated an unacceptable cost figure for the CPU. He also determined that the CPU (some 3 Million gates) had to be assembled on a single board. "It was time for this goal to be reached". He also reached the conclusion that a proper logic design required at least 15,000 gates per chip to meet these goals. </p>
  
At one of these meetings, Neil had "news" for me. &nbsp;Simply stated, the gate array in co-development with Motorola had unacceptable goals. &nbsp;The chip had too few I/O pins, consumed too much power and insufficient gates. He had determined that the CPU (some 3 Million gates) had to be assembled on a single board. &nbsp;"It was time for this to be done". &nbsp;He also reached the conclusion that the logic design required at least 15,000 gates per chip to meet these goals. &nbsp;
+
<p>The logic designers had gotten to him I surmised. Schedules, Neil reminded us, could not be altered - and that was that. To soften the blow he bought lunch that day, three Cokes and three orders of fish and chips - Neil's was a large order. </p>
  
''The logic designers had gotten to him I surmised.'' Schedules, Neil reminded us, could not be altered - and that was that. &nbsp;To soften the blow he bought lunch that day, three Cokes and three orders of fish and chips - Neil's was a large order.  
+
<p>The trip back to the lab was pretty quiet, fortunately short since our eating places were all very close to the lab. </p>
  
The trip back to the lab was pretty quiet, fortunately short since our eating places were all very close to the lab.  
+
<p>That afternoon, I assembled the key folks - I might miss one or two but Randy Bach, Doug Carlson, Dave Resnick and John Ketzler were four that I recall now. Doug was a mechanical engineer that I assigned the Motorola project to because of his management skills - something he probably never forgave me for - John was the key circuit engineer on the Motorola project and Dave was and still is a very versatile and perceptive engineer. </p>
  
<br>  
+
<p>Doug and I would inform Motorola of the decision not to continue. The team would package up what was accomplished and turn it over to Motorola to carry the ball forward if they wished. </p>
  
that afternoon, I assembled the key folks - I might miss one or two but Randy Bach, Doug Carlson, Dave Resnick and John Ketzler were four that I recall now. &nbsp;Doug was a mechanical engineer that I assigned the Motorola project to because of his management skills - ''something he probably never forgave me for ''- John was the key circuit engineer on the Motorola project and Dave was ''and still is'' a very versatile and perceptive engineer. &nbsp;
+
<p>As a side note, Motorola and Cray did continue the design. It was the circuit design used in the Cray C90, a very successful Cray Research Supercomputer. </p>
  
Doug and I would inform Motorola of the decision not to continue. &nbsp;The team would package up what was accomplished and turn it over to Motorola to carry the ball forward if they wished. &nbsp;'''''As a side note, Motorola and Cray did continue the design. &nbsp;It was the circuit design used in the Cray C90, a very successful computer'''''.  
+
<p>The meeting turned to what were the next steps. </p>
  
The meeting turned to what were the next steps.
+
<p>The key challenges that emerged were: </p>
 
+
The key challenges that emerged were:&nbsp;
+
  
 
*IC Technology that could meet the new lofty goals  
 
*IC Technology that could meet the new lofty goals  
Line 40: Line 40:
 
*Testing of complex IC technology and complex PCB technology
 
*Testing of complex IC technology and complex PCB technology
  
== == Summary of IC technology accomplishments == <!--StartFragment-->  ==
+
== Results  ==
 
+
== <br><!--StartFragment-->  ==
+
 
+
==== <span style="font-size: 21.0pt;font-family:Helvetica">'''ETA Systems Hardware'''</span><span style="font-size:21.0pt;font-family:Helvetica"> '''Technology&nbsp;'''</span><span style="font-size: 21.0pt;font-family:Helvetica">'''1980 – 1989'''</span> ====
+
 
+
==== '''<span style="font-size:19.0pt; font-family:Helvetica">Preface:</span>'''  ====
+
 
+
<span style="font-size:13.0pt; font-family:Helvetica">To restate the challenge: ETA Systems Inc. was spun off
+
as the Supercomputer subsidiary from a struggling Control Data Corporation
+
(CDC).&nbsp; The objective was to develop and manufacture High Performance Computers
+
or commonly called in the 80’s and 90’s simply Supercomputers.&nbsp; Cray
+
Research Inc. dominated this market during this time frame and CDC had a minor
+
market position introducing the Star-100 followed by the CYBER-203 and
+
CYBER-205 systetms.&nbsp; Novel architecture (fast scalar performance and the
+
efficient use of vectors), innovative software and highest performance
+
integrated circuit (resulting in the fastest clock period), innovative
+
packaging (to optimize device spacing and thermal management) differentiated
+
Supercomputers from conventional computer systems during this period. It must
+
be sated to be "fair and balanced" that Supercomputers also had the
+
highest price tag and demanded the largest memories and highest performance
+
peripherals and system bandwidths. Systems dominating the market during the
+
80’s were the Cray-1, Cray XMP and CYBER-205.&nbsp; NEC, Fujitsu and Hitachi
+
also developed systems in this market.&nbsp; The word Supercomputer was applied
+
to other products as well. It is not intentional to dismiss their recognition.</span>
+
 
+
<span style="font-size:13.0pt; font-family:Helvetica">The following overview will not enter into the decisions
+
to separate ETA Systems from Control Data Corporation organizationally,
+
although that topic is interesting as well.&nbsp; Nor will the following
+
discuss software innovations at ETA Systems – and there were many.</span>
+
 
+
<span style="font-size:13.0pt; font-family:Helvetica">Architecture had a role in dictating the technology in
+
terms of number of logic circuits that were serial per clock cycle.&nbsp;
+
Architecture also demanded high performance large registers (temporary storage
+
devices) to be included which also dictated performance (clock cycle) of the
+
system.&nbsp; Other architecture features (instructions) dictated the number of
+
functions that constituted a processor (gates / CPU) that, in turn, determined
+
technology selection from a point of preferred Gates per Chip and Ports per
+
Chip. Proximity of chips to each other for processor design was crucial during
+
this time period since a CPU could not reside within the boundary of a single
+
chip as it easily does today. Bandwidth, i.e.; number of bytes per unit of time
+
that could be moved between functions within the CPU and the CPU and associated
+
memory is key and places demands on pins or logic paths between functions that
+
usually requires compromise in each and every design.</span>
+
 
+
<span style="font-size:13.0pt; font-family:Helvetica">Those reading this now find this humorous I am sure with
+
multiprocessing units (multi CPUs) now residing within the boundaries of a
+
single chip or IC die. In the 80’s and well into the mid 90’s, however, a CPU
+
processor partitioning of necessary logic or Boolean functions on multiple
+
integrated circuit chips (usually multiple hundreds of chips) and multiple
+
complex printed circuit boards (2 to 8) was an integral part of determining the
+
overall performance, power consumption, cost and reliability of the system.</span>
+
 
+
<span style="font-size:13.0pt; font-family:Helvetica">&nbsp;</span><span class="Apple-style-span" style="font-family: Helvetica; font-size: 17px;">
+
</span><span class="Apple-style-span" style="font-family: Helvetica; font-size: 17px;">
+
</span>
+
 
+
== <span style="font-size:19.0pt; font-family:Helvetica">'''Introduction'''</span><span style="font-size:13.0pt; font-family:Helvetica">&nbsp;</span> ==
+
 
+
<span style="font-size:13.0pt; font-family:Helvetica">ETA Systems technology was selected in 1980 (the
+
organization was the Advanced Design Laboratory of Control Data Corporation at
+
the inception) with the following objectives:</span>
+
 
+
*
+
==== <span style="font-size:13.0pt;font-family:Helvetica">The highest performance
+
</span>Supercomputer at the time of product delivery ====
+
 
+
 
+
*
+
==== <span style="font-size:13.0pt;font-family:Helvetica">The most cost effective technology
+
</span>available ====
+
 
+
 
+
*
+
==== <span style="font-size:13.0pt; font-family:Symbol"><span>&nbsp;</span></span><span style="font-size:13.0pt;font-family:Helvetica">The lowest possible power
+
</span>consumption while meeting other objectives ====
+
 
+
 
+
*
+
==== <span style="font-size:13.0pt;font-family:Helvetica">The largest product diversity
+
</span>with a single design&nbsp; ====
+
 
+
 
+
*
+
==== <span style="font-size:13.0pt;font-family:Helvetica">The highest possible
+
</span>reliability. pins and interconnects usually dictated the reliability since by that time Integrated Circuit technology reached a very high reliability for both logic and storage devices ====
+
 
+
 
+
*
+
==== <span style="font-size:13.0pt; font-family:Symbol"><span>&nbsp;&nbsp;</span></span><span style="font-size:13.0pt;font-family:Helvetica">Leverage as much of the
+
</span>technology as possible to the follow-on computer generation. This usually fell by the wayside until ECAD and MCAD technologies were introduced into the design ====
+
 
+
 
+
*
+
==== <span style="font-size:13.0pt;font-family:Helvetica">Utilize only standard IC
+
</span>technology processes being developed for other markets. ====
+
 
+
 
+
*
+
==== <span style="font-size:13.0pt; font-family:Symbol"><span>&nbsp;&nbsp;</span></span><span style="font-size:13.0pt;font-family:Helvetica">Demonstrate the prototype of the
+
</span>product in less than four years. ====
+
 
+
 
+
=== <span style="font-size:18.0pt; font-family:Helvetica">'''Digging deeper into objectives:'''</span> ===
+
 
+
*
+
==== <span style="font-size:14.0pt;font-family:Helvetica">The highest performance
+
Supercomputer at the time of product delivery</span> ====
+
 
+
 
+
<br> <span style="font-size:13.0pt; font-family:Helvetica">Simply stated, the highest performance processor solved
+
the largest problems most effectively.&nbsp; Performance was usually measured
+
in clock cycle that was unfortunate since variable amounts of calculations
+
could be made per clock cycle.&nbsp; This single parameter was the bragging
+
rights although later Gigaflops became the stated parameter and that also did
+
not necessarily reflect the true performance of a supercomputer.&nbsp; The
+
“king of the hill” at any given cycle (usually 2 to 4 years) held the largest market
+
share.</span>
+
 
+
*
+
==== <span style="font-size:14.0pt;font-family:Helvetica">The most cost effective
+
technology available</span> ====
+
 
+
 
+
<br> <span style="font-size:13.0pt; font-family:Helvetica">''<span style="mso-spacerun: yes">&nbsp;</span>''</span><span style="font-size:13.0pt;font-family:Helvetica">The most “bang” for the buck
+
applied to Supercomputers as well as other markets.&nbsp; The customer was
+
willing to get the highest performance when solving his particular challenges
+
as a higher priority provided that the performance clearly exceeded lower cost
+
alternatives.''&nbsp;A legend of the Supercomputer industry - Jim Thornton -''
+
once described the requirement as getting through an intersection without
+
having an accident. &nbsp;Since a Supercomputer required so many components and
+
interconnects, they were bound to fail more rapidly than small computers. So -
+
Jim surmised, the faster the computer, the more things that could be solved
+
before something went wrong. &nbsp;Go through an intersection as fast as possible
+
- not at a slow rate and you have a better chance of getting through safely.</span>
+
 
+
*<span style="font-size:14.0pt;font-family:Helvetica"></span>'''<span style="font-size: 13.0pt;font-family:Helvetica">&nbsp;</span>'''
+
 
+
<br> <span style="font-size:13.0pt;font-family:Helvetica">Each follow-on generation of Supercomputer products
+
witnessed increased power per processor, which was justified by the resultant
+
performance realized.&nbsp;(The lower the RC time constant (resistance -
+
capacitance) the faster the computer clock cycle.) &nbsp;Since the lower the R,</span> the higher the power, this was a trend. &nbsp;As multi-processor units increased per system the power consumption became a major issue; the largest users for site related “wall plug” power capacity limits and for the small users for basic life-of-system cost concerns (mainly power consumption, cooling and system reliability).
+
 
+
*
+
==== <span style="font-size:13.0pt; font-family:Symbol"><span>&nbsp;</span></span><span style="font-size:13.0pt;font-family:Helvetica">'''The largest product diversity&nbsp;<span class="Apple-style-span" style="font-family: -webkit-sans-serif; font-size: 13px; font-weight: normal; ">with a single design'''<span style="font-size:13.0pt;font-family:Helvetica">&nbsp;</span>'''&nbsp;</span>'''</span> ====
+
 
+
 
+
<span style="font-size:13.0pt; font-family:Helvetica">Design cycles were three to five years in duration and
+
large teams were assembled to complete a single design.&nbsp; Development costs
+
per product were significant. Product cost ($2M to $40M per system) and
+
performance ranges (greater than 20:1) for each generation of Supercomputers
+
were increasing.&nbsp; Since optimized cost points for each product were
+
technology dependent, this required multiple design teams – each design
+
utilizing a unique technology.&nbsp; Desire to utilize a single total design,
+
i.e., packaging, IC selection, manufacturing tooling, and associated boards,
+
connectors, etc. was desirable, therefore, for a myriad of obvious reasons.
+
&nbsp;In fact, most companies were focused on only a portion of the computer
+
market and dedicated to only a small portion of the product
+
"bandwidth". &nbsp;Cray Research focused on the high end, IBM, CDC,
+
Unisys and others did an admirable job in the middle and companies like DEC and
+
HP were at the lower end. &nbsp;There were others too across the world, but
+
these companies are only examples.</span>
+
 
+
*
+
==== <span style="font-size:14.0pt;font-family:Helvetica">'''The highest possible'''
+
</span>reliability'''<span style="font-size:13.0pt;font-family:Helvetica">&nbsp;</span>''' ====
+
 
+
 
+
<span style="font-size:13.0pt; font-family:Helvetica">Supercomputers required significant bandwidth to pass
+
data between processing units and processors and memory.&nbsp; Bandwidth is a
+
key differentiator that separates true Supercomputers from conventional
+
computer systems.&nbsp; Interconnects were and still are dominant reliability
+
concerns in large systems.&nbsp; Thermal management is also significant. Large
+
systems, by the nature of the design require a large quantity of components to
+
be simultaneously functional.&nbsp; Each interconnect and each active logic
+
device (integrated circuit) requires the highest reliability to permit large
+
user problems to be solved using Supercomputers. Simply stated; Supercomputer
+
operational time had to exceed the size of the largest customer problem.</span>
+
 
+
*
+
==== <span style="font-size:14.0pt; font-family:Symbol"><span>&nbsp;</span></span><span style="font-size:14.0pt;font-family:Helvetica">Leverage as much of the
+
</span>technology as possible to the follow-on generation ====
+
 
+
 
+
<span style="font-size:13.0pt; font-family:Helvetica">Significant cost for development of a given technology
+
“kit” had to be leveraged for as long a period as possible.&nbsp; It was a
+
“given” that most Integrated circuits would be developed for each generation of
+
computer. What about interconnects (connectors)? What about printed circuit
+
boards? What about support technologies like simulation tools, assembly tooling
+
and basic packaging?&nbsp; Can any of these technologies extend to the next
+
generation?&nbsp; And, are there any “mid life kickers” that could be inserted
+
into a successful product to extend its market life? IBM did exceptional work
+
in taking hardware across product boundaries and generations of new products.&nbsp;
+
The initial tooling for packaging was expensive but results in later products
+
appeared to prove dividends.&nbsp; Cray Research Inc. and Control Data, by
+
contrast, generated new packaging and connector technology with each new
+
generation of product.&nbsp;</span>
+
 
+
*
+
==== <span style="font-size:14.0pt;font-family:Helvetica">Utilize only standard IC
+
</span>technology processes being developed for other markets ====
+
 
+
 
+
<span style="font-size:13.0pt; font-family:Helvetica">This addresses two major issues, cost and access to
+
technology.&nbsp; Cost, since dedicated IC processing lines with unique
+
processes for low volume products – even if it could be realized – would not
+
allow effective amortization of costs for process development and
+
manufacturing.&nbsp; Access to technology addresses advanced popular (high
+
volume) processes to accommodate unique system designs. Innovation in the IC
+
industry was and is applied to the highest return on investment markets.&nbsp;
+
The “trick” was to apply this ”standard” and most innovative technology to low
+
volume Supercomputer applications.</span>
+
 
+
*
+
==== <span style="font-size:14.0pt;font-family:Helvetica">Demonstrate operating product
+
prototypes in less than four years&nbsp;</span> ====
+
 
+
 
+
<br> <span style="font-size:13.0pt; font-family:Helvetica">This requires discipline as well as good management and
+
leadership.&nbsp; Due to the complexity of Supercomputers, we felt that tools
+
had to be upgraded significantly and checkout and diagnostics improved as well.
+
At any given time improvements are made in Supercomputer technologies. Allowing
+
each incremental development to be incorporated extends the product development
+
cycle significantly.&nbsp; Selecting known (proven) technologies at the time of
+
product development initiation results in a non-competitive product.&nbsp; Risk
+
must be taken.&nbsp; What areas to take risk (technologies that are not
+
available at product initiation but look the most promising) as well as the
+
return on the risk must be carefully evaluated with factors clearly
+
understood.&nbsp; Where to invest in new technology development must be understood
+
as well as the return on investment and the leverage of investment where other
+
markets are interested in common technologies must be understood. &nbsp;Back-up
+
alternatives should be identified. The CEO of Cray Research - Jon Rollwagon -
+
defined the challenge as "How many 3-point shots should each project
+
take?" Missing market cycles is costly.&nbsp; Under reaching
+
(conservative) and over-reaching (bad choices in “betting on the come”) were
+
also costly and prohibitive.&nbsp; All of these factors were carefully
+
evaluated.&nbsp;&nbsp;It might be added, using the same basketball comparison,
+
cannot have too many “''lay ups''</span><span style="font-size:13.0pt; font-family:Helvetica">” (sure things already developed) either!</span>
+
 
+
<br> <span style="font-size:13.0pt;font-family:Helvetica;mso-ansi-language:EN-US; mso-fareast-language:EN-US">
+
</span>
+
 
+
== &nbsp;<span style="font-size:13.0pt;font-family:Helvetica">&lt;o:p&gt;&lt;/o:p&gt;</span>  ==
+
 
+
== <span style="font-size:16.0pt; font-family:Helvetica">'''Results'''</span><span style="font-size:16.0pt; font-family:Helvetica">&lt;o:p&gt;&lt;/o:p&gt;</span> ==
+
 
+
''<span style="font-size:13.0pt; font-family:Helvetica">Before getting to the details as to how decisions were</span>'' made and how the ETA System technologies the “kit” was selected and developed, a list of noteworthy accomplishments achieved are listed:''<span style="font-size:13.0pt; font-family:Helvetica">&lt;o:p&gt;&lt;/o:p&gt;</span>''
+
 
+
*
+
 
+
==== <span style="font-size:14.0pt;font-family:Helvetica">'''First Industry competitive'''</span>'''===='''
+
 
+
<br> <span style="font-size:13.0pt; font-family:Helvetica">&nbsp; Since 1995 – to the present (beginning 12 years after
+
the technology selection by ETA Systems I might add) ALL HPC (High Performance
+
Computers) are developed and manufactured using CMOS IC technology. &nbsp;Until
+
as late as 2000, bipolar technology (higher power, more costly to manufacture
+
and lower gate count per chip) dominated high performance computers throughout
+
the world.</span>
+
 
+
*
+
 
+
==== <span style="font-size:13.0pt; font-family:Helvetica"></span><span style="font-size:14.0pt;font-family:Helvetica">'''First Industry Single Board'''</span>'''===='''
+
 
+
<br> <span style="font-size:13.0pt; font-family:Helvetica"><span style="mso-spacerun: yes">&nbsp;</span>The chip density (gates per chip) allowed by advanced CMOS, the use of layout and design Computer aided design tools for optimum layout and simulation, the successful design of a 45 layer advance Printed Circuit board (''you read it right 45 layers''</span><span style="font-size:13.0pt;font-family:Helvetica">) and
+
innovative chip attachment and cooling permitted a single processor containing
+
nearly 3 million gates to be packaged on a single board.&lt;o:p&gt;&lt;/o:p&gt;</span>
+
 
+
*
+
 
+
==== <span style="font-size:13.0pt; font-family:Symbol"><span>&nbsp;</span></span><span style="font-size:14.0pt;font-family:Helvetica">'''First Industry system to be'''</span>'''<span style="font-size:13.0pt;font-family: Helvetica"> &lt;o:p&gt;&lt;/o:p&gt;</span> ===='''
+
 
+
<br> <span style="font-size:13.0pt; font-family:Helvetica">CPU Processing units (''≈3Million gates each'') were validated for
+
functionality and performance in less than 4 hours.&nbsp;&nbsp;Any interconnect
+
errors were recorded and allowed chip-to-chip replacement to occur in a minimal
+
time.&nbsp;Other CPU checkout during this same period required weeks to months
+
to check out and validate a processing unit.&nbsp; Incoming testing of the logic
+
IC Chip (''function and performance'') also used the same self-test innovations.&lt;o:p&gt;&lt;/o:p&gt;</span>
+
 
+
*
+
 
+
==== <span style="font-size:14.0pt;font-family:Helvetica">'''First Industry production'''</span>'''===='''
+
 
+
<br> <span style="font-size:13.0pt; font-family:Helvetica"><span style="mso-spacerun: yes">&nbsp;</span>The ETA Systems CPU was immersed in Liquid Nitrogen – 77 degrees Kelvin – to improve performance greater than two times that CMOS technology operated at room temperature – 300 degrees Kelvin.&lt;o:p&gt;&lt;/o:p&gt;</span>
+
 
+
*<span style="font-size:14.0pt;font-family:Helvetica">'''First system at CDC to fully'''
+
</span>
+
 
+
utilize Computer Design Software to design Chips, boards, validate Logic design and Auto Diagnostic test the system with Synergistic tools'''<span style="font-size:13.0pt;font-family:Helvetica">.&nbsp; &lt;o:p&gt;&lt;/o:p&gt;</span>'''
+
 
+
<span style="font-size:13.0pt; font-family:Helvetica">Permitted checkout of a CPU to be completed in less than
+
4 hours.&nbsp; Manufacturing costs were greatly reduced. &nbsp;This technique
+
was also used at the IC Supplier and greatly reduced any probe test hardware
+
and software.&lt;o:p&gt;&lt;/o:p&gt;</span>
+
 
+
*
+
 
+
==== <span style="font-size:13.0pt;font-family:Helvetica">'''First Industry system to have'''</span>'''<span style="font-size:13.0pt;font-family:Helvetica">&nbsp; &lt;o:p&gt;&lt;/o:p&gt;</span> ===='''
+
 
+
<br> <span style="font-size:13.0pt; font-family:Helvetica">Performance range of the ETA System products was greater
+
than 24:1 (8 processor system operating at 7 nanoseconds Clock period and a
+
single processor system operating at 24 nanoseconds.).&nbsp; Processors were
+
manufactured, tested and validated from a single manufacturing line using
+
identical components.&nbsp; (''IC''</span><span style="font-size:13.0pt; font-family:Helvetica"> ''Chips were performance sorted using auto self test''</span>''<span style="font-size:13.0pt; font-family:Helvetica">). Product differences began at the system packaging</span>'' level.&lt;o:p&gt;&lt;/o:p&gt;
+
 
+
== <span style="font-size:13.0pt; font-family:Helvetica">&nbsp;&lt;o:p&gt;&lt;/o:p&gt;</span>  ==
+
 
+
== <span class="Apple-style-span" style="font-family: Helvetica; font-size: 17px;">
+
<!--StartFragment-->
+
 
+
 
+
<span style="font-size:
+
20.0pt;font-family:Helvetica;color:red">Boring into the details<o:p></o:p></span>
+
 
+
<span style="font-size:
+
20.0pt;font-family:Helvetica">Boring into details<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Helvetica">Any Technology kit must be driven by a customer
+
need.&nbsp; In the case of Supercomputers the craving for increased computer
+
performance at a lower cost (overall cost) was the deciding factor.&nbsp; In
+
any Supercomputer company a combination of marketing requirements, architecture
+
innovations and logic design demands dictate the initial objectives of the
+
hardware circuit and packaging organization.&nbsp; I state “initial” since once
+
the objectives are digested and key technologies are evaluated for the time
+
frame addressed, compromises are the norm. In the case of ETA Systems
+
technology selections in the early 1980’s, this was the strategy implemented.<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Helvetica">The following paragraphs sequence the thought process and
+
the technology selection strategy utilized.<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Helvetica">'''Integrated Circuit selection:'''</span><span style="font-size:13.0pt;font-family:Helvetica"><o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Helvetica">The objectives, listed in earlier paragraphs were first
+
integrated into the architecture and logic design requirements.&nbsp; A market
+
survey of key integrated circuit suppliers was conducted with emphasis on what
+
was in development and planned for product introduction – not what was
+
available at the time of the survey.&nbsp; A risk assessment was made.&nbsp;
+
Primary focus was on the most dynamic technology, the IC Logic technology.&nbsp;
+
All decisions as to volume requirements, pins, packaging, etc. resulted from
+
what was determined by this survey and risk analysis.&nbsp; Merging the logic
+
design objectives (gates, bandwidth and performance of key functions) was
+
next.&nbsp;&nbsp;<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Helvetica">An ECL (emitter coupled logic) high performance bipolar
+
gate array using Motorola advanced IC technology was selected.&nbsp; Since
+
Motorola was not fully staffed to begin the actual product development&nbsp;
+
(application) but did have the process development underway, a cooperative
+
development agreement was struck with the two companies (this occurred between
+
Motorola and Control Data since ETA Systems had yet not been formed).&nbsp; The
+
design called for basic logic cells to be incorporated into a larger version of
+
their existing gate array advancing the process for increased performance and
+
chip size for increased gate capacity.&nbsp; The existing gate or function
+
array utilized approximately 2,500 gates (which was used as the primary gate
+
array for the Cray Research very popular Y-MP Supercomputer) and the planned
+
gate array would contain an excess of 8,000 equivalent gates.&nbsp;<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Helvetica">Logic cell libraries were agreed to (acceptable to both
+
Motorola for the general market and to CDC for the logic designs).&nbsp; Pin
+
counts (for power, ground and input/output logic communications) were
+
established and power consumption estimates were made.&nbsp; Once these
+
parameters were established, board size, power systems and thermal control were
+
evaluated in a trade off give-and-take.&nbsp; Features of Printed Circuit
+
Boards, (line widths, spacing, interconnect vias and number of layers were
+
compared to the board size capacities, laminating press capabilities, drill
+
designs and printed pc board processing limits.&nbsp; IC packaging, limits,
+
i.e.; minimum size of package, pin spacing, thermal removal, etc. was evaluated
+
in parallel with PC Board limits.<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Helvetica">The chip design began, the cell library began and the
+
packaging began once all parameters&nbsp; (pins, power consumption and die size
+
objectives) were agreed to.&nbsp; Printed circuit board experiments also
+
began.&nbsp; Once feasibility was established and practical limits established
+
(original goals could be met as to physical design and performance based on IC
+
Modeling and extrapolation from previous established functional systems, a
+
preliminary specification was presented to the architects and logic designers
+
for review.<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Helvetica">From initial design data, logic design based on the
+
parameters provided established a physical size for the Central Processing unit
+
or CPU, the heart of the system.&nbsp; A multiple board processor was required.
+
This placed additional constraints on packaging since within a single processor
+
all distances are crucial between circuits.&nbsp; Three-dimensional packaging
+
concepts were considered. Three dimensional packaging effectively meant a
+
“sandwich” effect of multiple boards with interconnects from board to board
+
were throughout the area – not exclusive to the periphery of the board such
+
that chips on each of the boards would minimize distances between them.&nbsp;
+
In addition, power consumption estimates were made; thermal removal paths and
+
techniques were considered.&nbsp; A cost model was generated as well. All of
+
these factors resulted in a preliminary estimate of the CPU volume. &nbsp;In
+
the introduction portion of the document, you already know that this was rejected
+
- more to follow for sure.<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Helvetica">In parallel with these efforts, memory design was
+
underway.&nbsp; Less freedom was available to memory since the basic
+
semiconductor device could not be altered to accommodate specific users.&nbsp;
+
There were a few packaging alternatives, very few, and device configurations
+
(Word – Bit architecture, pin numbering, power considerations, etc.) were
+
dictated by the industry.&nbsp; Since memory design has its own objectives for
+
cost, reliability and performance, this effort could continue quite independently
+
with one exception, the packaging of the total system must be synergistic and
+
compatible.&nbsp; A crucial parameter of this is the interconnect mechanism
+
between processors and memory.<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Helvetica">A hardware system cost model was established – not only
+
for current cost considerations but also estimates on volume costs based on
+
learning-curve estimates as well for the life of the system.<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Helvetica">The chief architect, after careful review, rejected the
+
design; this was covered in the introduction. Three key reasons were sited;
+
performance would be impacted due to the 8,000 gate limit, (worst case logic
+
paths could not reside in a single chip and multiple chip distances would
+
increase the clock period), power consumption per CPU, although lower on a
+
performance ratio basis to previous generations, was too high when the total
+
system size (including the multiprocessor objectives) were considered and
+
system cost appeared prohibitive – always a subjective issue but never-the-less
+
a key component of the design.&nbsp; Reliability concerns were also stated
+
since the pin-count per CPU, although quite reduced from previous designs, were
+
of concern. &nbsp;The architecture was committed to four CPUs (max) per system
+
so the interconnect "bar" was raised.<o:p></o:p></span>
+
 
+
<span style="font-size:19.0pt;
+
font-family:Helvetica">Back to the drawing board</span><span style="font-size:
+
13.0pt;font-family:Helvetica">.&nbsp;<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Helvetica">mini tutorial <o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Helvetica"><span style="mso-spacerun: yes">&nbsp;</span>Bipolar technology refers to conventional NPN and PNP transistors operating in a non-saturating mode (collector-base).&nbsp; By not saturating the operating transistors (not allowing the base voltage being higher than the collector voltage) the switching characteristics were improved and balanced (off logic level and on logic levels had identical delays).&nbsp; In addition, the non-saturating circuitry – titled ECL for Emitter coupled logic – provided the TRUE and COMPLIMENT outputs for each logic function (i.e.; AND &amp; NAND, OR &amp; NOR, etc).&nbsp; This provided advantages to logicians to design complex Boolean functions (ADD units, MULTIPLY units, DIVIDE units, etc). Under the category of “no free ride” ECL circuitry consumed higher power than the more popular but much slower saturating logic circuitry (TTL – transistor-transistor logic).&nbsp; Other improvements in performance for integrated versions of ECL logic circuitry included replacing conventional junction isolation between circuit devices on a single die with Oxide isolation between circuits (lower capacitance per circuit so less charging and discharging when logic levels switched).&nbsp; <o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Helvetica">''CMOS (Complementary Metal Oxide Silicon) circuitry,
+
especially at the time of ETA System, was a simpler and more efficient logic
+
circuit. This form of logic also had a simpler process.&nbsp; Stacking of P
+
channel and N channel transistors in series between voltage bus rails defines a
+
single complementary gate. Functionality of the logic devices is much more
+
forgiving to process variations due to the larger voltage swing and only active
+
transistors used to define the circuitry (no resistors, diodes, etc.).&nbsp;
+
The physical size of a logic function when compared to a bipolar equivalent is
+
significantly smaller, resulting in an increase in circuitry per equivalent die
+
(chip) size. CMOS technology also consumed power ONLY when the circuit was
+
switching (changing states) so power consumption was directly proportional to
+
the frequency it was operating.&nbsp;&nbsp;(P = CV''</span><span style="font-size:11.0pt;font-family:Helvetica">''<sup>2</sup>''</span><span style="font-size:13.0pt;font-family:Helvetica">''f) &nbsp;&nbsp;ECL circuitry,
+
by contrast, consumed approximately the same power – while switching or in a
+
quiescent state.&nbsp; (Later forms of CMOS – especially those designed in
+
early 2000 and beyond, had increased power consumption primarily caused by
+
increased bulk leakage currents as a result of processes developed for
+
lithography having features en excess (smaller) than 90 nanometers.&nbsp;''</span><span style="font-size:13.0pt;font-family:Helvetica"> Technology at the time of the
+
development of the ETA Supercomputers had minimum features of 1,200 nanometers.
+
(In 2009, by contrast, the production capability is 45 nanometers)<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Helvetica">''Advantages of CMOS were obvious; more circuits per
+
given chip area, lower power consumption and higher functional yield.&nbsp; It
+
is important to stress “functional yield”. The CMOS devices functioned over a
+
much larger range of processing variations (&gt; 50% Vs. &lt; 15% to 25% for
+
ECL).&nbsp;''</span><span style="font-size:13.0pt;font-family:Helvetica">
+
Performance variations for a given process were approximately 2 to 3 times for
+
CMOS and 20% to 30% for ECL.&nbsp; For this reason CMOS devices were sold at a
+
much lower performance than any bipolar counterpart.&nbsp;(I.e.; if the product
+
was specified to accommodate the entire functional lot (wafers processed at the
+
same time), more IC devices yielded. &nbsp;There is one other key difference in
+
defining performance differences between Bipolar and CMOS devices.&nbsp; For
+
ECL (or any other bipolar device) the maximum operating frequency is defined,
+
in part, by the base width – the physical distance between the emitter and
+
collector of the transistor.&nbsp; This is determined by the spacing based on
+
diffusion or implant of the emitter and is controlled in the vertical direction
+
and limited by process control that is quite precise.&nbsp; This parameter is
+
very thin and the frequency is determined indirectly proportional to the base
+
width.&nbsp; For CMOS the gate length defines the critical performance
+
parameter.&nbsp; Gate length is defined by mask optic limitations for any
+
generation of processes.&nbsp; Bipolar devices in the 1980’s and well into the
+
later half on the 1990’s, therefore, had higher operating maximum frequencies
+
than their CMOS counterparts.&nbsp; As capital equipment – primary optics to
+
generate masking and etching capabilities defined smaller and smaller
+
geometries, CMOS technology improved dramatically in performance.&nbsp; This
+
was a result of smaller gate lengths but also each generation had smaller
+
devices resulting in lower capacitance loading and lower time constants to
+
charge and discharge.&nbsp; During the time of the ETA Systems Supercomputer
+
development, CMOS technology had not seen the advantages that bipolar devices
+
could realize – but the potential for future improvements was obvious and
+
projections clearly indicated that by the second half of the 1990’s (nearly 10
+
years after the first ETA Systems Supercomputer would be available), CMOS would
+
overtake Bipolar in the last and most important parameter – performance.<span style="mso-spacerun: yes">&nbsp; </span>To restate this; the IC industry was transitioning to CMOS technology and more funding at the device, and equipment level was being expended to accommodate new markets focused on potential of CMOS than was being expended for Bipolar devices. <o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Helvetica">Bipolar technology was stretched to a practical limit
+
for the time frame in question.<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Helvetica">The IC industry, therefore, had only one other
+
technology candidate, CMOS, which was, in 1983, used exclusively for lower cost
+
and considerable lower performance applications and memory device technology
+
where more bits per die could be fabricated at the expense of lower performance
+
of the Bipolar counterpart(s).&nbsp; The impressive characteristic of CMOS
+
technology at this time was: Lower power consumption per function, smaller size
+
per logic function and lower cost per die due to two key factors (smaller
+
physical size per function meant more logical functions per unit of area, and
+
higher chip yield – chip functionality per wafer manufactured – due to reduced
+
number of processing steps to generate CMOS devices.&nbsp; That was the good
+
news.&nbsp; The concern was system performance.&nbsp; While bipolar technology
+
had set the standard for clock periods of 10 NSec for Supercomputer
+
architectures such as the ETA System projection, CMOS was at least 5 times
+
slower – in most cases 10 to 20 times slower for equivalent architectures.
+
Based on this parameter alone, CMOS was not a candidate for Supercomputers in
+
the 1999-1990 time frame (the time frame where the ETA Systems Supercomputer
+
would be in high volume production).<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Helvetica">The next steps for CDC (recall that at this time CDC
+
still had a Supercomputer Division) were dramatic and at times emotional.&nbsp;
+
First, the team had to discard the ECL design and terminate the effort with
+
Motorola.&nbsp; This was very difficult since both companies depended on each
+
other and secondly, all objectives of the ECL product were being met within the
+
specifications established.&nbsp; CDC (team which later became ETA Systems)
+
provided Motorola with all of the design details to date.&nbsp; Considerable
+
effort was made to insure that the program was successful at Motorola.&nbsp;<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Helvetica">A sidelight to this discussion – Motorola completed this
+
product as an industry product.&nbsp; Cray Research Inc. (the key competition
+
and leader of the Supercomputer market) engaged with Motorola to successfully
+
complete this complex IC development for a product announced in the late
+
1980’s. The product (Cray C-90) under the leadership of Les Davis, Steve Nelson
+
and other notable scientists (a key circuit designer was Mark Birrittella),
+
became another very successful supercomputer products developed and
+
manufactured by Cray Research Inc.<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Helvetica">Next, a full effort evaluation of all technology
+
candidates occurred.&nbsp; CMOS futures were explored in depth.&nbsp; GaAs
+
technology was also evaluated.&nbsp; Alternative ECL (bipolar) candidates were
+
also considered.&nbsp; CMOS was viewed as the technology of the future but the
+
future was beyond the time frame necessary for product introduction.<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Helvetica">The following paragraphs summarize key events that led
+
to the decision to use CMOS technology.&lt;o:p&gt;&lt;/o:p&gt;<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Wingdings">Ø&nbsp;&nbsp; </span><span style="font-size:13.0pt;
+
font-family:Helvetica">Moore’s law (invented by the great innovator and
+
co-founder of Intel – Gordon Moore) stated that IC technology (CMOS)
+
technology, would double in performance and density every 18 months to two
+
years.&nbsp; The actual Moore’s law may have been stated somewhat differently
+
but this captured all the project cared about.&nbsp; To achieve this predicted
+
growth, several parameters had to occur:<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
</span></span><span style="font-size:13.0pt;font-family:Helvetica">The
+
die size would increase (more gates per manufactured chip).<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Features
+
on the chip (metal widths and spaces to interconnect devices and actual device
+
parameters) would be reduced every 16 months to 2 years. Reducing parameter
+
sizes have two positive results to goals of ETA Systems: increased performance
+
and more gates per die.<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
</span></span><span style="font-size:13.0pt;font-family:Helvetica">The
+
technology would gain popularity – this would mean that capital equipment would
+
keep pace with the “law”, applications would increase thus increasing volume,
+
thus lowering cost and increasing performance and more applications and
+
industries would drive CMOS technology – the Supercomputer industry could not
+
drive such a large industry.<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;font-family:Helvetica">&nbsp;<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;font-family:Helvetica">Key industry activities also
+
emerged at this time:<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
</span></span><span style="font-size:13.0pt;font-family:Helvetica">CDC
+
validated operational performance gains operating CMOS technology in a
+
cryogenic environment.<span style="mso-spacerun: yes">&nbsp; </span>Several ring counter configurations generated with the 5,000 gate chip discussed earlier were dipped in a Liquid Nitrogen thermos jug expecting to witness the shattering of the silicon and the detachment of the solder joints attached to the oscilloscope only to find the frequency of the ring oscillators double and the system operate for weeks until we turned off the experiment.<span style="mso-spacerun: yes">&nbsp; </span>Analytical analysis applied to the Silicon design validated the research done previously by others.<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Key
+
US Government agencies began a technology acceleration program based on CMOS
+
technology – the Very High Speed Integrated Circuits (VHSIC) program under
+
direction of the Army, Navy and Air Force certainly captured our attention.<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Honeywell,
+
one of the participants in the VHSIC program held a technology luncheon IEEE
+
symposium in which they presented an 11,000-gate CMOS development effort.&nbsp;
+
Attendees from CDC were impressed (especially the key designer – Randy Bach -
+
with what the efforts.&nbsp; The chip was certainly larger than any that had
+
been developed to date and the performance was accelerated beyond what was
+
predicted for the 1988 time frame by the conventional IC industry (the
+
introduction date set for the ETA Systems Supercomputer – then the next
+
generation CDC Supercomputer).<span style="mso-spacerun: yes">&nbsp;
+
</span>Honeywell was a recipient of one of the VHSIC contracts..<o:p></o:p></span>
+
 
+
<span style="font-size:13.0pt;
+
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Logicians
+
and architects back at CDC - led by Neil Lincoln (chief architect), Ray Kort,
+
Maurice Hudson and Dave Hill and others - determined that an minimum gate
+
density of 15,000 gates per die would allow them to achieve a key objective;
+
having a worst case Register to Register clock path residing within a single
+
chip.&nbsp; Now additional explanation is required here.&nbsp; There were
+
technical reasons that the logicians wanted more beyond the knee jerk reaction
+
that asking for 50% more than offered was a standard mode of operation for
+
these guys. Each architecture configuration has a method of achieving its goals
+
of applying computational instructions to problems.&nbsp; The number of gates
+
that are connected in serial fashion between the input and output registers
+
(and this is truly simplifying the problem) determine the clock period that is
+
allowed.&nbsp; For the ETA Systems Supercomputer, therefore, it was determined
+
that a functional unit clock period could reside within the boundary of the
+
chip if the chip could provide 15,000 gates of logic to the designer.<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>Before getting to the details as to how decisions were made and how the ETA System technologies the “kit” was selected and developed, a list of noteworthy accomplishments achieved are listed: </p>
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Research
+
into technology experiments uncovered significant performance features of CMOS
+
technology.&nbsp; First of all, the technology was functional across a wide
+
range of voltages and temperatures but performance was significantly
+
altered.&nbsp; The higher the operating voltage (within semiconductor
+
constraints, of course) the higher the performance resulted.&nbsp;
+
Unfortunately the Power consumption, although significantly lower than any
+
alternative technology, increased as the Square of the operating voltage.&nbsp;
+
The lower the operating temperature of CMOS the higher performance as well.
+
This factor was studied by others and carefully documented from 400 degrees
+
Kelvin (100 degrees above room temperature) to 77 degrees Kelvin.&nbsp; (77
+
Degrees Kelvin is the boiling point temperature of liquid Nitrogen.)&nbsp; <o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
*First Industry competitive CMOS CPU
font-family:Helvetica">So, let’s summarize what was learned with this
+
evaluation:<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>Since 1995 – to the present (beginning 12 years after the technology selection by ETA Systems I might add) ALL HPC (High Performance Computers) are developed and manufactured using CMOS IC technology. Until as late as 2000, bipolar technology (higher power, more costly to manufacture and lower gate count per chip) dominated high performance computers throughout the world. </p>
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
</span></span><span style="font-size:13.0pt;font-family:Helvetica">IC
+
chips currently (four years before the need for an ETA Systems product) had a
+
capacity of 11,000 gates.<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
*First Industry Single Board CPU
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
</span></span><span style="font-size:13.0pt;font-family:Helvetica">The
+
performance of these gates, when operated at liquid Nitrogen temperatures,
+
would perform at least two times faster than at room temperature – not yet
+
validated at CDC.<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>The chip density (gates per chip) allowed by advanced CMOS, the use of layout and design Computer aided design tools for optimum layout and simulation, the successful design of a 45 layer advance Printed Circuit board (you read it right 45 layers) and innovative chip attachment and cooling permitted a single processor containing nearly 3 million gates to be packaged on a single board </p>
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
</span></span><span style="font-size:13.0pt;font-family:Helvetica">15,000
+
useable gates were required per chip to meet logic designer chip boundary
+
requirements.<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
*First Industry system to bedesigned with self-test
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
</span></span><span style="font-size:13.0pt;font-family:Helvetica">If
+
Moore’s law was applied to these parameters, within the time frame required, it
+
was possible to achieve both gates per chip densities and performance goals (if
+
the system operated in a liquid Nitrogen environment).&nbsp;&nbsp;<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>CPU Processing units (≈3Million gates each) were validated for functionality and performance in less than 4 hours.Any interconnect errors were recorded and allowed chip-to-chip replacement to occur in a minimal time.Other CPU checkout during this same period required weeks to months to check out and validate a processing unit. Incoming testing of the logic IC Chip (function and performance) also used the same self-test innovations. </p>
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
</span></span><span style="font-size:13.0pt;font-family:Helvetica">There
+
were at least two IC Suppliers (those having contracts with the US government)
+
that were pursuing CMOS as a performance and high gate/chip density technology
+
(the other known corporation was TRW).<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
*First Industry production Liquid Nitrogen CPU
font-family:Helvetica">Computer Aided Design (CAD) tools were, during the
+
period of the 80’s, in the infancy stage if one was to compare them to today’s
+
capabilities.&nbsp; To design, place cells within the matrix of the gates
+
provided on the IC Chip, and route the interconnections of these cells
+
accurately to the logic or Boolean design required by the logicians and to
+
clock period constraints was a challenge.&nbsp; This challenge applied to board
+
layout designs as well.&nbsp; Control Data Corporation (CDC) recognized the
+
challenges and established a small but efficient and dedicated organization to
+
address these challenges.&nbsp; The industry had established a metric that to
+
use CAD tools for gate or cell arrays, an additional 20% to 30% gates were
+
required.&nbsp; This meant if the ETA Supercomputer required at least 15,000
+
useable gates to accomplish necessary designs based on its architecture, an 18,000
+
to ≈20,000-gate capacity was required.&nbsp; The technology organization set at
+
its objectives a design of 20,000 gates plus necessary circuitry to self-test
+
each gate or cell array.&nbsp; This as compared to the gate array in
+
development at Honeywell was nearly 2 times the capacity (11,000 total gates
+
Vs. 20,000 total gates plus circuitry for self test). <o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>The ETA Systems CPU was immersed in Liquid Nitrogen – 77 degrees Kelvin – to improve performance greater than two times that CMOS technology operated at room temperature – 300 degrees Kelvin. </p>
font-family:Helvetica">The task was to convince Honeywell to project the next
+
generation size and layout rules and to accept an R&amp;D effort that would
+
allow CDC / ETA Systems achieve its objectives. Honeywell, an innovative
+
organization, took on the task after considerable discussion with key
+
requirements:<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
*First system at CDC to fully utilize Computer Design Software to design Chips, boards, validate Logic design and Auto Diagnostic test the system with Synergistic tools
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
</span></span><span style="font-size:13.0pt;font-family:Helvetica">ETA
+
Systems (we were now ETA Systems by the time these discussions reached
+
negotiations) accept costs based on wafers processed, not functional
+
chips.&nbsp; Honeywell would provide necessary processing data to reflect
+
wafers were processed within process parameter specifications.<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>Permitted checkout of a CPU to be completed in less than 4 hours. Manufacturing costs were greatly reduced. This technique was also used at the IC Supplier and greatly reduced any probe test hardware and software. </p>
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
</span></span><span style="font-size:13.0pt;font-family:Helvetica">ETA
+
Systems provide test equipment for wafer testing and test parameters for chip
+
acceptance prior to packaging.<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
*First Industry system to havemultiple cost designs from single design effort
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Both
+
companies would share facilities and key resources and work as a single team –
+
as “open a Kimono relationship” that one could ever imagine during this dynamic
+
period of complex process developments within the IC Industry. – David Frankel
+
was assigned the task as ETA Systems interface and engergetically took on the
+
challenging task.<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>Performance range of the ETA System products was greater than 24:1 (8 processor system operating at 7 nanoseconds Clock period and a single processor system operating at 24 nanoseconds.). Processors were manufactured, tested and validated from a single manufacturing line using identical components. (IC Chips were performance sorted using auto self test). Product differences began at the system packaging level. </p>
font-family:Helvetica">&nbsp;<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
== Boring into details  ==
font-family:Helvetica">Self-test circuitry was designed into the basic cell
+
array periphery.&nbsp; The area consumed by this custom set of pseudo-random
+
generated logic and registers was less than 15% of the total chip area.&nbsp;
+
(David Resnick, resident do-it-all reduced concepts explored by ex CDC
+
scientist Nick Van Brunt who left the company a year previous to the formation
+
of ETA Systems.)<span style="mso-spacerun: yes">&nbsp; </span>This was one of many extra ordinary contributions David made to ETA Systems.<span style="mso-spacerun: yes">&nbsp; </span>Additionally to providing self test capability to accept or reject the circuitry – both functionality and performance sorting – the circuitry included in each 20,000 gate array had capability to test for interconnect between circuits on the final PC Board as well as circuit to I/O connections.<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>Any Technology kit must be driven by a customer need. In the case of Supercomputers the craving for increased computer performance at a lower cost (overall cost) was the deciding factor. In any Supercomputer company a combination of marketing requirements, architecture innovations and logic design demands dictate the initial objectives of the hardware circuit and packaging organization. I state “initial” since once the objectives are digested and key technologies are evaluated for the time frame addressed, compromises are the norm. In the case of ETA Systems technology selections in the early 1980’s, this was the strategy implemented. </p>
font-family:Helvetica">When the logic design team first heard of this area
+
“waste” of test circuits that could be used for logic design, they lobbied for
+
it to be removed in favor of more logic gates for function designs.&nbsp; Fortunately
+
this request was not honored.&nbsp; IC validation at both the supplier in wafer
+
form and at ETA Systems in packaged chip configuration coupled with the use of
+
the same circuitry in manufacturing checkout to detect board opens and shorts
+
between circuits assembled both in room temperature and cryogenic temperature
+
environments proved to be well worth this “waste” of circuitry area. Small,
+
relatively inexpensive testing systems were designed by ETA Systems and
+
provided to the supplier.&nbsp; The operands for initialization of the
+
pseudo-random logic were also supplied for each design (chip type).<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>The following paragraphs sequence the thought process and the technology selection strategy utilized. </p>
font-family:Helvetica">Chip types (array design options) were carefully managed
+
as to not proliferate the chip types in the system.&nbsp; This was a new
+
constraint placed on logic designers and was dealt with most professionally and
+
responsibly by all participants once understood.&nbsp; The resultant chip total
+
for the CPU (processing unit) was fewer than 150 while the chip types including
+
clock chips and all logic design chips was fewer than 20 as best recalled.<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
==== Integrated Circuit selection  ====
font-family:Helvetica">During the development cycle of the ETA System
+
Supercomputer, Honeywell moved the manufacturing capability from a local
+
Minneapolis facility to a state-of-the-art manufacturing facility in Colorado
+
Springs, CO.&nbsp; The transition was very transparent to ETA Systems (with the
+
exception of the traveling budget, of course). To accomplish this team
+
membership from both companies acted as one in all decisions addressing
+
scheduling and timing of needs of various chips, testing, packaging, etc. The
+
open book relationship was very beneficial to both companies. On one milestone
+
occasion – where Honeywell successfully completed an initial order – Dave
+
Frankel and I visited Honeywell, some 30 miles from the ETA Systems facility,
+
and served cake and coffee to all designers and operators – it was below zero
+
when this milestone was reached and no one cared.<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>The objectives, listed in earlier paragraphs were first integrated into the architecture and logic design requirements. A market survey of key integrated circuit suppliers was conducted with emphasis on what was in development and planned for product introduction – not what was available at the time of the survey. A risk assessment was made. Primary focus was on the most dynamic technology, the IC Logic technology. All decisions as to volume requirements, pins, packaging, etc. resulted from what was determined by this survey and risk analysis. Merging the logic design objectives (gates, bandwidth and performance of key functions) was next. </p>
font-family:Helvetica">One design that was incorporated into the chip was to
+
allow for next generation critical processing parameters to be added to the
+
existing design (present chip layout).&nbsp; Although this would not optimize
+
the features of new process features (all parameters were not considered), key
+
performance enhancements could be and were added to the present design.&nbsp; A
+
key feature was gate length and this was added transparently to the physical
+
chip and offered appreciable performance enhancements to the design.<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>An ECL (emitter coupled logic) high performance bipolar gate array using Motorola advanced IC technology was selected. Since Motorola was not fully staffed to begin the actual product development (application) but did have the process development underway, a cooperative development agreement was struck with the two companies (this occurred between Motorola and Control Data since ETA Systems had yet not been formed). The design called for basic logic cells to be incorporated into a larger version of their existing gate array advancing the process for increased performance and chip size for increased gate capacity. The existing gate or function array utilized approximately 2,500 gates (which was used as the primary gate array for the Cray Research very popular Y-MP Supercomputer) and the planned gate array would contain an excess of 8,000 equivalent gates. </p>
font-family:Helvetica"><u>Chip design summary</u>: The decision to utilize CMOS
+
technology for the ETA Systems Supercomputer in the 1985 – 1988 time frame
+
(prematurely by all industry metrics) resulted in the following additional
+
“technology kit” decisions:<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>Logic cell libraries were agreed to (acceptable to both Motorola for the general market and to CDC for the logic designs). Pin counts (for power, ground and input/output logic communications) were established and power consumption estimates were made. Once these parameters were established, board size, power systems and thermal control were evaluated in a trade off give-and-take. Features of Printed Circuit Boards, (line widths, spacing, interconnect vias and number of layers were compared to the board size capacities, laminating press capabilities, drill designs and printed pc board processing limits. IC packaging, limits, i.e.; minimum size of package, pin spacing, thermal removal, etc. was evaluated in parallel with PC Board limits. </p>
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Addition
+
of chip self-test. Feature established functionality at wafer test and
+
functionality and performance sorting at ETA Systems<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>The chip design began, the cell library began and the packaging began once all parameters (pins, power consumption and die size objectives) were agreed to. Printed circuit board experiments also began. Once feasibility was established and practical limits established (original goals could be met as to physical design and performance based on IC Modeling and extrapolation from previous established functional systems, a preliminary specification was presented to the architects and logic designers for review. </p>
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Computer
+
Layout tools that validated logic prior to chip release for fabrication<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>From initial design data, logic design based on the parameters provided established a physical size for the Central Processing unit or CPU, the heart of the system. A multiple board processor was required. This placed additional constraints on packaging since within a single processor all distances are crucial between circuits. Three-dimensional packaging concepts were considered. Three dimensional packaging effectively meant a “sandwich” effect of multiple boards with interconnects from board to board were throughout the area – not exclusive to the periphery of the board such that chips on each of the boards would minimize distances between them. In addition, power consumption estimates were made; thermal removal paths and techniques were considered. A cost model was generated as well. All of these factors resulted in a preliminary estimate of the CPU volume. In the introduction portion of the document, you already know that this was rejected - more to follow for sure. </p>
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Requirement
+
to operate the chip at 77 degrees Kelvin or in liquid Nitrogen<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>In parallel with these efforts, memory design was underway. Less freedom was available to memory since the basic semiconductor device could not be altered to accommodate specific users. There were a few packaging alternatives, very few, and device configurations (Word – Bit architecture, pin numbering, power considerations, etc.) were dictated by the industry. Since memory design has its own objectives for cost, reliability and performance, this effort could continue quite independently with one exception, the packaging of the total system must be synergistic and compatible. A crucial parameter of this is the interconnect mechanism between processors and memory. </p>
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Packaging,
+
interconnect &amp; assembly decisions based on liquid Nitrogen operation
+
challenges<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>A hardware system cost model was established – not only for current cost considerations but also estimates on volume costs based on learning-curve estimates as well for the life of the system. </p>
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Remote
+
testing of the CPU because of liquid Nitrogen operation challenges<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>The chief architect, after careful review, rejected the design; this was covered in the introduction. Three key reasons were sited; performance would be impacted due to the 8,000 gate limit, (worst case logic paths could not reside in a single chip and multiple chip distances would increase the clock period), power consumption per CPU, although lower on a performance ratio basis to previous generations, was too high when the total system size (including the multiprocessor objectives) were considered and system cost appeared prohibitive – always a subjective issue but never-the-less a key component of the design. Reliability concerns were also stated since the pin-count per CPU, although quite reduced from previous designs, were of concern. The architecture was committed to four CPUs (max) per system so the interconnect "bar" was raised. </p>
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Logic
+
design partitioning challenges to design within 15,000-gate per chip boundaries
+
and a minimum of IC chip types<o:p></o:p></span>
+
  
<span style="font-size:14.0pt;
+
== Back to the drawing board  ==
font-family:Helvetica">Printed Circuit Board Design Selection</span><span style="font-size:13.0pt;font-family:Helvetica">:<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>Bipolar technology refers to conventional NPN and PNP transistors operating in a non-saturating mode (collector-base). By not saturating the operating transistors (not allowing the base voltage being higher than the collector voltage) the switching characteristics were improved and balanced (off logic level and on logic levels had identical delays). In addition, the non-saturating circuitry – titled ECL for Emitter coupled logic – provided the TRUE and COMPLIMENT outputs for each logic function (i.e.; AND &amp; NAND, OR &amp; NOR, etc). This provided advantages to logicians to design complex Boolean functions (ADD units, MULTIPLY units, DIVIDE units, etc). Under the category of “no free ride” ECL circuitry consumed higher power than the more popular but much slower saturating logic circuitry (TTL – transistor-transistor logic). Other improvements in performance for integrated versions of ECL logic circuitry included replacing conventional junction isolation between circuit devices on a single die with Oxide isolation between circuits (lower capacitance per circuit so less charging and discharging when logic levels switched). </p>
font-family:Helvetica">In the period of the 1980s, the time frame of the ETA
+
Systems Supercomputer development, Printed circuit boards had maximum
+
dimensions of approximately a square foot and the number of total layers fewer
+
than 20.&nbsp; (Layers provide power and ground stability, interconnect
+
capability for the circuits attached to the board as well as inputs and outputs
+
to and from the board.)&nbsp; If these total layers are allocated properly,
+
approximately 50% are used for interconnect and the remaining for power and
+
ground.&nbsp; Positioning of power and ground layers also serve to provide
+
interconnect layers that have transmission line capabilities to insure signal
+
integrity throughout the board. During this period, a state-of-the-art printed
+
circuit board was approximately one square foot of active circuitry and as
+
stated earlier, 20 layers or fewer usually restricted to a total thickness of
+
0.063 inches.<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>CMOS (Complementary Metal Oxide Silicon) circuitry, especially at the time of ETA System, was a simpler and more efficient logic circuit. This form of logic also had a simpler process. Stacking of P channel and N channel transistors in series between voltage bus rails defines a single complementary gate. Functionality of the logic devices is much more forgiving to process variations due to the larger voltage swing and only active transistors used to define the circuitry (no resistors, diodes, etc.). The physical size of a logic function when compared to a bipolar equivalent is significantly smaller, resulting in an increase in circuitry per equivalent die (chip) size. CMOS technology also consumed power ONLY when the circuit was switching (changing states) so power consumption was directly proportional to the frequency it was operating.(P = CV2f) ECL circuitry, by contrast, consumed approximately the same power – while switching or in a quiescent state. (Later forms of CMOS – especially those designed in early 2000 and beyond, had increased power consumption primarily caused by increased bulk leakage currents as a result of processes developed for lithography having features en excess (smaller) than 90 nanometers. Technology at the time of the development of the ETA Supercomputers had minimum features of 1,200 nanometers. (In 2009, by contrast, the production capability is 45 nanometers) </p>
font-family:Helvetica">It was determined that a maximum of 150 chips would be
+
required to design the ETA Systems Supercomputer CPU.&nbsp; Packaging of the IC
+
and interconnecting the chip to a PC board with minimum spacing between chips
+
(some spacing was required to allow interconnects to all of the necessary
+
layers) resulted in a 1.2x1.2 sq. inch “footprint”.&nbsp; Doing the simple math
+
results in a pc board of a minimum of 220 sq. inches.&nbsp; The number of total
+
layers required to interconnect the 150 chips and the necessary Input and
+
Output at the board periphery was determined to be 45.&nbsp; Looking at design
+
parameters of the board layers in more depth and insuring transmission line
+
features to insure signal integrity defined the board thickness at slightly
+
greater than 0.25 inches.&nbsp; This thickness was approximately three times
+
greater than high-end printed circuit boards produced in this time frame.&nbsp;
+
With a board having an area of greater than 1.5 times the size of what was able
+
to be produced, a thickness of 300% of what was produced and a the number of
+
layers 2.5 times of what was produced in this time frame it was clear that the
+
printed circuit board industry was not ready for the ETA Systems design! The
+
design has other limitations.&nbsp; A key factor when designing pc boards is to
+
insure proper connecting of the layers, i.e.; connecting the chip pins to the
+
board and the proper layer of interconnect in the board and back to the proper
+
receiving chip.&nbsp; Drilling holes in the layers and plating the wall of the
+
holes with copper for conduction make these connections.&nbsp; These are called
+
plated thru holes or PTH. A key parameter to insure that plating occurs in
+
these holes is the hole diameter to depth ratio.&nbsp; The industry at this
+
period (not much better today) is 6:1, i.e.; the thickness of the board must be
+
no more than 6 times the diameter of the hole.&nbsp; This ratio would dominate
+
the size of the board. If this ratio is used to design the board the board size
+
would be increased in area by greater than 9 times.&nbsp; Talk about piling
+
on!&nbsp; Since it was deemed not feasible, issues like cost and time to
+
fabricate the board were not even addressed.<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>Advantages of CMOS were obvious; more circuits per given chip area, lower power consumption and higher functional yield. It is important to stress “functional yield”. The CMOS devices functioned over a much larger range of processing variations (&gt; 50% Vs. &lt; 15% to 25% for ECL). Performance variations for a given process were approximately 2 to 3 times for CMOS and 20% to 30% for ECL. For this reason CMOS devices were sold at a much lower performance than any bipolar counterpart.(I.e.; if the product was specified to accommodate the entire functional lot (wafers processed at the same time), more IC devices yielded. There is one other key difference in defining performance differences between Bipolar and CMOS devices. For ECL (or any other bipolar device) the maximum operating frequency is defined, in part, by the base width – the physical distance between the emitter and collector of the transistor. This is determined by the spacing based on diffusion or implant of the emitter and is controlled in the vertical direction and limited by process control that is quite precise. This parameter is very thin and the frequency is determined indirectly proportional to the base width. For CMOS the gate length defines the critical performance parameter. Gate length is defined by mask optic limitations for any generation of processes. Bipolar devices in the 1980’s and well into the later half on the 1990’s, therefore, had higher operating maximum frequencies than their CMOS counterparts. As capital equipment – primary optics to generate masking and etching capabilities defined smaller and smaller geometries, CMOS technology improved dramatically in performance. This was a result of smaller gate lengths but also each generation had smaller devices resulting in lower capacitance loading and lower time constants to charge and discharge. During the time of the ETA Systems Supercomputer development, CMOS technology had not seen the advantages that bipolar devices could realize – but the potential for future improvements was obvious and projections clearly indicated that by the second half of the 1990’s (nearly 10 years after the first ETA Systems Supercomputer would be available), CMOS would overtake Bipolar in the last and most important parameter – performance.<span style=""> </span>To restate this; the IC industry was transitioning to CMOS technology and more funding at the device, and equipment level was being expended to accommodate new markets focused on potential of CMOS than was being expended for Bipolar devices. </p>
font-family:Helvetica">Nestled into the design laboratory of Control Data
+
Corporation was a small but very innovative printed circuit board prototype
+
facility. The leader of this group, LeRoy Beckman, never said “no” to
+
challenges.<span style="mso-spacerun: yes">&nbsp; </span>He just bit his pipe a little harder and tried not to snicker out loud.<span style="mso-spacerun:
+
yes">&nbsp; </span>LeRoy kept his eyes and ears out for innovative alternatives to conventional board fabrication techniques and had previously displayed innovation (evolutionary in nature) in previous generations.&nbsp; Embedded termination resistors in layers was one invention he brought to CDC when resistor termination took up too much board area; finer features than the industry was producing another, and higher plated through hole (pth) ratios than the industry a third.<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>Bipolar technology was stretched to a practical limit for the time frame in question. </p>
font-family:Helvetica">New technologies in the printed circuit board were few
+
and far between.&nbsp; The industry was set in it’s ways of subtractive etching
+
of circuit layers (removing unwanted copper from a pre-copper clad layer,
+
convention wet etch processes and relatively simple assembly, i.e.; lamination
+
of layers with pressure.&nbsp; One inventor, Mr. Peter P. Pellegrino, arrived
+
on the scene to discuss innovative, revolutionary and proven pc board processing.&nbsp;
+
At first the claims appeared to be too good to be true.&nbsp; Board size
+
relatively independent, aspect ratios exceeding 20:1 for PTH, an additive
+
process that permitted finer lines to be fabricated on individual layers.&nbsp;
+
The layers were also embedded into the laminate so the opportunity for higher
+
yield with reduced features.&nbsp; An additional benefit of additive plating is
+
reduction in waste and water usage.&nbsp; <o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>The IC industry, therefore, had only one other technology candidate, CMOS, which was, in 1983, used exclusively for lower cost and considerable lower performance applications and memory device technology where more bits per die could be fabricated at the expense of lower performance of the Bipolar counterpart(s). The impressive characteristic of CMOS technology at this time was: Lower power consumption per function, smaller size per logic function and lower cost per die due to two key factors (smaller physical size per function meant more logical functions per unit of area, and higher chip yield – chip functionality per wafer manufactured – due to reduced number of processing steps to generate CMOS devices. That was the good news. The concern was system performance. While bipolar technology had set the standard for clock periods of 10 NSec for Supercomputer architectures such as the ETA System projection, CMOS was at least 5 times slower – in most cases 10 to 20 times slower for equivalent architectures. Based on this parameter alone, CMOS was not a candidate for Supercomputers in the 1999-1990 time frame (the time frame where the ETA Systems Supercomputer would be in high volume production). </p>
font-family:Helvetica">A special plating cell was also introduced that
+
permitted uniform deep hole plating by forcing plating fluid into each of the
+
thousands of PTH.&nbsp; The process titled “Push-Pull<sup>TM</sup>” also
+
accelerated the plating manufacturing cycle by over an order of magnitude,
+
reducing cost.<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>The next steps for CDC (recall that at this time CDC still had a Supercomputer Division) were dramatic and at times emotional. First, the team had to discard the ECL design and terminate the effort with Motorola. This was very difficult since both companies depended on each other and secondly, all objectives of the ECL product were being met within the specifications established. CDC (team which later became ETA Systems) provided Motorola with all of the design details to date. Considerable effort was made to insure that the program was successful at Motorola. </p>
font-family:Helvetica">A small plating cell was incorporated into the prototype
+
facility at CDC and a controlled set of experiments conducted.&nbsp;
+
Experiments were thorough and challenging since no one in the industry could
+
approach the lofty objectives of the ETA Systems Supercomputer CPU board nor
+
the lofty claims of the inventor. The results were simply outstanding.&nbsp;
+
From the results and a commitment to fabricate a larger manufacturing line of
+
plating insert cells, the 45 layer 15” x 24” CPU board became a realistic
+
finalized goal of ETA Systems.&nbsp;Anyone told of this goal openly scoffed at
+
this as too risky and unrealistic.<span style="mso-spacerun: yes">&nbsp;
+
</span>This included some in the company as well.<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>A sidelight to this discussion – Motorola completed this product as an industry product. Cray Research Inc. (the key competition and leader of the Supercomputer market) engaged with Motorola to successfully complete this complex IC development for a product announced in the late 1980’s. The product (Cray C-90) under the leadership of Les Davis, Steve Nelson and other notable scientists (a key circuit designer was Mark Birrittella), became another very successful supercomputer products developed and manufactured by Cray Research Inc.. </p>
font-family:Helvetica">Later, when manufacturing of the systems was viable, a
+
production capacity was developed for manufacturing.&nbsp; It is noted that
+
hundreds of these boards were fabricated from a period of 1987 through early
+
1989.&nbsp; The yield of final boards was nearly perfect – only one finished
+
board was scrapped. <o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>Next, a full effort evaluation of all technology candidates occurred. CMOS futures were explored in depth. GaAs technology was also evaluated. Alternative ECL (bipolar) candidates were also considered. CMOS was viewed as the technology of the future but the future was beyond the time frame necessary for product introduction. </p>
font-family:Helvetica">To this day (2009) few realize what a monumental
+
accomplishment this was and still is. This a tribute to LeRoy Beckman, Peter
+
Pellegrino, the manufacturing facility at ETA Systems (now a banking building
+
in St. Paul) and those who trusted that the lofty objectives could be realized.<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
==== Key events that led to the decision to use CMOS technology. ====
font-family:Helvetica">To accommodate routing and designing for minimum
+
distance between IC chips, CAD tools were developed and the first use of diagonal
+
routed layers were introduced.&nbsp; Prior to this only x–y layers were
+
permitted with manual and/or auto tools (CAD).&nbsp; This enhancement permitted
+
timing constraints to be realized between chips.<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>Moore’s law (invented by the great innovator and co-founder of Intel – Gordon Moore) stated that IC technology (CMOS) technology, would double in performance and density every 18 months to two years. The actual Moore’s law may have been stated somewhat differently but this captured all the project cared about. To achieve this predicted growth, several parameters had to occur: </p>
font-family:Helvetica">The final board had the following noteworthy
+
characteristics:<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
*The die size would increase (more gates per manufactured chip).  
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
*Features on the chip (metal widths and spaces to interconnect devices and actual device parameters) would be reduced every 16 months to 2 years. Reducing parameter sizes have two positive results to goals of ETA Systems: increased performance and more gates per die.  
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Board
+
*The technology would gain broad industry popularity – this would mean that capital equipment would keep pace with the “law”, applications would increase thus increasing volume, thus lowering cost and increasing performance and more applications and industries would drive CMOS technology – the Supercomputer industry could not drive such a large industry.
size: 15 inches by 22 inches by 0.26 inches<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
==== Key industry activities also emerged at this time: ====
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Pth
+
hole ratio ≈ 20:1 – plating time – less than 20 minutes<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
*CDC validated operational performance gains operating CMOS technology in a cryogenic environment. Several ring counter configurations generated with the 5,000 gate chip discussed earlier were dipped in a Liquid Nitrogen thermos jug expecting to witness the shattering of the silicon and the detachment of the solder joints attached to the oscilloscope only to find the frequency of the ring oscillators double and the system operate for weeks until we turned off the experiment. Analytical analysis applied to the Silicon design validated the research done previously by others.  
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
*Key US Government agencies began a technology acceleration program based on CMOS technology – the Very High Speed Integrated Circuits (VHSIC) program under direction of the Army, Navy and Air Force certainly captured our attention.  
</span></span><span style="font-size:13.0pt;font-family:Helvetica">45
+
*Honeywell, one of the participants in the VHSIC program held a technology luncheon IEEE symposium in which they presented an 11,000-gate CMOS development effort. Attendees from CDC were impressed (especially the key designer – Randy Bach - with what the efforts. The chip was certainly larger than any that had been developed to date and the performance was accelerated beyond what was predicted for the 1988 time frame by the conventional IC industry (the introduction date set for the ETA Systems Supercomputer – then the next generation CDC Supercomputer). Honeywell was a recipient of one of the VHSIC contracts.
total layers per CPU panel<o:p></o:p></span>
+
*Logicians and architects back at CDC - led by Neil Lincoln (chief architect), Ray Kort, Maurice Hudson and Dave Hill and others - determined that an minimum gate density of 15,000 gates per die would allow them to achieve a key objective; having a worst case Register to Register clock path residing within a single chip. Now additional explanation is required here. There were technical reasons that the logicians wanted more beyond the knee jerk reaction that asking for 50% more than offered was a standard mode of operation for these guys. Each architecture configuration has a method of achieving its goals of applying computational instructions to problems. The number of gates that are connected in serial fashion between the input and output registers (and this is truly simplifying the problem) determine the clock period that is allowed. For the ETA Systems Supercomputer, therefore, it was determined that a functional unit clock period could reside within the boundary of the chip if the chip could provide 15,000 gates of logic to the designer.
 +
*Research into technology experiments uncovered significant performance features of CMOS technology. First of all, the technology was functional across a wide range of voltages and temperatures but performance was significantly altered. The higher the operating voltage (within semiconductor constraints, of course) the higher the performance resulted. Unfortunately the Power consumption, although significantly lower than any alternative technology, increased as the Square of the operating voltage. The lower the operating temperature of CMOS the higher performance as well. This factor was studied by others and carefully documented from 400 degrees Kelvin (100 degrees above room temperature) to 77 degrees Kelvin. (77 Degrees Kelvin is the boiling point temperature of liquid Nitrogen.)
  
<span style="font-size:13.0pt;
+
==== Summary of what was learned with this evaluation  ====
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
</span></span><span style="font-size:13.0pt;font-family:Helvetica">150
+
IC chip locations (fewer were used in final design)<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
*IC chips currently (four years before the need for an ETA Systems product) had a capacity of 11,000 gates.  
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
*The performance of these gates, when operated at liquid Nitrogen temperatures, would perform at least two times faster than at room temperature – not yet validated at CDC.
</span></span><span style="font-size:13.0pt;font-family:Helvetica">More
+
*15,000 useable gates were required per chip to meet logic designer chip boundary requirements.&lt;o:p&gt;&lt;/o:p&gt; ¨ If Moore’s law was applied to these parameters, within the time frame required, it was possible to achieve both gates per chip densities and performance goals (if the system operated in a liquid Nitrogen environment).
than 30,000 board plated thru holes (pth) were used for interconnect<o:p></o:p></span>
+
*There were at least two IC Suppliers (those having contracts with the US government) that were pursuing CMOS as a performance and high gate/chip density technology (the other known corporation was TRW).  
 +
*Computer Aided Design (CAD) tools were, during the period of the 80’s, in the infancy stage if one was to compare them to today’s capabilities. To design, place cells within the matrix of the gates provided on the IC Chip, and route the interconnections of these cells accurately to the logic or Boolean design required by the logicians and to clock period constraints was a challenge. This challenge applied to board layout designs as well. Control Data Corporation (CDC) recognized the challenges and established a small but efficient and dedicated organization to address these challenges. The industry had established a metric that to use CAD tools for gate or cell arrays, an additional 20% to 30% gates were required. This meant if the ETA Supercomputer required at least 15,000 useable gates to accomplish necessary designs based on its architecture, an 18,000 to ≈20,000-gate capacity was required. The technology organization set at its objectives a design of 20,000 gates plus necessary circuitry to self-test each gate or cell array. This as compared to the gate array in development at Honeywell was nearly 2 times the capacity (11,000 total gates Vs. 20,000 total gates plus circuitry for self test).
  
<span style="font-size:13.0pt;
+
<p>The task was to convince Honeywell to project the next generation size and layout rules and to accept an R&amp;D effort that would allow CDC / ETA Systems achieve its objectives. Honeywell, an innovative organization, took on the task after considerable discussion with key requirements: </p>
font-family:Helvetica">In 2009 this board development and manufacturing stands
+
out as one of the major technology developments by ETA Systems<o:p></o:p></span>
+
  
<span style="font-size:14.0pt;
+
*ETA Systems (we were now ETA Systems by the time these discussions reached negotiations) accept costs based on wafers processed, not functional chips. Honeywell would provide necessary processing data to reflect wafers were processed within process parameter specifications.  
font-family:Helvetica">Packaging <o:p></o:p></span>
+
*ETA Systems provide test equipment for wafer testing and test parameters for chip acceptance prior to packaging.
 +
*Both companies would share facilities and key resources and work as a single team – as “open a Kimono relationship” that one could ever imagine during this dynamic period of complex process developments within the IC Industry. – David Frankel was assigned the task as ETA Systems interface and engergetically took on the challenging task.
 +
*Self-test circuitry was designed into the basic cell array periphery. The area consumed by this custom set of pseudo-random generated logic and registers was less than 15% of the total chip area. (David Resnick, resident do-it-all reduced concepts explored by ex CDC scientist Nick Van Brunt who left the company a year previous to the formation of ETA Systems.) This was one of many extra ordinary contributions David made to ETA Systems. Additionally to providing self test capability to accept or reject the circuitry – both functionality and performance sorting – the circuitry included in each 20,000 gate array had capability to test for interconnect between circuits on the final PC Board as well as circuit to I/O connections.
  
<span style="font-size:13.0pt;
+
<p>When the logic design team first heard of this area “waste” of test circuits that could be used for logic design, they lobbied for it to be removed in favor of more logic gates for function designs. Fortunately this request was not honored. IC validation at both the supplier in wafer form and at ETA Systems in packaged chip configuration coupled with the use of the same circuitry in manufacturing checkout to detect board opens and shorts between circuits assembled both in room temperature and cryogenic temperature environments proved to be well worth this “waste” of circuitry area. Small, relatively inexpensive testing systems were designed by ETA Systems and provided to the supplier. The operands for initialization of the pseudo-random logic were also supplied for each design (chip type). </p>
font-family:Helvetica">The key challenge for packaging the ETA Supercomputer
+
processing unit was the cryogenic chamber for the processor.&nbsp; The Cryostat
+
to contain the processor (two processor units) had a conventional (and quite
+
heavy) circular cryostat containing a vacuum chamber between the outside
+
environment and the inner environment.&nbsp; Input of liquid Nitrogen was at the
+
bottom of the chamber and the escaping of the gaseous Nitrogen was provided for
+
near the top of the unit.&nbsp; The piping containing the Nitrogen to and from
+
the regeneration unit was also temperature protected with vacuum lines.&nbsp;
+
Dan Sullivan and his design team led this admirable effort.<span style="mso-spacerun: yes">&nbsp; </span>(Unfortunately, Dan passed on a few years ago). It was felt that a less heavy and equally efficient chamber (proposed by Carl Breske – a very innovative scientist) could be designed if time permitted but the selection of the vacuum based design was conservative to accommodate schedule and also to familiarize the team with the challenges of Cryogenics.&nbsp; The compressor unit was a conventional Liquid Nitrogen system (very large and bulky) used for generation of Liquid Nitrogen for the commercial market.&nbsp; The system was not pretty.<span style="mso-spacerun: yes">&nbsp;
+
</span>Marketing, led by Bobby Robertson (also now deceased), prohibited the engineers to show this to perspective customers fearful that this would scare them away.<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>Chip types (array design options) were carefully managed as to not proliferate the chip types in the system. This was a new constraint placed on logic designers and was dealt with most professionally and responsibly by all participants once understood. The resultant chip total for the CPU (processing unit) was fewer than 150 while the chip types including clock chips and all logic design chips was fewer than 20 as best recalled. </p>
font-family:Helvetica">Thought was given to actually eliminate the need to
+
regenerate the system in a closed system but rather purchase Liquid Nitrogen –
+
readily available in tanks - and have them periodically refilled as is done in
+
the IC and other industries using Liquid Nitrogen.&nbsp; This was discarded for
+
the initial design since several customer sites did not easily accommodate the
+
external access to Liquid Nitrogen tanks.&nbsp; It was to be an option for
+
future systems and those customers that easily accommodated such an option.<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>During the development cycle of the ETA System Supercomputer, Honeywell moved the manufacturing capability from a local Minneapolis facility to a state-of-the-art manufacturing facility in Colorado Springs, CO. The transition was very transparent to ETA Systems (with the exception of the traveling budget, of course). To accomplish this team membership from both companies acted as one in all decisions addressing scheduling and timing of needs of various chips, testing, packaging, etc. The open book relationship was very beneficial to both companies. On one milestone occasion – where Honeywell successfully completed an initial order – Dave Frankel and I visited Honeywell, some 30 miles from the ETA Systems facility, and served cake and coffee to all designers and operators – it was below zero when this milestone was reached and no one cared. </p>
font-family:Helvetica">The final design was then a closed recycled Liquid Nitrogen
+
system with the compressor located remote, much like Freon compressors, which
+
many Supercomputer customers were already accommodating.<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>One design that was incorporated into the chip was to allow for next generation critical processing parameters to be added to the existing design (present chip layout). Although this would not optimize the features of new process features (all parameters were not considered), key performance enhancements could be and were added to the present design. A key feature was gate length and this was added transparently to the physical chip and offered appreciable performance enhancements to the design. </p>
font-family:Helvetica">The design challenge was at the surface (looked much
+
like a two slice toaster) where the processing boards were inserted. This seal
+
had to accommodate the connecting transmission to the external and room
+
temperature memory and I/O subsystems. A printed circuit board was designed to
+
connect the processor to the outside world.&nbsp; Heaters were applied to the
+
surface to prevent icing at the cryostat surface.<span style="mso-spacerun:
+
yes">&nbsp; </span>The separation, only a few short inches had memory operating at 300 degrees Kelvin and the CPU operating at 77 Degrees Kelvin.<span style="mso-spacerun: yes">&nbsp; </span>There were a few “frosty” events in this development cycle!<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
==== Chip design summary: ====
font-family:Helvetica">&nbsp;<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>The decision to utilize CMOS technology for the ETA Systems Supercomputer in the 1985 – 1988 time frame (prematurely by all industry metrics) resulted in the following additional “technology kit” decisions: </p>
font-family:Helvetica">The third challenge was to provide reliable soldering of
+
the circuitry to the board amidst the severe temperature difference that the
+
solder joints would be subjected to (greater than 250 degrees) during the cool
+
down and warm up cycles.&nbsp; Studies at the National Bureau of Standards provided
+
input that the temperature cycle should be profiled in a precise sequence as
+
the board was cooled and heated.&nbsp; In addition, care as not to remove the
+
board and to care for condensation that would occur if the board had not been
+
heated to room temperature was considered.&nbsp; The result was a 20-minute
+
cycle to remove or insert the board was designed with a specifically prescribed
+
sequence of temperature lowering and rising for both cycles.<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>Addition of chip self-test. Feature established functionality at wafer test and functionality and performance sorting at ETA Systems </p>
font-family:Helvetica">At the time of the unfortunate termination of ETA
+
Systems, a more refined, lower cost and lower weight design as stated earlier
+
was on the drawing boards.&nbsp; Although the cryostat and associated cooling
+
was costly, an analysis clearly showed that for the performance resulting from
+
the design, the cost was less than any Bipolar IC system designed at the
+
time.<span style="mso-spacerun: yes">&nbsp; </span>Once the connector was finalized and the process and assembly designed, the system operated flawlessly.<span style="mso-spacerun: yes">&nbsp; </span><o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
*Computer Layout tools that validated logic prior to chip release for fabrication
font-family:Helvetica">Checkout on the manufacturing floor of the system
+
*Requirement to operate the chip at 77 degrees Kelvin or in liquid Nitrogen
utilized the “Self-Test” capability exhaustively so specific interconnect flaws
+
*Packaging, interconnect &amp; assembly decisions based on liquid Nitrogen operation challengesRemote testing of the CPU because of liquid Nitrogen operation challenges
were clearly understood prior to removing a CPU from the cryostat, thus
+
*Logic design partitioning challenges to design within 15,000-gate per chip boundaries and a minimum of IC chip types
reducing checkout time considerably as well.<span style="mso-spacerun:
+
yes">&nbsp; </span>These designs were well done, significant and challenged laws of thermodynamics and physics to new limits.<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
== Printed Circuit Board Design Selection: ==
font-family:Helvetica">'''AIR COOLED SYSTEM '''</span><span style="font-size:
+
13.0pt;font-family:Helvetica"><o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>In the period of the 1980s, the time frame of the ETA Systems Supercomputer development, Printed circuit boards had maximum dimensions of approximately a square foot and the number of total layers fewer than 20. (Layers provide power and ground stability, interconnect capability for the circuits attached to the board as well as inputs and outputs to and from the board.) If these total layers are allocated properly, approximately 50% are used for interconnect and the remaining for power and ground. Positioning of power and ground layers also serve to provide interconnect layers that have transmission line capabilities to insure signal integrity throughout the board. During this period, a state-of-the-art printed circuit board was approximately one square foot of active circuitry and as stated earlier, 20 layers or fewer usually restricted to a total thickness of 0.063 inches. </p>
font-family:Helvetica">As stated earlier in the document, an Air-cooled
+
processor would operate considerably slower (2x slower) when operated in normal
+
or “room temperature” environments.&nbsp; ETA Systems by sorting the devices
+
for performance at incoming inspection, allowed for a three times performance
+
differential to be realized.&nbsp; Only the highest performance devices were
+
reserved for the Cryogenic cooled system.&nbsp; The remaining parts were then
+
re-sorted into two categories for room temperature; the differential would be a
+
4-nanosecond clock period between the two room temperature systems and 17
+
nanoseconds (24 to 7) for the total system product set.<span style="mso-spacerun: yes">&nbsp; </span>The sorting and using the entire distribution of Integrated Circuits had a significant cost reduction factor for the entire product line.<span style="mso-spacerun: yes">&nbsp; </span>Bipolar devices, by contrast had lower functional yield to begin with coupled with additional loss of product due to performance yield.<span style="mso-spacerun:
+
yes">&nbsp; </span>This was a definite cost reduction asset to the ETA System.<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>It was determined that a maximum of 150 chips would be required to design the ETA Systems Supercomputer CPU. Packaging of the IC and interconnecting the chip to a PC board with minimum spacing between chips (some spacing was required to allow interconnects to all of the necessary layers) resulted in a 1.2x1.2 sq. inch “footprint”. Doing the simple math results in a pc board of a minimum of 220 sq. inches. The number of total layers required to interconnect the 150 chips and the necessary Input and Output at the board periphery was determined to be 45. Looking at design parameters of the board layers in more depth and insuring transmission line features to insure signal integrity defined the board thickness at slightly greater than 0.25 inches. This thickness was approximately three times greater than high-end printed circuit boards produced in this time frame. With a board having an area of greater than 1.5 times the size of what was able to be produced, a thickness of 300% of what was produced and a the number of layers 2.5 times of what was produced in this time frame it was clear that the printed circuit board industry was not ready for the ETA Systems design! The design has other limitations. A key factor when designing pc boards is to insure proper connecting of the layers, i.e.; connecting the chip pins to the board and the proper layer of interconnect in the board and back to the proper receiving chip. Drilling holes in the layers and plating the wall of the holes with copper for conduction make these connections. These are called plated thru holes or PTH. A key parameter to insure that plating occurs in these holes is the hole diameter to depth ratio. The industry at this period (not much better today) is 6:1, i.e.; the thickness of the board must be no more than 6 times the diameter of the hole. This ratio would dominate the size of the board. If this ratio is used to design the board the board size would be increased in area by greater than 9 times. Talk about piling on! Since it was deemed not feasible, issues like cost and time to fabricate the board were not even addressed. </p>
font-family:Helvetica">To cool the CPU air was forced on to the processor chips
+
by using a plenum that was designed to cover each chip.&nbsp; Holes were
+
designed in the plenum such that equal operating temperature would result for
+
each operating chip.&nbsp; Since the power consumption variation significant
+
for several part types, designing the appropriate number of holes above each
+
chip location could provide custom cooling.&nbsp; The plenum could then be
+
molded for mass production of the processing unit.&nbsp; Large volume cooling
+
fans were designed for the system as well.&nbsp; Cost was the focus for the
+
air-cooled systems since the price tag was below $1M. Recall, that the
+
air-cooled design was identical in parts at the CPU and storage level.<span style="mso-spacerun: yes">&nbsp; </span>A single development was achieved for a wide range of products with one design team.<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>Nestled into the design laboratory of Control Data Corporation was a small but very innovative printed circuit board prototype facility. The leader of this group, LeRoy Beckman, never said “no” to challenges. He just bit his pipe a little harder and tried not to snicker out loud. LeRoy kept his eyes and ears out for innovative alternatives to conventional board fabrication techniques and had previously displayed innovation (evolutionary in nature) in previous generations. Embedded termination resistors in layers was one invention he brought to CDC when resistor termination took up too much board area; finer features than the industry was producing another, and higher plated through hole (pth) ratios than the industry a third.New technologies in the printed circuit board were few and far between. The industry was set in it’s ways of subtractive etching of circuit layers (removing unwanted copper from a pre-copper clad layer, convention wet etch processes and relatively simple assembly, i.e.; lamination of layers with pressure. One inventor, Mr. Peter P. Pellegrino, arrived on the scene to discuss innovative, revolutionary and proven pc board processing. At first the claims appeared to be too good to be true. Board size relatively independent, aspect ratios exceeding 20:1 for PTH, an additive process that permitted finer lines to be fabricated on individual layers. The layers were also embedded into the laminate so the opportunity for higher yield with reduced features. An additional benefit of additive plating is reduction in waste and water usage.A special plating cell was also introduced that permitted uniform deep hole plating by forcing plating fluid into each of the thousands of PTH. The process titled “Push-PullTM” also accelerated the plating manufacturing cycle by over an order of magnitude, reducing cost.A small plating cell was incorporated into the prototype facility at CDC and a controlled set of experiments conducted. Experiments were thorough and challenging since no one in the industry could approach the lofty objectives of the ETA Systems Supercomputer CPU board nor the lofty claims of the inventor. The results were simply outstanding. From the results and a commitment to fabricate a larger manufacturing line of plating insert cells, the 45 layer 15” x 24” CPU board became a realistic finalized goal of ETA Systems.Anyone told of this goal openly scoffed at this as too risky and unrealistic. This included some in the company as well.Later, when manufacturing of the systems was viable, a production capacity was developed for manufacturing. It is noted that hundreds of these boards were fabricated from a period of 1987 through early 1989. The yield of final boards was nearly perfect – only one finished board was scrapped. </p>
font-family:Helvetica">Storage<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>To this day (2009) few realize what a monumental accomplishment this was and still is. This a tribute to LeRoy Beckman, Peter Pellegrino, the manufacturing facility at ETA Systems (now a banking building in St. Paul) and those who trusted that the lofty objectives could be realized. </p>
font-family:Helvetica">Stacks using three-dimensional characteristics were
+
designed under the leadership of Brent Doyle for the memory – both static (high
+
performance) and dynamic (high density and lower performance) memories of the
+
ETA Systems Supercomputer.&nbsp; These unique designs provided for highest
+
density and optimum performance of the standard memory devices used.&nbsp;
+
Ability to upgrade to future generations of memory (more storage capacity
+
Integrated Circuits) was built into the design as well.&nbsp; The design worked
+
well and stacking became commonplace in the computer industry for future
+
designs – eventually eliminating the chip package entirely.<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>To accommodate routing and designing for minimum distance between IC chips, CAD tools were developed and the first use of diagonal routed layers were introduced. Prior to this only x–y layers were permitted with manual and/or auto tools (CAD). This enhancement permitted timing constraints to be realized between chips. </p>
font-family:Helvetica">&nbsp;<o:p></o:p></span>
+
<span style="font-size:13.0pt;font-family:Helvetica;mso-ansi-language:EN-US;
+
mso-fareast-language:EN-US">
+
  
</span>
+
<p>The final board had the following noteworthy characteristics: </p>
<span style="font-size:13.0pt;
+
font-family:Helvetica">The Air-cooled system was defined as a Piper.<span style="mso-spacerun: yes">&nbsp; </span>An Illustration of “Piper” is shown below.<o:p></o:p></span>
+
  
<!--[if gte vml 1]><v:shapetype
+
*Board size: 15 inches by 22 inches by 0.26 inches
id="_x0000_t75" coordsize="21600,21600" o:spt="75" o:preferrelative="t"
+
*Pth hole ratio ≈ 20:1 – plating time – less than 20 minutes
path="m@4@5l@4@11@9@11@9@5xe" filled="f" stroked="f">
+
*45 total layers per CPU panel
<v:stroke joinstyle="miter"/>
+
*150 IC chip locations (fewer were used in final design)
<v:formulas>
+
*More than 30,000 board plated thru holes (pth) were used for interconnect
  <v:f eqn="if lineDrawn pixelLineWidth 0"/>
+
  <v:f eqn="sum @0 1 0"/>
+
  <v:f eqn="sum 0 0 @1"/>
+
  <v:f eqn="prod @2 1 2"/>
+
  <v:f eqn="prod @3 21600 pixelWidth"/>
+
  <v:f eqn="prod @3 21600 pixelHeight"/>
+
  <v:f eqn="sum @0 0 1"/>
+
  <v:f eqn="prod @6 1 2"/>
+
  <v:f eqn="prod @7 21600 pixelWidth"/>
+
  <v:f eqn="sum @8 21600 0"/>
+
  <v:f eqn="prod @7 21600 pixelHeight"/>
+
  <v:f eqn="sum @10 21600 0"/>
+
</v:formulas>
+
<v:path o:extrusionok="f" gradientshapeok="t" o:connecttype="rect"/>
+
<o:lock v:ext="edit" aspectratio="t"/>
+
</v:shapetype><v:shape id="_x0000_i1025" type="#_x0000_t75" style='width:399pt;
+
height:391pt'>
+
<v:imagedata src="file://localhost/Users/tonyvacca/Library/Caches/TemporaryItems/msoclip1/01/clip_image001.gif"
+
  o:althref="file://localhost/Users/tonyvacca/Library/Caches/TemporaryItems/msoclip1/01/clip_image002.pct"
+
  o:title=""/>
+
</v:shape><![endif]-->[[Image:]]<span style="font-size:13.0pt;font-family:
+
Helvetica">
+
  
<o:p></o:p></span>
+
<p>In 2009 this board development and manufacturing stands out as one of the major technology developments by ETA Systems </p>
  
<span style="font-size:13.0pt;
+
== Packaging  ==
font-family:Helvetica">&nbsp;<o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>The key challenge for packaging the ETA Supercomputer processing unit was the cryogenic chamber for the processor. The Cryostat to contain the processor (two processor units) had a conventional (and quite heavy) circular cryostat containing a vacuum chamber between the outside environment and the inner environment. Input of liquid Nitrogen was at the bottom of the chamber and the escaping of the gaseous Nitrogen was provided for near the top of the unit. The piping containing the Nitrogen to and from the regeneration unit was also temperature protected with vacuum lines. Dan Sullivan and his design team led this admirable effort. (Unfortunately, Dan passed on a few years ago). It was felt that a less heavy and equally efficient chamber (proposed by Carl Breske – a very innovative scientist) could be designed if time permitted but the selection of the vacuum based design was conservative to accommodate schedule and also to familiarize the team with the challenges of Cryogenics. The compressor unit was a conventional Liquid Nitrogen system (very large and bulky) used for generation of Liquid Nitrogen for the commercial market. The system was not pretty. Marketing, led by Bobby Robertson (also now deceased), prohibited the engineers to show this to perspective customers fearful that this would scare them away. </p>
font-family:Helvetica">&nbsp;<o:p></o:p></span>
+
  
<span style="font-size:14.0pt;
+
<p>Thought was given to actually eliminate the need to regenerate the system in a closed system but rather purchase Liquid Nitrogen – readily available in tanks - and have them periodically refilled as is done in the IC and other industries using Liquid Nitrogen. This was discarded for the initial design since several customer sites did not easily accommodate the external access to Liquid Nitrogen tanks. It was to be an option for future systems and those customers that easily accommodated such an option. The final design was then a closed recycled Liquid Nitrogen system with the compressor located remote, much like Freon compressors, which many Supercomputer customers were already accommodating. The design challenge was at the surface (looked much like a two slice toaster) where the processing boards were inserted. This seal had to accommodate the connecting transmission to the external and room temperature memory and I/O subsystems. A printed circuit board was designed to connect the processor to the outside world. Heaters were applied to the surface to prevent icing at the cryostat surface. The separation, only a few short inches had memory operating at 300 degrees Kelvin and the CPU operating at 77 Degrees Kelvin. There were a few “frosty” events in this development cycle! The third challenge was to provide reliable soldering of the circuitry to the board amidst the severe temperature difference that the solder joints would be subjected to (greater than 250 degrees) during the cool down and warm up cycles. Studies at the National Bureau of Standards provided input that the temperature cycle should be profiled in a precise sequence as the board was cooled and heated. In addition, care as not to remove the board and to care for condensation that would occur if the board had not been heated to room temperature was considered. The result was a 20-minute cycle to remove or insert the board was designed with a specifically prescribed sequence of temperature lowering and rising for both cycles. At the time of the unfortunate termination of ETA Systems, a more refined, lower cost and lower weight design as stated earlier was on the drawing boards. Although the cryostat and associated cooling was costly, an analysis clearly showed that for the performance resulting from the design, the cost was less than any Bipolar IC system designed at the time. Once the connector was finalized and the process and assembly designed, the system operated flawlessly. Checkout on the manufacturing floor of the system utilized the “Self-Test” capability exhaustively so specific interconnect flaws were clearly understood prior to removing a CPU from the cryostat, thus reducing checkout time considerably as well. These designs were well done, significant and challenged laws of thermodynamics and physics to new limits. </p>
font-family:Helvetica">Summary <o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
== Air Cooled System  ==
font-family:Helvetica">The design of the ETA Systems Supercomputer hardware had
+
many unique features.&nbsp; The brief pages highlight some of them.&nbsp; <o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>As stated earlier in the document, an Air-cooled processor would operate considerably slower (2x slower) when operated in normal or “room temperature” environments. ETA Systems by sorting the devices for performance at incoming inspection, allowed for a three times performance differential to be realized. Only the highest performance devices were reserved for the Cryogenic cooled system. The remaining parts were then re-sorted into two categories for room temperature; the differential would be a 4-nanosecond clock period between the two room temperature systems and 17 nanoseconds (24 to 7) for the total system product set. The sorting and using the entire distribution of Integrated Circuits had a significant cost reduction factor for the entire product line. Bipolar devices, by contrast had lower functional yield to begin with coupled with additional loss of product due to performance yield. This was a definite cost reduction asset to the ETA System. </p>
font-family:Helvetica">It would be remiss not to briefly discuss the “team”
+
concept used to design the hardware.&nbsp; By having the CAD, Packaging,
+
memory, circuit and power expertise located in a close proximity and holding
+
concise project reviews at all levels at periodic and timely phases, all were
+
kept abreast of the progress and challenges of each other.&nbsp; This permitted
+
changes to be made to necessary designs to properly accommodate the challenges
+
and opportunities in a timely fashion.&nbsp; Hardware was demonstrated on or
+
near schedule despite the innovations required in each aspect of the
+
design.&nbsp; The team was truly a “team”.&nbsp; <o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
<p>To cool the CPU air was forced on to the processor chips by using a plenum that was designed to cover each chip. Holes were designed in the plenum such that equal operating temperature would result for each operating chip. Since the power consumption variation significant for several part types, designing the appropriate number of holes above each chip location could provide custom cooling. The plenum could then be molded for mass production of the processing unit. Large volume cooling fans were designed for the system as well. Cost was the focus for the air-cooled systems since the price tag was below $1M. Recall, that the air-cooled design was identical in parts at the CPU and storage level. A single development was achieved for a wide range of products with one design team. </p>
font-family:Helvetica">A missing link to the team was the logic design.&nbsp;
+
These folks were separate and actually on another floor of the ETA Systems
+
facility.&nbsp; It was strongly suggested and accepted for future designs, that
+
the logic team would be a part of this common organization. I had the
+
opportunity to lead one additional hardware development that included the logic
+
design team (at Cray Research, not ETA Systems) later.<span style="mso-spacerun: yes">&nbsp; </span>It was a smoother and more effective and thorough team.<span style="mso-spacerun: yes">&nbsp; </span>Like ETA Systems the communications were open and included both manufacturing and software participation (the later two were voluntary).<span style="mso-spacerun: yes">&nbsp; </span><o:p></o:p></span>
+
  
<span style="font-size:13.0pt;
+
== Storage  ==
font-family:Helvetica">Clearly, communications – effective communications at
+
all levels of the organization was key to this hardware design success.<o:p></o:p></span>
+
  
&nbsp;<o:p></o:p>
+
<p>Stacks using three-dimensional characteristics were designed under the leadership of Brent Doyle for the memory – both static (high performance) and dynamic (high density and lower performance) memories of the ETA Systems Supercomputer. These unique designs provided for highest density and optimum performance of the standard memory devices used. Ability to upgrade to future generations of memory (more storage capacity Integrated Circuits) was built into the design as well. The design worked well and stacking became commonplace in the computer industry for future designs – eventually eliminating the chip package entirely. </p>
<span style="font-size:12.0pt;font-family:&quot;Times New Roman";mso-ansi-language:
+
EN-US;mso-fareast-language:EN-US">
+
  
</span>  
+
<p>The Air-cooled system was defined as a Piper. An Illustration of “Piper” is shown below. </p>
<!--[if gte vml 1]><v:shape id="_x0000_s1026"
+
type="#_x0000_t75" style='position:absolute;left:0;text-align:left;
+
margin-left:0;margin-top:180.2pt;width:414pt;height:335.65pt;z-index:1'>
+
<v:imagedata src="file://localhost/Users/tonyvacca/Library/Caches/TemporaryItems/msoclip1/01/clip_image004.gif"
+
  o:althref="file://localhost/Users/tonyvacca/Library/Caches/TemporaryItems/msoclip1/01/clip_image005.pct"
+
  o:title=""/>
+
</v:shape><v:rect id="_x0000_s1027" style='position:absolute;left:0;
+
text-align:left;margin-left:65.7pt;margin-top:78.15pt;width:231.5pt;height:64.55pt;
+
z-index:2;mso-wrap-style:none' filled="f" fillcolor="#618ffd" stroked="f"
+
strokeweight="1pt">
+
<v:shadow color="#919191"/>
+
<v:textbox style='mso-fit-shape-to-text:t' inset="90487emu,3.5pt,90487emu,3.5pt"/>
+
</v:rect><![endif]--><span style="mso-ignore:vglayout">
+
  
 +
<p>[[Image:Piper processor.jpg|thumb|left]] </p>
  
{| cellpadding="0" cellspacing="0" align="left"
+
== Summary  ==
|-
+
| width="0" height="78" |
+
| width="66" |
+
| width="233" |
+
| width="115" |
+
|-
+
| height="67" |
+
|
+
| width="233" height="67" align="left" valign="top" style="vertical-align:top" | <span style="position:absolute;z-index:2">
+
 
+
{| cellpadding="0" cellspacing="0" width="100%"
+
|-
+
|
+
    <div v:shape="_x0000_s1027" style="padding:3.5pt 7.1249pt 3.5pt 7.1249pt" class="shape">
+
   
+
<span style="font-size:24.0pt;font-family:Helvetica;color:#00279F;text-shadow:
+
    auto">'''ETA 10 CPU<o:p></o:p>'''</span>
+
  
<span style="font-size:24.0pt;font-family:Helvetica;color:#00279F;text-shadow:
+
<p>The design of the ETA Systems Supercomputer hardware had many unique features. The brief pages highlight some of them. </p>
    auto">'''1988<span style="mso-spacerun: yes">&nbsp; </span>Product<o:p></o:p>'''</span>
+
</div>
+
|}
+
</span>&nbsp;
+
|-
+
| height="35" |
+
|-
+
| height="336" |
+
| colspan="3" align="left" valign="top" | [[Image:]]
+
|}
+
</span>&nbsp;<o:p></o:p>
+
<!--EndFragment--></span> ==
+
  
== <span class="Apple-style-span" style="font-family: Helvetica; font-size: 17px;">
+
<p>It would be remiss not to briefly discuss the “team” concept used to design the hardware. By having the CAD, Packaging, memory, circuit and power expertise located in a close proximity and holding concise project reviews at all levels at periodic and timely phases, all were kept abreast of the progress and challenges of each other. This permitted changes to be made to necessary designs to properly accommodate the challenges and opportunities in a timely fashion. Hardware was demonstrated on or near schedule despite the innovations required in each aspect of the design. The team was truly a “team”. </p>
</span> ==
+
  
== <span class="Apple-style-span" style="font-family: Helvetica; font-size: 17px;">
+
<p>A missing link to the team was the logic design. These folks were separate and actually on another floor of the ETA Systems facility. It was strongly suggested and accepted for future designs, that the logic team would be a part of this common organization. I had the opportunity to lead one additional hardware development that included the logic design team (at Cray Research, not ETA Systems) later. It was a smoother and more effective and thorough team. Like ETA Systems the communications were open and included both manufacturing and software participation (the later two were voluntary). </p>
</span> ==
+
  
== <span style="font-size:13.0pt;font-family:Helvetica;mso-ansi-language:EN-US; mso-fareast-language:EN-US">
+
<p>Clearly, communications – effective communications at all levels of the organization was key to this hardware design success.</p>
</span>
+
  
<!--EndFragment--> ==
+
[[Category:Microprocessors|{{PAGENAME}}]]
 +
[[Category:Computing and electronics|{{PAGENAME}}]]

Latest revision as of 16:16, 22 July 2014

Contributed by: Tony Vacca

Contents

How it started

It was in the early 80's. Control Data (CDC) had just launched the CYBER - 205 with modest success and the team was now focused on the next generation machine, the 2XX as I recall. Speed, cost and meeting the schedule were all key objectives. Speed because Cray Research under the guidance of Seymour Cray was setting milestones for Supercomputers with the Cray 1 and then the Cray 2. Cost, since Supercomputers were extremely expensive. Schedules since the CYBER - 205 had established patience records as a machine that may never get out the door and this must not be repeated.

A conventional evolutionary approach for Integrated Circuit (IC) logic was initially selected. Motorola, with some prodding, agreed to launch an 8,000 gate equivalent ECL (emitter-coupled-logic - the circuitry of choice for high performance processing units) provided that Control Data do the actual circuit development. There were insufficient customers for Motorola to commit their resources to this lofty development. Motorola did, however, commit their advanced ECL processes to CDC and a joint team was formed with the two companies.

Logic designers at the CDC Advanced Design Laboratory were given preliminary design rules based on computer device models and estimates of gate per chip densities. There was a natural follow up of grumbling by the logic design team led by very experienced and innovative folks (Ray Kort, Maurice Hudson and Dave Hill to name three) but circuit designers had learned to accept this since logic designers always found the circuits to be too slow and insufficient an quantity of gates and pins (I/O ports) per die. There was a lot of cooperation too. Basic building blocks were defined by the logic designers - gate functionality, register functionality, etc. From this set of preliminary rules function blocks were defined and capacity per reasonably-sized Printed Circuit (PC) boards defined. The initial design using the Cray CYBER - 205 based architecture was launched.

In parallel with this effort, and in the same design group; i.e.; circuit, packaging, PC board and newly formed CAD (tools for layout and design of chips and boards) - chief chip design engineer - Randy Bach - was assigned to develop an advanced CMOS chip for the Canadian Computer Development organization. At this time, early 80's CMOS was in it's infancy being used for memory devices, low performance peripherals and also for low performance microprocessors (5 to 10 MHz clock speeds). The design contained 5,000 gates plus appropriate input and output communication devices. Gate arrays for CMOS was also nearly non-existent so Randy and his small team of two assistants developed a cell library and worked closely with the Canadian Development team to meet their objectives as well.

This effort was completely separate from the ECL based gate array to be used for the next generation Supercomputer. The product was developed for a low cost application.

It was customary for Neil Lincoln - chief architect, Dale Handy - manufacturing manager and me to go off to lunch every 8 to 10 days to discuss status at either Author Treacher's Fish & Chips or Zantigo's (high class - NOT) fast food restaurants. As a side note, both of these fast food places disappeared during the ETA Systems brief duration. Zantigo's has returned (I think because they know it is safe now that the three of us cannot visit together any longer - Neil unfortunately passed on a few years ago).

At one of these meetings, Neil had "news" for me. Simply stated, the gate array in active co-development with Motorola had unacceptable goals. The chip had too few I/O pins, consumed too much power and insufficient gates. In addition, he completed a cost model which indicated an unacceptable cost figure for the CPU. He also determined that the CPU (some 3 Million gates) had to be assembled on a single board. "It was time for this goal to be reached". He also reached the conclusion that a proper logic design required at least 15,000 gates per chip to meet these goals.

The logic designers had gotten to him I surmised. Schedules, Neil reminded us, could not be altered - and that was that. To soften the blow he bought lunch that day, three Cokes and three orders of fish and chips - Neil's was a large order.

The trip back to the lab was pretty quiet, fortunately short since our eating places were all very close to the lab.

That afternoon, I assembled the key folks - I might miss one or two but Randy Bach, Doug Carlson, Dave Resnick and John Ketzler were four that I recall now. Doug was a mechanical engineer that I assigned the Motorola project to because of his management skills - something he probably never forgave me for - John was the key circuit engineer on the Motorola project and Dave was and still is a very versatile and perceptive engineer.

Doug and I would inform Motorola of the decision not to continue. The team would package up what was accomplished and turn it over to Motorola to carry the ball forward if they wished.

As a side note, Motorola and Cray did continue the design. It was the circuit design used in the Cray C90, a very successful Cray Research Supercomputer.

The meeting turned to what were the next steps.

The key challenges that emerged were:

  • IC Technology that could meet the new lofty goals
  • The PC board technology required to meet a single board CPU
  • Packaging and interconnect technology required to support the two above requirements
  • Computer Aided Design (CAD) technology necessary to accurately design IC and PCB technologies
  • Suppliers for all - do they exist?
  • What additional internal resources were required to achieve objectives
  • System packaging beyond a single CPU. (Memory, peripherals, I/O, etc.)
  • Testing of complex IC technology and complex PCB technology

Results

Before getting to the details as to how decisions were made and how the ETA System technologies the “kit” was selected and developed, a list of noteworthy accomplishments achieved are listed:

  • First Industry competitive CMOS CPU

Since 1995 – to the present (beginning 12 years after the technology selection by ETA Systems I might add) ALL HPC (High Performance Computers) are developed and manufactured using CMOS IC technology. Until as late as 2000, bipolar technology (higher power, more costly to manufacture and lower gate count per chip) dominated high performance computers throughout the world.

  • First Industry Single Board CPU

The chip density (gates per chip) allowed by advanced CMOS, the use of layout and design Computer aided design tools for optimum layout and simulation, the successful design of a 45 layer advance Printed Circuit board (you read it right 45 layers) and innovative chip attachment and cooling permitted a single processor containing nearly 3 million gates to be packaged on a single board

  • First Industry system to bedesigned with self-test

CPU Processing units (≈3Million gates each) were validated for functionality and performance in less than 4 hours.Any interconnect errors were recorded and allowed chip-to-chip replacement to occur in a minimal time.Other CPU checkout during this same period required weeks to months to check out and validate a processing unit. Incoming testing of the logic IC Chip (function and performance) also used the same self-test innovations.

  • First Industry production Liquid Nitrogen CPU

The ETA Systems CPU was immersed in Liquid Nitrogen – 77 degrees Kelvin – to improve performance greater than two times that CMOS technology operated at room temperature – 300 degrees Kelvin.

  • First system at CDC to fully utilize Computer Design Software to design Chips, boards, validate Logic design and Auto Diagnostic test the system with Synergistic tools

Permitted checkout of a CPU to be completed in less than 4 hours. Manufacturing costs were greatly reduced. This technique was also used at the IC Supplier and greatly reduced any probe test hardware and software.

  • First Industry system to havemultiple cost designs from single design effort

Performance range of the ETA System products was greater than 24:1 (8 processor system operating at 7 nanoseconds Clock period and a single processor system operating at 24 nanoseconds.). Processors were manufactured, tested and validated from a single manufacturing line using identical components. (IC Chips were performance sorted using auto self test). Product differences began at the system packaging level.

Boring into details

Any Technology kit must be driven by a customer need. In the case of Supercomputers the craving for increased computer performance at a lower cost (overall cost) was the deciding factor. In any Supercomputer company a combination of marketing requirements, architecture innovations and logic design demands dictate the initial objectives of the hardware circuit and packaging organization. I state “initial” since once the objectives are digested and key technologies are evaluated for the time frame addressed, compromises are the norm. In the case of ETA Systems technology selections in the early 1980’s, this was the strategy implemented.

The following paragraphs sequence the thought process and the technology selection strategy utilized.

Integrated Circuit selection

The objectives, listed in earlier paragraphs were first integrated into the architecture and logic design requirements. A market survey of key integrated circuit suppliers was conducted with emphasis on what was in development and planned for product introduction – not what was available at the time of the survey. A risk assessment was made. Primary focus was on the most dynamic technology, the IC Logic technology. All decisions as to volume requirements, pins, packaging, etc. resulted from what was determined by this survey and risk analysis. Merging the logic design objectives (gates, bandwidth and performance of key functions) was next.

An ECL (emitter coupled logic) high performance bipolar gate array using Motorola advanced IC technology was selected. Since Motorola was not fully staffed to begin the actual product development (application) but did have the process development underway, a cooperative development agreement was struck with the two companies (this occurred between Motorola and Control Data since ETA Systems had yet not been formed). The design called for basic logic cells to be incorporated into a larger version of their existing gate array advancing the process for increased performance and chip size for increased gate capacity. The existing gate or function array utilized approximately 2,500 gates (which was used as the primary gate array for the Cray Research very popular Y-MP Supercomputer) and the planned gate array would contain an excess of 8,000 equivalent gates.

Logic cell libraries were agreed to (acceptable to both Motorola for the general market and to CDC for the logic designs). Pin counts (for power, ground and input/output logic communications) were established and power consumption estimates were made. Once these parameters were established, board size, power systems and thermal control were evaluated in a trade off give-and-take. Features of Printed Circuit Boards, (line widths, spacing, interconnect vias and number of layers were compared to the board size capacities, laminating press capabilities, drill designs and printed pc board processing limits. IC packaging, limits, i.e.; minimum size of package, pin spacing, thermal removal, etc. was evaluated in parallel with PC Board limits.

The chip design began, the cell library began and the packaging began once all parameters (pins, power consumption and die size objectives) were agreed to. Printed circuit board experiments also began. Once feasibility was established and practical limits established (original goals could be met as to physical design and performance based on IC Modeling and extrapolation from previous established functional systems, a preliminary specification was presented to the architects and logic designers for review.

From initial design data, logic design based on the parameters provided established a physical size for the Central Processing unit or CPU, the heart of the system. A multiple board processor was required. This placed additional constraints on packaging since within a single processor all distances are crucial between circuits. Three-dimensional packaging concepts were considered. Three dimensional packaging effectively meant a “sandwich” effect of multiple boards with interconnects from board to board were throughout the area – not exclusive to the periphery of the board such that chips on each of the boards would minimize distances between them. In addition, power consumption estimates were made; thermal removal paths and techniques were considered. A cost model was generated as well. All of these factors resulted in a preliminary estimate of the CPU volume. In the introduction portion of the document, you already know that this was rejected - more to follow for sure.

In parallel with these efforts, memory design was underway. Less freedom was available to memory since the basic semiconductor device could not be altered to accommodate specific users. There were a few packaging alternatives, very few, and device configurations (Word – Bit architecture, pin numbering, power considerations, etc.) were dictated by the industry. Since memory design has its own objectives for cost, reliability and performance, this effort could continue quite independently with one exception, the packaging of the total system must be synergistic and compatible. A crucial parameter of this is the interconnect mechanism between processors and memory.

A hardware system cost model was established – not only for current cost considerations but also estimates on volume costs based on learning-curve estimates as well for the life of the system.

The chief architect, after careful review, rejected the design; this was covered in the introduction. Three key reasons were sited; performance would be impacted due to the 8,000 gate limit, (worst case logic paths could not reside in a single chip and multiple chip distances would increase the clock period), power consumption per CPU, although lower on a performance ratio basis to previous generations, was too high when the total system size (including the multiprocessor objectives) were considered and system cost appeared prohibitive – always a subjective issue but never-the-less a key component of the design. Reliability concerns were also stated since the pin-count per CPU, although quite reduced from previous designs, were of concern. The architecture was committed to four CPUs (max) per system so the interconnect "bar" was raised.

Back to the drawing board

Bipolar technology refers to conventional NPN and PNP transistors operating in a non-saturating mode (collector-base). By not saturating the operating transistors (not allowing the base voltage being higher than the collector voltage) the switching characteristics were improved and balanced (off logic level and on logic levels had identical delays). In addition, the non-saturating circuitry – titled ECL for Emitter coupled logic – provided the TRUE and COMPLIMENT outputs for each logic function (i.e.; AND & NAND, OR & NOR, etc). This provided advantages to logicians to design complex Boolean functions (ADD units, MULTIPLY units, DIVIDE units, etc). Under the category of “no free ride” ECL circuitry consumed higher power than the more popular but much slower saturating logic circuitry (TTL – transistor-transistor logic). Other improvements in performance for integrated versions of ECL logic circuitry included replacing conventional junction isolation between circuit devices on a single die with Oxide isolation between circuits (lower capacitance per circuit so less charging and discharging when logic levels switched).

CMOS (Complementary Metal Oxide Silicon) circuitry, especially at the time of ETA System, was a simpler and more efficient logic circuit. This form of logic also had a simpler process. Stacking of P channel and N channel transistors in series between voltage bus rails defines a single complementary gate. Functionality of the logic devices is much more forgiving to process variations due to the larger voltage swing and only active transistors used to define the circuitry (no resistors, diodes, etc.). The physical size of a logic function when compared to a bipolar equivalent is significantly smaller, resulting in an increase in circuitry per equivalent die (chip) size. CMOS technology also consumed power ONLY when the circuit was switching (changing states) so power consumption was directly proportional to the frequency it was operating.(P = CV2f) ECL circuitry, by contrast, consumed approximately the same power – while switching or in a quiescent state. (Later forms of CMOS – especially those designed in early 2000 and beyond, had increased power consumption primarily caused by increased bulk leakage currents as a result of processes developed for lithography having features en excess (smaller) than 90 nanometers. Technology at the time of the development of the ETA Supercomputers had minimum features of 1,200 nanometers. (In 2009, by contrast, the production capability is 45 nanometers)

Advantages of CMOS were obvious; more circuits per given chip area, lower power consumption and higher functional yield. It is important to stress “functional yield”. The CMOS devices functioned over a much larger range of processing variations (> 50% Vs. < 15% to 25% for ECL). Performance variations for a given process were approximately 2 to 3 times for CMOS and 20% to 30% for ECL. For this reason CMOS devices were sold at a much lower performance than any bipolar counterpart.(I.e.; if the product was specified to accommodate the entire functional lot (wafers processed at the same time), more IC devices yielded. There is one other key difference in defining performance differences between Bipolar and CMOS devices. For ECL (or any other bipolar device) the maximum operating frequency is defined, in part, by the base width – the physical distance between the emitter and collector of the transistor. This is determined by the spacing based on diffusion or implant of the emitter and is controlled in the vertical direction and limited by process control that is quite precise. This parameter is very thin and the frequency is determined indirectly proportional to the base width. For CMOS the gate length defines the critical performance parameter. Gate length is defined by mask optic limitations for any generation of processes. Bipolar devices in the 1980’s and well into the later half on the 1990’s, therefore, had higher operating maximum frequencies than their CMOS counterparts. As capital equipment – primary optics to generate masking and etching capabilities defined smaller and smaller geometries, CMOS technology improved dramatically in performance. This was a result of smaller gate lengths but also each generation had smaller devices resulting in lower capacitance loading and lower time constants to charge and discharge. During the time of the ETA Systems Supercomputer development, CMOS technology had not seen the advantages that bipolar devices could realize – but the potential for future improvements was obvious and projections clearly indicated that by the second half of the 1990’s (nearly 10 years after the first ETA Systems Supercomputer would be available), CMOS would overtake Bipolar in the last and most important parameter – performance. To restate this; the IC industry was transitioning to CMOS technology and more funding at the device, and equipment level was being expended to accommodate new markets focused on potential of CMOS than was being expended for Bipolar devices.

Bipolar technology was stretched to a practical limit for the time frame in question.

The IC industry, therefore, had only one other technology candidate, CMOS, which was, in 1983, used exclusively for lower cost and considerable lower performance applications and memory device technology where more bits per die could be fabricated at the expense of lower performance of the Bipolar counterpart(s). The impressive characteristic of CMOS technology at this time was: Lower power consumption per function, smaller size per logic function and lower cost per die due to two key factors (smaller physical size per function meant more logical functions per unit of area, and higher chip yield – chip functionality per wafer manufactured – due to reduced number of processing steps to generate CMOS devices. That was the good news. The concern was system performance. While bipolar technology had set the standard for clock periods of 10 NSec for Supercomputer architectures such as the ETA System projection, CMOS was at least 5 times slower – in most cases 10 to 20 times slower for equivalent architectures. Based on this parameter alone, CMOS was not a candidate for Supercomputers in the 1999-1990 time frame (the time frame where the ETA Systems Supercomputer would be in high volume production).

The next steps for CDC (recall that at this time CDC still had a Supercomputer Division) were dramatic and at times emotional. First, the team had to discard the ECL design and terminate the effort with Motorola. This was very difficult since both companies depended on each other and secondly, all objectives of the ECL product were being met within the specifications established. CDC (team which later became ETA Systems) provided Motorola with all of the design details to date. Considerable effort was made to insure that the program was successful at Motorola.

A sidelight to this discussion – Motorola completed this product as an industry product. Cray Research Inc. (the key competition and leader of the Supercomputer market) engaged with Motorola to successfully complete this complex IC development for a product announced in the late 1980’s. The product (Cray C-90) under the leadership of Les Davis, Steve Nelson and other notable scientists (a key circuit designer was Mark Birrittella), became another very successful supercomputer products developed and manufactured by Cray Research Inc..

Next, a full effort evaluation of all technology candidates occurred. CMOS futures were explored in depth. GaAs technology was also evaluated. Alternative ECL (bipolar) candidates were also considered. CMOS was viewed as the technology of the future but the future was beyond the time frame necessary for product introduction.

Key events that led to the decision to use CMOS technology.

Moore’s law (invented by the great innovator and co-founder of Intel – Gordon Moore) stated that IC technology (CMOS) technology, would double in performance and density every 18 months to two years. The actual Moore’s law may have been stated somewhat differently but this captured all the project cared about. To achieve this predicted growth, several parameters had to occur:

  • The die size would increase (more gates per manufactured chip).
  • Features on the chip (metal widths and spaces to interconnect devices and actual device parameters) would be reduced every 16 months to 2 years. Reducing parameter sizes have two positive results to goals of ETA Systems: increased performance and more gates per die.
  • The technology would gain broad industry popularity – this would mean that capital equipment would keep pace with the “law”, applications would increase thus increasing volume, thus lowering cost and increasing performance and more applications and industries would drive CMOS technology – the Supercomputer industry could not drive such a large industry.

Key industry activities also emerged at this time:

  • CDC validated operational performance gains operating CMOS technology in a cryogenic environment. Several ring counter configurations generated with the 5,000 gate chip discussed earlier were dipped in a Liquid Nitrogen thermos jug expecting to witness the shattering of the silicon and the detachment of the solder joints attached to the oscilloscope only to find the frequency of the ring oscillators double and the system operate for weeks until we turned off the experiment. Analytical analysis applied to the Silicon design validated the research done previously by others.
  • Key US Government agencies began a technology acceleration program based on CMOS technology – the Very High Speed Integrated Circuits (VHSIC) program under direction of the Army, Navy and Air Force certainly captured our attention.
  • Honeywell, one of the participants in the VHSIC program held a technology luncheon IEEE symposium in which they presented an 11,000-gate CMOS development effort. Attendees from CDC were impressed (especially the key designer – Randy Bach - with what the efforts. The chip was certainly larger than any that had been developed to date and the performance was accelerated beyond what was predicted for the 1988 time frame by the conventional IC industry (the introduction date set for the ETA Systems Supercomputer – then the next generation CDC Supercomputer). Honeywell was a recipient of one of the VHSIC contracts.
  • Logicians and architects back at CDC - led by Neil Lincoln (chief architect), Ray Kort, Maurice Hudson and Dave Hill and others - determined that an minimum gate density of 15,000 gates per die would allow them to achieve a key objective; having a worst case Register to Register clock path residing within a single chip. Now additional explanation is required here. There were technical reasons that the logicians wanted more beyond the knee jerk reaction that asking for 50% more than offered was a standard mode of operation for these guys. Each architecture configuration has a method of achieving its goals of applying computational instructions to problems. The number of gates that are connected in serial fashion between the input and output registers (and this is truly simplifying the problem) determine the clock period that is allowed. For the ETA Systems Supercomputer, therefore, it was determined that a functional unit clock period could reside within the boundary of the chip if the chip could provide 15,000 gates of logic to the designer.
  • Research into technology experiments uncovered significant performance features of CMOS technology. First of all, the technology was functional across a wide range of voltages and temperatures but performance was significantly altered. The higher the operating voltage (within semiconductor constraints, of course) the higher the performance resulted. Unfortunately the Power consumption, although significantly lower than any alternative technology, increased as the Square of the operating voltage. The lower the operating temperature of CMOS the higher performance as well. This factor was studied by others and carefully documented from 400 degrees Kelvin (100 degrees above room temperature) to 77 degrees Kelvin. (77 Degrees Kelvin is the boiling point temperature of liquid Nitrogen.)

Summary of what was learned with this evaluation

  • IC chips currently (four years before the need for an ETA Systems product) had a capacity of 11,000 gates.
  • The performance of these gates, when operated at liquid Nitrogen temperatures, would perform at least two times faster than at room temperature – not yet validated at CDC.
  • 15,000 useable gates were required per chip to meet logic designer chip boundary requirements.<o:p></o:p> ¨ If Moore’s law was applied to these parameters, within the time frame required, it was possible to achieve both gates per chip densities and performance goals (if the system operated in a liquid Nitrogen environment).
  • There were at least two IC Suppliers (those having contracts with the US government) that were pursuing CMOS as a performance and high gate/chip density technology (the other known corporation was TRW).
  • Computer Aided Design (CAD) tools were, during the period of the 80’s, in the infancy stage if one was to compare them to today’s capabilities. To design, place cells within the matrix of the gates provided on the IC Chip, and route the interconnections of these cells accurately to the logic or Boolean design required by the logicians and to clock period constraints was a challenge. This challenge applied to board layout designs as well. Control Data Corporation (CDC) recognized the challenges and established a small but efficient and dedicated organization to address these challenges. The industry had established a metric that to use CAD tools for gate or cell arrays, an additional 20% to 30% gates were required. This meant if the ETA Supercomputer required at least 15,000 useable gates to accomplish necessary designs based on its architecture, an 18,000 to ≈20,000-gate capacity was required. The technology organization set at its objectives a design of 20,000 gates plus necessary circuitry to self-test each gate or cell array. This as compared to the gate array in development at Honeywell was nearly 2 times the capacity (11,000 total gates Vs. 20,000 total gates plus circuitry for self test).

The task was to convince Honeywell to project the next generation size and layout rules and to accept an R&D effort that would allow CDC / ETA Systems achieve its objectives. Honeywell, an innovative organization, took on the task after considerable discussion with key requirements:

  • ETA Systems (we were now ETA Systems by the time these discussions reached negotiations) accept costs based on wafers processed, not functional chips. Honeywell would provide necessary processing data to reflect wafers were processed within process parameter specifications.
  • ETA Systems provide test equipment for wafer testing and test parameters for chip acceptance prior to packaging.
  • Both companies would share facilities and key resources and work as a single team – as “open a Kimono relationship” that one could ever imagine during this dynamic period of complex process developments within the IC Industry. – David Frankel was assigned the task as ETA Systems interface and engergetically took on the challenging task.
  • Self-test circuitry was designed into the basic cell array periphery. The area consumed by this custom set of pseudo-random generated logic and registers was less than 15% of the total chip area. (David Resnick, resident do-it-all reduced concepts explored by ex CDC scientist Nick Van Brunt who left the company a year previous to the formation of ETA Systems.) This was one of many extra ordinary contributions David made to ETA Systems. Additionally to providing self test capability to accept or reject the circuitry – both functionality and performance sorting – the circuitry included in each 20,000 gate array had capability to test for interconnect between circuits on the final PC Board as well as circuit to I/O connections.

When the logic design team first heard of this area “waste” of test circuits that could be used for logic design, they lobbied for it to be removed in favor of more logic gates for function designs. Fortunately this request was not honored. IC validation at both the supplier in wafer form and at ETA Systems in packaged chip configuration coupled with the use of the same circuitry in manufacturing checkout to detect board opens and shorts between circuits assembled both in room temperature and cryogenic temperature environments proved to be well worth this “waste” of circuitry area. Small, relatively inexpensive testing systems were designed by ETA Systems and provided to the supplier. The operands for initialization of the pseudo-random logic were also supplied for each design (chip type).

Chip types (array design options) were carefully managed as to not proliferate the chip types in the system. This was a new constraint placed on logic designers and was dealt with most professionally and responsibly by all participants once understood. The resultant chip total for the CPU (processing unit) was fewer than 150 while the chip types including clock chips and all logic design chips was fewer than 20 as best recalled.

During the development cycle of the ETA System Supercomputer, Honeywell moved the manufacturing capability from a local Minneapolis facility to a state-of-the-art manufacturing facility in Colorado Springs, CO. The transition was very transparent to ETA Systems (with the exception of the traveling budget, of course). To accomplish this team membership from both companies acted as one in all decisions addressing scheduling and timing of needs of various chips, testing, packaging, etc. The open book relationship was very beneficial to both companies. On one milestone occasion – where Honeywell successfully completed an initial order – Dave Frankel and I visited Honeywell, some 30 miles from the ETA Systems facility, and served cake and coffee to all designers and operators – it was below zero when this milestone was reached and no one cared.

One design that was incorporated into the chip was to allow for next generation critical processing parameters to be added to the existing design (present chip layout). Although this would not optimize the features of new process features (all parameters were not considered), key performance enhancements could be and were added to the present design. A key feature was gate length and this was added transparently to the physical chip and offered appreciable performance enhancements to the design.

Chip design summary:

The decision to utilize CMOS technology for the ETA Systems Supercomputer in the 1985 – 1988 time frame (prematurely by all industry metrics) resulted in the following additional “technology kit” decisions:

Addition of chip self-test. Feature established functionality at wafer test and functionality and performance sorting at ETA Systems

  • Computer Layout tools that validated logic prior to chip release for fabrication
  • Requirement to operate the chip at 77 degrees Kelvin or in liquid Nitrogen
  • Packaging, interconnect & assembly decisions based on liquid Nitrogen operation challengesRemote testing of the CPU because of liquid Nitrogen operation challenges
  • Logic design partitioning challenges to design within 15,000-gate per chip boundaries and a minimum of IC chip types

Printed Circuit Board Design Selection:

In the period of the 1980s, the time frame of the ETA Systems Supercomputer development, Printed circuit boards had maximum dimensions of approximately a square foot and the number of total layers fewer than 20. (Layers provide power and ground stability, interconnect capability for the circuits attached to the board as well as inputs and outputs to and from the board.) If these total layers are allocated properly, approximately 50% are used for interconnect and the remaining for power and ground. Positioning of power and ground layers also serve to provide interconnect layers that have transmission line capabilities to insure signal integrity throughout the board. During this period, a state-of-the-art printed circuit board was approximately one square foot of active circuitry and as stated earlier, 20 layers or fewer usually restricted to a total thickness of 0.063 inches.

It was determined that a maximum of 150 chips would be required to design the ETA Systems Supercomputer CPU. Packaging of the IC and interconnecting the chip to a PC board with minimum spacing between chips (some spacing was required to allow interconnects to all of the necessary layers) resulted in a 1.2x1.2 sq. inch “footprint”. Doing the simple math results in a pc board of a minimum of 220 sq. inches. The number of total layers required to interconnect the 150 chips and the necessary Input and Output at the board periphery was determined to be 45. Looking at design parameters of the board layers in more depth and insuring transmission line features to insure signal integrity defined the board thickness at slightly greater than 0.25 inches. This thickness was approximately three times greater than high-end printed circuit boards produced in this time frame. With a board having an area of greater than 1.5 times the size of what was able to be produced, a thickness of 300% of what was produced and a the number of layers 2.5 times of what was produced in this time frame it was clear that the printed circuit board industry was not ready for the ETA Systems design! The design has other limitations. A key factor when designing pc boards is to insure proper connecting of the layers, i.e.; connecting the chip pins to the board and the proper layer of interconnect in the board and back to the proper receiving chip. Drilling holes in the layers and plating the wall of the holes with copper for conduction make these connections. These are called plated thru holes or PTH. A key parameter to insure that plating occurs in these holes is the hole diameter to depth ratio. The industry at this period (not much better today) is 6:1, i.e.; the thickness of the board must be no more than 6 times the diameter of the hole. This ratio would dominate the size of the board. If this ratio is used to design the board the board size would be increased in area by greater than 9 times. Talk about piling on! Since it was deemed not feasible, issues like cost and time to fabricate the board were not even addressed.

Nestled into the design laboratory of Control Data Corporation was a small but very innovative printed circuit board prototype facility. The leader of this group, LeRoy Beckman, never said “no” to challenges. He just bit his pipe a little harder and tried not to snicker out loud. LeRoy kept his eyes and ears out for innovative alternatives to conventional board fabrication techniques and had previously displayed innovation (evolutionary in nature) in previous generations. Embedded termination resistors in layers was one invention he brought to CDC when resistor termination took up too much board area; finer features than the industry was producing another, and higher plated through hole (pth) ratios than the industry a third.New technologies in the printed circuit board were few and far between. The industry was set in it’s ways of subtractive etching of circuit layers (removing unwanted copper from a pre-copper clad layer, convention wet etch processes and relatively simple assembly, i.e.; lamination of layers with pressure. One inventor, Mr. Peter P. Pellegrino, arrived on the scene to discuss innovative, revolutionary and proven pc board processing. At first the claims appeared to be too good to be true. Board size relatively independent, aspect ratios exceeding 20:1 for PTH, an additive process that permitted finer lines to be fabricated on individual layers. The layers were also embedded into the laminate so the opportunity for higher yield with reduced features. An additional benefit of additive plating is reduction in waste and water usage.A special plating cell was also introduced that permitted uniform deep hole plating by forcing plating fluid into each of the thousands of PTH. The process titled “Push-PullTM” also accelerated the plating manufacturing cycle by over an order of magnitude, reducing cost.A small plating cell was incorporated into the prototype facility at CDC and a controlled set of experiments conducted. Experiments were thorough and challenging since no one in the industry could approach the lofty objectives of the ETA Systems Supercomputer CPU board nor the lofty claims of the inventor. The results were simply outstanding. From the results and a commitment to fabricate a larger manufacturing line of plating insert cells, the 45 layer 15” x 24” CPU board became a realistic finalized goal of ETA Systems.Anyone told of this goal openly scoffed at this as too risky and unrealistic. This included some in the company as well.Later, when manufacturing of the systems was viable, a production capacity was developed for manufacturing. It is noted that hundreds of these boards were fabricated from a period of 1987 through early 1989. The yield of final boards was nearly perfect – only one finished board was scrapped.

To this day (2009) few realize what a monumental accomplishment this was and still is. This a tribute to LeRoy Beckman, Peter Pellegrino, the manufacturing facility at ETA Systems (now a banking building in St. Paul) and those who trusted that the lofty objectives could be realized.

To accommodate routing and designing for minimum distance between IC chips, CAD tools were developed and the first use of diagonal routed layers were introduced. Prior to this only x–y layers were permitted with manual and/or auto tools (CAD). This enhancement permitted timing constraints to be realized between chips.

The final board had the following noteworthy characteristics:

  • Board size: 15 inches by 22 inches by 0.26 inches
  • Pth hole ratio ≈ 20:1 – plating time – less than 20 minutes
  • 45 total layers per CPU panel
  • 150 IC chip locations (fewer were used in final design)
  • More than 30,000 board plated thru holes (pth) were used for interconnect

In 2009 this board development and manufacturing stands out as one of the major technology developments by ETA Systems

Packaging

The key challenge for packaging the ETA Supercomputer processing unit was the cryogenic chamber for the processor. The Cryostat to contain the processor (two processor units) had a conventional (and quite heavy) circular cryostat containing a vacuum chamber between the outside environment and the inner environment. Input of liquid Nitrogen was at the bottom of the chamber and the escaping of the gaseous Nitrogen was provided for near the top of the unit. The piping containing the Nitrogen to and from the regeneration unit was also temperature protected with vacuum lines. Dan Sullivan and his design team led this admirable effort. (Unfortunately, Dan passed on a few years ago). It was felt that a less heavy and equally efficient chamber (proposed by Carl Breske – a very innovative scientist) could be designed if time permitted but the selection of the vacuum based design was conservative to accommodate schedule and also to familiarize the team with the challenges of Cryogenics. The compressor unit was a conventional Liquid Nitrogen system (very large and bulky) used for generation of Liquid Nitrogen for the commercial market. The system was not pretty. Marketing, led by Bobby Robertson (also now deceased), prohibited the engineers to show this to perspective customers fearful that this would scare them away.

Thought was given to actually eliminate the need to regenerate the system in a closed system but rather purchase Liquid Nitrogen – readily available in tanks - and have them periodically refilled as is done in the IC and other industries using Liquid Nitrogen. This was discarded for the initial design since several customer sites did not easily accommodate the external access to Liquid Nitrogen tanks. It was to be an option for future systems and those customers that easily accommodated such an option. The final design was then a closed recycled Liquid Nitrogen system with the compressor located remote, much like Freon compressors, which many Supercomputer customers were already accommodating. The design challenge was at the surface (looked much like a two slice toaster) where the processing boards were inserted. This seal had to accommodate the connecting transmission to the external and room temperature memory and I/O subsystems. A printed circuit board was designed to connect the processor to the outside world. Heaters were applied to the surface to prevent icing at the cryostat surface. The separation, only a few short inches had memory operating at 300 degrees Kelvin and the CPU operating at 77 Degrees Kelvin. There were a few “frosty” events in this development cycle! The third challenge was to provide reliable soldering of the circuitry to the board amidst the severe temperature difference that the solder joints would be subjected to (greater than 250 degrees) during the cool down and warm up cycles. Studies at the National Bureau of Standards provided input that the temperature cycle should be profiled in a precise sequence as the board was cooled and heated. In addition, care as not to remove the board and to care for condensation that would occur if the board had not been heated to room temperature was considered. The result was a 20-minute cycle to remove or insert the board was designed with a specifically prescribed sequence of temperature lowering and rising for both cycles. At the time of the unfortunate termination of ETA Systems, a more refined, lower cost and lower weight design as stated earlier was on the drawing boards. Although the cryostat and associated cooling was costly, an analysis clearly showed that for the performance resulting from the design, the cost was less than any Bipolar IC system designed at the time. Once the connector was finalized and the process and assembly designed, the system operated flawlessly. Checkout on the manufacturing floor of the system utilized the “Self-Test” capability exhaustively so specific interconnect flaws were clearly understood prior to removing a CPU from the cryostat, thus reducing checkout time considerably as well. These designs were well done, significant and challenged laws of thermodynamics and physics to new limits.

Air Cooled System

As stated earlier in the document, an Air-cooled processor would operate considerably slower (2x slower) when operated in normal or “room temperature” environments. ETA Systems by sorting the devices for performance at incoming inspection, allowed for a three times performance differential to be realized. Only the highest performance devices were reserved for the Cryogenic cooled system. The remaining parts were then re-sorted into two categories for room temperature; the differential would be a 4-nanosecond clock period between the two room temperature systems and 17 nanoseconds (24 to 7) for the total system product set. The sorting and using the entire distribution of Integrated Circuits had a significant cost reduction factor for the entire product line. Bipolar devices, by contrast had lower functional yield to begin with coupled with additional loss of product due to performance yield. This was a definite cost reduction asset to the ETA System.

To cool the CPU air was forced on to the processor chips by using a plenum that was designed to cover each chip. Holes were designed in the plenum such that equal operating temperature would result for each operating chip. Since the power consumption variation significant for several part types, designing the appropriate number of holes above each chip location could provide custom cooling. The plenum could then be molded for mass production of the processing unit. Large volume cooling fans were designed for the system as well. Cost was the focus for the air-cooled systems since the price tag was below $1M. Recall, that the air-cooled design was identical in parts at the CPU and storage level. A single development was achieved for a wide range of products with one design team.

Storage

Stacks using three-dimensional characteristics were designed under the leadership of Brent Doyle for the memory – both static (high performance) and dynamic (high density and lower performance) memories of the ETA Systems Supercomputer. These unique designs provided for highest density and optimum performance of the standard memory devices used. Ability to upgrade to future generations of memory (more storage capacity Integrated Circuits) was built into the design as well. The design worked well and stacking became commonplace in the computer industry for future designs – eventually eliminating the chip package entirely.

The Air-cooled system was defined as a Piper. An Illustration of “Piper” is shown below.

Summary

The design of the ETA Systems Supercomputer hardware had many unique features. The brief pages highlight some of them.

It would be remiss not to briefly discuss the “team” concept used to design the hardware. By having the CAD, Packaging, memory, circuit and power expertise located in a close proximity and holding concise project reviews at all levels at periodic and timely phases, all were kept abreast of the progress and challenges of each other. This permitted changes to be made to necessary designs to properly accommodate the challenges and opportunities in a timely fashion. Hardware was demonstrated on or near schedule despite the innovations required in each aspect of the design. The team was truly a “team”.

A missing link to the team was the logic design. These folks were separate and actually on another floor of the ETA Systems facility. It was strongly suggested and accepted for future designs, that the logic team would be a part of this common organization. I had the opportunity to lead one additional hardware development that included the logic design team (at Cray Research, not ETA Systems) later. It was a smoother and more effective and thorough team. Like ETA Systems the communications were open and included both manufacturing and software participation (the later two were voluntary).

Clearly, communications – effective communications at all levels of the organization was key to this hardware design success.