Title: Synchronization Ideas
1Synchronization Ideas
- Charles E. Dike
- Intel Corporation
2Introduction
- Tutorial
- Share some ideas about synchronization and
metastability - Introduce NEW, IMPROVED theory on metastability
- Charles Dike (cdike_at_ichips.intel.com)
3Why and where synchronize?
- Reduce latency between independent clock domains.
- Asynchronous domain to synchronous clock.
- Synchronous clock to an independent synchronous
clock. - Benefit - higher performance in critical circuits.
4Design Direction
80s towards 100MHz
90s towards 1GHz
00s multi-GHz
VALUE ADDED
5Chip Area Networks
Late 00s multi-GHz
6I believe.
- We must be able to synchronize all domains to a
PLL controlled clock - Interconnect on chip will be asynchronous (GALS)
- We need to minimize latency
- There will be two basic synchronizer uses - near
neighbor and the chip net
7Topics of Discussion
- Generic synchronizer of the type used in the
TeraFlops computer - Simple synchronizer of the type used in StrongArm
- The Myrinet pipeline synchronization scheme
- Latest understanding of metastability
8Generic Synchronizer
- Handles self timed to synchronous interfaces and
vice-versa - Supports synchronous to synchronous interfaces
- Can handle streaming data
- Adaptable to any speed range
- Possibly used over the chip network
9Two flop synch
VALID
1
2
CLK
10Single latch synch
REQ
ACK
CLK2
CLK1
Write Valid
Read Valid
11Multi latch synch
REQ
ACK
CLK2
CLK1
Write Valid
Read Valid
12General Case
WRITE POINTER
READ POINTER
STATUS REGISTER
SYNCHRONIZERS
EMPTY
FULL
PADDING
LATENCY
EN
EN
EN
Write Clock
Read Clock
Write Enable
13empty case
WRITE POINTER
READ POINTER
STATUS REGISTER
SYNCHRONIZER
EMPTY
Write Pointer a
Read Pointer a
EMPTY
Write Pointer b
Write Enable
Write Clock
Read Clock
Read Pointer b
14General Case
WRITE POINTER
READ POINTER
STATUS REGISTER
SYNCHRONIZERS
EMPTY
FULL
PADDING
LATENCY
EN
EN
EN
Write Clock
Read Clock
Write Enable
15Topics of Discussion
- Generic synchronizer of the type used in the
TeraFlops computer - Simple synchronizer of the type used in StrongArm
mprocessor - The Myrinet pipeline synchronization scheme
- Latest understanding of metastability
16Simple Synchronizer
- Constrained by frequency ratio
- Supports synchronous to synchronous interfaces
- Does it support asynch to synch? Yes, with
restrictions. - Possibly used in local neighbor synchronizers
17Simple Synchronizer
SYNC
A2
A3
A
A1
w
x
y
z
SLOW CLK
MI
Divide by 2
FAST CLK
MI Metastable Immune
18timing1
SYNC
A2
A3
A
A1
SLOW
MI
FAST
Divide by 2
1
2
3
4
5
6
19timing2
SYNC
A2
A3
A
A1
SLOW
MI
FAST
Divide by 2
20timing3
SYNC
A2
A3
A
A1
SLOW
MI
FAST
Divide by 2
FAST CLOCK
1
2
3
4
5
6
SLOW CLOCK
SYNC
CHEATER CLOCK
21timing4
SYNC
A2
A3
A
A1
SLOW
MI
MI
FAST
Divide by 2
SYNC
A2
A3
A
A1
MI
FAST
FAST CLOCK
1
2
3
4
5
6
SLOW CLOCK
SYNC
SLOW CLOCK
SYNC
22transfers
FAST TO SLOW TRANSFER
SLOW TO FAST TRANSFER
SLOW CLOCK
SLOW CLOCK
23Topics of Discussion
- Generic synchronizer of the type used in the
TeraFlops computer - Simple synchronizer of the type used in StrongArm
- The Myrinet pipeline synchronization scheme
- Latest understanding of metastability
24Pipeline Synchronizer
- Supports synchronous to synchronous interfaces
- Supports asynch to synch and vice-versa
- Possibly used in local neighbor synchronizers
- Essentially a distributed fifo and synchronizer
25Pipeline Synchronizer
26ME element
X
f0
REQ
27Fifo element
Ro
Ri
Data
Ai
Ao
28Async to sync
Synchronous
Asynchronous
29Sync to async
Synchronous
Asynchronous
30Points to ponder 1
- All synchronizing interfaces have one thing in
common - a latching element that holds data while
metastabilities are being resolved. - There is no way to avoid the latency which is
required to resolve metastabilities. - To minimize latency the latching element
characteristics can be improved. - We will be required to understand and use this
knowledge. This is the future of digital design.
31Topics of Discussion
- Generic synchronizer of the type used in the
TeraFlops computer - Simple synchronizer of the type used in StrongArm
- The Myrinet pipeline synchronization scheme
- Latest understanding of metastability
32Role of the Synchronizing Flop
- Reorients incoming information to a clock edge
- Its performance determines system failure rate or
latency
33Real Life
- There is no magic bullet
- There is a lot of misinformation on metastability
around - To date many circuits have been over designed
through planning and luck - Whenever a circuit fails based on too high of a
frequency ultimately the cause of failure is
metastability - There is no way to synchronize a signal faster
than about the time it takes to pass a signal
through six static gates
34Metastability is....
OUT
SET
OUT
RESET
35Technical terms
- Tw (window size) - likelihood of entering a
metastable state - in units of time - Tau (t) - rate at which metastability resolves -
in units of time - MTBF (Mean Time Between Failures)
ltVn2gt4kT/C lt thermal noise
36Simple jamb latch
NODE B
NODE A
OUT
DATA
CLOCK
RESET
37Simple jamb latch
NODE B
NODE A
OUT
DATA
CLOCK
RESET
RC time constant
38Rough Histogram
Tw
The slope is the t
D time of data after clock (log scale)
Propagation delay
e t/t
MTBF
Twfdfc
39 Why is the theory a problem?
- It assumes a uniform distribution of data about
the clock - What happens when data always violates the setup/
hold window? - It is not detailed enough
- Doesnt consider a deterministic region
- Doesnt account for thermal noise
- People tend to extrapolate the theory improperly
40Overview of refined theory
- Not everything past a normal propagation is a
metastable event - The Tw window cant be improved by input edge
rates - Tw has a complex relationship to t based on load
- The MTBF formula needs to be modified due to
non-uniform distribution of data about the clock
input
41Schematic
42Simulation of a typical latching device
43Test case
44Measuring real data
45Histogram
0.6mv/0.1ps
time
46Histogram
0.6mv/0.1ps
time
47Measured versus Basic
Tw
The slope is the t
0.6mv/0.1ps
D time of data after clock (log scale)
Propagation delay
Propagation delay
e t/t
MTBF
Twfdfc
48t Simulated....
Battery
Voltage Controlled Switch R1 100 W R1 100M W
49Tau Simulated 2
50 ltVn2gt4kT/C4kTBR
Vn 0.6 mv
51Putting it all together
normal
A
(picoseconds)
52Putting it all together
deterministic
?
B
(picoseconds)
53Putting it all together
deterministic
Thermal noise point
C
(picoseconds)
54Putting it all together
true metastability
deterministic
T19 ps
D
(picoseconds)
55Putting it all together
true metastability
deterministic
Tw15 ps
T19 ps
E
(picoseconds)
56(No Transcript)
57Points to ponder 2
Jakov Seizovic postulated a malicious
asynchronous signal no matter how we position
the sampling window, and no matter how small we
make the sampling window, the asynchronous
transition will appear in that window. This case
has to be assumed when interfacing to a signal of
unknown probability distribution. We know
something about just how malicious a signal can
be.
58Exploring
59Worst case bound
60Not worst case bound
Uniform distribution
12 ps jitter
lt 0.1 ps
61Final comments
- With the proper synchronizing device it may be
possible to synchronize a signal within a single
clock cycle. The constraints are - You require about 35 ts in order to get the MTBF
out to about 1 century. - Each typical static gate delay is equivalent to
about 5 ts in a properly designed synchronizing
flop. - The metastability MTBF of a device should
probably be an order of magnitude better than the
mechanical MTBF. - You must assume a malicious input to the
synchronizer. Nevertheless, this only adds about
5 ts to the delay. - Standard flop designs are generally very poor
synchronizers. Use a jamb structure. It has the
best transconductance. - You should never require more than two
synchronizing flops in series
62Conclusion
- There are several ways to communicate between
independent domains - I believe more asynchronous domains will appear
that are imbedded within synchronous designs - Latency must be reduced to maximize the use of
asynchronous designs. - This is a burden that asynch designers must bear
- We need to know the limitations of
synchronization and metastability - Chip area networks are coming and they will open
up opportunities for asynchronous design
63References
- T. Sakurai, Optimization of CMOS Arbiter and
Synchronizer Circuits with Submicrometer
MOSFETs, IEEE J. Solid State Circuits, vol.
23,no. 4, pp. 901-906, Aug 1988. - L. Kleeman and A. Cantoni, Metastable Behavior
in Digital Systems, IEEE Design Test of
Computers, pp. 4-19, Dec 1987. - I. E. Sutherland, Micropipelines. Turing Award
Lecture, Communications of the ACM, 32(6),
pp.720-738, 1989. - J. N. Seizovic, Pipeline Synchronization, Proc.
Intl Symp. Advanced Research in Asynchronous
Circuits and Systems, CS Press, 1994. - C. Dike and E. Burton, Miller and Noise Effects
in a Synchronizing Flip-Flop, IEEE J. Solid
State Circuits, vol. 34,no. 6, pp. 849-855, June
1999. - A. Van der Ziel, Noise in Measurements. New York
Wiley, 1976.
64Overview of present theory
- Everything past a normal propagation is
considered a metastable event - A deterministic region doesnt exist
- Tw has no fixed relationship to t
- The MTBF formula assumes a uniform distribution
of data about the clock input