VLSI Physical Design: December 2015

Saturday, 26 December 2015

capacitive loading & its affect on slew rate

By definition slew rate of a circuit is rate at which a circuit can charge and dischare capacitance. This capacitance may be external capacitor CL or Cg gate capacitances of transistors connected to this circuit.

Normally a digital circuit during switching must charge or discharge CL or Cg at faster rate, and this charging rate depends on output current of the circuit.

capacitive loading occurs when this output current is insufficient to drive load capacitances CL and one or more gates connected to original circuit as a result slew rate of the circuit decreases and circuit becomes slow(takes more time to charge capacitors connected to circuit).

Is it Possible to have Zero skew??

Theoretically it is possible....!

Practically it is impossible....!!

Practically we cant reduce any delay to zero.... delay will exist... hence we try to make skew "equal" (or same) rather than "zero"......now with this optimization all flops get the clock edge with same delay relative to each other.... so virtually we can say they are having "zero skew " or skew is "balanced".

Chip Level Vs Block level design

· Chip design has I/O pads; block design has pins.

· Chip design uses all metal layes available; block design may not use all metal layers.

· Chip is generally rectangular in shape; blocks can be rectangular, rectilinear.

· Chip design requires several packaging; block design ends in a macro.

Fall Time

· Fall time is the difference between the time when the signal crosses a high threshold to the time when the signal crosses the low threshold.

· The low and high thresholds are fixed voltage levels around the mid voltage level or it can be either 10% and 90% respectively or 20% and 80% respectively. The percent levels are converted to absolute voltage levels at the time of measurement by calculating percentages from the difference between the starting voltage level and the final settled voltage level.

· For an ideal square wave with 50% duty cycle, the rise time will be 0.For a symmetric triangular wave, this is reduced to just 50%.

· The rise/fall definition is set on the meter to 10% and 90% based on the linear power in Watts. These points translate into the -10 dB and -0.5 dB points in log mode (10 log 0.1) and (10 log 0.9). The rise/fall time values of 10% and 90% are calculated based on an algorithm, which looks at the mean power above and below the 50% points of the rise/fall times.

Rise Time

· Rise time is the difference between the time when the signal crosses a low threshold to the time when the signal crosses the high threshold. It can be absolute or percent.

· Low and high thresholds are fixed voltage levels around the mid voltage level or it can be either 10% and 90% respectively or 20% and 80% respectively. The percent levels are converted to absolute voltage levels at the time of measurement by calculating percentages from the difference between the starting voltage level and the final settled voltage level.

Latch Vs Flipflops

· Both latches and flip-flops are circuit elements whose output depends not only on the present inputs, but also on previous inputs and outputs.
· They both are hence referred as "sequential" elements.
· In electronics, a latch, is a kind of bistable multi vibrator, an electronic circuit which has two stable states and thereby can store one bit of of information. Today the word is mainly used for simple transparent storage elements, while slightly more advanced non-transparent (or clocked) devices are described as flip-flops. Informally, as this distinction is quite new, the two words are sometimes used interchangeably.
· In digital circuits, a flip-flop is a kind of bistable multi vibrator, an electronic circuit which has two stable states and thereby is capable of serving as one bit of memory. Today, the term flip-flop has come to generally denote non-transparent (clocked or edge-triggered) devices, while the simpler transparent ones are often referred to as latches
· A flip-flop is controlled by (usually) one or two control signals and/or a gate or clock signal.
· Latches are level sensitive i.e. the output captures the input when the clock signal is high, so as long as the clock is logic 1, the output can change if the input also changes.
· Flip-Flops are edge sensitive i.e. flip flop will store the input only when there is a rising or falling edge of the clock.
· A positive level latch is transparent to the positive level(enable), and it latches the final input before it is changing its level(i.e. before enable goes to '0' or before the clock goes to -ve level.)
· A positive edge flop will have its output effective when the clock input changes from '0' to '1' state ('1' to '0' for negative edge flop) only.
· Latches are faster, flip flops are slower.
· Latch is sensitive to glitches on enable pin, whereas flip-flop is immune to glitches.
· Latches take less gates (less power) to implement than flip-flops.
· D-FF is built from two latches. They are in master slave configuration.
· Latch may be clocked or clock less. But flip flop is always clocked.
· For a transparent latch generally D to Q propagation delay is considered while for a flop clock to Q and setup and hold time are very important.

Saturday, 12 December 2015

Common Path Pessimism removal

Timing defines the performance of a chip. If timing constraints are not met, the chip is as good as dead. Any extra pessimism in timing analysis not only requires more time to fix the critical paths but could negatively impact other important parameters such as power and area. In the worst case, it might leave no option but to reduce the functional frequency of the design. On the other hand, optimism in timing analysis might result in silicon failure. Finding a bug in silicon can be a ponderous task, not to mention the monetary and goodwill loss for design companies. It is therefore prudent to remove undue pessimism and optimism from timing analysis.

Clock architectures have become fairly complex for modern SoCs. In synchronous design, clock controls the switching of sequential elements of the design and functionality of logic is ensured through meeting the required setup and hold checks. Timing engineers must remove any undue pessimism/optimism in the calculation of clock path delay because it can be detrimental for the design.

On-chip variation is one of the most important factors that necessitate pessimism introduction in timing analysis. It refers to the intra-die variations that may exist between different cells in different parts of the chip under the same operating condition. These variations may be:

Variation in the manufacturing process
Variation in the voltage: Due to different IR drops for different cells
Variation in the temperature: Due to formation of localized hot-spots on the chip.

Timing engineers model these variation in the form of derates. Applying derates on clock paths is the most popular and acceptable way to model these variations. Assuming 10% derates, for both early and late paths, delay X for a cell under an operating point can be modeled as 0.9X as capture clock path delay and 1.1X as the launch clock path delay for setup analysis.

Common clock path pessimism is one of the most common sources of pessimism in the design.

Common path pessimism
Common path pessimism arises when the launching and capturing clocks share a common path. The difference between the max delay and min delay of this common clock path segment is called the common path pessimism. EDA tools take care of this using Common Path Pessimism Removal (CPPR).

Figure 1: Common Clock Path Pessimism

There are two ways of calculating common path pessimism:

Critical-path based approach (CPPR):
a) Timing analysis tools finds the top critical paths with CPPR off.
b) Only these critical paths are re-evaluated considering CPPR for the common clock path.

While this method offers the advantage of being relatively fast compared to the Exhaustive Approach, it can miss critical paths. Hence for some corner cases, it might lead to an optimistic timing analysis.
Exhaustive approach: This method does an exhaustive CPPR analysis and therefore does not miss any critical path. However, analysis requires more memory and CPU resources compared to those required for the critical-path based method.

Assuming two reg-2-reg timing paths with the same data path delay, the path with the lesser common clock path might get missed using critical-path based approach. Consider the following example:

Figure 2: Case study showing how path-based approach can be optimistic

As evident from Figure 2, path 2 was critical with CPPR off, but with CPPR on, path 1 became more critical.

Timing engineers therefore tend to analyze their design using the exhaustive approach once the design has achieved logic freeze.

Corner Cells

Corner cells - For pad ring connectivity, it contains only Metal layers. no active layers. these special cells contains metal structures that are bent 45 degrees, to maintain continuity of IO power buss structures

Useful Skew

Useful skew-If clock is skewed intentionally to resolve violations, it is called useful skew.

For example there is setup violation in the design, Then we add some skew along the clock path in order to eliminate the setup violation.

You have three registers A B C each getting the same clock A.
Now, the latency of clockA in A is 1 (slack 12 say) clockA in B 12 (slack -12) So clock B is setup violated. There are many ways of fixing setup violation like adding buffers to speed up the data path etc... but let us assume that we cannot use any of this and the only possible way to get this done is tweak the clock path (quite dangerous). Since Setup slack = required time (clock) - arrival time (data) if i reduce the required time my slack violation will cease. Now, i add delay buffers in clock path after clock A in A (not before) so that only the clock A for B is delayed.
Interesting to note here is that delay in clock path for B will also delay data path for the flop to which the output of flop B goes (assume C) If this C has setup margin such that B can add that delay only then can you use the skew. This is called useful skew which is used to fix violations by taking slack from some and giving it to the other

Max Cap Violations

The capacitance on a node is a combination of the fan-out of the output pin and capacitance of the net. This check ensures that the device does not drive more capacitance than the device is characterized for.

The violation can be removed by increasing the drive strength of the cell.

By buffering the some of the fan-out paths to reduce the capacitance seen by the output pin.

Max trans Violation

In some cases, signal takes too long transiting from one logic level to another, than a transition violation is caused. The Trans violation can be because of node resistance and capacitance.

By upsizing the driver cell.
Decreasing the net length by moving cells nearer (or) reducing long routed net.
By adding Buffers.
By increase the width of the route at the violation instance pin. This will decrease the resistance of the route and fix the transition violation.

Thursday, 10 December 2015

Transition Time

Transition delay or slew is defined as the time taken by signal to rise from 10 %( 20%) to the 90 %( 80%) of its maximum value. This is known as “rise time”.

Equations for Setup and Hold Time

Equations for Setup and Hold Time
Let’s first define clock-to-Q delay (T_clock-to-Q). In a positive edge triggered flip-flop, input signal is captured on the positive edge of the clock and corresponding output is generated after a small delay called the T_clock-to-Q. The flip flop can only do the job correctly if the data at its input does not change for some time before the clock edge (T_setup) and some time after the clock edge (T_hold). Again, the clock signal which circulates via clock tree throughout the design has its own variability termed as skew.

From Figure 1 below, we derive equations for setup time and hold time. Figure 1 shows two talking flops, the first being the launching flop and the second is obviously the capturing flop. We shall derive equation for setup time for the capturing flop and equation for hold time for the launching flop. However, the derived equations will be true for either of the flops or for that matter any flops in the design.

Figure 1. Two Talking Flops Scenario

In the diagram above, at time zero FF1 is to process D2 and FF2 is to process D1. Time taken for the data D2 to propagate to FF2, counting from the clock edge at FF1, is invariably = T_c2q+T_comband for FF2 to successfully latch it, this D2 has to be maintained at D of FF2 for T_setup time before the clock tree sends the next positive edge of the clock to FF2. Hence to fulfill the setup time requirement, the equation should be like the following.

T_c2q+ T_comb+ T_setup _≤ T_clk + T_skew ------- (1)

Let’s have a look at the timing diagram below to have a better understanding of the setup and hold time.

Figure 2. Setup and Hold Timing Diagram

Now, to avoid the hold violation at the launching flop, the data should remain stable for some time (T_hold) after the clock edge. The equation to be satisfied to avoid hold violation looks somewhat like below:

T_c2q+ T_comb _≥ T_hold + T_skew ------- (2)

As seen from the above two equations, it can be easily judged that positive skew is good for setup but bad for hold. The only region where the input can vary is the ‘valid input window’ as shown in Figure 3.

Figure 3. Valid Input Window

Full Custom ASIC Design

Full-custom ASIC design defines all the photolithographic layers of the device. Full-custom design is used for both ASIC design and for standard product design.

The benefits of full-custom design usually include reduced area (and therefore recurring component cost), performance improvements, and also the ability to integrate analog components and other pre-designed — and thus fully verified — components, such as microprocessor cores that form a system-on-chip.

The disadvantages of full-custom design can include increased manufacturing and design time, increased non-recurring engineering costs, more complexity in the computer-aided design (CAD) system, and a much higher skill requirement on the part of the design team.

For digital-only designs, however, "standard-cell" cell libraries, together with modern CAD systems, can offer considerable performance/cost benefits with low risk. Automated layout tools are quick and easy to use and also offer the possibility to "hand-tweak" or manually optimize any performance-limiting aspect of the design.

This is designed by using basic logic gates, circuits or layout specially for a design.

Routing Grid

That generally call routing grid, to match the pin std cell position.

Historically, the backend designer should align this grid function of the std cell, now the tool are generally able to analyze the std cell pins positions.

The tool router has a routing grid, plus a sub-routing that is the forth of the routing grid to provide more routing possibility

inputs to red hawk tool

Below are the basic inputs of redhawk tool

1. Design Data (milkyway database)
2. Parasitic Information
3. Technology file
4. Timing and power Info (.lib files)

Threshold Voltage

The threshold voltage, commonly abbreviated as Vth or VGS (th), of a field-effect transistor(FET) is the minimum gate-to-source voltage differential that is needed to create a conducting path between the source and drain terminals.

For an nMOS device at gate-to-source voltages above the threshold voltage ((VGS > Vth) but still below saturation (less than "fully on", (VGS − Vth) > VDS), the transistor is in its 'linear region', also known as ohmic mode, where it behaves like a voltage-controlled variable resistor.

How It is affecting Timing

Low VT Cells :- these cells are faster but results in more power consumption.

High VT Cells :- these cells are slower than low vt cells, consumes less power than LVT cells

Based on requirement we need to use HVT and LVT cells since there is always trade off between power and speed.

Cloning Vs Buffering

Cloning is where a clock-gate (a special gate in the clock tree that switches of the clock signal to a number of flip-flops to save power when they are not needed) is duplicated so that one clock-gate driving, for example, 40 flip-flops can be "cloned" to become 2 clock-gates driving 20 flip-flops each.

A buffer is a basic electronic gate that serves to strengthen a signal. It is needed when you wish to drive a signal along a long wire, or when you want to drive a signal to very many receiving pins. A single driving gate can only drive a short length of wire and only a small fanout. Buffering is the insertion of buffers to help drive the signal to bigger loads.

Home

Saturday, 26 December 2015

Saturday, 12 December 2015

Thursday, 10 December 2015