金曜日, 3月 28, 2014

CLKDA Isadore warns "Don't drink the SNPS/CDNS timing Kool-Aid!"

CLKDA Isadore warns "Don't drink the SNPS/CDNS timing Kool-Aid!"





( ESNUG 535 Item 1 ) -------------------------------------------- [12/20/13]

  Editor's Note: For non-U.S. readers, "Don't drink the Kool-Aid!" is
  a reference to when cult leader Jim Jones had his 900 followers drink
  Kool-Aid laced with cyanide in a mass suicide to follow Jones to heaven.
  "To drink the Kool-Aid" is to unquestionably believe something.  - John

             ----    ----    ----    ----    ----    ----

Subject: CLKDA Isadore warns "Don't drink the SNPS/CDNS timing Kool-Aid!"

> It is becoming almost impossible to close timing in all corners, given the
> ambitious (and often conflicting) specs for power and frequency, process
> variance, corner spread, large derates -- and the sheer number of PVT
> combinations, aka sign-off deadlock.
>
>     - from http://www.deepchip.com/items/0534-03.html


From: [ Isadore Katz of CLKDA ]
 
Hi John,

I want to follow up with some "check to see if there is water in the pool
before you dive in" warnings.  

There is a lot of confusion out there on:

    - what are corners,
    - the difference between traditional sign-off corners and global
      corners,
    - how derates work, and
    - how STA uses all of the above.  

WARNING: Synopsys and Cadence are pushing new capabilities in derates and
timing that are NOT compatible with existing sign-off practices -- and these
derates are NOT blessed by TSMC, Samsung, nor GlobalFoundries.  

Before you sign up to Synopsys or Cadence derates for sign-off, it pays to
learn what this all means.  
   
             ----    ----    ----    ----    ----    ----

SIGN OFF AND DESIGN CORNERS

Sign-off corners describe the boundaries of a semiconductor manufacturing
process such that by designing to and signing off at their corners, your
silicon can be guaranteed to work -- and yield is 'maximized'.  Corners are
determined through lot, wafer, and die measurements at the foundry.

                    

   Fig 1.) Yield is the percentage of die that meets your target
           power (average energy) and frequency (max delay).

Most importantly, sign-off corners are something that can be physically
measured to see if a wafer or die is "good" or "bad".  During wafer and
chip fabrication, acceptance tests will include frequency measurements of
ring oscillators and DC measurements of device or device arrays that must
fall within your corner boundaries.  The two graphics below illustrate how
corners are derived in practice.

         

   Fig 2.) The original Shmoo, and a Shmoo scatter plot that's
           used to define corners.  Corners are defined at the
           boundaries and center of the process.

This corner information is incorporated into a SPICE model, which is in
turn used to generate corner timing models for each library cell used in
static timing.  Corners are defined both for the front-end-of-line (FEOL)
tranistor device, and back-end-of-line (BEOL) metal (cell personalization,
power, ground, and interconnect).


TRADITIONAL CORNER-BASED SIGNOFF AND DESIGN

Historically, most SoCs used a corner based sign-off and design approach;
except for GPUs and CPUs -- which are performance binned.  The corners
engineers discuss are based on the relative NFET and PFET speeds in a chip's
transitors: slow-slow (SS), fast-fast (FF), slow-fast (SF), fast-slow (FS)
plus typical-typical (TT).  E.g. "FS" means "fast NFETs and slow PFETs".
TT refers to both FETs being typical.  

In this model your approach is to design to and pass all corners.  This
means that you verified against all corners, and fixed the problems at
each corner during optimization.  This approach traditionally produces the
highest yield, but sets your specification to the slowest (SS) corner for
both frequency and power.

To get better specs, many SoC houses are -- or are contemplating -- a shift
away from this traditional corner-based sign-off.  

             ----    ----    ----    ----    ----    ----

DESIGN TO "TYPICAL" IS GAINING TRACTION

The "new" alternate approach that's getting traction is design only to TT,
but do corner-based sign-off.  TT will produce a more compelling spec
than SS.  You verify at all the corners, but you only fix problems in TT.
This will give the best specification -- but creates problems for yield and
hitting volume production -- which impacts both price and availability.  

If you spec at TT, you may not get as many working parts.  There are silicon
architectural approaches which can ameliorate the impact of process corners,
such as dynamic voltage scaling, but they do not eliminate them.  However,
with the increasing amount of variance and corners, this new TT approach is 
increasingly attractive.  

             ----    ----    ----    ----    ----    ----

GLOBAL CORNERS

While analog and custom engineers have been familiar with global corners for
years, it is a relatively new concept for digital designers -- particularly
for sign-off and STA.  The global corners (FFG/SSG) separate out all local
on-die variance -- and only include global variance (die-to-die).  Analog
and custom designers use these corners to run Monte Carlo analysis to build
robust structures.  For STA, global corners are the basis for all of the
advanced derates: AOCV, POCV, SOCV, Liberty Variance Format (LVF).

The following graphic illustrates the relationship between global corners
and traditional corners. 

          

   Fig 3.) WARNING: global-corners-FFG/SSG-plus-3-sigma are outside
                    of SS and FF !!!

Global corners sit inside of the full corners.  Global corners plus some
amount of on-die variance (also called "local variance") equals the full
corners.  

In TSMC terms, along with SS, FF, SF, FS, and TT for local corners, these
global corners are called SSG, FFG, SFG, FSG, and TT.  Five important notes
about global corners: 

    1. You can NOT measure a global corner.  Global corners are
       derived values, and there is always some local variation.
       You can measure global plus something, though!

    2. A global-corner-plus-3-sigma does not equal a corner.  It'll
       usually be outside the boundaries of 3 sigma, and is more
       conservative with respect to yield.  The full corners are
       often defined as  +/- 3 sigma yield.  The nature of the
       statistical population is such that the global corners +/-
       3 sigma do NOT sum up to the same distribution.  
 
    3. Every foundry has its own approach to extracting global
       corners, and you can NOT assume that they are equivalent.  

    4. Signing off with SSG or FFG does NOT mean wafer acceptance
       occurs at SSG or FFG.  It will be at global corner +/- X
       (where "X" is a negotiated value with your foundry).  

    5. Perhaps most importantly, a global corner sign-off approach
       has to be agreed to and negotiated with the foundry.  This
       is right at the heart of the specification-yield trade-off.
       Most foundry/customer agreements still rely on SS- and FF-
       based sign-off.  
  
DERATES, CORNERS AND MARGINING

Margining is a way of measuring and representing factors in the timing flow
that cannot be captured in delay or crosstalk; for example process variance,
voltage drop, and clock jitter.  Using global corners and the local variance
parameters, SoC teams can now create process margin adjustments -- so-called
"derates" (OCV, AOCV, POCV, SOCV, Liberty...).  Many of the teams also add
in adjustments for voltage variance -- particularly for use in analyzing the
clock for hold-side violations, and clock noise, and jitter.  

Whether it is pure corners or TT based sign-off, the need to ensure yield at
the highest possible spec has led to new approaches to account for process
and voltage variance -- as well as clock jitter and insertion delay.  This
is what Jim Hogan describes as "Systematic Margining":

   "Systematic Margining reduces guard bands by replacing rule-of-thumb
    approaches with SPICE-accurate metrics to deliver a much more accurate
    margin analysis in libraries, which propagate to SoC level...  Data,
    analytics, and tools are added to give a more accurate and complete
    picture of the actual performance and yield characteristics of a
    library, IP, or a full SoC design."

        - Jim Hogan of Vista Ventures in ESNUG 524 #4

So how do you design to these margin factors, and how and when do you apply
them?  This is partly a question of what guard-banding is needed for yield,
but also what the STA tools can consume (more on that later).  

Here are the basic considerations related to Systematic Margining:

    - Clock-side derates are meant to be pessimistic with respect to
      hold violations.  Full stop.  Switching from OCV (a global factor)
      to AOCV, POCV or SOCV will be less pessimistic by taking into
      account path depth, but it is still adding pessimism.

    - Data-side derates can add or remove pessimism depending on which
      corners they are applied to.  If they are applied to global
      corners (SSF, FFG), or TT, they have to add variance (remember a
      global corner is not a real corner -- it has local variance
      removed).  If they are applied from the SS or FF corners, they 
      will never go past the SS or FF corner.  The corners are still
      the corners.  

    - All derates can NOT be applied to all corners.

         - AOCV, which is a multiplier, can be applied to any corner.
         - POCV, which is a sigma/coefficient, can only be applied to
           global corners, and ties directly into your foundry agreements.
         - Liberty Variance Format (LVF) is raw sigma data that can be
           used to generate derates, coefficients or sigmas.

      And no one has published any data on SOCV, so it is unclear how
      it is used.  

    - Device process variance is only one part of the equation.  Clock-
      side derates can also include a voltage component to differentiate
      launch and capture clock.  Traditionally, clock skew, noise and
      jitter were captured through STA insertion delay and clock
      uncertainty adjustments -- but these are also being pushed into
      the derate factor as well.  Alternately, more detailed clock tree
      circuit simulation can be used to capture all of these effects.

              

   Fig 4.) Derate Comparison: Some Derates Only Work in Some Corners

In addition, timing constraints based on corners can be very optimistic.
Adjustments to the timing constraints, called constraint uncertainty,
are also being added to libraries, or alternately the timing constraints
are actually calculated using statistical methods.

             ----    ----    ----    ----    ----    ----

HOW THIS ALL PLAYS WITH STATIC TIMING ANALYSIS

Static Timing Analysis (STA) tools are both evolving and being applied very
differently.  Regardless of what corners or margining users want to apply,
the two questions always come back:

   "Does my Synopsys or Cadence STA tool support it?"
               
and
    
   "Can I use this capability in my physical design timing flow?"

All STA tools, like Synopsys PrimeTime, Cadence ETS/Tempus, CLKDA Path FX,
as well as the timing tools embedded in the Physical Design (PD) tools like
Synopsys IC Compiler, Cadence Encounter, Mentor Olympus, AtopTech Aprisa
bound what can or cannot be measured and how it is measured.  For example,
PrimeTime, Tempus/ETS are very capable tools, but they have limitations as
well.  And the timing engines in the physical design tools are strictly
graph-based, which is starting to create an important mismatch as sign-off
goes path-based.

There are four key trends: 

1. The shift to Path Based Sign-off

More and more SoC teams have switched to path-based (PBA) sign-off versus
the traditional graph-based (GBA) sign-off.  Graph-based sign-off, while
much faster by considering all paths at once, can be more pessimistic.
(This is due to something called 'condensing' which determines which delay
and slew are propagated forward in the graph when there is more than one
path through a pin.)

The default condensing behavior of most static timing analysis tools such as
PrimeTime is worst/worst propagation -- which is inherently pessimistic
since you may be combining the worst slew from one input and the worst delay
from another.  Path-based timing, which only considers only one path at a
time, eliminates this problem and will find more timing slack.

However, the shift to path-based (PBA) sign-off comes with a cost:

    - All current commercial physical design (PD) tools rely on GBA for
      optimization.  The virtue of GBA sign-off was that it correlates
      well with GBA physical design results.  

    - PBA will rarely produce the same path ordering or slack values as
      GBA because of condensing (not derate factors as some were misled
      to believe).  This makes the feedback loop between sign-off and PD
      much more complicated.  If you find a problem in PBA, and try and
      fix it in GBA, you may actually make timing worse!  NOT GOOD!!!

    - This means that the final PBA sign-off and optimization loop will be
      very different from the core PD GBA optimization runs.  The fact
      is that most PD tools were already miscorrelating with sign-off
      because of the sheer number of corners, derates, delay models and
      signal integrity -- but this GBA/PBA makes it all that much worse.

2. Liberty Variance and the emergence of full load/slew/arc derates

Until recently STA tools were limited to either OCV, or AOCV.  AOCV has had
a known limitation which is that it only allows 8-values-per-cell regardless
of the timing arc, load, or slew (rising/falling, early/late, clock/data).
So while AOCV is an improvement over OCV, it still was either pessimistic or
optimistic depending on which arc/load/slew point is selected.

A new format, Liberty Variance, has been approved which adds full arc, load,
slew support.  This data can be used to generate a much richer and more
accurate derate set.  Reportedly both PrimeTime POCV and Tempus SOCV are
adding this capability.  AOCV can also be enhanced to take advantage of this
capability, and eliminate the 8 value limitation.  However, as we noted
before, POCV and SOCV require global corner sign-off -- and this may not be
an option for you and your foundry.   

3. Handling analog effects in digital delays

As I noted in the first part of this discussion in ESNUG 534 #3, digital
delays are showing more analog effects.  Miller capacitances on receivers,
non-linear slews, clock tree noise, and low voltage operation can have a
material impact on timing accuracy.  TSMC was explicit at their last
OIP conference that low voltage timing accuracy is a concern.

4. The glaring gap for interconnect variance.

There is one glaring gap in the sign-off flow: BEOL variance.  There is
currently no way to model interconnect variance other than adding more
corners.  The Statistical SPEF (sSPEF) approach, which was applied at 40 nm
and above, is not functioning at 20 nm and below for multiple reasons (like
how to deal with dual patterning.)  More importantly, corners do not always
capture all the cases which could lead to missed timing violations.

             ----    ----    ----    ----    ----    ----

CHECK FOR WATER BEFORE YOU DIVE INTO THE MARGINING POOL

There is a lot of noise out there from Synopsys and Cadence about timing
tools and they make some very big claims about speed and performance.  It
all sounds impressive.

Along with that have come some new four letter acronyms like POCV and SOCV,
and claims THEIR WAY is the superior way to better margin and timing --
though Synopsys and Cadence will NOT give you a lot of explanation about how
POCV/SOCV really work, when they can be used, and what they really mean to
the end user for yield and performance.  

Every day I have explain to some very smart people that everything is not
what it seems.  You cannot apply a global sigma based derate such as POCV or
SOCV to your sign-off corners SS and FF.  TSMC/Samsung/GlobalFoundries may
not agree to your signing-off at global corners -- SSG and FFG.  

And be careful what you wish for -- SSG-plus-3-sigma can often be much more
pessimistic than what you have today with SS.  

Your sign-off flow is about your yield.  

Be warned: Synopsys' and Cadence's yield interests aren't the same as yours.

In short, don't drink the SNPS/CDNS timing Kool-Aid.

    - Isadore Katz
      CLK Design Automation                      Littleton, MA

Join    Index    Next->Item
 Sign up for the DeepChip newsletter.
Email
 Read what EDA tool users really think.












FeedbackAboutWiretapsESNUGsSIGN UP!DownloadsTrip ReportsAdvertise

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2014 John Cooley.  All Rights Reserved.
Contact John Cooley | Webmaster | Legal | Feedback Form |

   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)

   

0 件のコメント: