CadenceLIVE India – OnDemand

5G/RF

Accurate and Fast IC-Level EM Analysis Using Fully Automated Virtuoso RF Integrated EMX Flow

The Virtuoso® RF design flow brings together the schematic editor, layout implementation, parasitic extraction, EM analysis and RF circuit simulation, along with integrated layout versus schematic (LVS) and design rule checking (DRC) in a single flow. This flow also incorporates EM analysis using the Cadence EMX® Planar 3D simulator into the Virtuoso® and Spectre® platforms, providing a high level of automation and the ability to analyze the performance of RF circuits pre- and post-silicon.

Vipul Kumar Sharma, Semi-Conductor Laboratory
Prayes Jain, Cadence
Madan Rachaprolu, Cadence

Silicon-Validated RFIC/Package Co-Design Using Virtuoso RF Solution in Tower Semiconductor’s CS18 RF SOI Technology

Established and emerging wireless and wireline applications such as 5G, WiFi and 400-800G optical networking are driving the demand for highly optimized RFIC solutions. Typical RF/mmWave design flows rely on the use of multiple EDA tools often sourced from multiple EDA vendors. This is inherently inefficient and often error prone leading to delays in getting a product to market. In addition, there exist multiple combinations of design tools and flows that prevent a foundry from providing a golden reference flow that can be used by a large portions of the design community. In this paper we present a silicon validated unified RFIC design flow using the Virtuoso RF. The design flow is based on a high-power SP4T switch design in Tower Semiconductor’s CS18QT9x1 foundry technology. RF SOI switch designs offer a useful test-case for the Virtuoso RF design flow as they require co-design and co-optimization of both the silicon and the package which is a key strength of this design flow. The design flow will be used to present a consistent modeling and simulation methodology. A seamless hand-off between PDK provided model, metal interconnect extraction within the p-cell, metal interconnect modeling outside the p-cell using EMX and Clarity, and the flip-chip package will be presented, while maintaining a single unified database that is used for tapeout. Silicon validation of key small and large-signal metrics will be presented highlighting the importance of the tight interaction between foundry Virtuoso PDK and package modeling using Cadence EMX and Clarity technologies.

Chris Masse, Tower Semiconductor
Samir Chaudhry, Tower Semiconductor

Cloud

Accelerating EDA Productivity: The Why, What, and How of the Journey to the Cloud

AI and IoT, in the cloud and at the edge, are driving a need for more rapid innovation in semiconductor devices. This talk presents examples and best practice architectures for cloud-based EDA, including use-cases inside and outside of Amazon. The talk will include an overview of how the development of next-generation products is enhanced through the use of cloud technologies. We will present performance optimization for computing, storage, and EDA workload orchestration, as well as covering how cloud enables secure collaboration in the semiconductor and electronics supply chain. We will also present examples of specific EDA workloads optimized and deployed on AWS, both internally at Amazon and at external customers.

David Pellerin, AWS

Cloud-Scale Productivity Without the Complexity—Have Your Cake and Eat It, Too!

Today, every design team is looking at Cloud with great interest to solve their compute capacity gap and accelerate project Turn Around Time (TAT). However, transitioning EDA and CAD flows to cloud can be complex, requiring thoughtful decisions about cloud architecture, data management, IT retraining, infrastructure setup, security, to name just a few.

This session will discuss Cadence platform to overcome cloud complexity. We’ll also uncover industry’s newest breed of cloud products that are allowing designers to enjoy their familiar on-prem design environment and yet enjoy all the great benefits of secure, scalable and agile cloud.

All the goodness of cloud without the effort and delays involved in adopting and optimizing the right cloud environment. Have your cake and eat it too!!

Ketan Joshi, Cadence

Designing Planet-Scale Video Chips on Google Cloud

Recently, Google announced a custom video chip to perform video operations faster for Youtube videos. Google’s TPU series of chips are also well-known for accelerating AI/ML workloads. These are just a couple of examples of chips designed by Google engineers. Google hardware teams have been leveraging Google Cloud for chip design for a few years, and this presentation highlights the benefits of using Google Cloud for chip design. Semiconductor customers can accelerate their time to market by leveraging the elasticity, scale and innovative AI/ML solutions offered by Google Cloud. Many large enterprises are choosing Google Cloud for their digital transformation. Google Cloud provides a highly secure and reliable platform, infrastructure modernization solutions, and best-in-class AI platform for the Semiconductor Industry. We will share relevant aspects of cloud (Compute, Storage, Networking, Workload Management, reference architecture) that enable high-performance chip design. We will discuss how typical verification and implementation flows on GCP can benefit by migrating to cloud, with specific examples for RTL simulation and full-chip analysis. We will also detail how customers can get started on their cloud journey. Every cloud journey is unique. We will share how we leverage Google Cloud internally for designing chips such as the Argos video chip. The Google Hardware team will share their journey with GCP with key learnings and benefits of migrating to Cloud. We will also share the challenges faced, and discuss verification/design workload migration and best practices.

Sashi Obilisetty, Google
Peeyush Tugnawat, Google
Jon Heisterberg, Google

Developing Scalable AI Inference Chip with Cadence Flow in Azure Cloud

d-Matrix, a cutting-edge startup, is building a first-of-its-kind, in-memory computing platform targeted for AI inferencing workloads in the datacenter. A pioneer in digital in-memory computing for the datacenter, d-Matrix is focused on attacking the physics of memory-compute integration using innovative circuit techniques, an approach that offers a path to massive gains in compute efficiency. With focus on AI innovation, the team chose full Cadence flow; and rather than setting up on-prem compute infrastructure, the design flow leveraged Azure Cloud.

This session will discuss how d-Matrix setup a productive Azure cloud infrastructure running Cadence flow, the lessons learned and the key success factors that led to delivering first AI chip within 14 months.

Farhad Shakeri, Azure-dMatrix
Andy Chan, Azure

Innovations to Create High Dynamic Range (HDR) Image Sensors

CMOS image sensors (CIS) are widely used in commercial, industrial and scientific applications. HDR sensors are useful for day-through-night imaging, capturing images in extreme lighting conditions from daylight through quarter-moon darkness. Such sensors are required for numerous applications like surveillance, security, and mobile applications. The dynamic range (DR) of an image sensor is limited by the temporal noise of the photodetector and readout signal chain. I will overview some of our work in this domain. The innovations we will discuss provide alternatives to image intensifiers, electron bombardment, EMCMOS. The use of these high responsive light sensors in applications like surveillance, security, reconnaissance UAVs, bio-luminescence and imaging systems in space platforms will be discussed. With remote working and a distributed workforce in a Pandemic world, Cadence Cloudburst interface has it made very easy and intuitive to bring semiconductor innovation and Collaboration to a cloud first work environment

Vrinda Kapoor, 3rdiTech
Mukul Sarkar, 3rdiTech

Computational Fluid Dynamics

How Computational Fluid Dynamics Extends Cadence’s Multiphysics System Analysis and Design

What is Computational Fluid Dynamics (CFD) and how does it extend the Cadence multiphysics system analysis and design capacity?

Fluids are present everywhere; in our environment and in nearly all sectors of industry, as vehicles of transport and of thermomechanical and chemical energy conversion systems. The full complexity of fluid flows can only be approached through numerical CFD simulations. Examples will illustrate the potential of our current CFD systems and the variety of applications they cover.

CFD today is still facing major challenges related to flow turbulence and the industrial expectations for higher reliability and efficient exploitation of High-Performance Computing. Our response to these challenges will be addressed.

Charles Hirsch, Numeca/Cadence

Mesh Generation for Launch Vehicle Configurations

Simulation during system design, before constraints are frozen, is the most opportune time to optimize and differentiate your product. High quality mesh generation is a key enabler of simulation-lead design which requires robust, efficient, and accurate simulations. The Cadence Pointwise mesh generation tool provides the flexibility and features needed to achieve faster system simulation.

 

Based on an aerospace heritage, the Pointwise meshing philosophy focuses on mesh quality and robustness while maintaining the flexibility to drop in to a wide variety of work flows. The result is best-in-class geometry and meshing technologies which combine to provide the choice for CFD meshing.

Amit Sachdeva, VSSC

Omnis, from Meshing to Solving to Optimization, in One Single Multiphysics Environment

Engineers and designers today need many different tools for their CFD analyses. Omnis combines them all into one single environment, from meshing to solving to optimization. High-performing technology in a slick, easy-to-use interface, streamlines the workflow of all its users. From the detailed analysis of a single component (e.g. IC chip) all the way to simulating a full system (e.g. entire car), with Omnis users can combine the different physics, scale fidelity to need and create as many designs as desired. It can be fully automated, driven by AI models and optimization algorithms, and is open to third-party software through powerful APIs.

Yannick Baux, Numeca/Cadence

The Choice for CFD Meshing

Simulation during system design, before constraints are frozen, is the most opportune time to optimize and differentiate your product. High quality mesh generation is a key enabler of simulation-lead design which requires robust, efficient, and accurate simulations. The Cadence Pointwise mesh generation tool provides the flexibility and features needed to achieve faster system simulation.

 

Based on an aerospace heritage, the Pointwise meshing philosophy focuses on mesh quality and robustness while maintaining the flexibility to drop in to a wide variety of work flows. The result is best-in-class geometry and meshing technologies which combine to provide the choice for CFD meshing.

Nick Wyman, Pointwise/Cadence

Custom and Analog Design: Implementation

Current Data Driven Analog Routing using Virtuoso SDR

Technology Update: Virtuoso Design Platform 20.1 Update

Yuval Shay, Cadence

Advanced Layout Implementation Utilizing Analog APR Flow

Analog and mixed-signal integrated circuits play an important role in emerging SoC design applications, hence dramatically increasing the demand of analog/mixed-signal ICs. The SoC designs have been done in an automated fashion traditionally but the analog/mixed-signal IC design, especially the analog layout design due to its complex needs and full custom nature continues to be a manual, time-consuming, error-prone, not fully correct by construction task. The design complexity is further impacted due to the FinFET architecture, local interconnects, multiple patterning process, base layers density requirements and complex design rules.  This drives the requirement for a flow to enable features like row-based layout placement to automatically meet density requirements, auto placing the devices as per the analog constraints, generate FEOL Fill using the WSPs, automatic routing of signals and so on which help layout designers to speed up TAT.

The APR (Automated Placement and Routing) Flow in the Virtuoso environment provides the flexibility for the layout designers to achieve the faster design convergence using a step by step approach. This flow consists of multiple steps starting with initialization of layout from schematic. The following steps are analog constraints generation for matching/grouping of critical devices,  Auto generating Rows for following a structured row-based placement, Placing the devices into rows while aligned to WSPs and the generated constraints, Cell Fill for filling out empty spaces along with base layers Insertion for density requirements, and finally trim insertion and routing of the nets using automatic router.  The steps in the APR flow can be used as standalone features in layout implementation or a mix of interactive, assisted, or automated layout implementation process. 

We worked with Cadence in enabling this flow for the latest foundry process nodes and are eyeing to achieve faster design convergence. This flow may help specific design blocks in gaining 35% on the development cycle.

As next steps, we will work with Cadence in making it usable for a broad spectrum of design blocks and incorporating the APR reuse flow to enhance predictability of layout design.

Atif Mohd, Intel

Enhancing Design Productivity by Managing RV and Parasitic Data in Real Time During Layout Implementation

As we go towards advanced process nodes, the layout design is very challenging due to complex design rules. Interconnect parasitics can have a profound impact on circuit performance and power, which may result in functional, Electromigration and IR drop issues. Until EAD, we had to wait for the complete LVS clean layout design for understanding the impacts. This results in re-spins and over-designing.

Virtuoso EAD provides flexibility to the layout engineers to perform real-time analysis of RV issues and parasitics during layout implementation. Circuit designers can pass electrical constraints to layout designer which can be verified in real-time and feedback can be shared to circuit designers to ensure the design intent is met in physical design ,this enables a proper handshake between a Circuit engineer & layout engineer. We can also discover electromigration issues as soon as a interconnect is created we don’t need to wait for post layout extraction results. The partial Layout Re-simulation utility in EAD offers net-by-net extraction and recurring checks between electrical and physical data. We can fix EM issues hierarchically.

EAD enables us for faster resolution of possible problems, saving cost of re-spins , overdesigning, and communicating the design intent in a much better way overall enhancing the productivity.

Jatin Kakkar, Intel
Amitkumar Patil, Intel

Increase Layout Productivity Through Accurate Area Estimation for Schematic Hierarchy

Area estimator is a calculator that read schematic for types of devices and connectivity to share diffusion during floor planning stage. It will refer to physical layout and calculate the total area hierarchically. The main challenges that we faced was to identify the same connectivity that can be shared parameterized cell and custom template cell. We gained high ROI from this are estimator during the floor planning stage.

Su Theng Chua, Intel
Tze Chan Cheah, Intel
Chiew Sia Goh, Intel
Chin Hong Heah, Intel

Metal stack portability

Automated VDR Tagging and Verification Flow for Advanced Node Design

This paper explains the fully automated VDR flow which helps reducing overall layout design and debug cycle. In this flow, schematic driven VDR approach is used to create the layout labels which refer schematic net voltages as reference values, in house wrapper helps cut down the time for verification.

Danish Shaikh, Western Digital

CLE Layout Development Methodologies to Enhance Productivity of Full Custom I/O and Test-Chip Design

Design complexity is increasing significantly for full/semi-custom mixed signal designs and common standard methodologies are not able to address new design needs, improve time to market and enhance the quality of products.  

Based on the layout planning and design complexity, multiple designers work in parallel in order to meet the critical projects requirement and deadlines. In existing layout flow, design manager (DM) who is handling top level layout  will do the layout partition and decide  the free area for other layout designers working on same project , other designer would proceed on different cell view and complete task in allocated area within locally created  library and later share his library path to DM. Since in existing flow multiple designers cannot work together and edit on single layout cell view, therefore it is very time consuming for DM to do complete all task of planning, assigning works, integration and repeat the same task at various stages of layout developments.

Validations (EMIR, DRC, DFM) are very challenging and difficult during the tape out week of the development cycles of test-chip projects /layout porting/modification driven layout /customer request etc. At these validations stage, in existing layout methodology, there is no way where layout can be selectively partitioned and assigned to multiple designers in order to reduce the overall task and to fix massive number of validation errors (based on divide and conquer approach). 

Above two major problem areas and design complexities are drastically reduced by novel way of utilizing the new Cadence IC advance utility Concurrent Layout Editor (CLE) in IO/ANA/SOC layout developments.

Virtuoso concurrent layou tediting (CLE) provides an editing environment that lets several designers to work concurrently on the same design at the same time, helping in cutting long design cycles and improving productivity.  

In this paper, #2 CLE based layout methodologies and its productivity gain are discussed

Varun Dwivedi, STMicroelectronics
Madhvi sharma, STMicroelectronics
Manoj Kumar Sharma, STMicroelectronics
Priyanshi Shukla, STMicroelectronics

Concurrent Layout Editing with Improved DRC Verification Flow for Enhanced Productivity

Constraint-Driven Automated Full Custom Place and Route for Achieving Faster TAT and Accuracy

With the recent trend of involving automation in layout design, we have seen great impact related to digital layouts, where area is the main constraint. By use of auto PR tool, we have managed to get a good digital layout which is showing better results and also helps in reducing the total man hours.

   When it comes to analog designs, there are more constraints involved such as parasitics, speed, area, power constraints, etc. These constraints need to be managed in an efficient way to get good results for any analog circuitry. Also, managing all these critical constraints involves good manual calculations while doing the layout which is time taking. A lot of research is going on to use auto PR tool for analog designs too, which leads us to usage of “Cadence Auto_Device_PR tool” for few of our designs.

  This paper explains the process which we used to achieve a fully automated analog layout in advanced nodes. We have executed the tool on a basic amplifier circuit, which involves a differential pair, a current sink and an active/passive load for few of the lower technology nodes. This resulted in less execution time to complete the critical analog layout i.e. aligning the pin position across the PR boundary, symmetrical placement of the devices, routing all the nets as per the specified width and getting DRC, LVS clean in very few iterations.

Ashok Kumar Mishra, STMicroelectronics
Pankaj Babu, STMicroelectronics
Ishita Dhawan, STMicroelectronics
Avinash Pratap Singh, Cadence

Design Intent: Tool to Enhance Productivity in Custom Memory Designs

Design Intent, a systematic platform that helps both schematic designer as well as layout designer to communicate relevant information to each other comprehensively and efficiently. As technology is shrinking with time, this flow of information starts playing even more crucial role since any miss in communication could lead to design failures and unanticipated delays in project completion. 

It is seldom that schematic designers do not share meaningful information with layout designers and vice versa. For now, this information is shared verbally, over emails or as notes in design, which are not very efficient in terms of trackability and documentation for future use. 

This is where “Design Intent” as an enhancement helps both the counterparts share their perspective and help make better design decisions. In this the schematic designer can introduce glyphs describing some design instructions that needs to be taken care while layout designing (for example: matching devices, shielding and other advanced layout effects). These instructions are majorly divided into three categories corresponding to devices, nets and pins. In response to these instructions, the layout designer has the flexibility to share his/her view regarding each instruction back to the schematic designer. The actions can be categorized as in-progress, issues, failed and complete. This information can easily be fetched from top-level report which shows all relevant information.

With minimal effort, this feature helps designer to share information with other designers and preserve the same for future use. Also, it helps layout designer to maintain design flexibility as instructed by schematic designer and revert in case of any concerns eventually leading to lesser design changes or a quick fix since the concerns were already communicated. For example: input nodes of sense-amplifier (in memory design) should be precisely matched in layout which if not communicated properly leads to design bottleneck and unanticipated delays. Design Intent as a feature will help circumvent such issues which have led to silicon bug in past.

For further developments, we are working on better management of hierarchy in Design Intent tool as it will give additional flexibility to a user specifically in hierarchical designs. Smooth reuse of all these features while porting designs across different technologies will be explored further. Integration of Design Intent in future projects would play a crucial role in reducing rework and hence overall project cycle.

Anuj Dhillon, STMicroelectronics
Ashvani Kumar Mishra, STMicroelectronics
Shafquat Jahan Ahmed, STMicroelectronics
Hitesh Chawla, STMicroelectronics
Tarun Handa, STMicroelectronics
Ishita Dhawan, STMicroelectronics

Electrically Aware Layout Design for High Quality and Increased reliability early in the Design cycle.

EM Aware routing with enhanced SDR flow

Custom and Analog Design: Verification

Technology Update: Accuracy, Performance and Capacity: Finding the Right Balance for Memory/SoC Design and Verification

The complexity of circuitry created by the combination of mixed-domains, advanced nodes,

and impossible schedules has pushed the all-important verification stage of design to its breaking point. Engineers are forced to make so many compromises in terms of what parts of the design can be tested, and how extensive those tests can be, and when can the results be returned to be useful that there becomes a real risk of critical parts “slipping through.” In this session, we will unlock the latest methodical secrets for reducing risk during your custom verification with a powerful combination of Virtuoso and Spectre platform tools and flows.

Steve Lewis, Cadence

Advances to AMS Co-Simulation Enabling Efficient Pre-Silicon Verification Signoff of Low-Power Mixed-Signal RF SoC

AMS co-simulation, a critical component of the pre-silicon SoC verification, involves careful scoping and simulation runtime management. This presentation will highlight the challenges with legacy design artefacts like VHDL support, connect modules for sensitive signal paths, simulating dynamic temperature variations, low power entitlement, comprehensive functional verification and performance bottlenecks with waveform handling. Will provide custom solutions to work-around known tool limitations, novel scalable CM solutions, and highlight Cadence tool offerings including dynamic simulation parameter control, dynamic checks and Spectre X Simulator.

Lakshmanan Balasubramanian, Texas Instruments
Avinash Chaudhary, Texas Instruments
Krithika Nanya, Texas Instruments
Russelle Carvalho, Texas Instruments

Smart Regression Using Reference and Merge History Enabling Incremental Simulation to Reduce Design Verification Cycle

Application of Joules and Valus Tools in Standard Cell Library Qualification

Check Analog Fault Simulation with Actual ATE Program

Analog fault simulation is a tool which is used to check the coverage of design defects (shorts and opens within design). Design defects should cause ATE final test program fail instead of pass. Otherwise, the fault coverage is not enough and leads to potential quality issue. Analog fault simulation (AFS) tool is provided within Cadence Virtuoso Maestro setup. AFS users can use it a different approach to achieve different goals. Some remaining questions which haven’t been addressed. First, can we run actual C++ ATE test program with AFS? Secondly, how do we manage large number of fault injection which will be an overkill for the computing resource? Lastly, how do we collaborate between multiple disciplines to achieve the AFS simulation run and analysis? This paper addressed all of the issues above. By using DPI-C in SystemVerilog, we are able to support C++ test program. Then we choose to run sub-block level sim with C++ stimulus to optimize the computing resource. Next, the checker results can be parsed from xrun.log by using a function named “CCSgrepCurrLogFile()”, dumped into Maestro output, and used for pass/fail criteria for AFS simulation. Lastly, the results are reviewed by design engineers, especially focused on undetected faults. This paper shows an innovative way of how to use Cadence AFS tool to emulate ATE tests with team effort cross DV, ATE, design functions.

Jun Yan, Texas Instruments
George Fleming, Texas Instruments
Jerry Chang, Texas Instruments
Chanakya K V, Texas Instruments

Cost-Effective Characterization Using Liberate Trio Characterization Suite on AWS Using Arm Neoverse-Based Graviton2 Instances

Digital to Analog Converter Meets Its Accuracy with the High-Performing SPICE Engine, Spectre X Simulator

The design of analog IPs to keep up with the pace in advancement of lower technology nodes and faster digital circuits has always been a challenge. One of the major factors in the design cycle time of the complex analog IPs is the simulator accuracy and speed. Usually, one trades with the other. The verifications of analog circuits require fast high precision simulators. Exhaustive post-layout simulations are becoming mandatory for assuring the quality in the first silicon of the IPs. Many a times, these IPs are going directly to the product without a test vehicle in-between, and this further necessitates very detailed verifications at design level itself. 

The traditional tools like spectre, and spectre aps are good in terms of accuracy, but they lose speed as the circuit nodes become larger. With Spectre X Simulator, the new high-performance spice simulator by Cadence, has helped to resolve the issue by a good amount.

Our case study is a 12-bit 15 Msps, digital-to-analog converter, with an R-2R architecture, split between 8-bit binary and 4-bit thermometric array. As it is an R-2R based DAC, with high speed requirement, post layout netlist for verification must be RCc with very minimal filtering in post layout options. This results in a PLS file with a large number of internal nodes.

Simulating this file with a ramp across all 4096 codes is important for measuring various static parameters like DNL, INL, TUE etc. 

As traditional tools are time consuming, designer is forced to have only a few selected checks. Further if iterations are required, it can sufficiently increase the design cycle time.

With spectreX, it was found that simulations are faster, both for schematic as well as post-layout netlist. We have tried various accuracy options of the spectreX e.g. MX, LX, VX and spectre aps++, with and without post-layout optimizations to reach to the accuracy required by the design. However due to strict accuracy requirement we had to choose +preset=mx as it was accurately matching the theoretical calculation. 

Up-to 4X fast simulations are observed for the transient simulation

Anubhuti Chopra, STMicroelectronics
Sudip sarkar, STMicroelectronics

Fast and Accurate methdod to check ageing impact in SRAM memories

Managing Circuit Design Migration from One Process to Another

Circuit design migration is often resource intensive and cumbersome task when migrating circuits from one process to another. When a block is already designed and verified in one process, it is often desirable to maintain the same architecture and similar performance as original source design in a process node. 

VALE Schematic Migration provides a methodology to migrate designs from One process node (Source) to different Process node (Target). This flow consists of a spreadsheet-based approach having device/cells mapping from source PDK to a target PDK. It also provides flexibility to map device CDF parameters between Source and target PDK. We can calculate target device parameters by using equations, conditional expressions, or literal values. 

A translation database is maintained from source to target to enable migration review through a results browser. 

This methodology has been tested in the multiple process nodes and we are able migrate circuits successfully with better productivity.

Srinivasu Parla, Intel

Scalable Analog Regression and Verification Using Virtuoso ADE Verifier

This presentation describes analog regression and verification methodology using Virtuoso ADE Verifier to reduce the complexity around PDK updates in the regression cycle. We discuss a solution for evaluating the performance drift caused due to the model files or extraction database updates. One of the possible ways to approach dependent simulations and processing the calibrated codes within testbenches through ADE Verifier is also conferred. Handling all these processes without having to maintain a parallel infrastructure is the added advantage. With this methodology we were able to cut down the days for overall regression process.

Niveditha B S, Intel
Sunil Mehta, Intel

SimVision Mixed-Signal Deut Option: Solving Analog Mysteries Inside a Digital Cockpit

For top-level mixed-signal design verification, locating the source of abnormal current consumption is difficult and tedious. Visualizing and trace the text based analog content in the complicated mixed signal design is impossible with the tools used so far. Current analog waveform viewer does not have debug capability to link the individual waveform to its netlist or source code, while existing digital/mixed-signal debug tool is unable to trace the text based analog content. This leads to very time intense simulation debugs: connectivity has to be traced on the netlist outside the waveform viewers and the designer has to switch between different tools.

Cadence SimVision Mixed-Signal Debug Option, a new digital/mixed-signal simulation cockpit integrated in SimVision Debug, is targeted to the TI needs and presented as a novel solution. With this solution, terminal currents can be viewed interactively and traced down to the leaf level of every node; Verilog-AMS/Spectre/SPICE text contents will be shown in Schematic Tracer as schematics, which allows to review connectivity and current distribution easily; Spectre and SPICE source file can be cross probed in the Source Browser and schematic tracer.

SimVision MS Debug Option provides new solutions for mixed-signal debugging. There are features for debugging issues related to connect modules (CMs)/Interface Elements (IEs) such as busy nets, wrong supply voltage, IE profiling, IE source code debugs, etc. 

Current distribution in SimVision does not require static current probes and hence saves disk space. Current distribution information is available in the browser currents side bar. Cross selection between SimVision and Virtuoso Schematic Editor speeds up the debug process. It reduces the top level mixed-signal debug time dramatically especially for those testbenches related to current or SPICE/Spectre netlist. Average of 4x debug time reduction with respect to the traditional SimVision is observed in top-level test cases.

Jim Godwin R S, Texas Instruments
Jerry Chang, Texas Instruments
Angelika Keppeler, Texas Instruments
Iris Wang, Texas Instruments
Lalit Mohan, Texas Instruments

Digital Front-End Design

Augmenting Formal Equivalence Verification Advanced Checks and Machine Learning Approach for Ensuring High-Quality SoC Tapeouts

FSDB-Based Self-Gating Technique for Power Saving and FEV Verification Approaches

Physical-Aware Premask/Postmask ECO Implementation for a Better Optimised Patch

Resolving Gate to Gate aborts with Datapath restructuring in Lec for netlists generated with new gen synth tools

Root Cause Analysis of Power Switch in Deadlock State Using IEEE 1801 (UPF)

A root cause analysis of a power switch not coming out of OFF state is discussed with the application of IEEE 1801 (UPF), which is widely known industry standard power intent format. The presentation covers learnings derived from one of SoCs from NXP Semiconductors in threefold manner. The first part of session discusses the problem statement where during power-up sequence, power switch is stuck in deadlock state as its enable is driven by switchable (generated) supply. The second part discusses root cause analysis carried out by performing various experiments in low power sign-off tools using IEEE 1801 and the third part describes enhancements in IEEE 1801 power intent writing style and sign-off methodology to mitigate such issues in future.

Sagar Patel, NXP Semiconductors
Sid Jain, NXP Semiconductors
Sorabh Sachdeva, Ex-NXP Semiconductors

STA Constraints Through the Litmus Test

Timing constraints development and their validation plays an important role in achieving 1st pass silicon success. Traditionally, check_timing and RAC (Report Analysis Coverage) are used for constraints validation. But with increasing complexities in MCU designs and timing constraints it is required to validate constraints for broader scope with extra rule checks like syntax, usage and structural checks to accomplish completeness in constraint validation. In this paper, we talk about how the wide portfolio of rules for lint and policy checks provided by Cadence Conformal Litmus helped in checking correctness, completeness and design impact of constraints, also about issues which were caught during validation and fixed earlier in the flow.

Gaurav Patil, Texas Instruments
Tejas Salunkhe, Texas Instruments
Siddharth Sarin, Texas Instruments
Ashwini Kulkarni, Cadence

Digital Implementation

PUBLIC 16x9 Template

A Differentiated PD Methodology for 4K Multi-Stream Encoder in AI SoC in 6nm

Today's deep sub-micron semiconductor technology has enabled large-scale integration of multi-million gates. The design of such IPs has introduced several challenges in terms of increased design complexity in the areas of timing closure, physical design, signal integrity and PDN. This tutorial will discuss a methodology that is based on the successful design of one such high-speed media encoder (Video Processors). We have increase the performance by 25% from 800Mhz to 1Ghz (partially coming from technology scaling) , The area was reduced by Huge margin while keeping similar power numbers. We will review different methodologies that we have been followed to meet the PPA goals of this design. Following topics will be covered with examples to explain design challenges and the approaches used to address them: Synthesis, Floor-planning, and APR, Design Closure. We have increased the design utilization by more than 15% from previous generation. We faced many PDN and Routing challenges due to utilization and have explored different design techniques and tools to overcome these challenges. Sizing of clocks cell was done to improve EM violations , keep out regions and checkered blockages for dynamic and grid violations. A lot of focus was put on power reduction as this is a battery operated device. We used vector based synthesis flow to minimize power. There was a conscious effort on each step of implementation to achieve the required QOR. With all these techniques we achieved benchmark stats of Std. Cell Utilization (80%), Multibit (>90%), CGC efficiency (99%), Power optimization based on vectors (10 % reduction). Along with these nos. we achieved smooth and predictable closure of this complex design. Keywords: floorplanning , utilization, Innovus, genus ,interface timing, routability, power optimization.

Ashwani Sharma, Intel
Pankaj Chafekar, Intel
Appala Naidu Ponnada, Intel

Reducing TAT and Improving QOR by Using Innovus Mixed Placer Technology

SoC designs are getting complex, resulting in a higher level of difficulty to reduce TAT and improve QoR. Floorplan has always been and becoming even more vital with lower technology nodes in determining how much performance and area optimization can be achieved. A good floorplan considerably reduces overall TAT and manual efforts in QoR convergence. 

Challenges faced due to sub-optimal Floorplan in macro intensive and complex designs are – 

  • Congestion - Results in high DRC count. 
  • Timing – Difficult to converge due to high negative slack. 
  • Power - Extra buffering leads to additional power, Increased IR drop. 
  • TAT - Increased runtime for QoR convergence. 

Cadence provided a utility called “Mixed Placer” from Innovus versions 18.xx and above which has ability to place macro and logic concurrently to achieve better performance. Congestion handling capability is enhanced, Wire lengths are better optimized. Better Floorplan can be achieved in fewer iterations. 

Successfully used Mixed Placer in our recent taped out. Utilizing our custom setting for mixed placer, we could achieve 51% reduction in congestion, 56% reduction in TNS, 16% reduction in runtime and 9% reduction in total power when compared to manual floorplan.

Ramya L R, Marvell Semiconductor
Sachin Revannavar, Marvell Semiconductor
Amit Lawand, Marvell Semiconductor
Roshan kumar, Marvell Semiconductor
Wendy LIU

Robust Techniques Used to Implement Hierarchical Multi-Million Inst Count Design with High Pin Density

Synthesis to Post-Layout Predictability and QoR Improvement with Genus iSpatial Technique

As we move towards shrinking technology nodes and increasing complexities of the design, the need for more robust approaches in synthesis and physical design becomes evident. Working continually to reduce the cycle-time from RTL to GDS, accompanied with enhanced precision through the various design stages, parameters like interconnect delay, cell density and routing congestion become more critical and play a crucial part in determining the performance of the design. Early modelling of the buffer-tree, logic placement and the associated delay degradation in layout would enable us to analyze and optimize the critical timing paths which typically pop-up in later stages during synthesis itself and would greatly reduce iterations between the physical design and RTL team. In this paper we demonstrate the QoR improvement and better correlation achieved from synthesis to post-layout in terms of timing, area and leakage power using Genus iSpatial technique, thus bridging the gap between Genus Synthesis Netlist and Innovus preCTS netlist. The attached flowchart (Figure-1) in the “Proposed Solution” section of the supporting document illustrates the difference between logical, physical and iSpatial synthesis flows and the various steps involved.

Using iSpatial flow to run synthesis enables us to utilize GigaPlace Engine and GigaOpt Engine from innovus tool directly, right during synthesis. This approach helps with better estimation of logic placement, buffer-tree, routing congestion and interconnect delay seen at post-layout and enables the tool to come up with better logic restructuring based on the criticality, thus saving time taken in preCTS iterations and also improving the QoR.  

When the tool is provided with physical information regarding the design at initial stage it has better access to mapping libraries and more options for architecture selection which helps in making better trade-off’s between area, power and timing. When innovus is invoked during “syn_opt” in iSpatial synthesis, it allows the tool to physically re-structure the design to obtain much better optimization. Having better access to locations of cell along with path routes, the tool is able to calculate delays along data-paths with much better accuracy right at syn_opt stage which helps in flagging critical paths early in the design cycle. Even though the same inputs are given to the tool even during physical synthesis, for the same design the results obtained from iSpatial run had much better correlation with preCTS results as compared to physical synthesis, and hence was picked for further analysis.

The following set of basic settings are to be done to enable iSpatial synthesis for a design.

•Need to provide path for innovus executable, this would determine the version of innovus which would be invoked while starting iSpatial run

•Set the “library_setup_ispatial” attribute to true, this ensures that the libraries are treated similarly between genus and innovus

•Set attribute “opt_spatial_effort” to extreme, this option enables iSpatial synthesis. 

•The “invs_postload_script” attribute has the path for postload.tcl which would be sourced once innovus is involved. This is the file which contains all the settings which are mimicked from preCTS setup

•Another key aspect to enable a fair comparison between the results is to ensure that the don’t_use and don’t_touch attributes for cells, libraries, instances, etc remains same. 

Results:

The proposed iSpatial flow was used on a design with ~70K flops, operating at 100MHz and had around 25 macros. The logical synthesis for the design showed no R2R violations and synthesis was flagged as clean. But when running iSpatial on the same design a few extra paths started violating. These paths were same as the paths which were flagged only after the floorplan and preCTS steps completed on the logical netlist. Further more high fanout buffering was also done based on the logic placement.

For the same design the results obtained from iSpatial run had much better correlation with preCTS results as compared to physical and logical synthesis, and hence was picked for further analysis. The attached diagram (Figure-2) in “Results” section in the supporting document highlights the placement of various hierarchies when the design is taken through logical synthesis + preCTS flow and the iSpatial flow. Table-1 shows the correlation achieved with iSpatial technique in terms of area, timing, leakage and active power. With almost similar QoR between the iSpatial and pre-CTS post-layout runs, the advantages of iSpatial flow over traditional synthesis flow can be clearly seen. It is able to flag and optimize critical paths during synthesis which would have otherwise been possible only after preCTS stage, thus saving design iterations and cycle time.

Kavithaa Rajagopalan, Texas Instruments
Murtaza Tankiwala, Texas Instruments
Abhishek Sahoo, Texas Instruments

Cerebrus Intelligent Chip Explorer - Revolutionizing Chip Design with Machine Learning

This presentation introduces Cerebrus, Cadence’s new Machine learning platform that is revolutionizing digital design implementation.

Learn how Cerebrus can help designers achieve huge PPA and productivity improvement by leveraging Reinforcement Learning technology

Sreeram Chandrasekhar, Cadence

Conquering Timing and Signoff Challenges in 7nm Multi-Million SoC Design Using Cadence Implementation and Signoff Solutions

In lower technology nodes like 7nm, achieving timing, PV & RV closure becomes very challenging with each block/sub-block having its own criticality. In this paper, we present distinct categories of block with their criticality and efficient way to take down Timing challenges in early design cycle of PnR and surpass timing miscorrelation along with implementation tweak to tackle IR. Moreover, it also surveys major signoff accelerators of Cadence tools that helped us along the way for swift tape-out.

Kartik Koul, eInfochips (An Arrow Company)
Ruchita Shah, eInfochips (An Arrow Company)
Niharika Modi, eInfochips (An Arrow Company)

Effective Ways of Design Closure to Achieve Optimal PPA with Reduced ECO Cycles

Introduction

As Technology node shrinks and Complexity of designs increased, every aspect of design closure is important like Power, Performance, Area and last but not least eco turnaround time. These aspects play major role when we are approaching the design closure. To achieve all these aspects optimally we have tried few limited access features and signoffOptDesign flow which helps us achieving our targets easily with lesser ECO cycles compared to our traditional signoff Design closure flow. In this paper, we present the effectiveness of Tempus SOD flow during signoff.

Features used:

We explored below features which helped us to achieve better PPA goals.

  • innovusGenusRestructuring feature helped in optimizing and restructuring logic using genus restructuring algorithm which helped in achieving better PPA numbers. 
  • Cts_virtual_delay_optimization_hold feature helped to reduce memory hold significantly by adjusting memory latencies based on available skew which  does not impact setup closure.  
  • signoffOptDesign flow, which enabled quantus and tempus to get pba timing reports and apply fixes for drv/setup/hold and power opt. This helped to reduce number of signoff eco cycles with optimal fixes and Significant Leakage power benefits.

Anil Kumar, Samsung
Sandeep Anantarao Jadhav, Samsung
Nagabharana Teeka, Samsung
Praveen Kumar Gontla, Cadence

Genus Ispatial + Innovus Solutions: A Winning Combination for Best-in-Class QoR Predictability, Realization, and Fastest PPA Closure

High-Speed Signal and Timing Qualification Methodology Using SPICE Within the P&R Environment

Incorporating Multi-Million Gate Design Change Using an ECO Approach in Innovus Implementation

In this paper we would like to present the techniques that have been successfully used to carve out a new partition in existing top level design using ECO Design approach. Using this method, we could contain the design implementation work inside that new partition without touching top level at all.

Ashish Kumar, Samsung
Aniket Khandekar, Samsung
Pradeep VS, Samsung
Ravneet Singh, Samsung

Novel Adaptive Power Switch (A-PSW) Scheme for Low-Power Designs

Technology Update: Digital Design and Implementation 2021 Update

Digital Signoff

Technology Update: Digital Design and Implementation 2021 Update

Latest innovations from the Digital Design and Implementation group relating to power savings, advanced node coverage, machine learning and multi-chipset flows will be presented.

Vinay Patwardhan, Cadence
Rob Knoth, Cadence

A Comprehensive Methodology to Analyze, Track, and Manage the Power Consumption Trends Along a SoC Lifecycle Using Joules and Voltus Solutions

With increasing gate count and complexity of a SOC and the ever-increasing need for low power designs, it becomes imperative to keep track of power consumed at every stage of the design cycle. From the very beginning i.e. RTL level it makes sense to ensure that all the power hungry blocks are manually clock gated to avoid needless power wastage. In order to meet all the key targets from an ultra-low power design perspective we have evolved a three stage power estimation methodology using Joules and Voltus.

First stage, the designer can make use of the RTL source code and the IP level simulation trace data (from Cadence’s Simvison) to estimate the power of their designed blocks. Joules is being employed for this task.

Once the designers have ensured highly optimized clock and data gating for their IPs, the next stage in the flow is to analyze the power consumed by the entire SOC. Joules provides a unique option of reporting power based on the device modes present in the CPF file. 

At the last stage, Voltus is being utilized for signing off power analysis. In the earlier two steps we have only analyzed and optimized the power of the logical design blocks but the clock tree (CT) cells and the buffers which are inserted as part of synthesis have a significant impact on the complete SOC power.

Generated reports from the above power flows are then analyzed by the help of Spotfire tool which boosts the strength of the overall analysis. It’s a handy tool from TIBCO which simplifies the data interpretation by providing a wide range of options for data visualization. 

Design upgrades for power optimization at a later point of time generally is a huge effort and has a large turnaround time which can affect the overall project cost and timeline significantly. Early power estimation saves both time and debugging efforts as compared to bug fixes at a later point in design cycle. The above elaborated three stage power flow makes our design more robust in terms of power saving. 

Key benefits of the Joules power flow: 

Useful features provided by Joules like clock efficiency, data gating efficiency helps in saving a significant amount of time in boiling down to the exact flop/logic cone which consumes the most power. Reporting out the power efficiency highlights all the blocks/cones which can be optimized. The profiler mode can be used to track how the power consumed within the IP varies at different stages of its operation, and when different features are enabled.

At SOC level, the RTL power analysis provides an insight on rail mapping, power consumed per rail and the validity of .libs being used. In case of any unexpected connections the reported power shows anomalies, and can be used to pinpoint the exact cause of the issue.

Cells within the library consuming higher current/power are also identified from this flow and are then replaced more efficient cells to optimize the power. 

The power mode flow available with Joules opens the door to estimate device mode-wise power; power can be estimated when the device is in active, sleep, standby modes etc.. This reduces the load on manual power estimation in such cases which is usually prone to error. 

Key benefits of the Voltus power flow: 

Gate level simulation using Voltus provides feedback on the power consumed by the clock tree structure. All high-power consuming clock routing cells can be placed more efficiently in a data path (moving buffers before a gating cell to after the gating cell in the same path). 

Active paths with high interconnect cap which consume a lot of power can be identified as part of this flow. This acts as a feedback to placement and routing team for optimizing the routing network to achieve low power dissipation.  

This flow with all 3-stages has been successfully deployed in the current project. In earlier projects we had used Voltus alone at a later stage to estimate power which leaves with very minimal design modification option. Incorporating the three-stage power analysis flow in the ASIC design cycle improves the overall chip quality in terms of power consumption. Lower the power wastage, higher will be the operational life of the device which increases the desirability of the solution in the market. This flow shows promising results for large designs as well where the gate count exceeds millions of transistors. In our current project, we have carried out multiple iteration of IP power analysis on roughly 27 IPs which has resulted in a power saving of 30-50% for most of the IPs. Similarly, multiple iterations are carried out at SOC level as well which has resulted in an additional saving of 3-5% with each iteration. It has also helped us in undertaking decisions related to area vs power tradeoff. We also plan to walk through the correlation of power numbers obtained from Joules and Voltus and how they correspond with actual Silicon power data.

Sudhanshu Surana, Texas Instruments
Ruchi Shankar, Texas Instruments
Anuvrat Srivastava, Texas Instruments
Aniruddha P N, Texas Instruments

DSTA for Multi-Million Designs

Dynamic IR Drop Reduction Techniques for Tensilica Processors

The semiconductor industry has continued its efforts for achieving advanced technology nodes throughout the last decade. Technologies are striving for higher speed, low area and low power, making timing and power sign-off a crucial stage in EDA flow. Hence, it has become very critical to limit the dynamic IR drop of an IC considering its impact on the speed of the cells and overall performance of an IC. Tensilica processors, with configurability and extensibility at its core, are adapting highly complex designs in the field of Audio, Vision, AI/ML, Radar/Lidar/wireless and energy-efficient low power applications. This presentation would provide more insights into the implementation of dynamic IR drop reduction techniques and dynamic IR drop improvement trends observed for Tensilica processors using Cadence’s Voltus IC Power Integrity tool.

Abhishek Parab, Cadence
Ashlesha Vikrant Karandikar, Cadence
Akshay Hindole, Cadence
Eliot Gerstner, Cadence

Efficient Dynamic IR Drop Closure Using Innovus In-Design Power Integrity Solutions

PVS Flow for Tensilica Processors

Tensilica® processors are well known for its versatility and configurability for a wide spectrum of applications. This paper mainly focuses on the implementation of LVS and DRC flow for Tensilica processors and co-relation of verification capabilities of Cadence® Pegasus™ with Innovus™ Implementation System tool, tested on various Tensilica DSP cores. It also incorporates various automation techniques which help in reducing manual efforts and time consumed in fixing violations. This is possible through automated Engineering Change Order (ECO) supported by Pegasus and Innovus tools for complete physical verification signoff.

Faisal Belwadi, Cadence
Harshad Angal, Cadence
Ashlesha Karandikar, Cadence

Voltus-XP and Voltus-XM for Power Integrity Signoff for Monster-Size Chips

With increasing chip complexity and compute, reticle sized chip has become very common. Handling these chips size is very challenging from implementation & signoff perspective due to compute and turn-around-time. Especially power-integrity is a daunting task with billions and billions of power nodes to be analyzed. This paper focusses on handling power integrity signoff on monster chips using Cadence Voltus-XP (Extensive Parallelism) and also using hierarchical modelling approach using Voltus-XM flows. Voltus XP mode distributes processing over a large number of machines effectively and provides a high-performance power sign-off solution with reduced processing time and  accuracy. Voltus XM Flow facilitates  parallel closure of block and top level of large designs. Very good Co-relation is seen between XP & XM flows.

In this paper, we will share details of XP & XM flows on following parameters   

a) CPU usage

b)Memory, Runtime 

c)Accuracy Aspects (EIV & tap/battery current profile)

d)Benchmarking Results between XP and XM runs 

Rakesh Reddy, Marvell Semiconductor
Karatholuvu Suryanarayanan, Marvell Semiconductor
Ratnakar Bhatnagar, Marvell Semiconductor

IP/ Subsystem Verification: Performance and Smart Bug Hunting

Challenges in Verification of Ever-Evolving MIPI Standards: A Case Study on Verification of Unipro 2.0 and M-PHY 5.0 IP Using Cadence VIP

With the arrival of 5G technology to help user in the next wave of seamless & immersive mobile experience, flash storage devices are moving to UFS 4.0 solutions which will be needing latest Unipro2.0 & MPHY5.0 as application & PHY layers. In this session, we will talk about newly added features in Unipro2.0 and MPHY5.0 IP’s and what are the challenges faced and the DV approaches taken to overcome the issues. Further, we will explore the test bench details, feature updates in VIP and how it helps with error scenarios and coverage closure.

Arnab Ghosh, Samsung
Piyush Tankwal, Samsung
Piyush Agnihotri, Samsung
Eldhose P M, Samsung
Kuntal Pandya, Samsung
Vishnu Prasad K V, Samsung

Comprehensive Verification of USB4 Sub-System using Cadence USB4 VIP and USB4-TripleCheck IP validator

Formal in Simland

Formal Verification at Subsystem Level: A Framework Based on Experience with Accelerator Subsystems

Formal Verification of Fine-Clock Gated RTL vs Un-Gated RTL Using JasperGold Sequential Equivalence Checking App

With ASIC size and complexity going up, the need to reduce ASIC power has become more important. Fine-clock gating technique is one of easy and widely used power-saving technique to reduce dynamic ASIC power. At Juniper, we use 3rd party tool to efficiently introduce fine-clock gating structures at RTL level. The tool provides complex structures to implement the fine clock-gating structures at various sequential depths. This generated RTL need to be sequentially equivalent to original un-gated RTL. JG-SEC is used to achieve this goal to verify the equivalence between the two RTLs. We set up an automation which uses JG-SEC clock-gating style strategy on 40+ designs with average design size of ~400K flops. It helps in early bug detection and at the same time JG-SEC formal techniques achieves 100% validation coverage and saves regression time which will be required to prove the equivalence. Automation provides push-button convergence for most of designs, but for few complex designs we use advance SEC techniques like abstractions, stopats, cutpoints, blackboxing and use of special engines to achieve design convergence.

Ashish Khicha, Juniper Networks
Arghajit Basu, Juniper Networks
Bathri Subramanian, Juniper Networks
Ketki Gosavi, Cadence

Getting the Most Out of JasperGold Clock Domain Crossing App for Efficient Design Signoff

Indago Python API Mathematical Signal Analysis

As debug becomes a bottleneck in the verification flow, there is a growing need to automate more and more debug flows. Cadence’s next-generation Indago Debug introduces a new Python API interface to its debug databases, allowing it to automate debug tasks like value fetching, pattern finding, static design analysis, etc. Due to the fact that Python has an incredibly large community of developers and library providers, you can use other libraries in conjunction with your debug automation scripts to gain even more powerful capabilities in debug. This talk focuses on the application of Python mathematical analysis packages, such as numpy and scipy, alongside the Indago Python packages to achieve advanced mathematical analysis that was previously only available with expensive third-party tools

Yuval Gilad, Cadence

PCB and System Design and Analysis

48V 250A FET PCB Thermal Profile - A Thermal Simulation Using Celsius Thermal Simulator's 2D and CFD Airflow Analysis

Sigrity Aurora and Advance SI Capabilities Evaluation to Solve SI and PI Issue on High-Speed System

This presentation is created to bring out the usage and efficiency of design analysis tools for post-layout analysis of high-speed boards at various scenarios. With efficient use of design analysis tools in between product design lifecycle, we were able to save design rework, save extra design margins which reduced BoM cost and to produce a design that is under prototype manufacturing.

Shubham Pandey, Thales
Diptar Banik, Thales
Arumugam Thangapandi, Thales

Technology Update: System Design Platform – Pervasive Performance and Productivity

Saugat Sen, Cadence

Voltus Package-Based Dynamic IR Drop Analysis

This paper focusses on ease of use package in Voltus-Sigrity Package Analysis flow to enable package information during rail analysis and understand it’s effect on die. Package information was added as lumped values for IR drop analysis in traditional flows. 

In this paper, we will share details on methodology, tip and tricks, issues faced and how they were addressed.

Amol Harkare, NXP Semiconductors
MayanK Mittal, NXP Semiconductors
Ratnakar Bhatnagar, NXP Semiconductors

DesignTrue DFM in Cadence

Impact of Amplitude Noises and PDN Noise on PAM4 Signalling for 400Gig Ethernet and PCIe 6.0 Applications

Prabhakaran Palaniappan, Mobiveil
Buvaneshwaran C, Mobiveil
Srividhya Mitran, Mobiveil

InFO_oS Design Flow Using Allegro Package Designer Plus

A brief introduction of an advanced wafer level technology named Integrated Fan out (InFO) is covered as it provides high speed, low cost, low power consumption, small form factor and energy efficiency. This paper explains various steps involved in implementing the InFO design in Allegro Package Designer tool from Cadence like taking in the constraints and pad stacks, placement of dice, bumps placement, assignment and routing. At the end, generation of reports related to post routing checks such as net length, parasitic and open pins and GDSII stream out for PDV from Allegro Package Designer Plus tool is discussed in detail.

Jagadeeswari Yadavalli, OpenFive
Diksha, OpenFive

IPC-2551, ''Digital Twin'' - Elucidated in Design and Manufacturing

The IPC-CFX standard has achieved a great deal, breaking down barriers of data exchange, eliminating the massive waste of time and money across the industry related to bespoke, customized machine interface connections, which collectively would have led to costs to the industry of many billions of dollars. In manufacturing, we now look toward the next phase of standardization. The challenge is for us to extend interoperability and dataflow beyond simple shop-floor communication

The new IPC-2551 digital twin standard currently in development seeks to create an environment of interoperability that allows solution and technology providers to collaborate, exchanging information, together bringing an order of magnitude greater value than could be achieved through any single disconnected source. The IPC digital twin standard sets out the top-down hierarchical structure, through which applications can identify and communicate active and useful elements of digital twin data, in any depth of detail, using standard formats.

The intention is to utilize existing IPC standards, such as IPC2581 (Digital Product Model Exchange DPMX), IPC-2591 (IPC-CFX), and IPC-1782 (traceability). Non-IPC standards, such as JEDEC JEP-30 3D component data standards, are also being considered. This is expected to fast-track adoption of the IPC digital twin standard, without having to reinvent current standards and radically change data flow practices

The IPC digital twin represents either a single product design instance or a range of closely related product variants and revisions. 

Information includes: 

a. The design intent, including specification, requirements, use-case metrics, environmental limitations, etc. 

b. The mechanical and electronic design, including 3D representations.

c. The intended bill of materials, including vendor selection, engineering change history, variant definition,

Amba Prasad, Tejas Networks

Layout Potency Challenges Using Cadence Allegro Technology

PCB Electrical Constraints Automation - Allegro Constraint Compiler

PCB electrical constraints entry in design is mostly manual, time consuming and error prone.  Automation of PCB design electrical constraints is limited with in organizations for their own use and in their own ways. So, there is a limitation to re-use constraints across designs/organizations. In this session, we will discuss how the PCB design constraints stored in csv format can be shared across designs/organizations to help to meet their product schedule. We will also discuss how the Constraint Compiler can be effectively utilized in translating electrical design constraints in csv format directly into Constraint Manager. Verified constraints in csv format and constraint compiler together helps in 67% faster constraint entry with utmost quality.

Vimal Cyril, Intel
Aravind Krishnan, Intel

PCIe Gen5 SERDES Compliance - End to End Analysis

PCIe Gen5 is the latest SERDES interface offering twice the bit rate of its predecessor at 32 GT/s. With x16 link, the transfer rate scales upto approx. 128 GBps. (32GT/s x 16 lanes / 8 bits-per-byte x 128/130 encoding x 2 for duplex). Usually in the industry, for SERDES speed greater than 30 GT/s, PAM4 modulation (Nyquist freq=1/4 data rate) method is usually used. However PCIe Gen5 continues to use NRZ signaling scheme, thus making Nyquist rate to be half of data rate (higher frequency  higher attenuation). The signal attenuation caused by the channel insertion loss (IL) is the biggest challenge of PCIe 5.0 technology

In the PCIe Gen5 compliance flow, we will cover the following 

1.S parameter extraction of PCIe links with highly accurate 3D solver.

2.Stitch the S parameter in Sigrity TopXp and perform PCIe Gen5 compliance analysis.

3.Analyze the compliance results, perform changes if required.

Gyan Nirmale, Blaize Semiconductor
Karthik R, Blaize Semiconductor

Reimagining 3D FEM Extraction with Clarity 3D Solver

Learn and apply the latest innovations in the full-wave Cadence Clarity™ 3D Solver to analyze your next-gen system design. Deep dive with us into the fully distributed architecture of the Clarity 3D Solver that enables you to extract large and complex packages and PCBs using hundreds of cores in the cloud or your on-premises farm — all while taking as little as 8GB memory per core.

Robert Myoung, Cadence

SoC Verification: Advanced Verification Methodology

Technology Update: Driving Verification Throughput with Cadence Verification Flow

Yogesh Goel, Cadence

A Novel Approach to In-System SRAM Repair Verification in Embedded SoC to Bridge Gap Between Test and Functional Mode

Embedded memories are a huge part of any modern SoCs and play a vital role in the performance of the design. The purpose of memories in systems is to store massive amounts of data. Since memories do not include logic gates and flip-flops, various fault models and test algorithms are required to test the memories. The process of testing the memories in a chip on automated testing equipment involves the use of external test patterns applied as a stimulus, the device’s response being analysed by the tester, comparing it against the golden data stored as part of test pattern data. This is a complex and time-consuming procedure that involves external equipment. MBIST makes this easier by placing all these functions within the test circuitry surrounding the memory itself. It implements a finite state machine (FSM) to generate stimulus and analyse the response with the expected response. This self-testing circuitry acts as an interface between the high-level system and the memory. The challenges of testing embedded memories are minimalised by this interface as it provides direct observability and controllability. The FSM generates the test patterns for memory testing and reduces the need for external test pattern to be stored. Since the MBIST design is now responsible for the performance of vital memories, it is imperative to verify the MBIST design with complex scenarios in a methodical manner. This paper discusses the approach to verify MBIST based SRAM Repair in embedded SoC.

Memory Repair is implemented in two major steps: the first step is to analyse the failures diagnosed by the MBIST controller during the test for repairable memories, and the second step is to determine the repair signature to repair the memories. The repair register generated by the MBIST model is then electrically fused (eFused) in a non-programmable memory on the chip. This repair signature is finally shifted to the corresponding memories using a complex memory system. The process of memory repair in our SoC is presented in Figure 2. Modern SoCs have hundreds of repairable memories embedded in various blocks. The memory repair becomes complicated to verify, as there are numerous possible combinations. Hence we attempt to bridge the gap between DFT block level verification in test mode and DV SoC level In-System verification in functional or mission mode by adopting functional and test coverage driven co-verification.

Harshal Kothari, Samsung
Eldin Ben Jacob, Samsung
Sriram Kazhiyur Sounderrajan, Samsung
Somasunder Kattepura Sreenath, Samsung

Accelerating Design for Testability (DFT) Simulations

Accelerating Verification Productivity by Harnessing Xcelium Multi-Snapshot Incremental Elaboration Methodology

Elaboration time can be a major issue in the verification of a large system-level designs. This paper presents on best practices for reducing elaboration time using multi-snapshot incremental elaboration (MSIE) methodology. This presentation will include an overview of how the (re)-elaboration time can be reduced through use of MSIE methodology. We will present on different MSIE flows that can be used RTL, GLS and DFT flow, as well as use-cases as per the design and verification requirements. We will also present results of specific elaborations workloads optimized and deployed in ADI internal projects.

Kiran Ankem, Analog Devices
Vinay Rawat, Analog Devices
Vijaykumar Sankaran, Cadence

Centralized Regression Optimisation Toolkit (CROT) for Expediting Regression Closure with vManager and Xcelium Performance Optimization

Few of the main requirements for DV closure are clean regression and coverage closure with lesser turnaround time with every design label. Smart Centralized Regression(SCR)[1] is a utility where all the DV tests from each IP/subsystem/SoC are run centrally to ensure 100% pass rate qualifying a bug-free DUT with each design release label and gather requisite coverage metrics. The simulation time depends on the size and complexity of the design, testbench and simulator performance and LSF/Compute availability. In addition to RTL simulations, with X-propagation, power aware simulations and gate level simulations closure being mandatory for DV closure, the turnaround time for regression closure has increased exponentially. With the increasing number of subsystems and IPs in an SoC, the number of tests increase exponentially thereby causing a surge in net run times and license requirement if the overall performance is not ameliorated to its most optimized state. The CROT addresses each of these concerns by managing the regression process efficiently in an optimized manner without compromising on the quality of verification and prudently saving upto 60% time and resources. 

Typically during SoC DV lifecycle, CR is run for 50~75 iterations traversing RTL development, PARTL, RTL freeze stages to unit delay and timing GLS. The current SoC taken up for the regression had a total of 3500+ tests and the individual sims had a mean run time of about 4hours with RTL. With 100 licenses in parallel, it would take around 6-7 days to get initial results. The bug fixes and their reruns would make the regression turnaround time exceed 10 days in RTL to about 30 days in GLS. All above activities are time and bandwidth consuming manual processes. With time to market and first pass silicon becoming a key differentiator, it is imperative to strategically reduce this cycle-time which often consumes a lot of engineering bandwidth resulting in productivity loss. The CROT employs 3 key aspects to expedite the CR process:

•Live tracking via HTML dashboard  Improved status reporting, failure analysis, rerun, and automated incremental coverage merging.

•Simulation speed optimization  Profile, Analyze, Perf Knobs, Save and Restore (SnR).

•LSF and compute optimization  Tracking and projecting the LSF scheduling and license acquisition mechanism.

Harshal Kothari, Samsung
Eldin Ben Jacob, Samsung
Pavan M, Samsung
Sriram Kazhiyur Sounderrajan, Samsung
Somasunder Kattepura Sreenath, Samsung

Challenges In Power-Aware Verification with Hardware Power Controller and Novel Approach to Harness Xcelium Low-Power Functional Coverage for Complex SoC

Complexity of System-on-Chip is increasing exponentially with the advancements in artificial intelligence, machine learning and IoT technologies. With the diminishing dependency on the software to take control of the hardware, the complexity of the hardware is increasing further. All devices are expected to consume the least power possible resulting in maximum operation time for the end user. Keeping up with the latest trends in ASIC domain, the need for advanced low-power design is a major requirement in high performance SoCs. As the number of transistors increase, it becomes imperative to keep the always-on portion of SoC as minimal as possible while employing rigorous power saving techniques for logic as well as memories. The management of power on the shrinking technology nodes becomes even more intricate and so does the verification of power related features on simulator which are considered to consume more simulation time in comparison with non-power aware simulations. This calls for a methodical approach in building the power aware verification plan to check complex scenarios and employing IEEE 1801 Unified Power Format (UPF) functional coverage driven simulations to identify and close holes with respect to arising corner cases so as to ensure completeness of Verification and a first-pass Silicon. This paper canvasses a strategy for Power-aware verification and discusses the steps to be followed in the run up to bring-up power aware simulations, scenarios to target for gate level with and without timing annotation, and optimizations for simulations that were done to achieve upto 52% higher performance and efficiency in the project execution without any compromise on verification quality.

Our SoC consisted of 75 power domains, 79 clock domains and 37 voltage domains resulting in mammoth combinations of power domain switching which is astronomical to handle and qualify. Following systematic approach was inculcated to target an efficient and smooth verification flow for low-power simulations for large SoCs. Power aware verification flow (Figure 1.)

•Analysis of power/voltage/clock domains  Defining verification scope 

•One time compile  Loading the power information of design

•PMIC controller  Mimic the behavior in the test bench

•Power operations isolation, level shifters, retention  Checkers to reinforce DV

Harshal Kothari, Samsung
Eldin Ben Jacob, Samsung
Sriram Kazhiyur Sounderrajan, Samsung
Somasunder Kattepura Sreenath, Samsung

Effective and Well-Organized Way of Tracking and Managing MDV Project

This paper introduces a consolidated flow of planning, managing, and tracking verification activity of a RFIP till sign-off with “User tuned Metric-Driven flow”. The activity was executed using Cadence’s vManager illustrating all its centers (Planning, Regression, Analysis and Tracking) usages in project management cycle. With verification results contributing to a delighted customer, this platform provides a crucial link among Design Requirements, Tests Specifications, and “How to Validate”, a single platform for the whole verification view. 

The proposed flow extensively used “Planning, Management and Tracking” tool resulting in improved verification efficiency, easy maintainability, managing the project along with meeting verification requirements.

Mansi Chadha, STMicroelectronics
Shivam Mittal, STMicroelectronics

Scaling Sequential Equivalence Formal Verification

In this presentation authors have introduced an additional application of sequential equivalence verification called shmoo FV where you can verify the timing critical units to expose corner case bugs. And this presentation also discusses how we can scale the usage of sequential equivalence hierarchically beyond unit level by using divide and conquer approach and achieve results much faster as compared to traditional methods.

Vichal Verma, Intel
Achutha KiranKumar M V, Intel
Bindumadhava SS, Intel
Smruthi Balaji, Intel

SoC, System Verification and Beyond!

This presentation talks about how are the designs getting evolved and Verification tools and methodology are trying to catch up. Critical areas of verification and appropriate methodology to be used, how to achieve time to market and good quality design.

Garima S, Samsung

Strategy for Accelerating the Code Coverage Closure

Due to the advent of technology, designs are becoming extraordinarily large and complex. As a result, verification of these designs has become more and more challenging and time-consuming. Code Coverage closure is one of the crucial metrics for DV closure. In this talk, we discuss a strategy to reduce the Man-Month(MM) required to close code coverage using Cadence's Jasper UNR and Resilience features.

Kotragoud Hg, Samsung
Sarath Yadav Saginala, Samsung
Naga Satya Sai Ponnam, Samsung
Naresh Raghav Rachamani, Samsung
Somasunder Kattepura Sreenath, Samsung

System Design and Verification: Emulation and Prototyping

A Novel and Scalable Solution for IP and Subsystem Level Emulation

Automation of SoC Boot ROM Validation on QT

Enhanced UVM Acceleration for Early Closure of Low Power

Introduction to Dynamic Duo of Emulation and Prototyping

Many design teams have used some form of hardware verification throughout their verification cycle for years now. Some engineering teams prefer to use emulation, some prefer to use prototyping, and some even use both. Why would engineering teams invest in both platforms?

Join this session to understand why you should consider bridging emulation and prototyping into a continuous verification environment to speed up your verification throughput and for early software validation and real world testing.

Michael Young, Cadence
Juergen Jaeger, Cadence

Validating Real Use Case in Pre-Silicon Phase

This talk presents benefits of enabling silicon like real world use cases to prove robustness of SOC design, estimate performance and dynamic power early in SOC design phase. We will present how Palladium Z1 and its infrastructure enabled testing of various use cases. We will also discuss how reduced verification time, debug capabilities, flexibility and reusability of test case environment in Post Silicon helped to move towards “Shift Left” in SOC design

Shruti Maheshwari, Maxlinear
Liang Xu, Maxlinear

System Design and Verification: Flows

A Flexible and Integrated Solution for Fault Injection Driven by Functional Safety Requirements

The semiconductor industry providing for safety-critical applications follows the implementation of functional safety standards such as ISO26262 (Automotive) and IEC61508 (Industrial). These standards can require verification of the safety metrics, which if often performed using fault injection techniques. Differently from manufacturing test, the safety verification methodology is still quite diverse across the semiconductor companies providing devices for safety-critical applications. Therefore, flexible solutions are essential to support the capability to derive inputs, execute, review, and feedback the fault campaign results into the functional safety analysis domain.

In this paper, we present how the Cadence Safety solution replaces our in-house custom fault injection flow: to achieve grading of STL (Software Test Library), the solution introduces several optimization techniques and executes with different simulation engines, maintaining scalability and performance. The Cadence Fault Campaign Manager (FCM) is connected to the Cadence FMEDA capability which provide the FS verification plan. Cadence FCM is then leveraged for the demanding task of using design and application knowledge to identify safe faults and guide the fault campaigns execution. Structural and Test-based analysis steps integrated in the Fault Campaign Manager flow are enabled to mark design/application aware safe faults as well as limiting the scope of faults to simulate per each test. Results of the safety metrics verification are then automatically annotated back in the FMEDA, including the possibility to introduce expert judgement. These and other automated capabilities, some yet to be explored, are discussed in this paper.

Sanjay Singh, Texas Instruments
Mohammed Arif, Texas Instruments
Mangesh Pande, Cadence
Vinay Rawat, Cadence

Advances to CPF-Based Low-Power Mixed-Signal Integration and Verification

Designs with large number of applications with lowest power consumption and on time delivery are the key contributors in winning the market.  Hierarchical power intent based integration proved effective for complex power managed mixed-signal SoC.  This presentation highlights the challenges and solutions for low power design integration including RTL instantiated level shifters and switches, hierarchical low power rules, enabling early verification for hard IP, power supply connectivity for legacy hard IP with Boolean abstraction of behavioural models and VHDL RTL.

Lakshmanan Balasubramanian, Texas Instruments
Ruchi Shankar, Texas Instruments
Sooraj Sekhar, Texas Instruments
Nidhi Singh
Ankhitha M
Subhadeep Aich

Complex Interconnect Verification and Performance Analysis Using Interconnect Workbench and Arm AMBA ATP

A memory controller SOC comprises of many subsystems connected to an interconnect with large number of masters and slaves. This interconnect routes transactions between different subsystems over different protocols. The complexity brought in the interconnect by multiple subsystems and IPs, and their advanced configurability contributes to verification challenges. It’s necessary to ensure that we aren’t starving IP-functions of the bandwidth they need or requests from latency-critical IPs are processed on time. This highlights importance of performance parameters like latency and bandwidth.

In the past, building a verification environment with modelling and integration of a large number of slaves and masters has been a time-consuming and cumbersome task. For our interconnect with 30 masters and 40 slaves, it would easily take four-five months to complete the entire verification process. But with IWB, the testbench generation and sanity bring up takes a week and the whole verification process can be completed in just a month. Datapath, performance, boundary and error scenarios are tested by us. IWB provides for an editable user-directory folder, and this facility provides flexibility and control to us as regenerating the TB retains the user-directory.

To stress-test the interconnect, multi-master-multi-slave tests were executed. This was achieved using AMBA Adaptive Traffic Profiles(ATP). With ATP enabled and readily integrated into the IWB environment, ability to generate exhaustive performance data for any kind of master-slave combination is gained with very little sequence-writing effort. Varying just a few parameters allows us to control the parameters in the transactions. The rate controls the bandwidth and latency. Address pattern controls address progression, start/end address controls the particular slave to be chosen and agent ID controls the initiating master. The data and the elaborate graphical overview obtained from the performance analyzer has also enabled the designer to create custom combinations to explore bandwidth/latency limitations further.

A routing issue in the interconnect in full path verification that would take up to two days to reproduce and debug can be identified much earlier with IWB hereby reducing the verification time and effort. ATP functionality has allowed us refine architecture in the initial stage itself. This paper showcases the use of IWB and ATP to overcome the challenges presented in verifying a complex interconnect.

Vaibhavi Rastogi, Samsung
Gaurav Goyal, Samsung
Srinivasa B N, Samsung
Girish Kumar Gupta, Samsung

Innovative Approach to Address System-Level Verification Challenges of a Highly Complex Data Server SoC

Optimizing and Maximizing Verification Regression Throughput (Up to 5X) Using Machine Learning

The ever increasing complexity of SoC designs and corresponding IP’s that make up these SoC’s, pose tremendous challenges not only in the Design phase, but also in the Verification flow as well. Mobile devices and smartphones have shown this trend as predicted by the International Technology Roadmap for Semiconductors (ITRS) 2.0:

The need to deliver an IP or a SoC right the first time and in the shortest time to market is one of the biggest concerns of any verification team. Companies would lose hundreds of millions of dollars in revenue and their market competitive advantage if schedule slips happen. Quick yet reliable verification is what is needed. Simulation is still the preferred verification mechanism due to the prohibitive costs of emulators. However, the simulator is inherently slower and hence leads to larger verification TAT and slower closure.

Lesser time for verification with complete coverage closure is the need of the verification community today. Moreover, due to the random nature of the test sequences, there is never a guarantee that all specified coverage goals will be achieved by simulation. To improve the likelihood of hitting desirable coverage numbers and mapping the corner cases, randomized test stimuli are often simulated repeatedly during regression testing. This approach is not effective, as it results in a significant increase in the total simulation time. However, the amount of simulation one can perform is always limited and hence it would be prudent to apply the test vectors in the most efficient way possible on the design instances that make the most difference.

However, is there an automated way to improve the regression efficiency: run fewer tests, spend less runtime on the regression runs, and achieve the same coverage at the end?

The answer is YES!!!

This paper introduces “Xcelium ML”, or what is officially known as the “Xcelium Parallel Logic Simulation with Machine Learning Technology” which achieves regression efficiency in a structured way. Experiments on a live project at Samsung show 3-4X overall regression optimization, which translates to saving, at least 3-5 person-days of effort person-days of effort per regression run for that IP block.

Arun K R, Samsung
Preetham Lakshmikanthan, Samsung
Ashwani Aggarwal, Samsung
Sundararajan Ananthakrishnan, Samsung

Reusing Stimuli Is Easier and Closer to Reality Than You Think

Portability of the stimulus across horizontal and vertical platforms is a dream long cherished. Lot of efforts had gone to achieve this in terms of defining DPIs, custom automation structures etc. TTR, high competition has mandated to have a structured, well defined, standard based methodology leading to the birth of “Portable Stimulus”. This talk showcases the work done on the live project where Portable stimulus models developed at the block level were truly portable at Block, SubSystem, SOC & FPGA.  At the time of this talk the chip had tapeout using Portable Stimulus methodology.

Priyadarshini Dixit, Analog Devices
Suguna Rajappa, Analog Devices
Keerthi Manjunath, Analog Devices