CadenceLIVE China – OnDemand

AI and Big Data Analytics

Cadence AI Technology Innovation and Future Development

Erick Chao, Cadence

Enhancing Chip Performance with Innovus Machine Learning-Based Delay Prediction

With the continuous development of machine learning technology, it plays an increasingly important role in the field of chip design. In order to improve the performance, power consumption, and area (PPA) of chips, various new solutions are being explored. This paper proposes a solution based on Innovus machine learning to improve the PPA of chips.This solution learns the cell/net delay from CTS to PostRoute in the chip P&R design stage through supervised learning methods, and fully integrates the machine learning predicted delay into PreRoute optimization, guiding the behavior of Innovus tools in the OPT stage. The advantage of this method is that the machine learning predicted delay can calculate each timing arc (net or cell) and dynamically adjust throughout the entire PreRoute optimization stage. To verify the effectiveness of this method, we conducted a series of experiments, and the results showed that this method performs better in some designs in terms of PreRoute and PostRoute delay/timing correlation, thereby improving chip performance.In addition, we conducted a detailed analysis of this method and discussed its advantages and disadvantages. This method uses machine learning technology to automatically learn delay models in chip design, not only improving chip performance but also saving design time and cost. Dynamic adjustment can optimize chip performance and improve chip PPA. However, this method requires time to train data in order to obtain accurate prediction results. Additionally, this method also requires efficient computing resources to complete training and prediction in a short period of time. Therefore, it may be limited in practical applications.

In summary, this paper introduces a solution based on Innovus machine learning to improve chip PPA. This method uses supervised learning methods to learn the cell/net delay from CTS to PostRoute in the chip P&R design stage, and fully integrates the machine learning predicted delay into PreRoute optimization, guiding the behavior of Innovus tools in the OPT stage. The experimental results show that this method can improve PPA and timing correlation of Preroute and Postroute, thereby improving chip performance. We believe that this method will be more widely used and provide more solutions for chip design.

李雨钊, NVIDIA

Deep Enhancement of PPAC Based on ML Model and Generality of Innovus + Cadence Cerebrus Flow

As the productivity of the digital economy, computing power imposes new requirements on high-performance computing, which stimulates the higher pursuit of chip design. Therefore,  higher requirements of performance, power, area and cost (PPAC) are put forward in chip design to meet diverse application demands, which brings design complexity continue to increase. Nevertheless, this causes great challenges for traditional IC back-end design process, such as the need for many experienced engineers and longer time to get reasonable design parameters, due to its high dependency on experimentation and experience during manual iterations. There is no doubt that it becomes a difficult task to achieve the PPAC goal of the project in the limited project period. In recent years, machine learning (ML) technology has shown great potential in chip design. Machine learning can not only quickly generate a variety of layout schemes, but also quickly rearrange major changes to the upstream design. Some advanced agents can efficiently create layouts that have never been done before. Cerebrus, an intelligent and efficient design space exploration platform, has proved this. There are already some successful cases of cerebrus flow applied in PPA improving, which means the chip design with better performance, better power consumption, more optimized area and shorter iteration times are achieved. Here, we introduce cerebrus flow into different process nodes and optimal PPAC is obtained successfully which show the generality of cerebrus flow. Moreover, based on the ML model from first flow optimization in which scenario the best results are gotten, we have tried multi-round and deep optimizations. The results show that adopting ML model as a starting point for next flow optimization can futher enhance PPAC performance. Our results denote that warm start even cold start through ML model from previous optimizing flow is applicable in cerebrus flow which is beneficial to both deep enhancement of PPAC and run timing reducing in chip design.

邓轩, Sanechips
王娅, Sanechips

基于Optimality和Clarity 3D实现高速 SerDes封装设计优化​

The vigorous development of AI technology such as GPT and other large language models has put forward greater demand for underlying technologies such as large computing power and high IO throughput. OIF organization has launched the CEI-224G project in 2020. If it continues to use PAM4 technology same as 112G serdes , the Nyquist frequency will be as high as 56GHz, which poses a great challenge to package design. It’s very difficult to achieve such a large bandwidth using traditional design and simulation processes. This paper proposes to use Clarity 3D Workbench for automatic parametric modeling, combined with the Optimality Explorer intelligent optimization engine, to realize the optimal design of impedance discontinuous structures such as package vias and ball pads. Compared with traditional manual optimization which usually takes 2~ 3 weeks, workflow of Optimality AI optimization based on automatic parametric modeling only takes 1~ 2 days, which can quickly and efficiently explore the optimal design space. Meanwhile, this paper also discusses the key packaging process parameters that affect signal quality of 224G Serdes, including layer stackup, materials, solder ball size, pad size, anti-pad size, etc.. And it finally achieves return loss below -15dB, insertion loss within -0.5dB, TDR impedance within 93Ohm +/-7%, in whole 56GHz frequency band. At last, this paper also studies different pinmap/ballmap patterns, and analyzes their impact on crosstalk.

马世能, Sanechips
向令, Sanechips
余斌, Sanechips
庄哲民, Cadence

Auto-Optimization in High-Speed SI Simulation with New Powerful Tool

The high-speed SI simulation usually takes a lot of lead time in PCB design projects, especially when it requires optimization to meet customer’s targets, there may need a lot of optimization rounds. And it’s also rather difficult to optimize the TDR, IL/RL to meet targets when there are lots of vias in the whole signal trace, even for experienced simulation engineers.Cadence have released a new simulation tool, Optimality, based on their existed 3D simulation tool called Clarity 3D workbench, which can achieve auto-optimization by parameter scanning and distribution calculation technology. Teradyne CHDC HW team have tested & verified whether this tool can apply to simulation projects or not, and how much improvement it can bring, and finally delivered high-speed simulation projects with this tool.In the tests and real projects, we found out Optimality can be very helpful in SI optimization, especially for all kinds of via optimization. In the projects supported by Cadence China MSA Service AE Team, max impedance in TDR have been improved 12.2~23.4ohm by using optimality, and RL improved 8.9dB (up to 120Gbps).By comparing between Clarity 3D workbench (optimality) results and VNA measurement, we decided that Optimality also have a very good accuracy. The optimality is very reliable.With this new tool, simulation engineer with/without experience can workout an optimization plan in one single night. It can save ~50% of simulation optimization time for strict spec project.

费佳成, Teradyne
尚昊, Teradyne
王胜利, Cadence
方永新, Cadence

Accelerating Verification Coverage Convergence Using Xcelium Machine Learning Technology

随着设计越来越复杂,受约束的随机化验证方法已成为验证的主流方法。一般地,验证激励做到不违反spec描述条件下尽量随机,这样验证能跑到的空间才更充分。但是,这给功能覆盖率收敛带来极大挑战,为解决这一难题,Cadence率先推出了仿真器的机器学习功能--Xcelium MachineLearning,采用机器学习技术让功能覆盖率快速收敛,大大提高验证仿真效率。本文主要介绍Xcelium Machine Learning的使用流程,并给出在相同模拟(simulation)验证环境下应用Machine Learning前后情况对比。最后Machine Learning在模拟(simulation)验证中的应用前景进行了展望。

植玉, Sanechips

Automotive Solution and IP

Designing Automotive SoCs Using Unified Safety Intent (USF)-Driven Cadence Full Flow

蔡军辉, Sanechips
李英男, Sanechips

Cadence Tensilica IP 自动驾驶解决方案

王伟, Cadence

大模型在端侧部署的机遇和挑战

唐琦, Axera

使用Cadence Tensilica Vision DSP的经验分享

薛长宇, MulticoreWare

空间计算赋能浸式感知交互新体验

Xvisiotech 是Cadence Tensilica Vision DSP 重要的软件合作伙伴尤其在vSLAM的运用,在这个演讲中会和大家分享下是如何将slam部署到Tensilica Vision DSP上从而获得较好的性能及能效比。

John Lin, Xvisio Technology

Interface IP Solution for LLMs AI SoC

Arno Li, Cadence

和Cadence automotive DIP一起迎接智能汽车系统的新未来

Juan Du, Cadence

Cadence 24G GDD6 加速AI/ML 训练和推理

Kenny He, Cadence

Custom/Analog Design

New Virtuoso Studio and Spectre Update

Zhong Fan, Cadence

基于Virtuoso Device-level APR 流程实现版图设计的效率提升

在先进工艺的大背景下,版图设计的难度和时间成本越来越大,借助工具提升效率的趋势日益明显,本论文以“提升版图效率”为切入点,重点研究了Row Template和WSP功能在版图效率提升方面的重大贡献。本文首先分析了版图工程师在画图过程中遇到的痛点、难点以及希望工具上未来能实现的功能,然后总结了finfet工艺的项目中版图效率提升收益较大的点,并通过查阅virtuoso在布局、布线中的相关功能进行类比,最后总结出Row Template和WSP分别在版图的布局和布线上效率提升显著。

李小芳, Sanechips
孙航, Sanechips

Several Techniques About Post-Layout Simulation in RF/Analog Design

After finishing layout drawing in RF/Analog integrated circuit design, we get a circuit model which is closer to actual product compare to schematic. With the scaling down of semiconductor technology, the layout contains more and more parasitic and interaction effect in physical chips. So it is important to designers to do post-layout simulation, which can help to check the function and performance of their circuit block. Post layout simulation is a quality evaluation of circuit and layout design. Meanwhile , it is helpful to designers to have a deep understanding of their working. Designers can optimize their design according to the post-layout simulation results, which forms a close loop of analog circuit design with pre-simulation. Then the chip can be sent to tape out. So to be proficient in post layout simulation is a essential skill for RF/Analog designers.This slides present several techniques about how to do post-layout simulation efficiently in RF/Analog design. We can create a new cell view and put the netlist in it. We can also pack multiple post netlists as a model library. Or just use config to switch the schematic and post-layout netlist and so on. With these techniques, designers can be more efficient and relaxed in working. Every method has its own advantage, designers can choose them based on their own favor and habits.

郭昶, AICXTEK

一站式Voltus-XFi EMIR分析解决方案在定制模拟IC设计当中的应用

With the continuous reduction of process size and the increase of design complexity, performance degradation and chip failure caused by IR drop and electromigration (EM) have become more common in chip design. To solve these problems, EMIR analysis needs to be carried out in the post-simulation sign-off stage of analog circuit design process. Unlike the post-simulation process in general analog circuit design, EMIR analysis usually requires special simulation and analysis tools. The tools involved in EMIR analysis are varied, mainly divided into three categories: design input, simulation, and result analysis. In traditional EMIR solutions, the design input stage is quite complex, and designers need to spend a long time sorting out design input files and various simulation analysis options that affect the results; the simulation process generally takes a long time because EMIR is a type of post-simulation and usually occurs at higher levels of design, balancing simulation speed and accuracy during post-simulation needs to be considered; in terms of result analysis, EMIR simulation usually generates a large amount of data, and how quickly the tool loads data and provides effective analysis methods is also very important. Voltus-XFi differs from traditional EMIR solutions in that it has made great improvements in all three aspects, providing a one-stop solution that allows designers to maintain a usage similar to that of general analog circuit simulation, thereby greatly reducing the time designers spend on the EMIR tool itself and helping designers efficiently focus on solving EMIR problems. This article mainly introduces the usage of Voltus-XFi.

陈思雨, Sanechips
黄亚平, Sanechips
胡劼, Sanechips
曾义, Cadence

使用新一代Fast Spice仿真器 – SpectreFX实现SerDes电路全面高效验证

随着电子产品的不断创新,产品应用对SerDes的需求也不断提高。近年来SerDes电路的发展趋势有更高的复杂度、更快的通信速率、更先进的流片工艺、更复杂的模拟数字信号的交互等。这些变化趋势对SerDes电路的全面验证提出了更高的挑战,对电路的时序、功能、功耗需要进行更严格精确的验证。SpectreFX仿真器能够很好地满足对SerDes电路的全面高效验证的要求。SpectreFX是Cadence  Spectre仿真器平台中新一代的Fast-Spice仿真器,使用可扩展创新型FastSpice仿真器架构,能够在保持精度的同时,提供高达3倍的仿真提速。其除了保证精度外,更快的仿真速度、更大的仿真容量、更全面的流程支持,以帮助客户缩短产品的开发周期,保证全面验证的质量。SpectreFX的易用性也是较为明显的优势。SpectreFX提供了四种易用的档位,可以在性能、容量和精度之间快速权衡,避免了传统Fastspice仿真器反复迭代调节的繁琐过程。同时SpectreFX完全兼容了Spectre平台的基础技术,与Virtuoso ADE、AMS designer无缝集成,使得Spectre-FX与Spectre仿真器平台的其它产品 有着非常好的一致性和兼容性,用户上手更快。本文简单介绍Spectre FX的特点和使用方法,并分享在项目中使用SpectreFX帮助验证和优化设计的经验和心得。

刘广涛, UniSoc
颜学超, UniSoc

一种基于Quantus-Reduce加速模拟仿真验证分析的解决方案

随着半导体技术的不断发展,半导体工艺制造的尺寸越来越小,芯片设计的生产制造需要考虑的寄生效应也越加复杂。这对我们设计人员提出更高的要求,也使得电路仿真的工作更加繁重。如何在仿真速度和精度之间取得很好的平衡,以达到高效的工作效率,缩短TAT,成为每个设计人员面临的挑战。本文将会论述,如何应用Cadence公司的寄生抽取工具Quantus进行post-layout寄生抽取,利用Quantus的Qreduce功能对后仿网表进行精简,以缩减网表的规模,提高仿真的速度。Cadence的Qreduce功能,是通过数学的运算,将RC网络进行等效运算,以减少节点,从而达到减少网表的规模,但同时保证了精度不会造成比较大的损失。本文会从后仿网表的缩减程度,仿真精度的影响,仿真速度以及内存消耗等方面进行论述,给出关键对比指标。

李嘉欣, Sanechips
黄亚平, Sanechips
胡劼, Sanechips
凌秋婵, Cadence
杨晓晨, Cadence

Utilizing Cadence Virtuoso PDK Cockpit to Help with PDK Validation in GlobalFoundries 22FDX MRAM IP Design

Introduce the Virtuoso new feature -PDK Cockpit and its application in Global Foundries MRAM design base on 22nm FDX platform.

王连祝, GlobalFoundries
许思凡, GlobalFoundries
朱爱洲, GlobalFoundries

基于Cadence Liberate Mx Trio流程的SRAM多PVT下LVF特征化提取

随着半导体工艺进入28nm节点,无法再使用传统OCV模型去指导时许约束,特别是进入7nm、5nm节点,OCV的悲观模型造成了50%以上功耗和时序损失。相比之下,LVF模型能更好表征纳米节点下多级电路的延迟、功耗模型。但是要获得3σ精度的LVF模型需要进行千万次的蒙特卡洛仿真,而对于关键时序路径由大量晶体管构成、如SRAM之类的大规模集成电路设计,即使折衷数据精度,采用一些加速手段仍需要付出相较于传统OCV模型2到5倍的时间代价,遑论验证数据正确性的时间成本。本文基于Cadence的Liberate Mx Trio工具,可以实现SRAM的LVF特征化提取, 并缩短生成SRAM多PVT下LVF数据的周期。它将Memory的库参数表征、偏差建模和库验证整合,能进行多PVT库参数表征,流程统一,表征数据准确,过程高效。Cadence的Liberate Mx Trio工具是业内第一款针对Memory电路提供高精度的LVF提取解决方案的工具,它不仅可以针对Memory 还有Custom block ;Mix signal block 进行库参数表征,LVF建模。它可以自动将full netlist 进行partition , 在关键路径上自动寻找Probe node ,与cadence 的快速仿真器相配合,在最短的时间内产生符合用户精度需求的时序库。

    本文将结合实际项目,从数据准确性、数据迭代周期以及工具流程复杂性三个方面评估Liberate mx trio工具生成SRAM 多PVT下LVF模型的可行性。

潘任豪, Sanechips
刘琦兵, Sanechips
崔大勇, Sanechips
黄强, Sanechips
王瑾瑜, Sanechips

AMS-Spectre FX仿真器在大规模电路仿真验证中的应用

Cadence® Spectre® FX仿真器是Cadence Spectre仿真器家族中的新成员,是新一代晶体管级的FastSPICE仿真器。它主要用于Memroy、SOC等大规模电路的设计验证。Spectre FX仿真器能够很好地平衡精度和速度,充分利用硬件资源,具有准确,高效,操作简单等优点,符合现代设计对大规模电路仿真的需求。其中通过将电路分块而采用不同的步长,可以极大地缩短大规模电路的仿真时间。利用AMSD进行数模混合仿真时,模拟仿真引擎采用Spectre FX,可以大大提高数模混仿的效率。本文针对本公司的一个实际电路顶层,利用AMSD-FX进行混合仿真,验证了其在不影响仿真精度的情况下,显著缩短了仿真时间,提高了迭代速度。

易睿佳, Sanechips

Digital Design and Signoff

Engines Matter: Delivering Superior Designs at Advanced Nodes

Qinghua Liu, Cadence

基于xReplay的功耗预估和优化流程在高性能处理器核中的应用

In recent years, with the explosive increasement of computing power demand and the promotion of green development concepts, cloud computing centers have a higher requirement for high-performance computing chips, which can perform faster with less power consumption. The power efficiency has become a key criterion of HPC chips. The leading chip design companies have also changed optimization strategy from improving performance to optimizing power efficiency. Power reduction can happen in every process of chip development, in this paper we focus on the power optimization in physical implementation of the high-performance processor core by employing the latest toggle prediction technology of Cadence into the full-flow of physical implementation. Xreplay can predicate toggle rate that is highly consistent with the actual working scene for every optimization flow from RTL to GDS to guide the tool to optimize power. Comparison with the traditional power optimization flow with unified toggle rate and joules-replay power optimization flow, Xreplay power optimization flow can obtain about 6% power consumption benefit. At the same time, Xreplay can generate a simulation waveform which is highly consistent with post-simulation when postroute optimization finishing. After comparison, the power consumption different between Xrelay and post-simulation can reach within 0.5%. Due to no dependence on the long runtime post-simulation flow, the reasonable IR results can be obtained in earlier stage of project, which left more time to get a more robust power ground structure and fix the IR-drop violations.

李峄, Jaguar Microsystems
吴驰, Jaguar Microsystems
杨超, Jaguar Microsystems

Power Optimization and Evaluation Flow Based on Joules

With the complexity of chip increasing, power is getting to be one of the important factors which impacts design signoff. Get accurate power value in relative rearly is stage and decreasing power value through optimization and is becoming as key technique issue.Gate-level waveform for power analysis can be generated through Joules Xreplay, which really shift-left the time-node for Gate-level power analysis with relatively high accuracy, the gap is about 2% compared with the power value of post-sim waveform. Joules Xreplay and CGLAR feature also are invoked in synthesis stage to optimze power and the results are impressive which saved about 5% with only 0.5% area scacrifice. The series method and flow help to greatly decrease power and get accurate power value in early stage, which reduced the project iteration time and shortened the whole chip development process partly.

罗心月, OMNIVISION
印鼎, OMNIVISION
罗锋, Cadence

Smart Optimization and Timing Closure for Automotive Designs-Block Level to Full-Chip

As the demand for automotive designs increases so does the complexity of the designs themselves. With increasing design complexity, timing closure posses significant challenges at both block and full-chip level to tapeout in a timely manner with highly performing products in terms of PPA (Power, Performance, and Area). In this paper, we will present all the challenges that we have encountered over the years as related to timing closure for an automotive design and present solutions that we have employed to meet our tapeout schedule on time, every time with proven silicon.

Tuan Nguyen, Renesas

Comprehensive Fastest Signoff Closure from Block-Level to Full-Chip with Tempus Timing Solution and Cadence Certus Closure Solution in Innovus Implementation System

随着芯片设计工艺节点的不断推进,芯片尺寸随之减小,其设计周期不断增加,并且在顶层设计中模块间的接口时序收敛难度也相应提高,因此高效且优质的接口时序优化策略成为了芯片设计的迫切需求。本文采用基于逻辑简化技术的ILM(Interface Logic Model)流程以及基于分布式ECO(Engineering Change Order)的Certus流程,在PR(Placement and Route)阶段优化顶层的接口时序。测试结果表明,对于ILM流程,它能够减少数据读入设计的时间,并且能将时序违例修复到可控范围内,且优化后呈现的结果具有可参考性;对于Certus流程,其不仅能有效缩短接口时序的优化时间,对时序违例也能进行较大程度的修复。

赵子瑨, Sanechips
刘元龙, Sanechips
刘宇峥, Cadence

Challenges and Solutions to Achieving Overnight Chiplet Signoff Closure

In this presentation, we will discuss the criteria, results of our evaluation of Cadence Certus Closure Solution for advanced nodes (7nm, 5nm, 3nm), in context of both Smart Hierarchical large CPU flow and full chip/chiplet timing closure flows. As we deploy Certus for production use on our next-gen Total Compute CPU and GPU cores, we will cover the methodology that can account for scalability and productivity, while maintaining best-in-class PPA.

We will provide information on the benefits of using Cadence Full Flow for our IPs with Genus, Innovus, Tempus ECO/Certus for block-level closure, Certus Closure Solution, Quantus extraction and Tempus Signoff Solution for chiplet/full chip signoff.

Avinash, Arm
Jessica Zhang, Cadence

Confidently Optimizing and Signing off Automotive Designs with Tempus Timing Solution

在本篇论文中,我们将分享如何使用Tempus ECO进行时序优化和Tempus STA进行最终签核,所有的分享结果都得到了silicon的验证。 Tempus ECO与Innovus Implementation System的无缝集成使我们能够更快地收敛block level的时序,同时在full chip level实现最佳PPA。 此外,使用Tempus进行最终STA分析还有助于我们得到和PR工具更好的时序一致性, 更精确的性能预估和更有效的机器使用率, 满足既定的芯片上市时间规划。

Jing Shao, SemiDrive

Chip Area Reduction Techniques with Diesize Doctor in Innovus

Chip diesize is always critical in chip implementation. Small diesize means small cost for chip.In this paper, different ways to save chip area during implementation are introduced. “Diesize Doctor” is developed to support chip diesize analysis and reduction.“Diesize Doctor” provides two functions to help analyze and reduce die area. One is “Routing Metric” report to evaluate routing-heathy quality. It includes regular metrics (e.g. overflow, hotspot, total wire length ) from Innovus, and newly developed track-utilization utility for routing analysis. With “Routing Metric”, different design databases (e.g. different libraries, netlists, floorplans, tool flows, user setups) can be compared and selected easily.The other function is “Diesize Inspection”. It inspects different chip status and tool options to find potential chip area improve opportunity. The inspections include 1) chip area utilization and routing ingredient analysis, e.g. on-track power stripes, memory/analog channel cost, physical & spare cell cost, low power design cost, clock tree cost, DFT cost; 2) recommended tool version, flow and configuration check; 3) hotspot diagnosis and corresponding suggestions, e.g. padding, corner/channel congestion, unbalanced routing resources, design architecture; 4) other checks such as pad limited reminder, MBFF ratio…Different chip area reduction methods are implemented on 40nm MCUs. The proposed flow is 1) library/metal stack selection; 2) floorplan optimization; 3) power stripe optimization; 4) initial run; 5) chip area diagnoses with “Diesize Doctor.” 6) seek opportunity for smaller size and resolve hotspots.With this flow, smaller chip area is achieved with better area utilization and higher routing efficiency.

戈喆, NXP Semiconductors
何其, NXP Semiconductors
闫旭, NXP Semiconductors
高婷, NXP Semiconductors

Application of Fully Automated Optimization Based on PG Network in High-Performance CPU Core

With the continuous improvement of the integration of high-performance computing chips and the advancement of technology, the width of metal wires is getting narrower and narrower, and the voltage drop (IR Drop) will occur on the power network when the resistance on the chip power network increases and the high-density logic gate unit has a logic flip action at the same time, resulting in timing problems in the chip, and even the function failure of the logic gate may occur. Based on the flash PG flow of the Cadence implementation tool Innovus, this paper completes the comprehensive implementation and rapid iteration of the PG network, and uses auto reinforce pg and trim pg to realize the trade-off between the dynamic voltage drop and timing of the high-performance CPU core from two aspects, and completes the whole process optimization for PG network from floorplan to PR (Placement and Route) stage. The results show that under the premise of the same machine resources, flash PG flow can increase the speed of powerplan up to 10 times the original, especially in the design of the top level, which can effectively save the exploration time of PG mesh in the early stage of design. Auto reinforce pg and trim pg repair 48% of the IR Drop violations by reinforcing the pg of the IR weak area and trimming the redundant pg, respectively, and provide more winding resources for the design to achieve the purpose of not deteriorating the timing and DRC (Design Rule Check).

姜姝, Jaguar Microsystems
杨超, Jaguar Microsystems
吴驰, Jaguar Microsystems

Concurrent Multi-Die Optimization物理实现方案的应用

随着芯片制造工艺不断接近物理极限,使用多die堆叠的3DIC Chiplets设计已经成为延续摩尔定律的最佳途径之一。Integrity 3D-IC平台将设计规划、物理实现和系统分析统一集成于单个管理界面中,为3D设计提供了系统完善的解决方案。其中传统的die-by-die流程在3D结构建立后分别对两个die进行2D物理实现,同时工具也开发了concurren multi_die optimization的物理实现流程。此工作在实际项目中,使用Cadence Integrity 3D-IC 工具,针对性地建立concurrent multi_die的PnR流程,将两颗die在同一个DB中实现并行placement、3D结构单元(Hybrid Bonding bump/TSV)的位置优化、时钟树综合和绕线。协同优化的3D PnR方案相比于2D方案、die -by-die方案在设计整体结果上有更好的表现。

黄彤彤, Sanechips
陈昊, Sanechips
武辰飞, Sanechips
周国华, Sanechips
欧阳可青, Sanechips
许立新, Cadence
徐国治, Cadence
李玉童, Cadence

How Has Socionext Shortened STA Schedule in Developing 5nm Large-Scale Design – Tempus DSTA Case Study

Akihiro Nakamura, Socionext

PCB, Package Design, and System SI/PI/Thermal Simulation - 1

Accelerating Multiphysics System Simulations

Over the years, Cadence has developed significant processes for advancing multiphysics system analysis. There are multiple innovative products coming to this field, including Cadence's Clarity, Celsius, Sigrity X, Optimality, and Fidelity solutions, that deliver remarkably greater performance than existing technologies in the market. To provide a comprehensive multiphysics system analysis product portfolio, we acquired the best-in-class simulation technologies from Integrand, Pointwise, Cascade, and Future Facilities, spanning the multi-dimensional domains that cover signal and power integrity (SI/PI), electromagnetics, thermal management, and computational fluid dynamics (CFD). As we continuously innovate and excel in this fast-growing field, our team provides revolutionary technologies to accelerate the simulation process for our users and customers.

Focusing on generative AI technology, last year we announced Optimality Intelligent System Explorer, the industry's first AI-driven system design optimization solution, which has been highly integrated with our proven Clarity and Celsius technologies to reduce design respins that help our customers achieve magnificent productivity gains. Already providing tremendous design optimization benefits for our customers, the Optimality technology is now in the process of integrating with Cadence's Allegro design platform to further revolutionize the workflow for PCB and IC package designs.

We are dedicated to adopting heterogeneous hardware technologies to accelerate simulations by porting our simulation products to different CPU/GPU platforms and are excited to bring significant advancements to optimize design productivity.

Ben Gu, Cadence

Cadence先进封装EDA工具高效赋能CoWoS-S硅中介层设计和签核

With the Moore’s Law slowing down, chiplet based packaging solution has been increasingly appealing for advanced computing applications. With more than a decade of development and evolution, the CoWoS-S technology has become one of the most popular 2.5D integration packaging solutions, given its superior advantages on high bandwidth signal transmissions, low latency connections, smooth logic die to high bandwidth memory (HBM) communications, and the most importantly, its numerous high volume manufacturing successes.However, challenges lie in front of designers who are obligated to deliver a high-density redistribution layer (RDL) silicon interposer design with the pursuit of reduced design cycle time and satisfactory power integrity (PI) and signal integrity (SI) performances. As the pioneer in the design tool industry, Cadence has been putting continuous efforts in developing new design methodologies and efficient design tools to empower the 2.5 package design. In this article, we will present a systematic design flow with the aid of Cadence tools that enable a more vigorous interposer design process. In addition, methods to achieve desired PI performance for core area with high current demands and SI performance for connections between the logic die and memories under given crosstalk and capacitive parasitic effects will be illustrated. The criticality of advanced Cadence tools and design methodology in solving the challenges related to the latest CoWoS-S technology will hence be shown.

谷雨, MetaX
徐兴隆, MetaX
陈恺立, MetaX
刘华宝, MetaX
孙晨, MetaX
王海三, Cadence
祁芮, Cadence
徐国治, Cadence

A New Generation CFD Platform – Fidelity

Jane Jia, Cadence

DDR5仿真精度研究及在内存升级中的应用

With data rates slated to be up to 6400MT/s, the improved performance of DDR5 come with strings attached. Improved performance usually signifies improved signal integrity (SI) which in turn requires improved power integrity. For signal integrity, there are four types of signal that require special attention in our simulation, namely: clock/address/command/control(CAC), and data. Because the transmission rates of these four types of signals are relatively high, we will conduct signal quality and timing simulation to ensure the entire bus can meet the requirements of the DDR5 protocol in this paper. DDR5 offers ODT capabilities in DRAM’s CAC signals. With consideration of multiple loads topology, there are a large number of different combinations of ODT settings. It becomes important to find the optimal settings. Furthermore, unlike for data bits, the address bits cannot individually be delayed at the controller to optimize its arrival at each DRAM with respect to the clock. The only way of doing this is by adjusting the trace lengths between the DRAMs on the bus, which increase the necessity of SI analysis. 

On a similar theme, the DQ receivers will now support DFE equalization on DDR5 DRAM. With 4 taps of DFE and a gain bias, there are a total of 377,856 combinations of settings just for the DRAM equalization. Maximizing the eye-opening further requires adjusting the ODT of the DRAM. Simulations are also needed to handle such conditions.

黄刚, EDADOC

一种基于背对背贴装QSFP-DD连接器的过孔设计方法

在现今的交换机市场中,56G-PAM4的市场占有率越来越高而成为市场的主流。在此背景下如何在现有技术的基础上,实现功能的同时再实现成本的降低从而实现降本增效,成为一个重要课题。本文通过使用Cadence先进3D电磁场仿真工具Clarity,用它对QSFP-DD连接器和PCB焊盘引脚的模型进行分析,实现在最低的PCB设计制作成本下,一种新的基于Belly-to-Belly贴装的QSFP-DD的扇出过孔方法。仿真结果显示该方法在有微小SI损失的基础上可以实现其信号完整性功能,从而带来了经济效益。

柳雷, Zenosic

Quick Solution Verification in 112Gbps Serdes Loadboard Design Based on Clarity 3D Layout

In ATE Loadboard design ,with Serdes’s speed getting higher and the DUT BGA’s pitch getting smaller,the fabrication risk is getting higher, and the tight SI spec is getting harder to meet.In each time, we have to try many solutions and do 3D simulation to each of them,then find out a solution with good SI performance  and low fab risk. As 3D simulation is time consuming, our project schedule cannot allow us spend too much time to make experiment. Now,With the help of Clarity 3D layout ,its embedded Clarity 3D and Hybrid solver makes it possible.In this paper, firstly, two traditional solutions has been introduced:1.PTH via backdrill solution can be used when DUT BGA is 0.8mm pitch or larger,it can easily pass the simulation spec;but this solution is not suitable for DUT pitch with 0.5mm or smaller, as return loss is too bad. 2. Blind via solution can get good simulation result as blind via is no stub,but,it will lead to high board fabrication risk and low yield.To overcome this dilemma,a new solution combing the advantage of these two solution is proposed, it is laservia+PTH backdrill solution.With the help of laservia, it allow us to route the loopback in outer ring of BGA in 1 layer,so that to other loopback in inner ring, we can easily do fanout use PTH via,and then backdrill the long stub.As in order to keep good contact of DUT socket pin, DUT fanout vias should be filled with epoxy,in this paper, the SI effect of air over stub and epoxy over stub after backdrill is also discussed.It is verified that air over stub hardly leaks the energy away, it will have better insertion and return loss, which means we should apply epoxy filling first, back-drill last.The final result shows we get good SI performance and good fabrication yield,and simulation efficiency can be improved by 20%.

毛剑峰, Teradyne

高速技术洞察分析

随着5G网络扩展和6G技术发展快速推进,业界在为112Gbps速率系统推出而持续努力的同时,目光投向了下一代224Gbps速率标准,数据中心和移动用户侧数据流量的可预见增长催生了更高速率需求,本文针对224G高速链路通道进行了一些相关研究。

黄健, Sanechips
朱代山, Sanechips
杨智伟, Sanechips

SI Design and Simulation Flow for GDDR6 High-Speed System

本文分享了基于Cadence EDA tool的GDDR6设计仿真流程。主要内容包括Data信号(DQ,DBI,EDC)之间的串扰优化,Control信号(CA,CABI)clamshell 拓扑结构的信号质量优化。仿真流程主要包括频域参数提取和时域眼图仿真两部分。本文用clarity3D Layout进行芯片封装和PCB的高速信号的S参数提取,然后通过级联封装,PCB和颗粒的S参数得到全链路的S参数。由于级联方法的回流路径与实际情况存在偏差以及会重复计算solder ball带来的影响,因此本文还采用了封装和PCB 融合一起提取S参数的方法,并和级联方法做了对比。 时域眼图采用Topxp仿真得到,channel simulator 能够考虑到单端信号的上升沿与下降沿不对称的情况,软件支持多核并行计算,能够快速得到两个channel的眼图。

吴凯, Enflame
秦征, Enflame
杨向飞, Enflame
刘珍黎, Enflame
宫建徽, Enflame
邱雪松, Enflame
方永新, Cadence
王海三, Cadence

PCB, Package Design, and System SI/PI/Thermal Simulation - 2

Allegro X AI: Generative Design for PCB and Packaging

Adam Fuchs, Cadence

System-Level LVS Checking of Heterogeneous Integration Packaging Based on Integrity 3D-IC

In recent years, with the development of silicon process size to the level of single nano, it has been more and more difficult to continue Moore's law. Single nano process in the comprehensive consideration of cost, yield, power consumption and other factors, will no longer be competitive. Advanced packaging solutions with heterogeneous integration, such as 2D MCM, 2.5D and 3D, will continue to meet market requirements for miniaturization, high performance and low cost, thus become the main direction of continuing Moore's Law. But it also presents new challenges, especially for system-level LVS checking. Due to the complex structure and large scale of heterogeneous integration packaging, a mistake in any link will have a huge impact, so it is necessary to carry out the final system-level checking of the whole packaging system. However, due to the various parts of the heterogeneous integration with different process nodes, different Fab, and different EDA design tools, bringing them together for system-level LVS checking is extremely challenging. In this paper, Cadence Integrity 3D-IC tool is adopted to perfectly solve the challenges above and implement system-level LVS checking of heterogeneous integration packaging, which fully verifies the effectiveness and practicability of the tool and guarantees the reliability of the system solution.

张成, GlobalFoundries
赵佳, GlobalFoundries
李晴, GlobalFoundries

基于In Design Analysis的PCB设计

随着汽车“新四化”的发展,汽车正快速驶入“电子化”赛道,汽车电子电气架构也随之不断发生变化,逐步由分布式电子电气架构、域控制器电子电气架构向中央集中式电子电气架构发展。中央集中式电子电气架构,主要由车身传感器,中央计算单元和若干个区域控制器组成。传感器、处理器和域控制器之间传输的车身数据和环境信息数据非常庞大,传输的信号速率很高,这对高速信号的信号完整性有很高挑战,设计者需要格外关注。对于设计者来说,传统的设计方法学是根据要求研制产品样机,然后进行测试和调试。如今,汽车产品的上市时间和产品的性能和成本同等的重要,采用传统做法效率很低。如果在初始阶段不考虑信号完整性,就很难做到首件产品一次成功。设计者就需要尽量应用成熟的来自工程经验积累的设计规则,并要充分利用量化的手段对期望的产品性能进行预估。为此,Candence In Design Analysis流程提供了在PCB Editor中调用Sigrity引擎,直接对PCB数据进行阻抗,串扰,耦合,反射,回流路径和直流压降分析,能够快速而准确将结果以云图和表的形式呈现出来,PCB Layout工程师可以更加方便地对设计做出修改。从而在加速设计的同时也保证了产品质量。

沈子尧, NIO

2.5D Package Organic Interposer Design and Validation with Allegro Package Designer Plus and Pegasus Verification System

Organic RDL interposer is used in a 2.5D package structure to enable high interconnectivity density with 2um/2um line width/spacing. The presentation includes RDL design implementation with Allegro Package Designer (APD+) and physical verification with Pegasus DRC. System integration with Orbit IO, substrate design with APD+ are also neccesary for a complete package product design cycle.In order to optimize performance of 2.5D package, simulations of electrical, mechanical stress and thermal can also be conducted. Co-design of silicon, interposer and substrate is critical to meet requirements while minimizing cost. An example from HBM2e to CPU interconnect is demonstrated that finer trace width, samller spacing and smaller via pad can reduce the layer count.

杨程, JCET

VRM模型和Loadline仿真

With the development of electronic technology, the power consumption of server CPU is increasing, and the current amplitude and slope are also increasing. The challenge of voltage transient drop is increasing, and various power supply technologies to optimize the voltage transient drop in an endless stream. Traditional PI analysis only focuses on the power supply quality of nanosecond die. System voltage drop on the microsecond scale are underanalyzed, this paper uses SYSTEMPI for VRM modeling and feedback loop construction to evaluate the system voltage drop at microsecond level and the voltage stabilizing effect of BUCK circuit itself. The simulation results of Loadline circuit with voltage and current feedback effect are obtained and fit well with the test results

张媛瑗, Zhaoxin
石百仟, Cadence

基于Celsius PowerDC的电热协同仿真在芯片热设计中的应用

在如今的芯片设计中,电热问题是一个重要的考虑因素,因为芯片的工作温度会影响它们的性能和可靠性。常规的热仿真往往将芯片的功耗作为唯一的热源,这会导致仿真出的芯片结温等参数和实际情况有一定的偏差,因为封装上的电流产生的焦耳热也对芯片的温度场有一定的贡献。为了将这一部分焦耳热考虑进去,更精确地完成电热联合仿真,本汇报使用Cadence Celsius PowerDC工具,基于有限元算法(FEA)将DC IRdrop仿真和热仿真相结合,通过对芯片功耗和相应位置电流电压源以及材料电学参数随温度变化情况的设置,完成了几组电热联合仿真,并将仿真结果和某主流热学仿真软件的仿真结果以及实测结果进行了对比,均有较高的吻合度。在本汇报中还详细介绍了使用PowerDC进行电热联合仿真的操作流程,通过对该流程的推广,可以帮助提升相关部门精确热仿真的能力。综上所述,使用Cadence Celsius PowerDC工具进行仿真可以帮助PKG设计人员更好地理解芯片的电热特性,并优化设计,以提高其电热性能和可靠性。

朱瀚翔, UniSoc

DDR5仿真分析

After the JEDEC committees released the DDR5 standard, mainstream DDR5 manufacturers rapidly released their own DDR5 products. However, higher DDR5 rates bring better performance and more challenges in signal integrity design have appeared. Based on the product design simulation case, this paper proposes the DDR5 simulation channel model. With the Sigrity SystemSI(TopXp) tool of the Cadence platform, comprehensively considering the impact of crosstalk, jitters, and noise, risks can be identified and signals can be optimized and improved for the DDR5 design link , effectively supporting the development requirements of the DDR5 design link and facilitating product implementation.

孟利强, Sanechips
于明坤, Sanechips

基于Voltus和Celsius的3D-IC联合热仿真解决方案

3D IC技术由于其高性能、小面积、小体积、低成本等优点,逐渐成为高性能芯片的最优选择。随着3D堆叠的复杂度提升,热也成为影响PPA的一个重要因素。在进行PI验证时,我们带入的温度通常是一个固定值。如何对3D设计进行热分析同时能够返回真实的温度给后端进行PI验证迭代,成为了解决这个问题的关键。Voltus结合Celsius流程,则可以将Celsius仿真后的温度以temperature map形式带入voltus,以更加精确的温度进行PI仿真验证,去除因温度不确定性引起的过悲观,指导PR更精准的实现,从而优化PPA。此外,该流程在早期设计阶段就可以进行验证,以优化芯片设计和散热方案,尽早排除迭代风险。

程华, Sanechips
丁萍, Sanechips
陈昊, Sanechips
张海亮, Sanechips
周国华, Sanechips
陈利斌, Cadence

Verification - 1

HW-Assisted Solution and Technology Update

Michael Young, Cadence

Verisium AI-Driven Verification Platform引领验证

王正算, Cadence

System VIP Speed-Up Verification for Interconnect

Interconnect is the key component for any SOC. It works as the core and decide the performance of the SOC. The verification of interconnect faces challenge of verifying correct data routing and performance requirements.This paper is about how to deploy Cadence’ System VIP product to speed up the verification of interconnect. STG(System Testbench Generation) is used for verification environment generation, SVD(System Scoreboard) for data integrity checking and SPA(System Performance Analysis) for performance analysis.With these tools, the verification of interconnect can start right after design is ready. And whenever design is updated, the verification environment can be re-generated in a few minutes. The tool-generated env already have SVD integrated. SPA server will generate performance analysis data in nicely-composed figures.We use a simple 5X5 NIC to try these tools. The NIC requires some customization to work. For example, more constraint added as not all burst types are supported; address translation is required for correct routing. We will also show how these customizations is achieved in the paper.Cadence System VIPs are powerful tool to speed up interconnect verification with user adjustment.

张晓明, Huixi Technology
沙燕萍, Cadence

基于Integrity 3D-IC和Palladium仿真方法设计世界上最先进的光子计算引擎

Lightelligence 的下一代光子计算引擎以单个多芯片3D系统级封装 (SIP) 的形式集成了光子矩阵乘法引擎和相应的电子元件。这种设计使引擎能够在人工智能 (AI) 和非 AI 应用中提供前所未有的性能。SIP 设计中包括一个包含数字处理元件、模拟电路和高速 IO 的电子集成电路 (EIC),以及一个包含耦合器、调制器、波导和光电二极管的独立光子集成电路 (PIC),此复杂的设计带来了很多全新的设计和验证上的挑战。设计过程是一项艰巨的任务,但本文的新颖之处在于利用了 Cadence Integrity 3D-IC 平台及其无缝协同设计方法。此外,使用基于通用验证方法 (UVM) 的验证和 Palladium 仿真环境加速验证和硬件/软件协同的设计过程。Integrity 3D-IC 工具降低了复杂性,提高了稳健性,并简化了多芯片物理设计流程。因此,可以整体、高效地设计高性能 PIC 和配套 EIC 并满足目标规范。它不仅可以确保 µBump 和硅通孔 (TSV) 的准确堆叠,还可以缩短设计周期。Integrity 3D-IC 平台、Innovus、Voltus、Pegasus 和其他强大的 EDA 单点工具无缝协作,以优化实施、电源完整性和系统级 LVS 签核。此外,复杂的异构光子架构和底层的多芯片系统导致相当大的验证挑战。仅使用传统验证方式和方法很难保证产品的进度和质量。本文提出了一种有效的方法来提高异构光子和电子设计的质量并减少总体验证时间。开发了一个基于 Palladium 的仿真平台,以保证项目质量,并加速项目进度,运用“左移”方法缩短硅和光子设计的流片时间。

胡永强, Lightelligence
冯亮, Lightelligence
韩福强, Lightelligence
张文, Lightelligence

基于接口颗粒度的可复用断言架构

断言(SystemVerilog Assertion,SVA)是一种在IC验证中广泛应用的技术手段,它用于描述预期的行为和状态,以及特定条件下待测目标应该满足的性质,从而对特定的时序进行检测与验证。通过将断言嵌入到验证平台中,可以帮助验证工程师更好地了解设计的行为。这种方式不仅能够提高验证的完备性,同时可以加快故障定位的速度,从而帮助验证收敛。    但是断言在实际落地中存在一些困难,其中最显著的难点之一是断言的复用性较差,由于每个设计都具有不同的特性和时序条件,很难将断言从一个设计复制到另一个设计中,这就需要针对每个设计重新编写和调试断言,耗费大量的时间和精力。此外,断言的编写也是一个挑战。编写高质量的断言需要丰富的验证经验和技能,并且对于设计的细节有深入了解。面对这些挑战,中兴微电子积极探索新的解决方案。针对复用性问题,开发了以模块与模块间交互的接口为颗粒度的可复用断言架构,通过将同一个接口的断言同时应用于上下游模块,完成了水平复用。这种方法可有效地提高断言的复用性,减少重复编写和调试的工作量,加快验证收敛的速度。接口断言水平的复用还有助于统一上下游接口理解一致性,避免因为不同的验证人员对同一接口理解不一致导致的故障泄露。除了模块层级的水平复用,接口断言还可以向上集成至更高的子系统层级以及系统层级,实现垂直复用,可大幅减少系统级断言的编写和调试时间。针对断言编写的挑战,中兴微电子在开源OVL库(Open Verification Library)的基础上,结合有线通信芯片的特点,开发了自研的ZVL库(ZTE Verification Library,ZVL)。ZVL库以参数宏的形式提供给验证人员,传入指定参数,即可生成对应的断言。为了实现可复用断言架构以及ZVL库的自动化,中兴微电子开发了sva_gen自动化平台。该平台可以基于设计代码输入输出信号名称中包含的源端和目的端信息,自动按照模块间交互的接口分组并生成module块,在module块内根据接口信号命名来区分信号类别,并自动调用ZVL库生成断言。对于同一个模块,sva_gen平台自动将其包含接口的断言module块例化至同一个module文件,并生成对应的bind文件。验证人员将其嵌入验证平台,即可完成接口断言的检测工作。可复用断言架构及ZVL库为验证工程师提供了一种高效、灵活、可扩展的接口断言自动生成方案。sva_gen平台在Cadence公司的Xcelium仿真器及Indago调试器的支撑下,在中兴微电子完成4个历史项目的断言编写工作,并且应用于多个在研项目中。平均每个项目可以完成总接口数目50%的水平复用以及100%的垂直复用。在同一组接口中,60%的基础公共断言可以通过sva_gen平台调用ZVL库自动生成。在实际项目中sva_gen平台可减少断言编写时间50%左右,并且实现断言工作的左移,对验证工作的快速收敛起到重大助力。

商思航, Sanechips
徐加山, Sanechips
王瑞甫, Sanechips
姜珂, Sanechips
贺志强, Sanechips

Streamlining Fault Campaign Management with Verisium Manager for Digital FuSa Verification in Automotive Design

The automotive industry is experiencing a rapid shift with the emergence of technologies such as ADAS, autonomous driving, sensor fusion, and edge processing. While these advancements present new opportunities for Auto SoC Design such as high-performance computing, power management, and networking, they also bring significant challenges of reliability, safety, and security that require digital FuSa verification work.Cadence offers a comprehensive solution for digital FuSa verification across its products, from the safety analysis of Midas to the Fault campaign management of Verisium Manager, the safety simulation of Xcelium Safety, and the Jasper FSV App. This FMEDA-driven safety verification flow enables high efficiency in verifying safety mechanisms, achieving diagnostic coverage, and attaining the ASIL level.This paper focuses on the usage of the XFS/Jasper/VM triplet, including a case study of building the DFI flow for a projectA block. We detail the steps, from pure Xcelium good/fault simulation to enabling the Save and Restart feature and leveraging the power of Jasper. As a result, injected fault nodes are reduced from 12,294 to 1,517, and the overall time is shortened from 7.26 hours to 1.43 hours. Additionally, diagnostic coverage is improved by using Jasper FSV App to analyze the UU faults and solve the insufficient stimulus problems.The Cadence digital FuSa verification flow offers an efficient, full-stack solution that streamlines the process of ensuring safety and achieving ASIL levels in the automotive industry.

张天晓, Analog Devices
Siri Rajanedi, Analog Devices

Verification - 2

FMEDA-Driven Digital Safety Analysis and Verifications

Deyin Zhang, Cadence

Advanced PCIe Gen5 and CXL2.0 Verification Solution

In terms of high-speed data transmission, PCI Express interface is undoubtedly a better choice. PCIe is widely used in data centers and servers, more and more large-scale SoC chips are also based on PCIe interface, such as GPU and DPU. However, with the continuous improvement of PCIe transmission rate, the PCIe protocol has become increasingly complex, which brings great challenges to both design and verification.For PCIe Gen5, there are multiple types of PCIe devices on the server, including RC, EP, Switch, and Retimer. Meanwhile, CXL type 3 memory expansion devices are also gradually being adopted, which is based on PCIe Gen5. This means that DV engineers of PCIe IP vendor need to develop a large number of test vectors to ensure coverage.From the perspective of delivering mature products, software/firmware development and verification for PCIe/CXL devices are equally important compared to RTL in pre-silicon process. Traditional DV methods are difficult to ensure software and hardware system level verification on  Gen5 speed.For the above PCIe/CXL verification challenges, under the help of Cadence tools, INNOSILICON have developed advanced and efficient solutions that can meet with the sign off standard. This solution mainly includes four parts: Triple-Check Simulation, Qemu Co-Sim, Z1 EMU, and FPGA prototype verification. The Cadence PCIe VIP and Triple-Check test vectors can save a lot of testcase development work while ensuring high coverage of basic functions. With the help of Cadence Palladium Z1 and FPGA, The completeness of software firmware development and system verification can be achieved. Z1 and FPGA have different advantages and cannot completely replace each other. At the same time, QEMU Co-sim as a new INNOSILICON self developed verification method, can let us run system level PCIe verification without hardware restrictions.This solution can cover the verification of all different types of PCIe devices, including CXL type3 device. This paper will provide a detailed introduction based on different typical cases.

贾仪彬, Innosilicon

基于FCM flow 的小规模数字电路芯片测试

With the advance of the chip process, the scale of digital chips has increased sharply, and the cost of testing has further increased. Advanced DFT technology has been used on large scale SoC chips, including scan path design, JTAG, ATPG (Automatic Test Pattern Generation) and more. However, for some small scale integrated circuits(analog front end chips for example), inserting test circuits, such as scan chains, will increase chip area and add additional power consumption. For this kind of chip, the test pattern generated from functional simulation cases can be used to detect the manufacturing defects and failures. Therefore, there should be some methodology to verify if the coverage has met the goal, especially for automotive chips.We solved this problem using the Cadence Verisium Manage Safety Client, relying on core engines of Xcelium Fault Simulator and the Jasper Functional Safety Verification App (FSV). It provides an credible coverage for ATE (Automated test equipment) pattern.Key words: DFT; Coverage; Verisium Manager; Xcelium fault simulator; Jasper

崔震, 3PEAK
周立阳, 3PEAK
刘萌, 3PEAK
赵禹, 3PEAK
王学德, 3PEAK

一种基于Palladium加速器和Spirent测试仪的虚拟以太网测试方案与实践

目前,信息通信行业正在高速高质量发展,业内不断夯实以5G网络、千兆光网、算力网络为基础的“新底座”,推动5G技术演进、算网融合和AI技术突破等锻造“新能力”,释放云计算、物联网、大数据等新兴业务产生的“新动能”,从而推动信息通信行业自身和其他行业奔向数字“新蓝海”。伴随这些业务发展的同时,对底层的通信承载网络提出了更多的需求,在要求低延时的同时,更加需要高速的带宽支撑。以太网芯片作为以太网传输的基础芯片之一,随着数据量的爆发式增长,市场规模拥有持续上涨的动能。与此同时,以太网接口的验证成为芯片成功的焦点。如何验证800G,400G,200G,100G,50G,40G,25G,10G,1G诸多种类的带宽,如何在统一平台上进行多通路测试,以及在满足功能验证的同时对设计的性能进行分析验证十分重要。同时,随着以太网芯片的复杂度不断上升、规模的不断增大、产品迭代周期不断缩短,系统级的芯片验证以及软件验证也成为新的挑战,传统的验证方法已经无法完全满足系统级验证的需求。Cadence 提供一种基于Palladium加速器的虚拟以太网测试方案,结合使用Spirent虚拟标准以太网流量生成器(Spirent Test Center),为以太网接口测试提供了完备、健壮、灵活的测试平台,用户可以在此平台上进行芯片的功能、性能测试。虚拟以太网测试仪能够使用标准以太网流量生成器Spirent Test Center (STC) 等带有 Palladium 仿真器的应用程序, 可以使用以太网测试仪接口利用 STC 功能生成以太网流量并运行流量,基于 Palladium 的加速仿真环境中验证设计DUT。 支持定义流量属性或重用物理测试环境中的脚本,从而填补两者之间的空白仿真和硅后验证,进而提供更高的效率和更好的调试能力。同时,虚拟测试方案在通道扩展上更灵活、更便捷。本方案使用 Cadence 的虚拟测试环境Ethernet Accelerated VIP (AVIP) 提供硬件加速。 虚拟解决方案连接Spirent Test Center (STC) 虚拟测试仪到 Cadence Palladium 仿真器。 STC软件有两部分,即前端和后端。 前端可以是 STC GUI(安装在 Windows机器)或自动化脚本(可以从 Linux 或 Windows 机器启动)。后端软件托管在运行模拟加速 (SA) 上的 QEMU 虚拟机中主机 Linux 工作站。 Cadence 提供 Ethernet Virtual Tester Interface Adapter 来桥接STC和 Palladium 仿真器上运行的DUT+ Cadence Ethernet Accelerated VIP 。Palladium加速器具有仿真加速和强大的debug能力,结合Spirent虚拟以太网测试仪产生的标准以太网流可进行灵活配置、流量调节、定义包长与包内容、错误注入,可以对返回的以太网帧进行统计分析、校验检测,通过Palladium+STC虚拟以太网测试方案大大提高了项目的仿真验证的效率,加速了验证的周期,为项目的Tapeout从时间和质量上做了有力的保证。总之,基于Palladium加速器和Spirent测试仪的虚拟以太网测试方案提供了一种灵活完备的以太网接口测试平台,解决了以太网等高速接口在测试仪下的标准化验证的痛点。为网络处理器的顺利的Tapeout打下了坚实的基础。

李娴, Sanechips
贺志强, Sanechips
刘江伟, Cadence
郝淼, Cadence

使用Verisium Debug Python API加速时钟验证

随着芯片规模的不断增加,现代芯片的设计变得越来越复杂。在这样的情况下,开发人员需要尽可能地利用现有的工具来简化和加速验证流程。Cadence公司推出的Verisium Debug工具,提供了强大的解决方案,让工程师更加轻松进行验证;Verisium Debug提供了非常友好的调试接口,这是使用Verisium Debug的最显著的优势之一,可以大大提高调试效率。该接口使工程师能够快速定位错误并确定其根本原因,从而更快地完成调试。此外,Verisium Debug还提供了一套Python API接口,可以让工程师访问和分析RTL代码。通过Python API,工程师可以快速获取有关模块实例、信号名称等信息。这些信息可以用于开发定制脚本和自动化工具,提高工作效率。综上所述,Verisium Debug是一个有价值的工具,可以简化硬件设计验证过程,优化调试流程,并提高工程师的工作效率。由于其用户友好的界面和强大的Python API,Verisium Debug已成为业内工程师的首选。

高勇, AICXTEK
王兴耀, AICXTEK
沙燕萍, Cadence

Helium在大型软件定义SoC芯片中的应用实践

在大型软件定义SoC芯片中,软件验证成熟度成为该大型SoC芯片成功与否的必要条件,所以在芯片开发阶段,软件测试的介入时间对该芯片Tapeout时间有着直接影响。但是软件测试又受到芯片开发进度的影响,因为软件无法在不成熟的芯片代码开发阶段做过多的验证。所以传统的芯片开发流程为:RTL代码设计->RTL代码EDA工具验证->软件测试。显而易见该流程的弊端为软件测试的开始时间完全取决于芯片的开发及验证进度,整个流程是串行运行的,从而导致芯片最终的tapeout时间大大加长(这里还没有考虑软件测试发现问题到RTL的代码修改的回归时间)。幸运的是我司使用了来自Cadence的Helium验证平台,通过该验证平台,可以将芯片部分RTL模块提前模型化,即在实际RTL没有准备好之前软件就可以开始调试部分代码。该平台使得设计流程创新性的变为:RTL代码的EDA验证过程与软件测试并行进行,从而大大的提前了软件测试介入时间及节约了代码的迭代过程所耗费时间,最终使得整个芯片的开发周期大大缩短。同时在此虚拟平台上,相对于使用EDA、Palladium或者Protium,软件可以快速的启动OS,从而使得软件测试验证本身的时间大大缩短。

梅明, Jaguar Microsystems
李志, Jaguar Microsystems
翟璐璐, Jaguar Microsystems
吴志伟, Jaguar Microsystems

C2rtl_在GPU验证中的应用

formal验证技术中的data-path-verification可以被用来验证GPU的浮点、整数、定点运算,以及图像处理单元等方面的正确性,在本公司的GPU项目中使用了jasper-gold的c2rtl app来验证不同的浮点/整数算子、向量乘法、图像处理等单元,并开发了对应的work-flow脚本自动生成验证环境,提供Makefile方便快捷的进行formal验证及regression。本文旨在介绍本公司在GPU验证中使用c2rtl的经验和方法。

毛维, MOORE THREADS
吴晓波, MOORE THREADS