Tensilica Fusion G DSP Family

Multi-purpose, fxed- and foating-point DSP with exceptional out-of-the-box performance

The scalable Cadence Tensilica Fusion G DSP Family includes two high-performance, general-purpose, fixed-and floating-point vector processors: the Fusion G3 DSP and the Fusion G6 DSP. This easy-to-program DSP family, with its highly advanced auto-vectorizing compiler and optimized DSP library, provides exceptional out-of-the-box performance for both DSP and control code. Its comprehensive instruction set, along with extensive integer and foating-point data type support, make it ideal for compute-intensive signal processing applications in automotive, consumer, mobile, internet of things (IoT), and industrial applications.

Overview

As devices strive to become smarter, they need to interact more extensively with the world around them. This includes continuously innovating with newer and more sophisticated algorithms to process real-world sensor or derived data. The rapid cycle of innovation is more easily enabled by the high-performance Fusion G DSP Family, with its rich and orthogonal instruction set and high-precision fxed- and foating-point data types. Software development is simplifed with programming in C/C++, an auto-vectorizing compiler, and an extensive DSP software library, all courtesy of the mature Cadence Tensilica Xplorer integrated development environment. Further gains in performance may be achieved with the simple addition of application-specifc instructions that seamlessly integrate with the DSP’s instruction set and are fully supported by the optimized C/C++ compiler and SDK.

The Fusion G DSP Family features an 11-stage, 4-slot very long instruction word (VLIW) pipeline and scales with single-instruction multiple-data (SIMD) data paths. The Fusion G3 DSP supports 4-way vectorization for 32-bit integer types and single-precision foating-point. The Fusion G6 DSP doubles the performance with 8-way vectorization. Further levels of vectorization are also supported with native 8-, 16-, 32-, and 64-bit data types. Dual load-store and load units, coupled with vector and wide accumulation vector register fles ensure maximum data throughput for the computational units.

Fusion G DSP Family Application Examples
Figure 1: Fusion G DSP Family Application Examples

The Fusion G DSP Family is built on the confgurable Xtensa LX 32-bit scalar processor, with fexible memory and I/O subsystems, an optional integrated DMA controller (iDMA), and optional memory protection unit (MPU). The high performance, low power, and small footprint make the Fusion G DSP Family ideal for highly integrated, embedded SoC designs with single or multicore processors running real-time control and DSP applications.

The Fusion G DSP Family block diagram
Figure 2: The Fusion G DSP Family block diagram

Fusion G DSP Family Features and Benefits

  • General-purpose, fxed- and foating-point processor with exceptional out-of-the- box performance
  • Extensive ISA with instructions and operations for multiple signal-processing applications
  • Quick and easy performance for many signal-processing applications, with support for a wide range of data types from 8- to 64-bit
  • Implements a 4-way VLIW and 128/256-bit SIMD architecture (for Fusion G3/G6 DSPs) delivering multiple operations per cycle
  • Optional IEEE 754-compliant single- and double-precision vector foating-point unit (VFPU) with four 32-bit singleprecision foating-point MACs for the G3 and eight for the G6
  • Peak performance of up to 16/32 GFLOPS (for Fusion G3/G6 DSPs) at 1GHz
  • Half-precision conversion operations are available to save memory when storing foating-point data
  • Complete memory subsystem with support for instruction and data caches and instruction and data local memories (implemented as ROM or RAM), all with optional ECC and parity support
  • Optional integrated DMA controller for effcient data transfers to and from system memory
  • Optional memory protection unit
  • Confgurable external bus interface supports AXI4, AXI3, ACE-Lite, and AHB-Lite interface standards with optional ECC support
  • Optimized auto-vectorizing compiler for exceptional out-of-the-box performance
  • M-way vector programming support, based on 32-bit vector elements for seamless code portability across the Fusion G DSP Family
  • Extensive software DSP library with over 550 signal-processing tasks and functions
  • Simple and powerful instruction-set extensibility for customerspecifc instructions using TIE

Key Fusion G DSP Family Features

Family Features Fusion G3 DSP Fusion G6 DSP
Load/store Dual 128-bit load/store and 128-bit load units Dual 256-bit load/store and 256-bit load units
Instruction lengths (bits) 128, 64, 24, 16
VLIW 128 bits wide, 2- or 4-issue slots
SIMD 2/4/8/16-way for integer, 2/4-way for foating point 4/8/16/32-way for integer, 4/8-way for foating point
Data formats 8/16/20/32/40/64/80 real integer, 32/64 real and complex foating-point
Vector register fles 32 entries x 128-bit, 4 entries x 320-bit (wide data) 32 entries x 256-bit, 4 entries x 640-bit (wide data)
ALU operations/cycle 2, 4, 8, 16 for 64, 32, 16, 8-bit 4, 8, 16, 32 for 64, 32, 16, 8-bit
Multiply bit width 8 x 8, 16 x 16, 32 x 32, 64 x 64
MACs/cycle 1, 4, 8, 16 for 64, 32, 16, 8-bit 2, 8, 16, 32 for 64, 32, 16, 8-bit
16-bit complex MAC throughput 2 per cycle 4 per cycle
Guard bits 4/8/16-bit on 16/32/64 types (utilizing 20/40/80-bit accumulators)
Aligning load Available in either of two slots
Addressing modes Post-increment, reverse, circular, and general
Floating-point confguration options IEEE-754 compliant, single-precision vector foating-point or single- and double-precision vector foating-point
Floating-point operations Vector MAC, FMA, ADDSUB, ALU, and type conversion, including to/from half-precision
MAC/FMA/ADDSUB per cycle 2, 4 for double, single-precision foating-point 4, 8 for double, single-precision foating-point
Division/reciprocal/RSQRT Included in ISA: integer division and foating-point division/reciprocal/RSQRT
Vector-programming support M-way, 32-bit based
FFT acceleration ADDSUB included for 2X FFT speedup (fxed- and foating-point)
Image processing Scatter gather and histogram
Vector predication ALU, MAC, LOAD/STORE, etc.
DMA support Integrated DMA controller option and support for external DMA controller

Toolchain and Design Flow

The Fusion G DSP Family is delivered with a complete set of software tools, including:

  • High-performance C/C++ compiler with automatic bundling and vectorization to support the VLIW and SIMD capabilities
  • Linker, assembler, debugger, profler, and graphical visualization tools
  • Comprehensive cycle-accurate instruction set simulator (ISS), which allows you to quickly simulate and evaluate performance
  • Fast, functional instruction-accurate TurboXim simulator achieves speeds that are 40X to 80X faster than the ISS for effcient software development and functional verifcation when working with large systems or lengthy test vectors
  • Xtensa Modeling Protocol (XTMP) for system modeling in C and Xtensa SystemC (XTSC) for system modeling in SystemC provide for full-chip simulations. The pin-level XTSC model offers co-simulation of the SystemC model at the pin level for fast, cycle-accurate system simulations
  • All major back-end EDA fows are supported

Cadence Services and Support

  • Cadence Tensilica application engineers can answer your technical questions, and provide technical assistance and custom training
  • Cadence-certifed instructors teach a series of courses on Tensilica IP and bring their real-world experience into the classroom
  • Internet Learning Series (iLS) online courses allow you the fexibility of training at your own computer via the Internet
  • The Cadence Tensilica IP support site gives you 24x7 online access to a knowledgebase of the latest solutions, technical documentation, software downloads, and more at support