What's New for the AMD Vitis™ Software Platform

2024.1

AMD Vitis™ Software Platform 2024.1 Release Highlights:

Enhancements for AMD Versal™ AI Engine DSP Designs

Enhanced DSP Library Functions for AMD Versal AI Core Series
- Time division multiplexed (TDM) FIR filter functions for SSR > 1
- FFT with 32-bit twiddle
- Mixed-Radix 3 & Mixed-Radix 5 FFTs
- Kronecker Matrix Product
- Householder-based QRD solver for improved stability
- DFT for SSR > 1
- New DSP library functions for AMD Versal AI Edge Series with AIE-ML
- General Matrix Vector (GEMV) with SSR support
- General Matrix Multiply (GEMM) with SSR support
AIE API Enhancements
- Support Radix-3/Radix-5 FFTs
- AIE Simulator Enhancements
- Cycle approximate simulation capabilities for AI Engine designs with PL, without the need for control, interfaces, and processing system (CIPS) IP core
- AMD Vitis analyzer support for hardware emulation with 3rd party simulators such as VCS, Questa, Xcelium, and Riviera

Key improvements to Vitis Unified Software Platform

New device support:AMD Versal™ Premium VP1902 Adaptive SoC, AMD MicroBlaze™ V Processor
Enhanced embedded application development and BSP generation for Windows® environment
User-managed flow to debug embedded applications compiled externally
New Bootgen GUI
Enable incremental builds for platform project

Key improvements to AMD Vitis IDE (New GUI)

Added support for processing subsystem hierarchical debug
Added support for export and import of projects/workspace
Added support for Python interpreter and API
New feature preview page
New file change notification for embedded, AIE, platform projects

AMD Vitis What's New by Category

Expand the sections below to learn more about the new features and enhancements in AMD Vitis software platform 2024.1. For information on supported platforms, changed behavior, and known issues, please refer to the Vitis software platform 2024.1 Release Notes for the Application Acceleration Flow and Embedded Software Development Flow.

Enhanced DSP Library Functions for AMD Versal AI Core Series

Time division multiplexed (TDM) FIR filter functions for SSR > 1
FFT with 32-bit twiddle
Mixed-Radix 3 & Mixed-Radix 5 FFTs
Kronecker Matrix Product
Householder-based QRD solver for improved stability
DFT for SSR > 1

New DSP library functions for AMD Versal AI Edge Series with AIE-ML

General Matrix Vector (GEMV) with SSR support
General Matrix Multiply (GEMM) with SSR support

AIE API Enhancements

Support Radix-3/Radix-5 FFTs

AI Engine Simulator Enhancements

Cycle approximate simulation capabilities for AI Engine designs with PL, without the need for CIPS (Control, Interfaces, and Processing System IP Core).
Vitis analyzer support for hardware emulation with 3rd party simulators such as VCS, Questa, Xcelium,and Riviera

Export tables from Vitis analyzer to CSV format

New DSP functions supported for AIE and AIE-ML within AMD Vitis Model Composer
- Time Division Multiplexed (TDM) FIR Filter functions
- For building polyphase channelizers @ 1 GSPS and higher throughput
- DFT/IDFT – with SSR support
- Optimized transforms for throughput/latency on small sizes
- FFT/IFFT – with extended support for CINT32-bit twiddle
- Mixed-Radix FFT/IFFT – with AIE-ML support
Ease-of-use improvements to Model Composer Hub block
Enhancements to Hardware Validation flow
OS and MATLAB® version support added with v 2024.1:
- RHEL 9
- MATLAB R2023a and R2023b

New example designs available on Github.

A new stencil pragma simplifies HLS C++ code for image and video filters
New library function wizards tap into the AMD Vitis libraries github repo
- Create “Solver” and “Vision” (OpenCV compatible) IPs for AMD Vivado design tool
- Run the available library examples
Pragma for memory interface (ap_memory) can now bundle ports for AMD Vivado IP Integrator
New HLS component comparison displays side-by-side metrics for 2 or more components
Support for user-provider RTL code to replace a C++ function (black-box flow)
Code Analyzer can now disaggregate C++ struct members to fine-tune performance analysis
New user control for HLS global FSM encoding and selection of safe state
Access to Clang sanitizers during C-Simulation to perform address and initialization checks

2023.2

Vitis™ Software Platform 2023.2 Release Highlights:

Enhancements for Versal™ AI Engine DSP Designs

New DSP library functions
New API support for DSP functions
New features in AI Engine compiler and simulators

New Standalone Vitis Embedded Software

A smaller standalone installer for designers writing C code for the Arm® embedded subsystem
All embedded features are provided, including utilities such as Bootgen and XSCT

New Vitis Unified Integrated Design Environment

Consistent GUI and CLI across all Vitis workflows
Next-generation, Eclipse Theia-based GUI provides better flexibility and user-friendly features for enhanced work efficiency

Vitis What's New by Category

Expand the sections below to learn more about the new features and enhancements in Vitis software platform 2023.2. For information on supported platforms, changed behavior, and known issues, please refer to the Vitis software platform 2023.2 Release Notes for the Application Acceleration Flow and Embedded Software Development Flow.

New DSP library functions for AI Engines

Mixed Radix FFT
Discrete Fourier Transform (DFT)
General Matrix-Vector Multiply (GEMV)

New API support for DSP functions

FFT IP with cint32 twiddle data types
Support for cint16 for Radix-4 FFT APIs
Vectorized "fix2flt" and "flt2fix" implemented in API

New API support for AIE-ML

APIs now support int32/cint32 data types in sliding_mul() function
APIs now support <float> data types in sliding_mul() function
All AIE API routines required to support sparse matrix multiplication are provided

AIE compiler can now support 2D and 3D arrays as inputs or outputs
Vitis Analyzer now generates guidance report to adjust FIFO size
New support for multi-threaded simulator kernel and value change dump (VCD) analyzer speedup
External interfacing with MATLAB® environment & Python traffic generators
Enhanced AXI Stream model with support for empty/wait cycles in PLIO alignment
Enhanced Design Rule Checking

AI Engine trace offload via high-speed debug
NoC and hard DDRMC profiling support in the Vitis environment
Vitis tool now supports AIE-ML trace for VEK280 and Alveo™ V70 AI inference accelerator card

AI Engine block updates

Support for importing AIE-ML graphs as blocks into Vitis Model Composer
New DSPlib functions for AIE and AIE-ML implementation in Vitis Model Composer
Plotting of AIE simulator output for internal signals in the Simulink® tool

HLS Kernel block updates

Automatic test bench generation
Expanded data type support for HLS Kernel blocks

Integration of Vitis Model Composer and Vitis tool

Generation of .xo and libadf.a files directly from Vitis Model Composer

Other enhancements

MATLAB® tool version support: R2021a, R2021b Update 6, R2022a Update 6, R2022b
Additional topologies supported for the hardware validation flow
New example collaterals available from GitHub

New Vitis Unified IDE for HLS components
New Vitis HLS license requirements
New code analyzer feature for obtaining performance estimations before running C synthesis
Enhancements to AXI interface:
- Support for HLS AXI Stream side-channels
- Support for user-configurable AXI master caching
Other enhancements:
- New code complexity report to enable identifying design size issues during C synthesis
Compile time improvements: Average compile time improvement of 20% in 2023.2 compared to 2023.1*

_{*Based on testing on August 10, 2023, across 1000 Vitis L2/L3 code library designs, with Vitis HLS release 2023.2 vs. Vitis HLS 2023.1. System configuration during testing: Intel Xeon E5-2690 v4 @ 2.6GHz CPU, 256GB RAM, RedHat Enterprise Linux 8.6. Actual performance will vary. System manufacturers may vary configuration, yielding different results. -VGL-04}

2023.1

Vitis Software Platform 2023.1 Release Highlights:

New Vitis™ Library Functions for Versal™ AI Engine (AIE) Arrays

DSP library functions – more FIR filter configurations
Solver library functions – enhancements for higher performance

Design Flow Enhancements for Versal AI Core and AI Edge Series

AIE compiler support for 2D and 3D arrays as inputs/outputs
AIE simulator guidance support for FIFO sizing to avoid deadlock conditions
AIE status reporting enhancements
New default GUI for the Vitis analyzer

Support for Vitis environment export to the Vivado™ environment

Enables Vitis and Vivado tool development teams to work in parallel based on a common interface checkpoint

Vitis What's New by Category

Expand the sections below to learn more about the new features and enhancements in Vitis software platform 2023.1. For information on supported platforms, changed behavior, and known issues, please refer to the Vitis software platform 2023.1 Release Notes for the Application Acceleration Flow and Embedded Software Development Flow.

DSP Library - FIR Filters

Enhanced Fractional Resampler FIR, Single Rate FIR, Half Band FIR, and Rate Change FIR to support coefficient bit widths to be larger than data bit widths
Fractional Resampler FIR also supports SSR operation using multiple AIE tiles and incorporates coefficient reload feature

Solver Library

Enhanced API performance with high-performance streaming designs (~300 tiles)
QR and Cholesky Decomposition support for 4D data mover functions to help read or write data from AIE arrays

AIE compiler can now support 2D and 3D arrays as inputs or outputs in addition to 1D.
AIE compiler supports graph-within-graph constructs (subgraphs) and conditional port constructs.
New AIE CINT-to-CFLOAT data conversion APIs.

AIE status reporting enhancement to generate a file that includes information about tiles, events, and additional registers on AIE-ML and AIE tiles in the design.
Offloading of AIE event trace over high-speed differential pairs (HSDPs) instead of storing it in memory on Versal devices.
NoC and hard DDR MC profiling support in the Vitis environment.
AIE windowed event trace for inspecting a specific part of an application.

Guidance with FIFO sizing to avoid deadlocks.
Ability to select nodes that are reported by the AIE simulator to reduce the size of the simulator VCD file and speed up simulation.
AIE simulator now generates a report (that can be viewed in the Vitis analyzer) that shows which AIE has memory access violations and how these correspond to lines in the graph C code.
Trace view data visualization now supports the AIE-ML array as well.

New data type support for FIR filter configurations that target Versal AI Engines
Two new floating-point functions optimized for DSP58s in Versal adaptive SoCs
Faster response time for all Vitis Model Composer library functions targeting Versal AI Engines
Other enhancements:
- Enhancements to HLS kernel blocks
- Enhancements to the Vitis Model Composer Hub
- Support for MATLAB tool versions R2021a, R2021b, R2022a

Performance improvements*: Average latency improvements of 5.2% in 2023.1 compared to 2022.2
Easy way to download, view, and instantiate L1 libraries functions in the Vitis HLS tool
Enhanced support for AXI transactions and burst reporting within the Vitis HLS tool

Footnotes:
_{*The benchmark tests were performed on all 1208 Vitis L1 library C-code designs as of February 12th, 2023. All designs were run using a system with 2P Intel Xeon E5-2690 CPUs with CentOS Linux, SMT enabled, Turbo Boost disabled. Hardware configuration not expected to effect software test results. Results may vary based on software and firmware settings and configurations- VGL-03}

2022.2

Vitis Software Platform 2022.2 Release Highlights:

New Vitis™ Library Functions for Versal™ AI Engine (AIE) Arrays

DSP library functions – enhanced features
Solver library functions
Vision library functions
Ultrasound library functions

Design Flow Enhancements for Versal AI Core and AI Edge Series

Control relative placement of kernels in the AI Engine array – higher performance and better utilization
AIE x86 simulator enhancements - improved modeling of deadlock conditions in x86 simulator
AIE API enhancements - Radix 3/5 FFT and Matrix ‘x’ Vector APIs added
Enhanced profiling and debugging capabilities for Versal designs – deadlock detection, larger trace data collection, RTL/Python testbench support
New simulation options for heterogenous designs in Vitis

Vitis What's New by Category

Expand the sections below to learn more about the new features and enhancements in Vitis software platform 2022.2. For information on supported platforms, changed behavior and known issues, please refer to the Vitis software platform 2022.2 Release Notes for Application Acceleration Flow and Embedded Software Development Flow.

DSP library functions

Super sample rate (SSR) FIR filter implementation on AI Engine now supports coefficient reload feature and dynamic point size
Added FFT windowing element to the FFT function that targets the AI Engine array

Solver library functions

Quadrature rotation (QR) decomposition
Cholesky decomposition

Vision library functions

Four new video functions targeting the AI Engine array

Ultrasound library functions

Various functions to help build medical ultrasound designs

Ability to add constraints to control relative placement of kernels in the AI Engine array - this allows users to get higher performance and better utilization
Improved modeling of AIE deadlock conditions in x86 simulator
New AIE API added - Radix 3/5 FFT and Matrix ‘x’ Vector APIs added

Generation of AI Engine profiling reports in HW Emulation

Deadlock detection using XSDB (AMD System Debugger) for both AI Engine and PL-based designs

Xilinx Runtime (XRT) controlled continuous offloading of AI Engine event trace over PLIO

Supports PS application on x86 host machine for SW emulation
Allows SystemC functional models for HW emulation instead of RTL
Allows users to simulate the AI Engine kernel with a simple RTL test bench or Python script-based traffic generator
AI Engine status can be analyzed during HW emulation with the Vitis™ analyzer

Vitis environment 2022.2 new simulation options: Processor system x86 simulation and AI Engine x86 simulation: Programmable logic simulation can be performed using the x86 simulator.

Click to Enlarge

Features for Versal AI Engine Design

Ability to add graph constraints to AI Engine DSP library blocks designs – better utilization and performance
New capability for cycle approximate simulation for AI Engine designs
AI Engine Graph Import block automatically detects Run Time Parameter (RTP) ports
Enhancements and additions to the DSP Library blocks

General Features

Hardware validation flow supported for heterogenous system designs that use PL and AIE array
Vitis Model Composer Hub block updated to support heterogenous design
- Automatic detection of valid AI Engine, HDL, and HLS subsystems
Hardware validation flow enhanced for HDL only designs and HDL → AI Engine → HDL designs for Versal platforms

Improved 'task level parralellism' coding style support
- Enables faster C simulation and better QoR

Additional performance and timing enhancements
- Improved burst inference
- Automatic inference of Unroll, Pipeline, Array_Partition, and inline pragmas for better performance
- Improved timing accuracy resulting in better timing closure at higher frequencies

Other features
- Analysis and debug: printf inserted in C-code now supported even after synthesis in the RTL
- Ease of use: new performance pragma to automatically achieve a given transaction interval
- HLS::stream interfaces now supported by FFT and FIR IP

2022.1

Vitis Software Platform 2022.1 Release Highlights:

Vitis™ Flow Enhancement for Versal™ ACAP and AI Engine

Supports AMD base DFX platform with one static region and one DFX region
AIE profiling supports stall/deadlock detection, generates AI Engine status (including error events) view reports in Vitis Analyzer
External Traffic Generators in x86sim, AIEsim, and SW emulation are much more flexible and can be inserted very easily in Simulation and Emulation flows
Vitis Model Composer supports Hardware Validation, Linux and HW emulation

Vitis for DC and Vitis HLS

Vitis Provides additional reporting support for the dynamic region generation process and Flow reporting enhancements include 3 new or updated reports
Vitis Improves PL profiling with the choice of offloading trace to memory resources (preferred) or FIFO in the PL for better performance
A new Timeline Trace Viewer to show the runtime profile and allows user to remain in the Vitis HLS GUI is now available after simulation
Vitis HLS now supports a higher-level type of "smart" construct via the new performance pragma or the set_performance_directive
Vitis Graph Library with L3 API enhancements (1 mS time saved for kernel call) for performance

Vitis What's New by Category

Expand the sections below to learn more about the new features and enhancements in Vitis 2022.1. For information on Supported Platforms, Changed Behavior & Known Issues, please refer to Vitis 2022.1 Release Notes for Application Acceleration Flow and Embedded Software Development Flow.

new Genomics Accelerator Library Added (L1&L2 and L3
Graph Library, L3 enhancements for performance
Vitis Database Library, GQE Multi-Functional Kernel
New functions added in Vision Library
New functions in Vitis AIE Vision Library additions/enhancements
Vitis AIE DSP library, FIR resampler supersedes FIR fractional interpolator
Vitis Codec Library new APIs, API jxlEnc, API ‘leptonEnc’, API ‘resize’, API ‘WebpEnc’

ZLIB Compress Improvement, Customized Octa-Core compression for 8KB solution
ZLIB Decompression Improvement, Customized IP for 8KB file size

Platform Capability Query Improvement
HBM Easy-of-Use Improvement, Ability to choose a specific S_AXI entry point to the HMSS for a kernel M_AXI, RAMA insertion supported from the configuration files

AI Engine Automated Stall/Deadlock Detection & Analysis in Hardware
Analyzing the Automated Status Output
Analyzing the Automated Status Output – Buffers
Analyzing the Manual Status Output in Hardware
Analyzing the Manual Status Output
AI Engine Event Trace Enhancements
External traffic generators AIEsim
AI Engine Profiling Improvements on HW
AI Engine support for Broadcast windows
Vitis AI Engine Compiler Enhanced Graph Programming Model
Vitis AI Engine Compiler - PLIO/GMIO in ADF Graphs

Analysis Enhancements, New Timeline Trace Viewer
Coding Style Enhancements, Array Partition support for Stream of Blocks type
Pragma Abstraction, New Performance Pragma (and directive)
Vitis Core “one liner”, Vitis HLS - New Timeline Trace Viewer, new PERFORMANCE pragma, Stream of Blocks support windows
New Viewer introduced
- Shows the runtime profile of all surviving functions in your design - i.e., those that get converted into modules
- Especially useful to see the behavior of dataflow regions after Co-simulation
- Native to Vitis HLS - No need to launch the xsim waveform viewer anymore (external tool)

Vitis Analyzer Improvement, Save/Restore Timeline Customization
Reporting Enhancement, report_qor_assessment, xclbin Clocking Information, Vivado Automation Summary
Profiling Enhancement, New PL profiling infrastructure enabled, Multiple trace_memory options can be added to insert multiple memory monitors (HW Only), Sample config file for v++ linker to offload trace data for all CUs in SLR0 to DDR0 and same for all CUs in SLR1 to DDR1

Updated Bootgen GUI for Versal
Toolchain Update
XSCT, Support STAPL, Add Linker script generation command
System Compile Flow, Refer to system compile doc

Add Software Emulation support for Auto-restart and mailbox support for always running kernels
Free running kernel doesn’t need while(1) for sw-emu
Add Software Emulation support for external traffic generator
Hardware Emulation can use HLS C source code function model for Streaming IP.

Add API xrt::system for Probing number of devices
Add API xrt::message for Logging messages
XRT Native API host code now requires
-std=c++17 or above
Add experimental xrt::queue APIs for asynchronous execution of synchronous operations
xbutil can show AIE FIFO counters that helps to debug AIE deadlock scenarios
xbutil --legacy option is removed.
xclbinutil --info provides clock information for embedded platforms
xbutil on ARM can load SOM images
xbtop standalone utility to show linux top like output (replacing legacy xbutil -top)
XRT Utilities supports auto-completion in Bash with tab key.

Alveo Platform Updates, Platform Updates for improved stability, Card Management Updates, SC Firmware Update Tool
Embedded Platform, New VCK190 DFX Platform: xilinx_vck190_base_dfx_202210_1, Embedded Platforms are now installed with Vitis, Vivado adds a new Customizable Example Design: Vitis Platform for MPSoC

Major overhaul of the Vitis Model Composer hub block for scalability and ease of use
Hardware validation flow now supports Linux in addition to bare-metal
"AIE to HDL" and "HDL to AIE" blocks no longer include the HDL gateway blocks
2022.1 now ships with a snapshot of the examples for customers who do not have access to the internet. The tool will prompt the user to download a new revision of the examples from GitHub if available
For ease of use, utility blocks that are not part of code generation are now presented with a white background color
Enhanced and reorganized the library browser for ease of use
RHEL 8.x support
MATLAB Support - R2021a and R2021b

2021.2

Vitis Software Platform 2021.2 Release Highlights:

New domain specific development environments
- Vitis™ Video Analytics SDK on Kria™ SOM, Alveo™ U30/U50, and VCK5000 Versal™ development card: Learn More >
- Vitis Blockchain solution on Varium™ C1100 card with Vitis libs: Learn More >
Full end to end flow support for VCK5000 and Varium C1100 cards
Enhanced core tool features
- Vitis AI Engine Compiler C/C++ high level abstraction API, Auto Pragma Inference, Area Group Constraints
- Vitis AI Engine x86simulator enhancements: Trace Report, Memory Access Violation and Deadlock Detection
- Vitis HLS EoU, Timing and QoR enhancement, HLS APIs for user-controlled burst inferencing
- Enhanced Vitis Analyzer for better timeline trace report, data visualization, stall analysis
- Vitis XRT for AI Engine Multiple Process and Multi Thread Support for AI Engine graph control
- Vitis IDE & Emulation support AI Engine Trace, SW Emulation for AI Engine applications
39 new C/C++ library in diverse domains covering in DSP, Data Analytics, Vision, Compression, Database, Graph, Security, … total of over 1000 library functions, Database, Graph, Security, …
Vitis Model Composer
- 3x compile/simulation time, 7x compilation time reduction with Parallel Compilation
- New Hardware Validation Flow and Enhanced Functional Co-simulation

Vitis What's New by Category

Expand the sections below to learn more about the new features and enhancements in Vitis 2021.2. For information on Supported Platforms, Changed Behavior & Known Issues, please refer to Vitis 2021.2 Release Notes for Application Acceleration Flow and Embedded Software Development Flow.

Note: Vitis Accelerated Libraries are available as a separate download. They can be downloaded from GitHub or directly from within the Vitis IDE as well.

Library	2021.1	2021.2	New functions in 21.2
xf_blas	167	167	0
xf_codec	3	3	0
xf_DataAnalytics	33	36	3
xf_database	62	65	3
xf_compression	78	93	15
xf_dsp	94	96	2
xf_graph	53	59	6
xf_hpc	37	37	0
xf_fintech	116	116	0
xf_security	135	140	5
xf_solver	11	11	0
xf_sparse	11	11	0
xf_utils_hw	55	57	2
xf_opencv	147	150	3
total	1002	1041	39

Note: For vision, just count the number of sub folders in L*/tests, because each API has multiple tests for different types

Programmable Logic (PL)

End-to-end Mono Image Processing（ISP）with CLAHE TMO
RGB-IR along with RGB-IR Image Processing(ISP) pipeline
Global Tone Mapping(GTM) along with an ISP pipeline using GTM

New Features	Cat	Customer/Strategic	Segments	Description
RGB-IR	ISP	Seeing Machines	Automotive, ISM	•Support 4x4 RGB-IR demosaicking •Primarily for in-cabin monitoring system •Low light surveillance camera
Mono (CCCC)	ISP	Strategic	Automotive, ISM, A&D	•Machine vision •Low light applications
Global Tone Mapping (GTM)	ISP	Strategic	Automotive, ISM, A&D	•Improved dynamic range and contrast •Lower cost version compared to local tone mapping (LTM)
Dense Optical Flow TV-L1	CV	NTT	ISM	•Improved robustness (against illumination, noise, occlusions) for optical flow

AI Engine (AIE)

BlobFromImage
Back to back filter2D with batch size three support

New Features	Cat	Customer/Strategic	Segments	Description
RGB-IR	ISP	Seeing Machines	Automotive, ISM	•Support 4x4 RGB-IR demosaicking •Primarily for in-cabin monitoring system •Low light surveillance camera
ML+X	ISP	Strategic	Automotive, ISM, A&D	•ML interference pre-processing
Gaussian Pyramid	CV	Strategic	Automotive, ISM, A&D	•Fundamental for multi-scale image processing
Box Filter	CV	Strategic	Automotive, ISM, A&D	•Fundamental for smoothing, low pass filter

Vitis Blockchain Solution based on Vitis libraries

Out-of-Box Mining solutions for Ethereum
Open-Source & easy to use and deploy with Vitis Libs using C++
Flexible & Scalable with Vitis Libs
Be flexible to mine multiple coins
Customize and compile into hardware
Highly optimized design

Adding CSV parser API into library

CSV parser could parse comma-seperated value files and generate object stream which could easily be connected with DataFrame APIs

New L2 libraries added
Louvain with renumber
Renumbering
The ‘weight’ feature is supported for Cosin Similarity

GQE start to support asynchronous input / output feature, along with multi-card support.
- Asynchronous support will allow the FPGA start to process as soon as part of the input data is ready.
- Multi-card support allows to identify multiple Alveo cards that suitable for working.

ZSTD Mult-Core Compression
- Created new ZSTD multi-core architecture and provided >1GB/s throughput using quad-core.
ZSTD Decompress optimization
- ZSTD decompress optimized for performance (increased by 20%) and resource (reduced < 30%)
GZIP/ZLIB Stream Core Improvement for IBM
- Customized Static & Dynamic compress streaming IP (4KB & 8KB)
- Added functionality to provide compressed size in TUSER port
GZIP/ZLIB Decompress Improvement for IBM
- Optimized huffman decoder to reduce latency < 1.5K cycles
- Reduced resources significantly from to 6.9K (older > 9K)
- Added ADLR32 Checksum Functionality
GZIP System Compiler PoC
- Created a System Compiler PoC for GZIP Compress solution and benchmarked against OpenCL Host.

DSPLib on Github since 2021
Fast Fourier Transform (FFT/iFFT)
- Point size increase to 32k (data type dependent)
- Support for stream API as well as window API.
- Parallel Power (0-4)
  - Allows higher throughput and extends range of supported point sizes

FIR Filters
- Initial Stream support for Single Rate asymmetric / symmetric FIR

DDS/Mixer
- New library unit in 2021.2

KECCAK-256 (hash function) and CRC32C (checksum function) are released

Two Data-Mover implementation are added for debugging hw issue.

LoadDdrToStreamWithCounter: For loading data from PL’s DDR to AI Engine through AXI stream and recording the data count sending to AI Engine.
StoreStreamToMasterWithCounter: For receiving data from AI Engine through AXI stream and saving them to PL’s DDR, as well as recording the data count sending to DDR.

AI Engine API

Implemented as a C++ header-only library that provides types and operations that get translated into efficient AI Engine intrinsics.
Provides parametrizable data types that enable generic programming
Implements most common operations in a uniform way for different data types
Transparently translates higher-level primitives into optimized AI Engine intrinsics
Improves portability across AI Engine architectures

AI Engine API will be the lead method for AI Engine kernel programming

High Level Optimizations

AI Engine compiler optimization options

--xlopt=0, no optimization applied.
--xlopt=1, automatic computation of heap size, guidance generation from LLVM IR analysis.
--xlopt=2, automatic inlining, loop peeling for unrolled loops, pragma insertion.

Introducing --xlopt=2 to improve performance, default remains --xlopt=1

Automatic inline
- Automatically inlines functions if it is practical and possible to do so, even if the functions are not declared as __inline or inline
Automatic pragma insertion
- Insert pragmas to kernel code automatically. (see next slide for more details)

Pragma Inference

Necessary for optimizing the kernels

Alleviate user’s responsibility of adding effective & correct chess pragmas

Support to auto-infer five pragmas in 2021.2

for performance:
- chess_prepare_for_pipelining for innermost loop, and outer loops with known trip count
- chess_loop_range for loops with known trip count
- chess_unroll_loop/chess_flatten_loop for innermost loops with known trip count
for correctness:
- chess_unroll_loop_preamble when trip count is not a multiple of unroll factor

Updated Graph Programming Model PLIO and GMIO

Model Changes Include:

Changes to usage of “simulation::platform”
Interaction with PLIO/GMIO objects in the graph, position determines input/output.
Changes of global PLIO/GMIO objects in the graph.
Changes around graph connect<> statements.

PLIO/GMIO in ADF Graphs

Current

Write PLIO, GMIO, simulation::platform, and connections at global scope

GMIO gm0(“GMIO_In0”, 64, 1);

GMIO gm1(“GMIO_In1”, 64, 1); … GMIO gm7(“GMIO_In7”, 64, 1);

PLIO pl0(“PLIO_Out0”, plio_32_bits, “data/output0.txt”, 250.0);

PLIO pl1(“PLIO_Out1”, plio_32_bits, “data/output1.txt”, 250.0); … PLIO pl7(“PLIO_Out7”, plio_32_bits, “data/output7.txt”, 250.0);

simulation::platform<8,8> plat(&gm0, &gm1,…, &gm7, &pl0, &pl1,…, &pl7,);

subgraph g;

connect<> net0(plat.src[0], g.in[0]);

connect<> net1(plat.src[1], g.in[1]); …

connect<> net7(plat.src[7], g.in[7]);

connect<> net8(g.out[0], plat.sink[0]);

connect<> net9(g.out[1], plat.sink[1]);

…

connect<> net15(g.out[7], plat.sink[7]);

Alternative method

Create a top-level graph and move PLIO, GMIO, and connections inside
Allow managing connections within for loop

class topgraph

{

input_gmio gm[8];

output_plio pl[8];

subgraph sg;

topgraph()

{

for (i=0; i<8; i++)

{

gm[i] = input_gmio::create(“GMIO_In”+std::to_string(i), 64, 1); pl[i] = output_plio::create(“PLIO_Out”+std::to_string(i), plio_32_bits, “data/output”+std::to_string(i)+”.txt”, 250.0); connect<>(gm[i].out[0], sg.in[i]); connect<>(sg.out[i], pl[i].in[0]);

}

};

topgraph g;

Area Group Constraints Improvements

Ability to use flags in the ADF graph or constraints file to control the mapper and router

-contain_routing – when specified true ensures all routing, including nets between nodes contained in the nodeGroup, is contained within the area group.
-exclusive_routing - when specified true ensures all routing, excluding nets between nodes from the nodeGroup, is excluded from the area group.
-exclusive_placement - when specified true prevents all nodes not included in the nodeGroup from being placed within the area group bounding box.

Snapshots

Snapshots are textfiles containing comments and data relative to all kernel ports

streams, packet streams, cascade streams
windows, buffer
RTP

Includes also all platform ports

PLIO, GMIO, RTP

Allows users to inspect data traffic at kernel ports without using the debugger and without requiring instrumentation of kernel code

Deadlock Detection

Detects deadlocks in x86 simulations whether this situation arises from insufficient input data, or an imbalanced FIFO depth on a re-convergent path
The stop-on-deadlock feature must be enabled during x86 simulation by specifying option --stop-on-deadlock
If the simulation is stopped because of a deadlock, the error message indicates that you should rerun with option -trace --timeout

Memory Access Violation Detection

Integration with Valgrind for Memory Access Violation Detection

Detect
- out-of-bounds read and write
- read of uninitialized memory
No specific flag required for compilation
Simulation flags can be either
- --valgrind : simulation runs as usual and valgrind displays a report
- --valgrind-gdb : same thing but with gdb debug at the same time

Trace report

Deadlock situation results in poor simulation output and difficulties to analyze bug origin

X86 simulation trace option allows the simulator to log various timestamped information:

Start/End of Kernel iterations
Start/End of Stream stalls
Start/End of lock stall

Timestamps are different in between x86 simulation and AI Engine simulation

User Controlled Burst Inference

For use cases that do not satisfy the automatic burst inference by Vitis HLS tool, user can adopt the newly introduced manual burst optimization
A new class 'hls::burst_maxi’ to support manual controlling burst behavior. New HLS APIs are provided to use together with the new class
User need to understand AXI AMBA protocol and the hardware transaction level modeling in HLS design

Timing and QoR Enhancements

Provide support for user to input high level throughput constraints
Improve HLS timing estimation accuracy. When HLS reports timing closure, the RTL synthesis in Vivado should also expect to meet timing

EoU Enhancements

Add interface adaptors report in the C synthesis reports

Users need to know the resource impact that interface adaptors have on their design
Interface adaptors have variable properties that impact design QoR
Some of these properties have associated user controls which should be reported to users
Text version of bind_op and bind_storage reports are provided

Add new section in synthesis report to show list of pragmas and warnings on pragmas

User can easily understand which of the pragmas that add have issues.

Analysis and Reporting Enhancements

The Function Call Graph Viewer has some new features

New mouse drag based zoom in and out capability
New Overview feature that shows the full graph and allows the user to zoom in on parts of the overall graph
All functions and loops are shown along with their simulation data

A new Timeline Trace Viewer is now available after simulation. This viewer shows the runtime profile of your design and allows the user to remain in the Vitis HLS GUI.

Link Summary Enhancement

Provide clock frequency information for the AI Engine, platform and compute units
Provide a new table called Clocks in system diagram and platform diagram

Platform Export Enhancement

XSA export from Vivado no source files required to be local to the project
XSA export from Vivado no change to the project structure
Package the IPs that are used in the hardware platform project instead of packaging the whole IP repo

AI Engine application emulation enhancements

Provide support for external testbench integration with aiesimulation
Provide support for external testbench integration with x86simulation
Support for GDB debugging with x86simulation
Provide support for snapshots of the data between kernels in a graph for x86simulation
Provide support for access violation checking to x86sim
Provide support for stop on deadlock to x86sim

Support AI Engine Trace

Support SW Emulation for AI Engine applications

Support external traffic generator in Verilog / System Verilog

Extend Profiling Monitor insertion to Monitor Memory

Currently the profiling monitor logic can be inserted on kernel/CU port basis. This feature provides user the option to insert monitor logic on memory interface directly
The visualization of memory bandwidth achieved directly on the memory interfaces can be reflected in profile summary report
DDR memory and PLRAM are supported
Hardware flow is supported
To enable this feature, both linking phase and xrt need to be set up
- memory=all
- data_transfer_trace= coarse|fine or
- opencl_device_counter=true

Extend Profiling Monitor insertion to Monitor Memory

A vadd example that enables memory interface monitoring
- A new table ‘Memory Bank Data Transfer’ is included

Vitis Analyzer Enhancements

Generic profile summary report generated for non-OpenCL applications

Provide the same level of support for XRT API and HAL API applications.
Users select which types of reports they want to create, the tool automatically generate and visualize them in Vitis Analyzer

Add OpenCL commands to PL event timeline

Profiling will add overhead, XRT provides capability to dump the OpenCL events on the timeline trace without overhead.
Vitis Analyzer can process the XRT output and show it in timeline trace view.
xocl_debug=true needs to set in the xrt.ini.

Flatten signal hierarchy in timeline trace report

By default, the timeline trace report displays the signal trace in hierarchical way
Vitis Analyzer provides the capability of flattening the hierarchy by toggling the “Flatten Signal” symbol
Comparing the waveform is supported for flattened timeline trace

Vitis Analyzer – Data Visualization

Display input/output data to AI Engine kernels in an AI Engine design
- Helps debug AI Engine designs to show input/output data along with timeline
Works with aiesimulator
Supports
- Window/stream/cascade data types
- Packet streams
- Templated kernels
- data-dump utility

Vitis Analyzer – AI Engine Stall Analysis

Vitis Analyzer provide visualization capabilities to enable users to identify root cause of stalls
Support
- Performance Metrics
- Lock Stall Analysis
- Stream Stall Analysis
- Cascade Stall Analysis
- Memory Stall Analysis
Support Flow
- aiesimulator
- HW emulation

Xilinx Runtime Library (XRT): www.xilinx.com/xrt

XRT API
- The XRT native API supports user managed kernel control with xrt::ip
XRT Utilities
- The xbutil and xbmgmt tools now becomes default
  - To use the legacy utilities, please use xbutil --legacy or xbmgmt --legacy with legacy sub-commands
- New utility, xball
  - Apply xbutil or xbmgmt commands to all or a filtered part of the installed data center cards. Check xball --help for details
- A new command, xbutil configure
  - Allow you to enable, disable, or configure the PCIe Host Memory and PCIe Peer to Peer features. See the XRT documentation for more details
- All XRT utilities now globally support the --force option to skip user interactive confirmation
Profiling
- A profile summary report is generated when any profiling option is enabled.
- All applicable summary tables and guidance are generated based on the profiling options enabled in the xrt.ini file
- New data transfer summary table for aggregate information on a memory resource when monitors are added to memory resources in the design
- New AIE profiling metric sets to count different AIE events including (1) floating point exceptions in AIE, (2) tile execution counts, and (3) stream puts and gets
Embedded
- zocl memory manager improvements to support any sptag

Vitis XRT for AI Engine Multiple Process Support

C and C++ APIs to define access modes for multiple processes to share access to the same AI Engine array and graphs.
- ¬Protect AI Engine array & graphs from unwanted access.
Three modes are supported for opening AI Engine array & graphs
- Exclusive Mode (prevent any other processes to access)
- Primary Mode (only allow other processes to do nondestructive access)
- Shared Mode (only do nondestructive access)
Take into consideration when multiple process support is needed. For example:
- Prevent others to access AI Engine array(exclusive access)
- Multiple users to control different graphs separately (multiple application support)
- One primary user to control graph, and allow others to probe the running status (primary & shared access)

Vitis XRT for AI Engine Support Status

C and C++ APIs

C version API
- For AI Engine array:
  - xrtAIEDeviceOpenExclusive (Exclusive mode)
  - xrtAIEDeviceOpen (Primary mode)
  - xrtAIEDeviceOpenShared (Shared mode)
- For AI Engine graph:
  - xrtGraphOpenExclusive (Exclusive mode)
  - xrtGraphOpen (Primary mode)
  - xrtGraphOpenShared (Shared mode)
C++ version API
- xrt::aie::device class support access mode in constructor
  - enum class access_mode : uint8_t { exclusive = 0, primary = 1, shared = 2 };
- xrt::graph class support access mode in constructor
  - enum class access_mode : uint8_t { exclusive = 0, primary = 1, shared = 2, none = 3 };

Access latest Vitis Target Platforms for Alveo Cards:

www.xilinx.com/alveo and refer to the Getting Started section of the Accelerator Card
www.xilinx.com/download and refer to the Alveo Packages section

Refer to UG1120 - Alveo Data Center Accelerator Card Platforms User Guide

AI Engine DSP Library – New Blocks

AIE DDS
AIE Mixer

Parallel Compilation

Reduced times vs. 2021.1 (As an example, the following numbers are for the 200 MHz TX Chain):

Time to compile and simulate reduced by factor of 3
Compilation times reduced by a factor of 7
Dead time after simulation reduced from 25s to ~0s

Constraint Editor Enhancement

2021.2 Improved Navigation

To Fixed Size Improvements

To Variable Size Block Improvements

Enhanced Functional Co-simulation Capabilities

Export Matlab data for AI Engine input – xmcVitisWrite
Import AI Engine Data into Matlab – xmcVitisRead
Import AI Engine Data into Matlab - xmcVitisRead

Others

Import an AI Engine or HLS Kernel block with no input (Source block)
New Data Type Support
- the Simulink native int64 and uint64 for AI Engine development instead of AMD data types, x_sfix64 and x_ufix64.
- accfloat and caccfloat for AI Engine Development
Support for Ubuntu 20.04
Support for MATALB 20a, 20b, 21a (No support for MATLAB 21b)
Addition of new examples
- Dual stream SSR filter example with 64 kernels
- Pseudo inverse(64x32) – commslib example.
Use xmcLibraryPath command to point to a custom DSPLib location.
Many more enhancements and bug fixes

2021.1

Vitis Software Platform 2021.1 Release Highlights:

AMD Kria System-on-Modules (SOMs) KV260 vision AI starter kit support. The full Vitis flow for ML (DPU inference engine) + X (RTL kernel and Vitis HLS based computer vision kernels). Learn More >
Support for new C/C++ Vision, DSP, Graph (Louvain Modularity), Codec in image processing, compression (GZIP, Facebook ZSTD, ZLIB whole application acceleration) performance-optimized libraries on FPGA and/or Versal ACAP over CPU/GPUs
Enhanced Vitis™ core development kit design flow on Versal ACAP devices: visualization improvements for AI engine design trace report, AI engine event tracing via GMIO, incremental recompile, new boot image wizard, and encrypted AI engine source file support
The new Vitis Model Composer tool enables rapid design exploration and verification within the MathWorks MATALB and Simulink® environment, enabling co-simulation of blocks targeting AI Engines and Programmable Logic, code generation, and test bench creation. Learn More >
New Vitis HLS Flow Navigator GUI for quick access to flow phases and reports. Merge synthesis, analysis, and debug views into a general default context

Vitis What's New by Category

Expand the sections below to learn more about the new features and enhancements in Vitis 2021.1. For information on Supported Platforms, Changed Behavior & Known Issues, please refer to Vitis 2021.1 Release Notes for Application Acceleration Flow and Embedded Software Development Flow.

Note: Vitis Accelerated Libraries are available as a separate download. They can be downloaded from GitHub or directly from within the Vitis IDE as well.

AIE DSP

DSPLib published as part of the Vitis Acceleration Library set on Github

DSPLib contains common parameterizable DSP functions used in many advanced signal processing applications. All functions currently support window interfaces with streaming interface support.

FIR Filters

Function	Namespace
Single rate, asymmetrical	dsplib::fir::sr_asym::fir_sr_asym_graph
Single rate, symmetrical	dsplib::fir::sr_sym::fir_sr_sym_graph
Interpolation asymmetrical	dsplib::fir::interpolate_asym::fir_interpolate_asym_graph
Decimation, halfband	dsplib::fir::decimate_hb::fir_decimate_hb_graph
Interpolation, halfband	dsplib::fir::interpolate_hb::fir_interpolate_hb_graph
Decimation, asymmetric	dsplib::fir::decimate_asym::fir_decimate_asym_graph
Interpolation, fractional, asymmetric	dsplib::fir::interpolate_fract_asym:: fir_interpolate_fract_asym_graph
Decimation, symmetric	dsplib::fir::decimate_sym::fir_decimate_sym_graph

FFT/iFFT - The DSPLib contains one FFT/iFFT solution. This is a single channel, single kernel decimation in time, (DIT), implementation with configurable point size, complex data types, cascade length and FFT/iFFT function.

Function

Namespace

Single Channel FFT/iFFT

dsplib::fft::fft_ifft_dit_1ch_graph
Matrix Multiply (GeMM) - The DSPLib contains one Matrix Multiply/GEMM (GEneral Matrix Multiply) solution. This supports the Matrix Multiplication of 2 Matrices A and B with configurable input data types resulting in a derived output data type.

Function

Namespace

Matrix Mult / GeMM
dsplib::blas::matrix_mult::matrix_mult_graph
Widget Utilities - These widgets support converting between window and streams on the input to the DSPLib function and between streams to windows on the output of the DSPLib function where desired and additional widget for converting between real and complex data-types.

Function

Namespace

Stream to Window / Window to Stream
dsplib::widget::api_cast::widget_api_cast_graph
Real to Complex / Complex to Real
dsplib:widget::real2complex::widget_real2complex_graph
DSP Library functions are supported in Vitis Model Composer, enabling users to easily plug these functions into the Matlab/Simulink environment to ease AI Engine DSP Library evaluation and overall AI Engine ADF graph development.

Vitis HPC Library release introduces HLS primitives, prebuild kernles and software APIs for HPC applications on FPGAs. These applications are:
- 2D Acoustic RTM (Reverse Time Migration) FDTD (Finite Difference Time Domain) algorithm, including forward kernel and backward kernel
- 3D Acoustic RTM (Reverse Time Migration) FDTD (Finite Difference Time Domain) algorithm, including forward kernel
- MLP (Mult-Layer Perceptron) components: activation functions and fully connected network kernels
- PCG (Preconditioned Conjugate Gradient) Solvers for both dense matrix and sparse matrix

First release of selected vision functions for Versal AI Engines:
Functions available
- Filter2D
- absdiff
- accumulate
- accumulate_weighted
- addweighted
- blobFromImage
- colorconversion
- convertscaleabs
- erode
- gaincontrol
- gaussian
- laplacian
- pixelwise_mul
- threshold
- zero
xfcvDataMovers : Utility datamovers to facilitate easy tiling of high resolution images and transfer to local memory of AI Engines cores. Two flavors
- Using PL kernel : higher throughput at the expense of additional PL resources.
- Using GMIO : lower throughput than PL kernel version but uses Versal NOC (Network on chip) and no PL resources.
New Programmable Logic (PL) functions and features
ISP pipeline and functions:
- Updated 2020.2 Non-HDR Pipeline
  - Support to change few of the ISP parameters at runtime: gain parameters for red and blue channels, AWB enable/disable option, gamma tables for R,G,B, %pixels to compute min&max for awb normalization.
  - Gamma Correction and Color Space conversion (RGB2YUYV) made part of the pipeline.
- New 2021.1 HDR Pipeline : 2020.2 Pipeline + HDR support
  - HDR merge for 2 exposures which supports sensors with digital overlap between short exposure frame and long exposure frame.
    - Four Bayer patterns supported : RGGB,BGGR,GRBG,GBRB
  - HDR merge + isp pipeline with runtime configurations, which returns RGB output.
  - Extraction function : HDR extraction function is preprocessing function, which takes single digital overlapped stream as input and returns the 2 output exposure frames(SEF,LEF).
- 3DLUT : provides input-output mapping to control complex color operators, such as hue, saturation, and luminance.
- CLAHE: Contrast Limited Adaptive Histogram Equalization is a method which limits the contrast while performing adaptive histogram equalization so that it does not over amplify the contrast in the near constant regions. This it also reduces the problem of noise amplification.
Flip : Flips the image along horizontal and vertical line.
Custom CCA : Custom version of Connected Component Analysis Algorithm for defect detection in fruits. Apart from computing defected portion of fruits , it computes defected-pixels as well as total-fruit-pixels
Canny updates : Canny function now supports any image resolution.

Library Related Changes

All tests have been upgraded from using OpenCV 3.4.2 to OpenCV 4.4
Added support for Versal Edge series (VCK190)
A new benchmarking section with benchmarking collateral for selected pipeline/functions published.

The 2021.1 release provide Two-Gram text analytics:
- Two Gram Predicate (TGP) is a search of the inverted index with a term of 2 characters. For a dataset that established an inverted index, it can find the matching id in each record in the inverted index.

Community Detection: Louvain Modularity
2-Hop Search

Adds double-precision SpMV (Sparse Matrix dense Vector multiplication) implementation with L2 kernels

In 2021.1 release, GQE receives early-access support the following features
- 64-bit join support: now the gqeJoin kernel and its companion gqePart kernel has been extended to 64-bit key and payload, so that a larger scale of data can be supported.
- Initial Bloom-filter support: the gqeJoin kernel now ships with a mode in which it executes Bloom-filter probing. This improves efficiency on certain multi-node flows where minimizing data size in the early stage is important.
- Both features are offered now as L3 pure software APIs, please check corresponding L3 test cases.

GZIP Multi Core Compression:
- New GZIP Multi-Core Compress Streaming Accelerator which is purely stream only solution (free running kernel), it comes with many variant of different block size support of 4KB, 8KB, 16KB and 32KB.
Facebook ZSTD Compression Core:
- New Facebook ZSTD Single Core Compression accelerator with block size 32KB. Multi-cores ZSTD compression is in progress (for higher throughput).
GZIP low latency Decompression:
- A new version of GZIP decompress with improved latency for each block, lesser resources (35% lower LUT, 83% lower BRAM) and improved FMax.
ZLIB Whole Application Acceleration using U50:
- L3 GZIP solution for U50 Platform, containing 6 Compression core to saturate full PCIe bandwidth. It is provided with Efficient GZIP SW Solution to accelerate CPU libz.so library which provide seamless Inflate and deflate API level integration to end customer software without recompiling.
Versal Platform Supports.

Add AIE Support - See above

The 2021.1 release provide support for: * RIPEMD160 * Initial support for BLS (not complete)

In the 2021.1 release, Data-Mover is added to this library. Unlike other C++ based APIs, this addition is targeting people less experienced in HLS based kernel design and just want to test their stream-based designs. The Data-Mover is actually a kernel source code generator, creating a list of common helper kernels to drive or validate designs, like those on AIE devices.

Produce QoR metrics (Vitis QoR Generation API)
- Cycles took by Application kernel
- Stall cycles (computed from VCD file)
- Measure overhead cycles in the wrapper (time spent in other functions than the kernel itself)
- Throughput
3 levels of optimization XLOPT=0, 1 (default), 2
New functionalities for xlopt=2:
- loop fusion, flatten single iteration outer loops, enhance loop peeling heuristics
Analyze "__restrict" usage and give guidance
Incremental recompile: when the graph does not change, recompile only kernels that've been modified
Packet Switched data → up to 32-split (was limited to 4)
New DMA FIFO location constraint (mapper/router changes between release do not impact performances)
Use mapping solution as a constraint in the new compilation: prevent future mapping variations that impact performance
Bring x86sim feature support to aiesim level
Start of deprecation of PL kernels in ADF graphs (complete deprecation in 2021.2)

New “Flow Navigator” in GUI for quick access to flow phases and reports. The contextual "synthesis, analysis, debug" views are merged into a general default context
New synthesis report section for the BIND_OP and BIND_STORAGE directives
A new post-synthesis text report reflects the information provided in the GUI synthesis report
The IP export and Vivado implementation run widgets have been redesigned with options to pass settings and constraint files to Vivado
New function call graph viewer to visualize functions and loops which can be highlighted with an optional heatmap to detect II, latency, or DSP/BRAM utilization hot spots
Versal timing calibration and new controls for DSP block native floating-point operations (the -precision option for config_op)
The Vitis HLS Migration guide (former UG1391) is now a chapter in UG1399
New methodology sections in user guide (UG1399 and web)
Alternate flushable pipeline option has been improved (free-running pipeline aka "frp")
In Vitis, a top port pointer can now simply be mapped onto the axi-lite adapter rather than a global memory
The aggregate directive now provides a "-compact bit" option for maximum packing
Adds back a "Leave Feedback" entry in Help menu with optional survey
Fixed bug for "Man Pages" tab not displaying information on some Linux systems
In Vitis, reshaping m_axi interfaces should be done via the hls::vector types
New customization options for s_axilite and m_axi data storage which can be "auto, "uram", "bram" or "lutram" allowing you to tweak RAM utilization in your design
In Vitis, introducing a new continuously (aka "never-ending") running mode for kernel
The axi_lite secondary clock option has been re-instated

Enhance support for RTL kernel packaging in Vivado IP packager
- public and productized feature with proper methodology and documentation.
- XRT managed kernel is the default flow.
Support encrypted AIE source files as input
- AIE compiler can accept encrypted AIE source file and v++ supports the rest of the flow.

Add Create Boot Image Wizard support for Versal devices
Multiple improvements for AI Engine programming and debugging
- Being able to turn on and off micro code labels
- Static Cross-probing between the source code and the microcode
- Full view of the microcode
- Bringing the last PC in the visible area whenever Pipeline view updates the data
- Aligning the Instruction data in Pipe line view
- Adding "Single Instruction Mode" action to disassembly view.
Be able to generate a default BIF file for a platform project
Program Flash for SD and eMMC adds raw mode support
In-context help messages are added to AI Engine development flow
Upgraded GCC toolchain version to 10.2

Users can emulate AXI-MM master/slave through an external process such as Python / C++. This may help users to emulate design with quick design time of AXI Master / Slave, without investing resources in developing AXI Master or VIP. AXI-MM Inter-process communication can also help to emulate the Chip-to-Chip connection between two FPGAs.
Enabling compilation of Versal models for VCS.
Platform developers can run hardware emulation on the platform with standalone applications to test the platform in the early stage.

User range profiling information and user event information are aggregated into profile summary report
Vitis Analyzer shows a critical timing path.
- Vitis Analyzer will display a simplified version of the Vivado GUI timing report, without the need to open a Vivado project or netlist. This allows users to quickly navigate to the failing timing path.
Vitis Analyzer multiple strategies support
- Results from multiple strategies run can be visualized in Vitis Analyzer.

New xrt.ini switches for profiling and debug
Reduce memory and loading time for large applications
- The new profile tool takes less resource for processing large csv file, which reduces the loading time and the crashing problem occurrence.
PL continuous trace offloading improvement
- Use DDR or HBM as memory resource to store trace data
- Circular buffer support for large data offloading
- Trace buffer size and offloading interval can be set in xrt.ini
Improvements to the visualization of AIE design’s trace report
- All AIE inputs will be displayed(window, stream, cascaded stream, etc.)
- Support all IO data types

Stable native XRT API, with C++ APIs for AIE graph control and execution, Software Emulation and tracing support.
XRT provides new helper APIs to help users to move from OpenCL API to XRT native API in $XILINX_XRT/include/CL/cl2xrt.hpp.
XRT New API xrt::device.get_info() can extract device properties
Greatly improved next generation xbutil and xbmgmt utilities are now the default.
xbutil can report power status
xbmgmt can support runtime clk scale and setup user power threshold to protect board and server.
sysfs, xbmgmt and xbutil can report MAC address of Alveo board
KDS scheduler in xocl has been refactored to significantly improve the throughput across hundreds of processes exercising multiple compute units across multiple devices concurrently. For legacy shells you may notice small percentage of throughput degradation. Please see the AR for proper solution.
XRT driver debug trace support through debugfs /sys/kernel/debug/xclmgmt/ and /sys/kernel/debug/xocl/

Access the latest Vitis Target Platforms for Alveo Accelerator cards at www.xilinx.com/alveo. Please refer to the Getting Started section of the accelerator card you want to deploy your applications on.

Please refer to UG1120 - Alveo Data Center Accelerator Card Platforms User Guide for more details and to keep up-to-date on the latest Vitis Target Platform releases, as they become available.

New Platforms

Alveo U200 Gen3x16 XDMA 1RP
- Name: xilinx_u200_gen3x16_xdma_1_202110_1
- Features: Slave Bridge, P2P, GT Kernel, DDR Self-Refresh
Alveo U50 Gen3x16 noDMA 1RP
- Name: xilinx_u50_gen3x16_nodma_1_202110_1
- Features: Slave Bridge, P2P, GT Kernel, Clock Throttling

VCK190 Base Platform enables ECC on DDR and LPDDR; constraints become concise.
MPSoC base platforms increased CMA size to 1536M. All Vitis-AI models can run with this CMA size.
Embedded platform creation flow gets simplified: Device Tree Generator can automatically generate a ZOCL node; XSCT can generate BIF files. Base platform source files are reduced.

Support for Kubernetes(K8s) clusters: Xilinx FPGA Resource Manager (XRM) can now be used together with the Kubernetes to run and manage compute units (CUs) across a pool of multiple Alveo accelerator cards attached to a server and scale applications to multiple servers with Alveo cards.

A comprehensive constraint editor enables users to specify any constraint for AI Engine kernels in Vitis Model Composer. The generated ADF graph will contain these constraints.
Addition of AI Engine FFT and IFFT blocks to the library browser.
Users now have access to many variations of AI Engine FIR blocks in the library browser.
Ability to specify filter coefficients using input ports for FIR filters.
Addition of two new utility blocks "RTP Source" and "To Variable Size".
Enhanced AIE Kernel import block now also supports importing templatized AI Engine functions.
Ability to specify AMD platforms for AI Engine designs in the Hub block.
Through the Hub block, users can relaunch Vitis Analyzer at any time after running AIE Simulation.
Users can now plot cycle approximate outputs and see estimated throughput for each output using Simulink Data Inspector.
Enhanced usability to import a graph as a block using only the graph header file.
Revamping of the progress bar with cancel button
Usability improvement during importing an AI Engine kernel or simulation of a design when MATLAB working directory and model directory are not the same.
New TX Chain 200MHz example.
New 2d FFT examples showcasing designs with HLS, HDL, and AI Engine blocks.

Simulation speed enhancement for SSR FIR (more than 10x improvement), and SSR FFT.
Simulation speed enhancement for memory blocks like RAMs, and FIFOs
Questa Simulator updated with VHDL 2008 in the Black-box import flow

Vitis Model Composer now contains the functionality of AMD System Generator for DSP. Users who have been using AMD System Generator for DSP can continue development using Vitis Model Composer.
MATLAB Support - R2020a, R2020b & R2021a

2020.2

Vitis Software Platform 2020.2 Release Highlights:

Vitis 2020.2 supports application acceleration and embedded software development for Versal ACAP Platforms
Vitis Core Development Kit now includes the AI Engine Compiler to compile C/C++ applications for Versal AI Engines. AI Engine, part of Versal AI Core Series, is a vector processor for compute-intensive applications
Vitis HLS is default for both accelerated-kernel compilation (Vitis) and C/C++ to RTL IP creation flow (Vivado)
600+ FPGA-accelerated functions across 13 performance-optimized libraries. 2020.2 introduces the new Vitis HPC library for accelerating high-performance computing applications and several enhancements & additions to the Data Analytics, Graph, BLAS, Sparse, Security & Database libraries
Support for evaluating multiple implementation strategies for final FPGA binary creation & enhancements for easier RTL-kernel integration within Vitis applications
Other enhancements this release include support for AI Engine application profiling, Git version control for Vitis projects, Vitis AI profiler data integration within Vitis Analyzer and enhancements for emulation modes. Learn More >
Add-on for MATLAB® and Simulink® : Unification of AMD Model Composer and System Generator for DSP. AI Engine is a new domain in Add-On for MATLAB and Simulink.
Learn More >

Vitis What's New by Category

Expand the sections below to learn more about the new features and enhancements in Vitis 2020.2. For information on Supported Platforms, Changed Behavior & Known Issues, please refer to Vitis 2020.2 Release Notes for Application Acceleration Flow and Embedded Software Development Flow.

Note: Vitis Accelerated Libraries are available as a separate download. They can be downloaded from GitHub or directly from within the Vitis IDE as well.

FPGA-accelerated library for HPC workloads. Initial release focuses on Seismic Imaging & Geophysics Simulation use-cases
- Reverse Time Migration (RTM) – Seismic imaging technique for accurate representation of subsurface
- High-precision Multi-layer Perceptron (MLP) - Reconstruction of subsurface properties using seismic reflection data (Seismic Inversion)
Optimized for single precision floating point data types (FP32) which is a key requirement within HPC applications
Version 1 of the library offers the following:
- L1 Stencil primitive, L1 MLP activation functions including Sigmoid, Relu, and Prelu
- L2 2D RTM forward kernel, 2D RTM backward kernels, and 3D RTM forward kernel
- L3 2D RTM APIs for supporting shot parallelism

New Functions and Features

2020.2 ISP Pipeline example design supports pixel depths up to 16 bits
Local tone mapping
Auto Exposure Correction
Quantization & Dithering
Color Correction Matrix
Black Level Correction
Lens Shading correction
Brute Force Feature Matching
Mode Filter
blobFromImage
Laplacian Operator
Distance Transform

Library Infrastructure & Other Enhancements

All library functions support Alveo U50 platform
GUI support for both Edge and Data Center platforms
Color Conversion : Supporting RGBX or fourth channel support
Line Stride support in Data Converters
Removed xf_axi_sdata.hpp file. Axiconverter functions now use the HLS ap_axi_sdata.h file instead.

Ready-to-Evaluate Apps in New AMD App Store

The following FPGA-accelerated applications, developed using the Vitis Vision library, are now available on the new AMD App Store as containers for easy evaluation and deployment on Alveo accelerator cards on the Nimbix cloud or On-premise

Image Classification using ML-inference engine from Vitis AI Library and Vitis Vision Pre Processing Function
Image Sensor Processing (ISP) Pipeline
Stereo Block Matching

Text Processing APIs. Two major APIs included - the regular expression match and geo-IP lookup. The former API can be used to extract content from unstructured data like logs, while the latter is often used in processing web logs, to annotate with geographic information by IP address. A demo tool that converts Apache HTTP server log in batch into JSON file is provided with the library.
DataFrame APIs for in-memory Data Abstraction: DataFrame is widely popular for in-memory data abstraction in data analytics domain, the DataFrame write and read APIs should enable data analytics kernel developers to store temporal data or interact with open-source software using Apache Arrow DataFrame more easily.
Tree Ensemble Method. Random forest is extended to include regression. Gradient boost tree, based on boosting method, is added to support both classification and regression. Support for XGBoost on classification and regression is also included to exploit 2nd order derivative of loss function and regularization.

Single-Source Shortest Path API (singleSourceShortestPath): 2020.2 version now supports the Alveo U50 platform and provides a new output ‘pred32’ for the shortest path information.
Page Rank APIs: 2020.2 version now supports Alveo U50 platform and including two APIs both named ‘pageRankTop’ - One to leverage a single memory channel and the other to utilize multi-bank memories.
Similarity APIs: 3 new APIs to cover different applications: .‘denseSimilarityKernel’ is for dense graph applications, ‘sparseSimilarityKernel’ for Sparse graph applications and ‘generalSimilarityKernel’ for both types of applications with single kernel.
The following APIs now support Alveo U50 platform:
- Breadth-First search bfs API (bfs)
- Degree calculation API (calcuDegree)
- Connected component API (connectedComponents)
- Converting format from CSC to CSR API (convertCsrCsc)
- Label propagation API (labelPropagation)
- Strongly connected component API (stronglyConnectedComponents)
- Triangle count API (triangleCount)

New L2 GEMM Kernel
For FP32 data types, the L3 GEMM performance has been improved from 280 GFLOPS to 340 GFLOPS

Introduced FP32 L2 CSCMV kernel (sparse matrix vector multiplication for CSC - Compressed Sparse Column - format matrices) that utilizes 16 HBM channel support on the Alveo U280 accelerator card.

The 2020.2 release brings a major enhancements and updates to the General Query Engine (GQE) kernel design, and brand-new Level 3 APIs for JOIN and GROUP-BY AGGREGATE.
- Columns as Input Buffers: The GQE kernels treat each column as an input buffer, simplifying the data preparation in the host code. Additionally, allocating multiple buffers on host side will reduce out-of-memory issues compared to big contiguous memory allocations, especially when the server is under heavy load.
- Command Classes for generating Configuration bits : The L2 layer now provides command classes to generate the configuration bits for GQE kernels. Developers no longer have to dive into the bitmap table to understand which bit(s) to toggle to enable or disable a function in GQE pipeline. Thus, the host code can be more sustainable and less error-prone.
- New Level-3 APIs: New experimental L3 APIs for JOIN and GROUP-BY AGGREGATE are built to scale the problem size that GQE can handle. They can breakdown the tables into parts based on hash and call the GQE kernels multiple rounds in a well-schedule fashion. The strategy of execution is separated from execution, so database gurus can fine-tune the execution based on table statistics, without messing with the OpenCL execution part.

LIBZ Library Acceleration using Alveo U50
- Seamless acceleration of libz standard APIs : deflate, compress2 and uncompress
- Ready-to-use libz.so library to accelerate any host code without any code change
- xzlib standalone executable for both gzip/zlib compress & decompress
ZSTD Decompression : New implementation of Facebook ZSTD algorithm available
Snappy Dual Core Kernel : New implementation of Google snappy Dual Core decompression algorithm achieves 2x throughput improvement for single file decompress.
GZIP Compress Kernel: New GZIP Quad Core Compress Kernel (in-built , LZ77 , TreeGen, Huffman encoder) implementation available. More than 20% reduction in overall resources and 50% reduction in DDR bandwidth requirement.
GZIP Compress Streaming Kernel: Fully standard compliance GZIP(include header & footer) implementation available, streaming free running kernels.
GZIP/ZLIB L3 Application on Alveo U50: GZIP/ZLIB Application available as an L3 API , optimized for Alveo U50 (HBM) and Alveo U250 cards. Single FPGA binary (xclbin) supports both zlib & gzip format for compress and uncompress
Support for to Alveo U50 : Library functions (LZ4, Snappy, GZIP, ZLIB) ported to support the Alveo U50 platform.
Low Latency GZIP/ZLIB Decompress : Initial decompression latency reduced from 5K to 2.5K for 4KB/8KB/16KB block sizes

APIs revised to fully support Vitis HLS compiler

New Signature Generation and Verification Algorithms: DSA, ECC, ECDSA(secp256k1) and EdDSA(ed25519)
New Checksum Algorithms: Adler32 and CRC32.
Verifiable delay function (VDF) evaluation and verification: Pietrzak's VDF and Wesolowski's VDF.
Commercial Cryptography constituted by CAS: SM2, SM3 and SM4.
Stream Cipher: XChacha20.
Optimization on RSA, GMAC, AES-GCM and SHA3 to improve their performance and resource utilization.

Argument parser (Beta): Parses the options and flags passed from command line and offers automatic help information generation enabling developers to create unified experience on test cases and user applications.
FIFO multiplexer: This module wraps around a FIFO (implemented through hls::stream in kernel code ) to enable passing data of different type through the same hardware resource. When the data is too wide, it will automatically be transferred using multiple cycles. This module is expected to make the dataflow code more compact and readable.

ADF: Adaptive Data Flow

Compiler:
- Event tracing on PLIO or GMIO
- Event tracing also on Hardware
- Heat Map generation: %utilization of all AI Engines
- Supports different PL frequencies for PL kernels and PLIOs
Vitis IDE for AI Engines
- Pipeline view
- Vector register view
- Internal memory views East, West North, South
- External memory

Vitis HLS replaces Vivado HLS in Vivado (it was already default for Vitis and C based kernel compilation in 2020.1)
- Adds array reshape and partitioning pragmas for top function ports
The tool is now installed in its own directory ./Vitis_HLS/2020.2 alongside Vitis and Vivado
HLS design migration information has been updated in UG1391
Vitis HLS user guide is UG1399, the full content is also available in HTML
Updated design examples on GitHub, they can also be loaded automatically from the Vitis HLS GUI (from the "Git Repositories" sub-window) for direct access
Support for SIMD programming
Support for on-chip block RAM ECC flags via the bind_storage pragma (Vivado flow only) to monitor error correction logic generated by the RAM blocks
GUI has a simplified toolbar icon layout, new reporting sections for interfaces and AXI4 including bursts
Non-default options can be filtered for quick review in "Solution Settings"→"General" then "Show only non-defaults" tick mark
User can create and open a project in the GUI directly starting from Tcl using the -p option and passing the Tcl file as an argument: vitis_hls -p <file>.tcl
Interactive FIFO depth sizing in GUI
Constrained random testing for AXI interfaces now visible in the GUI

Versal Only Features

Vitis HLS now infers the dedicated single clock cycle accumulation for floating point (adder or multiplier) of the DSP58 block to implement efficient high throughput accumulation
Timing libraries updated for Versal production target devices

Improved RTL-Kernel Integration: Enhancements for packaging & integrating RTL IPs as kernels within Vitis applications, including support for user-managed RTL kernels (not controlled by XRT APIs) and improvements to IP Packager within Vivado to support this flow.
Multiple Implementation Strategies for Timing Closure: Vitis compiler & linker (v++) now supports launching & running multiple Vivado implementation strategies at the same time during hardware builds. This enables users to explore & assess all results and select the best strategy for final FPGA binary (xclbin) creation.

Versal Only Features

In 2020.2, as long as the hardware design stays the same, aiecompiler will only recompile and update to the software when AIE program is modified. The v++ linking stage is not re-run and it goes directly to the package step. This allows users to easily and quickly iterate on the AIE program after the HW has been fixed.
System Level template will be provided which includes AIE, PL and PS design files.
AIE tools features integrated into Vitis IDE, such as displaying pipeline information, storage view, parallel compilation etc.

Version Control for Vitis Projects: Integration with Git version control for Vitis Projects enables collaboration across multiple developers and teams.
Improvements to Project Hierarchy: Acceleration kernel and host applications are now separate projects under top-level System Project enabling a user to compile the host application and hardware kernels separately.
Improvements to Board Support Package (BSP) Build times: For platform projects with standalone domains, the Board Support Package (BSP) drivers compiles in parallel to speed up application build time.
Ease-of-Use for Host Application Debug: Processing System registers can be now be exported as a file from the Vitis GUI for debug.
Profiling System Projects: Top-level System Projects now offers more control over specifying profiling features via the Vitis GUI for the Vitis application acceleration flows.

Improved Support for Platform Creation with Hardware Emulation: In addition to the Block Diagram as the top-level, the Hardware emulation mode now also supports RTL sources in the platform as the top-module or reference RTL inside block diagram without packaging. You can add RTL testbench as in Vivado. It offers more flexibility for validating designs before deployment.
Save Signals during Emulation for Debug: Save signals to Xilinx Simulator (XSIM) waveform file during emulation. User can pass -wcfg-file-path to launch_hw_emu.sh when rerunning hardware emulation.
Emulation Support for Slave Bridge Feature (Alveo Platforms) : Please refer to the Alveo Platform Documentation for more details on Slave Bridge features.
Python/C++ APIs for emulating AXI Stream IOs : Mimic data streaming through IO ports on platform using simple Python or C++ APIs while emulating AXI Stream kernels enabling you to emulate and debug complete system with programmed traffic patterns much earlier in the design cycle
Questa Simulator support for U250 Alveo Platform: In addition to the Xilinx Simulator (XSIM), hardware emulation in Vitis for U250 Alveo platforms now also supports Questa. Setup is done via V++ configuration files or Vitis IDE.
HLS kernel deadlock detection: Deadlock or livelock code in HLS kernel can be detected during hardware emulation by compiling HLS kernel with v++ config param=compiler.deadlockDetection=true

Versal Only Features

3rd party simulator support ( Questa, Xcelium, VCS) : In addition to the Xilinx Simulator (XSIM), hardware emulation in Vitis for Versal embedded platforms now also supports 3rd party simulators like Questa and Xcelium on Linux. VCS is supported in Early Access stage. Setup is done via V++ configuration files or Vitis IDE.

Vitis AI Profiler Data Integration: For applications that use the Deep Learning Processing Unit (DPU) for AI inference, you can access Vitis AI profiler information including DPU throughput, DDR read/write rates and timeline trace information within Vitis Analyzer to assess end-to-end application acceleration.
View Package Summary Report: View the Package Summary Report within Vitis Analyzer for an overall view of application’s status from a performance and optimization perspective. The package summary is created by v++ command after linking to build a package that can be run for software or hardware emulation or can be booted and run on the hardware device.
Integrated Host & Kernel Profiling: Vitis 2020.2 adds the capability to provide user event API profiling. Beyond the profiling capabilities inherently available for accelerated kernels, you can call Xilinx Runtime Library (XRT) APIs in your host code to profile arbitrary sections of the design and make decisions on overall application performance optimization.
Other Enhancements: Global Search across all reports accessible within Vitis Analyzer, flexibility to save/restore custom user layouts for viewing performance reports, Intuitive grouping of guidance messages to view related information in one place, Improvements to utilization reports enabling visibility into statistics on a per Super Logic Region (SLR) basis for deeper insight.

Versal Only Features

Profile summary report will have specific AIE design entry. More AIE related data will be shown in the compile/run summary reports, such as AIE heatmap which displays the kernel active/stall cycles running on HW.

Improved Visibility for Debug: AXI-S Transaction-level view available in the Xilinx Simulator (XSIM) Transaction Viewer for System-C portions of hardware emulation designs, providing better visibility into the design at a transaction level for debug.
View FIFO Status in Live Waveform Viewer: Status of user-level FIFOs (denoted as hls:streams in kernel code) can be viewed in Live Waveform Viewer during Hardware Emulation, providing visibility into static FIFO depths, FIFO elements and FIFO usage to identify performance bottlenecks for acceleration kernels

Versal Only Features

Event trace enhancements: Vitis 2020.2 incorporate a couple of enhancements on AIE event trace features, such as support for offloading by XRT, multiple trace stream flow enhancement support and the ability to monitor PL/AIE boundary even PL kernel is defined in the graph. Meanwhile, the PL/PS/AIE event trace are combined into a common timeline to provide better visualization of the whole design.

Note: Xilinx Runtime Library (XRT) is available as a separate download. Please refer to the Getting Started information for download and install instructions.

Improved Support for HBM-enabled Platforms: Leverage the benefits of high-bandwidth memory (HBM) enabled platforms by specifying kernel port connections to HBM banks through v++ --sp HBM[#:#] Xilinx Runtime Library (XRT) APIs can also automatically assign the HBM banks and enable the host application to allocate arbitrary sized buffers of one or more HBM segments (256MB+) (on HBM segment bounds).
Next Generation AMD Board Management Utilities (Preview): Next generation AMD Board Management utilities (xbutil, xbmgmt) are available for preview. They can enable the Slave Bridge and DDR retention features for AMD platforms that support them. Note: Current generation of board management utilities will be moved to maintenance mode in 2021.1 & new features will only be added to next generation utilities.

Versal Only Features

AIE support is added to support RTP, error handling, full array reconfiguration and graph API.

Access the latest Vitis Target Platforms for Alveo Accelerator cards from the Alveo Packages Download Tab

Please refer to UG1120 - Alveo Data Center Accelerator Card Platforms User Guide for more details and to keep up-to-date on the latest Vitis Target Platform releases, as they become available

U200/U250 XDMA Platforms

Alveo Platform U200 XDMA 2RP - Production
- Features: ERT, CMC, PLRAM, DRM capable floorplan, XDMA, 2RP, P2P, M2M, GT Kernel, PCIe Slave Bridge, DDR Self-Refresh
Alveo Platform U250 XDMA 2RP - Production
- Features: ERT, CMC, PLRAM, DRM capable floorplan, XDMA, 2RP, P2P, M2M, GT Kernel, PCIe Slave Bridge, DDR Self-Refresh

Shell Upgrade DFX - 2RP ( 2 Reconfiguration Partitions)

Small size of static region: Base
- PCIe functionality
- In-band FPGA partial reconfiguration
New reconfiguration partition: Shell
- Update DMA and utility functions
- Dynamic swapping between platforms without rebooting the server
2nd reconfiguration partition: User Logic
- Accelerator kernel functions

AXI Slave Bridge

Direct host memory access by the kernel
DMA bypass capability, with AXI-Slave 512-bit interface and user can provide their own data mover

Data Retention - DDR4 self-refresh

Data context retained in FPGA memory using DDR4 self-fresh during reconfiguration
Eliminates copying to host RAM as a temporary storage for different XCLBINs
Minimizes movement of large data sets

Note: Vitis Target Platforms for Embedded Platforms (including pre-built linux kernels, root file system and sysroot) are available as a separate download on Vitis Embedded Platforms Tab

ZYNQ 7000 and ZYNQ UltraScale+ MPSoC base platform functions are kept the same but platform source code has been re-structured. Directories are renamed for easy understanding; common source files across multiple platforms are grouped together. It would be easier to reuse the platform source code and port it to a new platform.
When building platform from source code, besides compiling PetaLinux from scratch, a new end-to-end compiling method is added if user uses downloaded common software components. User can point to those components and skip PetaLinux compiling when building a platform.

The VCK190 platform has flexible DDR + LPDDR memory subsystem and supports 63 interrupts for acceleration kernels. It is available for use with the Vitis core development kit, for both application acceleration and embedded processor software development, as described in Versal AI Engine Programmers Guide (UG1076). The platform enables development of designs that include:

AI Engine graphs and kernels
Programmable Logic kernels
Host application targeting the Linux or a bare metal OS running on the Arm processor in the Versal device.
Please refer to Getting Started with Vitis and Versal ACAP platforms to learn more.

Support for Kubernetes(K8s) clusters: Xilinx FPGA Resource Manager (XRM) can now be used together with the Kubernetes to run and manage compute units (CUs) across a pool of multiple Alveo accelerator cards attached to a server and scale applications to multiple servers with Alveo cards.

2020.1

What's New in 2020.1

Function	Namespace
Single Channel FFT/iFFT	dsplib::fft::fft_ifft_dit_1ch_graph

Function	Namespace
Matrix Mult / GeMM	dsplib::blas::matrix_mult::matrix_mult_graph

Function	Namespace
Stream to Window / Window to Stream	dsplib::widget::api_cast::widget_api_cast_graph
Real to Complex / Complex to Real	dsplib:widget::real2complex::widget_real2complex_graph

Servers

Business Systems

Workstations

Embedded

Personal Laptops

Personal Desktops

Handheld

Resources

GPU Accelerators

Adaptive Accelerators

DPU Accelerators

SmartNICs & Ethernet Adapters

Workstations

Desktops

Laptops

Resources

Adaptive SoCs & FPGAs

System-on-Modules (SOMs)

Technologies

Resources

Evaluation Boards & Kits

Processor Tools

Graphics Tools & Apps

Adaptive SoC & FPGA Tools

Intellectual Property & Apps

GPU Accelerator Tools & Apps

Overview

For Data Center & Cloud

For Edge & Endpoints

For Developers

Industries

Industries

Industries

Industries

Industries

Gaming

Systems

Technologies

Resources

EPYC Processors

Radeon Graphics & AMD Chipsets

Adaptive SoCs & FPGAs

Alveo Accelerators & Kria SOMs

Ryzen Processors

Ethernet Adapters

Overview

Processors

Accelerators

Adaptive SoCs, FPGAs, & SOMs

Graphics

Overview

Product Information & Training

Product Specifications

Resources

Processors & Graphics

DPU Accelerators

Adaptive SoCs & FPGAs

Gaming & Personal Computing

Adaptive & Embedded Computing

Get AMD Fan Gear

Shop Our Retail Partners

What's New

What's New for the AMD Vitis™ Software Platform

AMD Vitis™ Software Platform 2024.1 Release Highlights:

Enhancements for AMD Versal™ AI Engine DSP Designs

Key improvements to Vitis Unified Software Platform​

Key improvements to AMD Vitis IDE​ (New GUI)

AMD Vitis What's New by Category

AI Engine Simulator Enhancements

Vitis™ Software Platform 2023.2 Release Highlights:

Enhancements for Versal™ AI Engine DSP Designs

New Standalone Vitis Embedded Software

New Vitis Unified Integrated Design Environment

Vitis What's New by Category

New DSP library functions for AI Engines​

​New API support for DSP functions

New API support for AIE-ML

AI Engine block updates

HLS Kernel block updates

Integration of Vitis Model Composer and Vitis tool

Key improvements to Vitis Unified Software Platform

Key improvements to AMD Vitis IDE (New GUI)

New DSP library functions for AI Engines

New API support for DSP functions