leveraging data-mover ips for data movement in zynq-7000 ap … · 2019-10-10 · wp459 (v1.0)...
TRANSCRIPT
WP459 (v1.0) January 13, 2015 www.xilinx.com 1
© Copyright 2015 Xilinx, Inc. Xilinx, the Xilinx logo, Artix, ISE, Kintex, Spartan, Virtex, Vivado, Zynq, and other designated brands included herein are trademarks of Xilinx in the United States and other countries. PCI, PCIe, and PCI Express are trademarks of PCI-SIG and used under license. All other trademarks are the property of their respective owners.
Moving large quantities of data, both off-chip and on-chip, requires careful selection of the interface technology best suited to the task. Data-mover IPs can help improve performance.
White Paper: Zynq-7000 AP SoC
WP459 (v1.0) January 13, 2015
Leveraging Data-Mover IPs for Data Movement in
Zynq-7000 AP SoC SystemsBy: Srikanth Erusalagandi
ABSTRACT
System-on-a-Chip (SoC) designs face increasingly aggressive power and performance system requirements. Systems designers are challenged to f ind the right architecture for their SoCs that satisfy these requirements. One of the most common operations in SoC-based design is the eff icient implementation of data transfer functions, which can be critical to low-power or high-performance systems.
This white paper deals with some of the challenges that customers face while moving data in SoCs. Comprised of both a Processing System (PS) and Programmable Logic (PL), the Zynq®-7000 AP SoC architecture can leverage data-mover IP for eff icient data movement within the PS, between the PS and PL, and within PL.
This white paper also describes some of the reference designs associated with data-mover IPs, guiding the designer to the best reference design for the envisioned application. This can help the designer reduce development time, associated design risks, and improve performance.
WP459 (v1.0) January 13, 2015 www.xilinx.com 2
Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
Data Movement ChallengesFigure 1 illustrates the paths described in this section that impact data movement performance.
Data movement in a System-on-a-Chip (SoC) can be on-chip from one functional block to another, or off-chip using a standard interface. Data transfers on-chip are normally done between standard peripherals or memory interfaces.
The key challenge a designer faces is f inding a suitable data transfer method from a peripheral to a memory subsystem. While small on-chip data movement can be accomplished using software instructions, large data transfers can be done more eff iciently using special data transfer resources.
Many SoCs use Direct Memory Access (DMA) controllers for this purpose, minimizing CPU overhead and increasing system performance. Many high-speed peripherals on SoCs (USB, Ethernet, etc.) include their own dedicated DMA engines that operate independently from the CPU. The CPU is involved only in setting up the transfer parameters and is interrupted when the transfer is completed. The CPU can perform other tasks while the data is being transferred, or it can be put into a low-power mode.
It is important that each peripheral on the SoC has the right bus protocol and data widths to optimally use the on-chip bus structures that minimize silicon infrastructure while supporting high performance, low power on-chip communications.
Another key element that impacts data movement performance is the on-chip bus structure that connects the peripheral to the memory subsystem. Today's SoCs also include advanced on-chip bus structures that help support multiple data transfers to occur simultaneously. Simultaneous data transfers not only enable higher data transfer bandwidths, but also enable lower power consumption when compared to doing sequential data transfers.
X-Ref Target - Figure 1
Figure 1: Logical Diagram of Data Movement on an SoC
WP459_01_101014
Latency
Data Path
MemoryController
Latency
SOC
Cache
PeripheralInterconnect
CPU
Memory
PeripheralData
TransferResource
Latency
Latency
WP459 (v1.0) January 13, 2015 www.xilinx.com 3
Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
Apart from data movement, another key factor that impacts system performance is the data processing time, which indirectly affects the data movement operations in the system. Designers are looking for multi-processor solutions to increase system performance. The ideal solution involves not just multiple processors, but different compute engines for performing different tasks. Since all these engines do not work completely on their own, they need to pass data between them so each engine can perform its tasks and then hand the data set on to the next engine to perform its task. This particular data sharing mechanism mandates the data to be coherent between these multiprocessing engines. Data coherency can be managed using software, but doing so incurs the cost of executing cache maintenance routines whenever a processing engine updates the shared data. This has a direct impact on system performance.
Subsequent sections in this white paper describe how these challenges can be addressed while designing with the Zynq-7000 AP SoC.
Enable High-Performance Zynq-7000 AP SoC Systems with AXI
The Zynq-7000 AP SoC architecture comprises two sections, the Processing System (PS) and Programmable Logic (PL). The PS is an optimized silicon element consisting of a dual-core ARM® Cortex™-A9 MPCore™ with integrated peripherals. The PL enables designers to expand their hardware or accelerate the software by choosing either off-the-shelf IP or their own custom IP blocks that complement pre-built IP selections. Figure 2 illustrates the Zynq-7000 AP SoC block diagram.
X-Ref Target - Figure 2
Figure 2: Zynq-7000 AP SoC Block DiagramWP459_02_110714
ENET 1
Flash MemoryInterfaces
I/O Peripherals
SMC TimingCaluculation
ENET 0USB 1USB 0SD 1SD 0GPIO
CAN 1CAN 0L2C 1L2C 0
−
−
−−
−SP1 1SP1 0
I/OMUX(MIO)
Bank1MIO
(53-16)
Bank0MIO
(15:0)
GeneralSettings
ClockGenerationResets
32b GPAXI
MasterPorts
32b GPAXI
SlavePorts
64bAXIACPSlavePorts
High PerformanceAXI 32b/64bSlave Ports
ConfigAES/SHA
XADC0 1 2 30 1 2 3
DMABChannel
OCMInterconnect
ProgrammableLogic to Memory
Interconnect
Memory Interfaces
256 KBSRAMCoreSight
ComponentsCentral
Interconnect
TTCSWOT
−
G&C Snoop Control Unit
512 KB L2 Cache and Controller
System LevelControl Regs
ARM Cortex™A9CPU
Application Processor Unit (APU)
ARM Cortex™A9CPU
DAP
Processing System (PS)
DEVC DDR2/3, LPDOR2Controller
12 12 12 128 9 10 114 5 6 70 1 2 3
DMA Sync
DMAChannelsPS-PL
Clock PortsExtended
MIO (EMIO)
IRQ
QUAD SPINANDSPAMINOR
−
UART 1UART 0
−
Programmable Logic (PL)
WP459 (v1.0) January 13, 2015 www.xilinx.com 4
Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
Xilinx has adopted the ARM AMBA™ AXI3 interface to facilitate communication between the PS and a peripheral or processor subsystem on the PL.
The PS-PL interface includes the following AMBA AXI3 interfaces for primary data communication:
• Two 32-bit AXI master interfaces (M_AXI_GP)
• Two 32-bit AXI slave interfaces (S_AXI_GP)
• Four 64-bit/32-bit configurable buffered AXI slave interfaces with direct access to DDR memory and OCM, referred to as high-performance (HP) AXI ports (AXI_HP)
• One 64-bit AXI slave interface (ACP port) for coherent access to CPU memory
The General Purpose (GP) AXI ports provide a simple means to move data between the PS and the PL. Access to the PL is provided through the two GP AXI Master ports (M_AXI_GP), which each have a memory address range. From a software perspective, any instruction that performs a read or write to this memory range will originate PL AXI transactions on this interface.
The two GP AXI slave ports (S_AXI_GP) allow a PL AXI master to access the PS memory or any of the PS peripherals on the PS subsystem.
The GP AXI master and slave ports are the low-bandwidth interfaces contributed by the 32-bit data width and the central interconnect switch in the PS subsystem that adds latency to a PS or PL peripheral access.
The two highest-performance interfaces between the PS and the PL for data transfer are the high-performance (HP) AXI ports and the ACP interface.
The high-performance AXI (AXI_HP) ports provide high-bandwidth PL slave interface to the OCM and DDR3 memories of the PS. A data-intensive PL master should be connected to these HP interfaces to achieve higher throughputs. Each AXI_HP contains control and data FIFOs to provide buffering of transactions for larger sets of bursts, making it ideal for workloads such as video frame buffering in DDR. This additional logic and arbitration does result in higher minimum latency than other interfaces. Any data that is transferred by the PL master to the DDR is not coherent with the Cortex-A9 CPUs and requires software instructions to make the data coherent with the processor.
The AXI ACP interface provides a similar user IP topology as the HP (AXI_HP) ports. The ACP differs from the HP performance ports due to its connectivity inside the PS. The ACP connects to the CPU L2 cache, allowing ACP transactions to interact with the cache subsystems. This potentially decreases the total latency for data to be consumed by a CPU. These optionally cache-coherent operations can obviate the need to invalidate and flush cache lines, improving application performance.
WP459 (v1.0) January 13, 2015 www.xilinx.com 5
Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
Choosing the Right IP Interface for Optimal Power and PerformanceXilinx has adopted AXI version 4 protocol for Intellectual Property (IP) cores that enable Plug-and-Play IP to use high-performance, device-optimized, on-chip interconnects. This is described in detail in Vivado Design Suite: AXI Reference Guide. [Ref 1]
AXI4 supports variable data widths up to 256 bits, enabling large data movements in burst modes. The AXI4 protocol supports three kinds of interfaces optimized for different application requirements.
• AXI4-Full — The AXI-Full interface is a high-performance, memory-mapped interface that allows bursts of up to 256 data transfer cycles with just a single address phase. This interface is recommended for data-intensive tasks that support both single-beat transfers for simple register read/write operations and burst transfers for large memory transactions.
• AXI4-Lite — The AXI4-Lite interface is a subset of the AXI4 protocol intended for communication with simpler, smaller control-register style interfaces in components. The AXI4-Lite interface supports single-beat transfers, ideal for writing control information or reading status information to/from an IP in the PL. Though the AXI4-Full interface supports single-beat data, the AXI4-Lite interface is more power-efficient because of its lower pin count, and as a result it consumes less routing and logic resources.
• AXI4-Stream — The AXI4-Stream interface provides point-to-point high-speed streaming data from a high-speed interface that supports unlimited data burst sizes. This interface is recommended for transferring high-bandwidth streaming data like video.
The Create or Package IP Wizard in the Vivado® Design Suite provides templates for AXI4-Full, AXI4-Lite, and AXI-Stream interfaces, facilitating development of custom hardware IP. Designers can modify these templates to add their own custom logic, then package the design as an IP and add it back to the Vivado IP Integrator tool’s block design.
More information on the Create or Package IP Wizard is available in Chapter 4 of the “Migrating to AXI4 Protocols” section of the Vivado Design Suite: AXI Reference Guide. [Ref 1]
Leverage Data Mover and AXI IP to Obtain Ideal ResultsAfter an interface has been chosen, the next step is to find a suitable data-mover IP that can be used to transfer the data from the peripheral to the system memory. The Xilinx IP Catalog contains and describes all the supported IP infrastructure that enables connections to the PS subsystem.
Subsequent sections in this white paper describe the recommended interface on the PS and the corresponding data-mover and AXI infrastructure IPs for moving the data between the PS and PL to achieve the best performance. Also illustrated are some key reference designs that can be used as templates for new designs.
Understanding the breadth and depth of the surrounding ecosystem support can help reduce development time and associated design risks.
WP459 (v1.0) January 13, 2015 www.xilinx.com 6
Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
Basic OverviewIn its most basic form, an AXI4 interface connects one master device to one slave device. For device and system architectures that require more than one master or more than one slave, Xilinx provides AXI interconnect and infrastructure IPs to manage this type of communication requirement.
The PS–PL interface is based on AXI Version 3 protocol. AXI interconnect IP is the key infrastructure that performs protocol conversion and enables data movement between the PS and PL interfaces.
When connecting one master to one slave, the AXI interconnect can optionally perform address range checking. It can also perform any of the normal data width, clock-rate, pipelining, and protocol conversions. It also has built-in data-width conversion enabling the designer to easily hook up a 32-bit interface to a 64-bit HP port and ACP port of the optionally has built-in data FIFOs to manage the data rates to or from the PS and PL.
Xilinx's AXI4 Interconnect is also highly configurable to allow users to balance and tune for their design goal, e.g., area, timing, throughput, and latency. The Xilinx white paper Maximize System Performance Using Xilinx Based AXI4 Interconnects [Ref 2] provides recommendations on how to utilize the AXI interconnect IP for better performance.
Simple Register Read/WriteOne of the most basic communication or data movements between the PS and PL is a simple read or write operation to a register available in the PL IP. These simple read/write transactions can be enabled via providing an AXI4-Lite or AXI4 interface on the IP. The AXI4 IP can be connected to GP AXI ports via the AXI Interconnect Infrastructure IP.
Most of the IPs in the Xilinx IP catalog have either an AXI4-Full or an AXI4-Lite interface to facilitate plug and play infrastructure capability using the Vivado IP Integrator tool. The IP registers can be accessed from the PS via the GP AXI ports, as shown in Figure 4.
X-Ref Target - Figure 3
Figure 3: AXI Interconnect UsageWP459_03_120114
M_AXI_GP0 M_AXI_GP1 S_AXI_GP0/GP1 HP1/HP2/HP3HP0 ACP
Zynq-7000 AP SoC Processing System
Programmable Logic
AXI Infrastructure IPs
Xilinx/Custom IPs
AXI-3 Interface
AXI-4 Interface
AXI-364-Bit Interface
AXI-432-Bit Interface
AXI Interconnect
Custom IP1
AXI-364-Bit Interface
AXI-464-Bit Interface
AXI Interconnect
Custom IP2
AXI Interconnect
Custom IP3
WP459 (v1.0) January 13, 2015 www.xilinx.com 7
Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
A Xilinx application note, System Monitoring using the Zynq-7000 AP SoC Processing System with the XADC AXI Interface[Ref 3], demonstrates a simple AXI-Lite register read/write interface. In this application note, a simple AXI-XADC IP is connected to the Master GP AXI Port 0 of the PS (Figure 5). The AXI4-Lite interface is used to configure the XADC for system monitoring applications.
For more information on this design, refer to the application note. [Ref 3]
X-Ref Target - Figure 4
Figure 4: AXI4-Lite Simple Register Read Write
X-Ref Target - Figure 5
Figure 5: System Monitoring Application using the Zynq-7000 AP SoC PS with the XADC AXI Interface
WP459_04_120114
M_AXI_GP0 M_AXI_GP1 S_AXI_GP0/GP1 HP2/HP3HP0/HP1 ACP
Zynq-7000 AP SoC Processing System
Programmable Logic
AXI Infrastructure IPs
Xilinx/Custom IPs
AXI-3 Interface
Control/Status Path
AXI Interconnect
Register Bank
Control Logic
WP459_05_103014
DDR3Memory
GP0 IRQ_F2P
32-Bit at 100 MHz
Ethernet
External PC
Alarm Interrupts
Ethernet
Programmable Logic
Processing System
DDR Controller
Zynq-7000 AP SoC Processing System
XADC Wizard IP
WP459 (v1.0) January 13, 2015 www.xilinx.com 8
Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
AXI4-Memory Map to Streaming FIFOThe AXI4-Stream FIFO core converts AXI4/AXI4-Lite transactions to and from AXI4-Stream transactions, and can be used in applications that use packet communication.
The AXI4-Stream FIFO has three AXI4-Stream interfaces: one for transmit data, one for transmit control, and one for receive data. The AXI4-Stream transmit control interface can support the transmit protocol of AXI Ethernet cores.
The AXI4-Stream FIFO can be operated like a micro-DMA. This enables the core to be used to interface to AXI4-Stream IPs, similar to the LogiCORE™ IP AXI Ethernet core, without requiring a full DMA solution.
The registers of the AXI4-Stream FIFO can be accessed from the AXI4-Lite interface as well as the AXI4-Full interface. For data-intensive applications, it is recommended that:
• The AXI4-Stream FIFO register and datapaths be isolated
• AXI4-Lite be used for register access
AXI4-Full be used for the datapath. See Figure 6 for a block diagram of this implementation.
X-Ref Target - Figure 6
Figure 6: AXI Streaming FIFO Block Diagram
AXI-Stream FIFO
AXI4-Lite Interface (32-Bit)
Slave
Interrupt Controller
AXI4-Lite Interface (32 or 64 Bit)
Slave
Register Space
Transmit FIFO
Transmit Control
32/64-BitAXI_STR_TxC
32/64-BitAXI_STR_TxD
Receive FIFO
Receive Control
32/64-BitAXI_STR_RxD
Data Interface Option
AXI4-Lite Datapath AXI4 Datapath
AXI4-Lite (32-Bit)
AXI4(32/64Bit)
WP459_06_102914
WP459 (v1.0) January 13, 2015 www.xilinx.com 9
Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
Figure 7 illustrates the recommended interconnection method for the AXI4-Stream FIFO.
The AXI4-Stream FIFO IP is recommended for medium-performance solutions, especially where light-weight packet transactions are required.
For more details on AXI4-Stream FIFO, see LogiCORE IP AXI4-Stream FIFO Product Guide[Ref 4].
AXI DMAOne of the key data-mover IPs in the Xilinx IP catalog is the AXI DMA IP core. The AXI Direct Memory Access (AXI DMA) IP provides high-bandwidth direct memory access between the AXI4 memory-mapped and AXI4-Stream IP interfaces. Designers who choose to use the DMA IP core should verify that they have the AXI-Stream interfaces for their IP.
The AXI DMA controller, moving primary high-speed DMA data between system memory and the stream target, routes through the AXI4 memory-mapped-to-stream master (MM2S); the AXI4 stream routes to the memory-mapped slave interface (S2MM) channels. These channels operate independently. The MM2S channel supports an additional AXI control stream for sending user application data to the target IP; an additional AXI status stream is provided for the S2MM channel for receiving user application data from the target IP. The DMA registers, accessed through the core’s AXI4-Lite slave interface, should be connected to the PS GP master ports.
The optional scatter/gather port reads descriptors from system memory; it can be connected either to the GP slave ports or to HP ports.
This core block diagram is shown in Figure 8. For more information on AXI DMA, refer to the LogiCORE IP AXI-DMA Product Guide[Ref 5].
X-Ref Target - Figure 7
Figure 7: Recommended Connection to AXI4-Stream FIFOWP459_07_111814
MAXI GP0 HP/ACP port
Zynq-7000 AP SoC Processing System
Programmable Logic
AXI Infrastructure IPs
Xilinx/Custom IPs
AXI-4 LiteControl/Status
AXI-4 Streaming Control
AXI-4 Streaming Data
AXI-4 Streaming DataCustom IP AXI Streaming
FIFO
AXI-364-Bit Interface
AXI Interconnect AXI Interconnect
WP459 (v1.0) January 13, 2015 www.xilinx.com 10
Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
When the AXI DMA core is used in a block design, the memory-mapped master port is connected to either of the PS AXI slave port interfaces. The slave interfaces of the PS include slave GP ports, HP ports, and the ACP port.
The slave GP ports provide access to PS peripherals and memory, but they have high latency and provide lower performance because of the data width and central interconnect delays. The PS ACP and high-performance AXI ports provide direct access to the PS memory, and provide lower latency than the GP ports. Therefore, it is recommended that the AXI DMA core memory-mapped master ports be connected to the ACP or to HP ports on the PS to achieve higher throughputs. See Figure 9.
X-Ref Target - Figure 8
Figure 8: AXI DMA Block Diagram
X-Ref Target - Figure 9
Figure 9: Recommended AXI DMA Connections to Processing System
DataMoverAXI4 Stream Master (MM2S)AXI4 Memory Map Read
MM2S Cntl/Sts LogicAXI4 Control Stream (MM2S)
Scatter/Gather
AXI4 Memory Map Write/ReadRegisters
S2MM Cntl/Sts LogicAXI4 Stream (S2MM)
DataMoverAXI4 Stream Slave (S2MM)
AXI4 -Lite
AXI4 Memory Map Write
WP459_08_111814
WP459_09_111714
HP/ACPport
Zynq-7000 AP SoC Processing System
Programmable Logic
AXI Infrastructure IPs
Xilinx/Custom IPs
AXI-4 FullScatter/GatherInterface
Custom IP AXI DMA
AXI-364-Bit Interface
Descriptions inDDR Memory
AXI Interconnect
AXI-4 LiteControl/Status
AXI Interconnect AXI Interconnect
MAXIGP0
HP0/SAXI GP0
AXI-4 Stream Control
AXI-4 Stream Data
AXI-4 Stream Data
WP459 (v1.0) January 13, 2015 www.xilinx.com 11
Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
The AXI DMA can operate in two different modes.
1. Register Direct Mode (Simple DMA)
2. Scatter/Gather Mode
Register Direct Mode (Simple DMA)
The AXI DMA in simple DMA mode can be used especially in cases where a high-speed transfer from a peripheral in PL is required. The AXI DMA in simple DMA mode provides configuration for doing simple DMA transfers that requires less FPGA resource utilization. In this mode, all the transfers are initiated by writing to the DMA control, source or destination address, and length registers. When the transfer is completed, an interrupt is asserted for the associated channel and, if enabled, generates an interrupt out.
There are multiple reference designs from Xilinx that illustrate the usage of AXI DMA in simple DMA mode.
Implementing Analog Data Acquisition using the XADC AXI Interface
A Xilinx application note, Implementing Analog Data Acquisition using the Zynq-7000 AP SoC Processing System with XADC AXI Interface[Ref 6], demonstrates a design in which the XADC data is transferred directly to system memory using the DMA IP core instead of by simple read/write operations.
X-Ref Target - Figure 10
Figure 10: Implementing Analog Data Acquisition using the PS with XADC AXI Interface
Zynq-7000 AP SoC Processing System
GP0 GP1 UART
DDR3Memory
AXI DMA
AXI4 Memory Mapped
GUI
Adapter Block
AXI4-Stream
XADC Wizard IP
AXI4-Stream
WP459_10_111814
32-Bit at 100 MHZ32-Bit at 100 MHZ
WP459 (v1.0) January 13, 2015 www.xilinx.com 12
Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
For design and software implementation details, refer to the application note and related documentation.
Zynq-7000 AP SoC Accelerator for Floating-Point Matrix Multiplication using Vivado HLS
The AXI DMA IP core[Ref 5] also becomes a key infrastructure IP where the designer chooses to accelerate key software operations to hardware using high-level synthesis techniques. If the designer must move large amounts of data from the accelerated hardware implemented in PL to system memory, then DMA can serve the purpose. A Xilinx application note, Zynq-7000 All Programmable SoC Accelerator for Floating-Point Matrix Multiplication using Vivado HLS[Ref 7], demonstrates one such use case.
In this design (see Figure 11), AXI DMA performs all the burst formatting, address generation, and scheduling of memory transactions for communicating with the floating-point matrix multiplication accelerator IP core.
Refer to the application note to learn how a software program is modif ied to include AXI streaming interface, converted to an IP core, and connected to the PS using the DMA IP core[Ref 7].
Zynq-7000 AP SoC Spectrum Analyzer Reference Design
The Zynq-7000 AP SoC Spectrum Analyzer (see Figure 12) demonstrates a single-chip reference design for data acquisition and digital signal processing implementing a spectrum analyzer to demonstrate the capability of the Zynq-7000 AP SoC on the ZC702 development board.
X-Ref Target - Figure 11
Figure 11: Accelerator for Floating-Point Matrix Multiplication using Vivado HLSWP459_11_111814
Processing System
Zynq-7000 AP SoC
Programmable Logic
ARM CPUand
L1-Caches
L2-Cache
ACP DMAAXI-Full AXI-Str
AXI-LiteTimer
HLSIP
Core
GP0
Mem
ory
Con
trol
ler
WP459 (v1.0) January 13, 2015 www.xilinx.com 13
Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
In this design, one DMA IP core is used to transfer XADC samples directly to system memory via the ACP port, and another DMA IP core is used to transfer XADC samples from system memory to the Xilinx Complex FFT core to perform the complex FFT windows function.
Since the design demonstrates multiple technical references for building the spectrum analyzer design, the original design is split into multiple technical tips and can used for reference. The details and technical tips for this design are available in the technical articles section of the Xilinx Wiki page.
Scatter/Gather Mode
As described in the AXI DMA introduction, AXI DMA can be optionally configured to support the scatter/gather mode (see Figure 13). Unlike the simple DMA mode, the scatter/gather mode enables higher-performance DMA transfers at the cost of additional FPGA resource consumption.
The AXI DMA has an optional AXI4 scatter/gather read/write master interface through which the scatter/gather engine fetches and updates buffer descriptors from system memory. Unlike the simple DMA mode, the scatter/gather engine requires optional descriptors queued in the system memory. This offloads DMA management work (writing the address, length, and control registers) from the CPU, and instead fetches and updates the information automatically from the descriptor table, thus maximizing primary data throughput.
X-Ref Target - Figure 12
Figure 12: Zynq-7000 AP SoC Spectrum Analyzer Reference DesignWP459_12_111814
System Gates,DSP, RAM
1080p60Video
ProgrammableLogic:
ARM CoreSight Multi-core & Trace Debug® ™
512 KB L2 Cache
General Interrupt Controller DMA Configuration
256 KB On-Chip MemoryTimer Counters
Snoop Control Unit (SCU)
Cortex -A9 MP Core32/32 KB I/D Caches
™ ™ Cortex -A9 MP Core32/32 KB I/D Caches
™ ™
NEON / FPU Engine™ NEON / FPU Engine™
AMBA Switches®2x SPI
2x 12C
2x CAN
2x UART
GPIO
2x SDIOwith DMA
2x USBwith DMA
2x GigEwith DMA
I/OMUX
XADC
Analog Input
DDR3
AXI4 (Lite)AXI4DMA FFT
4096AXI4DMA
XylonDisplay
Controller
AXI4Inter-
connectAMBA Switches®
Processing SystemDynamic Memory Controller
DDR3, DDR2, LPDDR2Static Memory ControllerQuad-SPI, NAND, NOR
AMBA Switches®
ACP
HP
GP
WP459 (v1.0) January 13, 2015 www.xilinx.com 14
Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
The operation of the scatter/gather mechanism and descriptor formats is explained in the LogiCORE IP AXI DMA Product Guide (PG021).[Ref 5]
The following reference designs from Xilinx illustrate the usage of AXI-DMA in Scatter/Gather Mode.
Zynq-7000 AP SoC PL Ethernet Performance and Jumbo Frame Support
A Xilinx application note, PS and PL Ethernet Performance and Jumbo Frame Support with PL Ethernet in the Zynq-7000 AP SoC[Ref 8], demonstrates high-speed data movement between AXI Ethernet IP and system memory using the AXI DMA IP core.
The application note focuses on multiple aspects of the Ethernet peripheral usage in the Zynq-7000 AP SoC, like connecting the gigabit Ethernet MAC (GEM) through the extended multiplexed I/O (EMIO) interface with the 1000BASE-X physical interface using high-speed serial transceivers in PL.
X-Ref Target - Figure 13
Figure 13: Recommended Connections to AXI DMA CoreWP459_13_120114
HP/ACPPort
Zynq-7000 AP SoC Processing System
Programmable Logic
AXI Infrastructure IPs
Xilinx/Custom IPs
AXI4 FullScatter/GatherInterface
Custom IP AXI DMA
AXI364-Bit Interface
Descriptions inDDR memory
AXI Interconnect
AXI4 LiteControl/Status
AXI Interconnect AXI Interconnect
M_AXI_GP0 HP0/S_AXI_GP0
AXI4 Stream Control
AXI4 Stream Data
AXI4 Stream Data
WP459 (v1.0) January 13, 2015 www.xilinx.com 15
Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
In the PL implementation of Ethernet, the design consists of the AXI Ethernet, AXI DMA, and AXI Interconnect IP cores. The design uses the HP port for fast access to the PS-DDR memory; however, the GP slave port can also be used if the HP port is occupied with other peripherals. The application note also demonstrates the effective use of AXI-DMA control and status ports, which transfers the critical control and status information between the Ethernet MAC Controller and system memory required for the Ethernet device operation.
For design and software implementation details, refer to the application note.
AXI-Video DMAHigh-performance video systems can be created using the Zynq-7000 AP SoC and available Xilinx LogiCORE IP AXI Interface cores. AXI interconnects and AXI video DMA IP cores enable the design of video systems capable of handling multiple video streams and multiple video frame buffers, sharing a common DDR3 SDRAM via the AXI3 ports on the Zynq-7000 AP SoC.
The AXI Video Direct Memory Access (AXI VDMA) core is designed to allow for efficient high-bandwidth access between AXI4-Stream video interface and AXI4 interface that can be connected to the HP ports of the PS interface.
X-Ref Target - Figure 14
Figure 14: Zynq-7000 AP SoC Ethernet Performance and Jumbo Frame Support with PL Ethernet
APU
GIC
32-Bit GP AXI-Master 64-Bit HP AXI-Slave
PL to Memory Interconnect
Memory Interface DDR3
AXI Interconnect AXI Interconnect
AXI-DMA
AXI-Ethernet
RX TX
S2MM MM2S
125 MHz GT Reference Clock
UART
Central Interconnect
Clock Generation
Processing System
Programmable Logic
1000BASE-X Interface
1000BASE-X PCS-PMA
Si5324 SFP
75 MHz (FCLK1)
GMII 8-Bit at 125 MHz
GTX Transceiver
WP459_16_102914
32-Bit at 125 M
Hz
Datapath
WP459 (v1.0) January 13, 2015 www.xilinx.com 16
Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
The AXI VDMA operation is similar to the AXI-DMA configured in simple DMA mode but is optimized for video protocol. The core provides eff icient two-dimensional DMA operations with independent asynchronous read and write channel operation. Initialization, status, interrupt, and management registers are accessed through an AXI4-Lite slave interface.
Figure 15 illustrates the blocks of the AXI VDMA IP core.
For more information on the AXI-VDMA IP, refer to the LogiCORE IP AXI Video Direct Memory Product Guide[Ref 9].
The following reference designs from Xilinx illustrate the usage of AXI-VDMA Core.
High-Performance Video Systems Using IP Integrator
A Xilinx application note, Designing High-Performance Video Systems with the Zynq-7000 All Programmable SoC Using IP Integrator[Ref 10], demonstrates the AXI VDMA core’s video capability using the on Zynq-7000 SoC ZC702 development board. This design uses four AXI VDMA cores to simultaneously move eight streams (four transmit video streams and four receive video streams), each in 1920 x 1080p format at a 60 Hz refresh rate, with up to 24 data bits per pixel.
X-Ref Target - Figure 15
Figure 15: Video DMA IP Core
AXI4-Lite
AXI4-Stream
Registers
DataMover Line Buffer
Control and Status
WP459_15_101314
Logic
AXI4 Memory Map
WP459 (v1.0) January 13, 2015 www.xilinx.com 17
Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
The design is built using the Vivado Design Suite and the Vivado IP integrator feature. The design also includes software built using the Xilinx Software Development Kit (SDK). The complete IP integrator project and SDK tool project f iles are provided with this application note, allowing the designer to examine and rebuild this design or use the f iles as a template for starting a new design.
Zynq-7000 AP SoC Base Targeted Reference Design
The Zynq Base Targeted Reference Design[Ref 11], also built on the same principle, demonstrates the AXI VDMA usage and its connections. The Base TRD consists of two processing elements: the Zynq-7000 AP SoC PS and a video accelerator built in PL.
X-Ref Target - Figure 16
Figure 16: Designing High-Performance Video Systems Using the Vivado IP Integrator
USB
Clock Generation Reset SWDT
TTC FPU and NEON Engine
I/O Peripherals
USBGigEGigE
SD SDIO
SD SDIOGPIOUARTUARTCANCANI2CI2CSPISPI
SRAM/NOR
Memory Interfaces
ONFI 1.0 NANDQSPI CTRL
Central Interconnect
System Level
Control Registers
MMU ARM Cortex A9 CPU
32 KBI-Cache
32 KBD-Cache
FPU and NEON Engine
MMU ARM Cortex A9 CPU
32 KBI-Cache
32 KBD-Cache
DMA 8 Channel
Snoop Controllers, AWDT, Timer
512 KB L2 Cache & Controller
OCMInterconnect
GIC
256K SRAM
Application Processor Unit
DDR2/3, LPDDR2 Controller
Memory Interfaces
CoreSight Components
Programmable Logic to Memory Interconnect
DAP
DevC
AXIVDMA
AXIVDMA
AXIVDMA
AXIVDMA
V_OSD
ASI4S_VID_ OUT
V_TC
V_TPG V_TPG V_TPG V_TPG
HP0 HP1 HP2 HP3
High-Performance Ports
Programmable LogicConfig AESSHA
General-Purpose Ports
FSYNC
AXI PERF MON
GP0
HDMI
IRQDMA Sync
MIO
EMIO
IRQ
2x SD2x GigE2x USB
Processing System
ACP
WP459_16_111814
WP459 (v1.0) January 13, 2015 www.xilinx.com 18
Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
X-Ref Target - Figure 17
Figure 17: Zynq-7000 AP SoC Base TRDWP459_17_110614
Bank0MIO
(15:0)
I/O Peripherals
Flash MemoryInterfaces
ClockGeneration
SPI 0SPI 1I2C 0I2C 1CAN 0CAN 1UART 0UART 1GPIOSD 0SD 1
USB 0USB 1Enet 0Enet 1
SRAM/NORNAND
Quad SPI
Bank1MIO
(53:16)
Input Clockand Freq
ExtendedMIO (EMIO) PS to PL
Clock Ports
PS I2C-1(EMIO)
I/OMUX(MIO)
Reset
Processing System (PS)
DAP
DEVC
IRQ
ProgrammableLogic to Memory
Interconnect
High PerformanceAXI 32b/64bSlave Ports
Memory Interfaces
DDR2/3, LPDDR2Controller
SWDTTTC
DMA 8Channel
GIC Snoop Control Unit
MMU
NEON/FPU Engine
Application Processor Unit (APU)
32 KB ICache
32 KB DCache
Cortex-A9MPCore
CPUMMU
NEON/FPU Engine
32 KB ICache
32 KB DCache
Cortex-A9MPCore
CPU
CoreSightComponents
SystemLevel
ControlRegs
CentralInterconnect
AXI Interconnect AXI Interconnect
64bAXI ACP
SlavePort
OCMInterconnect
255 KB OCM
512 KB L2 Cache and Controller
BootROM
32b GPAXI
MasterPorts
32b GPAXI
SlavePorts
12 13 14 158 9 10 114 5 6 70 1 2 3
0 1 2 3
0 1 2 3
S
M M M M M M M M Slave Slave
AXI Interconnect
Slave Slave
MasterMaster
PerfMon
DMA Sync
LOGICVC_0TPG VDMA
MM2S S2MM
Sobel VDMA
Sobel Filter
HDMI_INVideoIn1080p Video
Out1080p
Local Pcore
EDK IP
Third Party IP
CORE Generator EDK IP
AXI Interface
Video Interface
AXI Streaming Interface
S2MM
M
VTC_0VIDEO_MUX_0
Video Sync Signals
Monitor
TPG_0VID_IN_AXI4S
Programmable Logic (PL)
WP459 (v1.0) January 13, 2015 www.xilinx.com 19
Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
The video accelerator in PL implements the AXI VDMA core which redirects video traff ic generated by the Test Pattern Generator core and the Sobel Engine to DDR3 memory via the HP ports.
The AXI video DMA core in this design contains an AXI4-Lite slave interface with which the software configures the core for 1080p, where each horizontal line is set for 5,760 bytes (1,920 x 3 bytes per pixel), and each vertical line is set for 1,080 pixels by the processor. There are twelve frame buffers in this design (three frame buffers x four video pipelines).
Designers can leverage both the above reference designs to build a video application using the Zynq-7000 AP SoC.
The design f iles for this reference design can be downloaded from Xilinx Wiki Site at http://www.wiki.xilinx.com/Targeted+Reference+Designs#Zynq.
Zynq-7000 AP SoC PS DMA Controller (PL330 DMA)The PS subsystem includes a DMA controller (DMAC) block that can be used to transfer data from a PS peripheral or PL peripherals to the system memory. The DMAC provides a flexible DMA engine that can provide moderate levels of throughput with little PL logic resource usage.
The DMAC supports up to eight channels. However, this flexible programmable model could increase software complexity relative to simple CPU transfer or specialized a DMA block implemented in PL.
The DMAC interface to the PL is through the GP AXI master interfaces, and a peripheral request interface also allows PL slaves to provide status to the DMAC on buffer state. This enables the PL peripheral to prevent stalled transactions due to interconnect and DMAC buffer management, as shown in Figure 18.
WP459 (v1.0) January 13, 2015 www.xilinx.com 20
Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
The PL330 DMA driver is integrated into the standard Linux DMA API framework. The hardware DMA components in the Zynq-7000 device are controlled through the standard Linux DMA API.
AXI-CDMA
The AXI CDMA is another key data-mover IP in the Xilinx IP catalog that performs provides high-bandwidth direct memory access between a memory-mapped source address and a memory-mapped destination address using the AXI4 protocol. The AXI CDMA IP provides an optional Scatter Gather (SG) feature to off-load control and sequencing tasks from the System CPU. The Initialization, status, and control registers of the CDMA are accessed through an AXI4-Lite slave interface.
The AXI CDMA is an ideal IP to move blocks of the data from buffers in the PL to the Zynq DDR memory via the Zynq PS-PL Interfaces. Chapter 6 of the Zynq-7000 All Programmable SoC: Concepts, Tools, and Techniques (CTT) guide provides a design example to configure the CDMA block to move data between the PS-PL interfaces.
For more information on AXI CDMA refer to the LogiCORE IP AXI CDMA Product Guide (PG034).
X-Ref Target - Figure 18
Figure 18: Connections for PL330 DMAWP459_18_111714
MIO
ClockGeneration
2x USB
Reset
2x GigE
2x SD
IRQ
SWDT
TTC
SystemLevel
ControlRegs
FPU and NEON Engine
MMU
32 KBI-Cache
32 KBD-Cache
ARM Cortex-A9CPU
Application Processor UnitProcessing System (PS)
FPU and NEON Engine
MMU
32 KBI-Cache
32 KBD-Cache
ARM Cortex-A9CPU
XADC12-Bit ADC
EMIO ACPGeneral-PurposePorts
High-Performance PortsDMASync
IRQ
Programmable Logic to MemoryInterconnect
AXI Infrastructure IPs
Xilinx/Custom IPs Custom IP
AXI Interconnect
CentralInterconnect
CoreSightComponents
ConfigAES/SHA
DAP
DAP
OCMInterfonnect
256KSRAM
MemoryInterfaces
DDR2/3, 3L,LPDDR2Controller
DDR3
GIC Snoop Controller, AWDT, Timer
512 KB L2 Cache & ControllerDMA 8Channel
I/OPeripherals
MemoryInterfaces
USB
USB
GigE
GigE
GPIOUARTUARTCANCANI20I20SPISPI
SRAM/NOR
ONFI 1.0NAND
SDSDIO
SDSDIO
Q-SPICTRL
Programmable Logic (PL)
WP459 (v1.0) January 13, 2015 www.xilinx.com 21
Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
DMA for PCI Express Based DesignsTo implement a PCIe Express® based design using Zynq-7000 AP SoC, Xilinx offers a couple of reference designs that use DMA to transfer data between the PCI Express core and system memory.
The PCI Express core in the PS can be used as Root Complex as well as Device Endpoint Block. The reference designs below illustrate PCI Express as an Endpoint Block.
PCI Express Endpoint – DMA Initiator Subsystem
A Xilinx application note, PCI Express Endpoint-DMA Initiator Subsystem[Ref 12], demonstrates a subsystem for Endpoint-initiated DMA data transfers through PCI Express. The provided subsystems target the indicated devices to initiate data transfers between DDR3 memory and an externally connected PCI Express Root Complex.
The block diagram for this reference designs and its connection to the device is shown in Figure 20.
X-Ref Target - Figure 19
Figure 19: Connections to AXI CDMA IPWP459_19_111814
SAXIGP/HP/ACP Port
Zynq-7000 AP SoC Processing System
Programmable Logic
AXI Infrastructure IPs
Xilinx/Custom IPs
AXI4 FullScatter/GatherInterface
CDMA
AXI364-Bit Interface
Descriptions inDDR memory
AXI Interconnect
AXI4 LiteControl/Status
AXI Interconnect AXI Interconnect
MAXIGP0 SAXI GP0
AXI4Full Interface
Custom IP
WP459 (v1.0) January 13, 2015 www.xilinx.com 22
Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
To accomplish this, a scatter/gather-capable DMA engine is paired with the PCI Express IP along with AXI Interconnect IP cores This design demonstrates how to map AXI Memory to PCI Express IP to perform high-throughput data transfers over a PCI Express link. The DMA engine enables the system to manage data transfer over the PCI Express link to increase throughput and decrease processor utilization on the Root Complex side of the PCI Express link. This approach is important specif ically for high-throughput PCI Express applications, which can include using the Zynq-7000 SoC PS high-performance ports or double-data-rate (DDR) memory.
For design and software implementation details, refer to the application note.
PCI Express Targeted Reference Design (TRD)
The Zynq-7000 AP SoC ZC706 Evaluation Kit Getting Started Guide (ISE Design Suite 14.7)[Ref 13] provides a reference solution to implement a PCI Express Base Targeted Reference Design (TRD), which expands the Zynq-7000 Base Targeted Reference Design[Ref 11] by adding PCI Express communication with a host system communicating at x4 Gen2 speed.
X-Ref Target - Figure 20
Figure 20: PCI Express Endpoint-DMA Initiator Subsystem Reference Design
WP459_20_111814
AXI Infrastructure IPs
Xilinx/Custom IPs
Zynq-7000 AP SoC Processing System
Programmable Logic
AXI-CDMA
PCIe Subsystem
PCI Express Interface
HP0
AXI-364-Bit Interface
AXI Interconnect
AXI Interconnect
proc_sys_reset
TranslationBlock RAM
Block RAMCTLR
AXI-PCIe
WP459 (v1.0) January 13, 2015 www.xilinx.com 23
Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
In this reference design, Xilinx utilizes a third-party, multi-channel, bus-mastering scatter/gather PCIe Express DMA from Northwest logic. The Northwest Logic DMA Core provides high-performance, scatter-gather DMA operation in a flexible fashion.
The Northwest Logic DMA core provides an AXI4-Stream interface for data and an AXI4 interface for register space access. The Northwest Logic DMA core also provides independent transmit and receive channels for each channel, allowing full-duplex operation.
The PCI Express DMA translates the stream of PCI Express video data packets into AXI-Stream data, which is connected to a Video DMA (VDMA). Software running in the PS on the Cortex-A9 processor manages the AXI VDMA and transfers the raw video frames into the PS DDR3 memory.
Information regarding the Northwest Logic DMA core is available on the Xilinx website at http://www.xilinx.com/products/intellectual-property/1-8DYF-1689.htm.
The design f iles for this reference design can be downloaded from the Xilinx Wiki Site page at http://www.wiki.xilinx.com/Targeted+Reference+Designs#Zynq.
X-Ref Target - Figure 21
Figure 21: PCI Express TRD
Host Control
& Monitor
Interface RA
W D
RIV
ER
PC
Ie D
MA
DR
IVER
GTX
PCIe
x 4
Gen
2 Li
nk
Inte
grat
ed E
ndpo
int B
lock
for P
CI E
xpre
ss
AXI
-ST
Basi
c W
rapp
er
Multi-channel DMA for PCIe
AXI Master
Cha
nnel
-0C
2S
S2C
Cha
nnel
-1C
2S
S2C
PerformanceMonitor
User Space Register
AXI IC
FIFO VDMA
Sobel Filter VDMA
AXI
ICA
XIIC
LogiCVC
GP0
64 x
25
0MH
z
64 x
15
0MH
z
64 x
15
0MH
z32
x
150M
Hz
64 x
15
0MH
z12
8 x
150M
Hz
Raw Data Generator
64 x
25
0MH
z
Raw Data Checker
Video Out
HP0
HP2
Perf Monitor
Perf Monitor
64 x
150
MH
z
Interface Blocks in FPGA Xilinx IP Third Party IP Custom Logic Host PCWP459_21_111814
Zyn
q-70
00 A
P S
oC P
roce
ssin
g S
yste
m
WP459 (v1.0) January 13, 2015 www.xilinx.com 24
Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
SummaryTable 1 summarizes the recommended interface on the PS and the corresponding data-mover IP for moving the data between the PS and PL. In using Table 1, f irst locate a body cell containing a description of the task and the performance required. Then refer to the row of headings above and the column of headings to the left to identify the best PS interface and the best data-mover IP for those requirements, respectively. If these interface results do not look quite right or are clearly inappropriate for the project at hand, try to find a similar task/performance description. The best interface models shown in the headings now might be more in line with the rest of the design.
Table 1: Zynq-7000 SoC PS Interface Usage vs. Data-Mover IP
Zynq-7000 SoC PS Interface Usage
Data-Mover IP GP AXI Master Port GP AXI Slave Port HP AXI Port ACP (Accelerated Coherency Port)
Simple Register Read/Write
Control and interrupt data to AXI PL peripheral.
Control and interrupt data to PS. — —
AXI FIFO MM2S Control and interrupt data or low to medium performance data transfer between PS and streaming PL IP.
Control and interrupt data, or low to medium performance data transfer between streaming PL IP and PS memory.
Low to medium performance data transfer between streaming PL IP and PS memory.
Low latency data transfer between PL IP and PS Memory; data results to be in L2 cache, coherent with processors. Improves application performance.
DMA(Simple)
Control and interrupt data to PL. —
Medium performance data transfer between streaming PL IP and PS memory.
Low latency data transfer between PL IP and PS Memory; data results to be in L2 cache, coherent with processors. Improves application performance.
DMA (Scatter/Gather)
Control and interrupt data to PL. —
High performance data transfers between streaming PL IP and PS memory.
Low latency data transfer between PL IP and PS Memory; data results to be in L2 cache, coherent with processors. Improves application performance.
PL330 DMA Control and interrupt data to PL, or low to medium performance data transfer between memory-mapped PL IP and PS memory.
Low to medium performance data transfer between memory Mapped PL IP and PS memory or PS peripherals.
— —
WP459 (v1.0) January 13, 2015 www.xilinx.com 25
Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
ConclusionThis white paper lists the most popular data-mover IPs from Xilinx with the appropriate reference designs to which designers can refer to implement a wide range of systems. The design examples and recommendations provided in this document help designers choose the best data-mover IP much earlier in the design phase. Faster time to market is a competitive advantage; choosing Xilinx FPGAs, Targeted Design Platforms, AXI4 IP, and industry-standard AXI4 support provides that competitive edge.
Video DMA Control and interrupt data to PL. —
High performance video pixel information between video IP in PL and PS memory.
—
PCI Express DMA Control and interrupt data to PL. —
High performance data transfers between PCI Express core in PL and PS memory.
Low latency data transfer between PCI Express Core in PL and PS Memory; data results to be in L2 cache, coherent with processors. Improves application performance.
Table 1: Zynq-7000 SoC PS Interface Usage vs. Data-Mover IP (Cont’d)
Zynq-7000 SoC PS Interface Usage
Data-Mover IP GP AXI Master Port GP AXI Slave Port HP AXI Port ACP (Accelerated Coherency Port)
WP459 (v1.0) January 13, 2015 www.xilinx.com 26
Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
References1. Vivado Design Suite: AXI Reference Guide (UG1037)
2. Maximize System Performance Using Xilinx Based AXI4 Interconnects (WP417)
3. System Monitoring using the Zynq-7000 AP SoC Processing System with the XADC AXI Interface (XAPP1182)
4. LogiCORE IP AXI4-Stream AXI FIFO Product Guide (PG080)
5. LogiCORE IP AXI DMA Product Guide (PG021)
6. Implementing Analog Data Acquisition using the Zynq-7000 AP SoC Processing System with XADC AXI Interface (XAPP1183)
7. Zynq-7000 All Programmable SoC Accelerator for Floating-Point Matrix Multiplication using Vivado HLS (XAPP1170)
8. PS and PL Ethernet Performance and Jumbo Frame Support with PL Ethernet in the Zynq-7000 AP SoC (XAPP1082)
9. LogiCORE IP AXI VDMA Product Guide (PG020)
10. Designing High-Performance Video Systems with the Zynq-7000 All Programmable SoC Using IP Integrator (XAPP1205)
11. Zynq Base Targeted Reference Design (UG925)
12. PCI Express Endpoint-DMA Initiator Subsystem (XAPP1171)
13. Zynq-7000 AP SoC ZC706 Evaluation Kit Getting Started Guide (ISE Design Suite 14.7) (UG961)
Further Reading1. Zynq-7000 All Programmable SoC Technical Reference Manual (UG585)
2. Xilinx Design Tools: Vivado Design Suite, Embedded Development Kit (XPS)
WP459 (v1.0) January 13, 2015 www.xilinx.com 27
Leveraging Data-Mover IPs for Data Movement in Zynq-7000 AP SoC Systems
Revision HistoryThe following table shows the revision history for this document:
DisclaimerThe information disclosed to you hereunder (the “Materials”) is provided solely for the selection and use of Xilinxproducts. To the maximum extent permitted by applicable law: (1) Materials are made available “AS IS” and with all faults,Xilinx hereby DISCLAIMS ALL WARRANTIES AND CONDITIONS, EXPRESS, IMPLIED, OR STATUTORY, INCLUDING BUT NOTLIMITED TO WARRANTIES OF MERCHANTABILITY, NON-INFRINGEMENT, OR FITNESS FOR ANY PARTICULAR PURPOSE;and (2) Xilinx shall not be liable (whether in contract or tort, including negligence, or under any other theory of liability)for any loss or damage of any kind or nature related to, arising under, or in connection with, the Materials (including youruse of the Materials), including for any direct, indirect, special, incidental, or consequential loss or damage (including lossof data, profits, goodwill, or any type of loss or damage suffered as a result of any action brought by a third party) evenif such damage or loss was reasonably foreseeable or Xilinx had been advised of the possibility of the same. Xilinxassumes no obligation to correct any errors contained in the Materials or to notify you of updates to the Materials or toproduct specif ications. You may not reproduce, modify, distribute, or publicly display the Materials without prior writtenconsent. Certain products are subject to the terms and conditions of Xilinx’s limited warranty, please refer to Xilinx’sTerms of Sale which can be viewed at http://www.xilinx.com/legal.htm#tos; IP cores may be subject to warranty andsupport terms contained in a license issued to you by Xilinx. Xilinx products are not designed or intended to be fail-safeor for use in any application requiring fail-safe performance; you assume sole risk and liability for use of Xilinx productsin such critical applications, please refer to Xilinx’s Terms of Sale which can be viewed at http://www.xilinx.com/legal.htm#tos.
Automotive Applications DisclaimerXILINX PRODUCTS ARE NOT DESIGNED OR INTENDED TO BE FAIL-SAFE, OR FOR USE IN ANY APPLICATION REQUIRINGFAIL-SAFE PERFORMANCE, SUCH AS APPLICATIONS RELATED TO: (I) THE DEPLOYMENT OF AIRBAGS, (II) CONTROL OF AVEHICLE, UNLESS THERE IS A FAIL-SAFE OR REDUNDANCY FEATURE (WHICH DOES NOT INCLUDE USE OF SOFTWARE INTHE XILINX DEVICE TO IMPLEMENT THE REDUNDANCY) AND A WARNING SIGNAL UPON FAILURE TO THE OPERATOR, OR(III) USES THAT COULD LEAD TO DEATH OR PERSONAL INJURY. CUSTOMER ASSUMES THE SOLE RISK AND LIABILITY OFANY USE OF XILINX PRODUCTS IN SUCH APPLICATIONS.
Date Version Description of Revisions01/13/2015 1.0 Initial Xilinx release.