XC7Z020-2CLG400C Application Guide: From Datasheet to Working Circuit
When designing a real-time industrial machine vision system, the central challenge is processing high-bandwidth video streams with minimal latency while simultaneously managing network communication, user interfaces, and system control. The Xilinx XC7Z020-2CLG400C System-on-Chip (SoC) is exceptionally well-suited for this task. It elegantly handles the parallel, high-throughput demands of image pre-processing in its Programmable Logic (PL) fabric, while its dual-core ARM Processing System (PS) runs a full-featured operating system for complex decision-making and connectivity.
Table of Contents
Application Context: Where XC7Z020-2CLG400C Fits in the System
In a modern industrial inspection system, the XC7Z020-2CLG400C acts as the computational heart. Let's consider a system designed to inspect manufactured widgets on a conveyor belt for defects. The system block diagram places the Zynq SoC at the center, coordinating multiple peripherals and executing the core logic.
Input Stage: One or more high-resolution industrial cameras, often with a MIPI CSI-2 or LVDS interface, connect directly to the Zynq's I/O banks. The Programmable Logic (PL) is ideal for implementing the deserializer logic for these high-speed interfaces. Once the raw pixel data is received, it can be immediately processed by a custom pipeline implemented in the PL. This pipeline might perform tasks like color space conversion (e.g., Bayer to RGB), lens distortion correction, and image scaling—all at line rate without burdening the CPU.
Processing Core: The XC7Z020-2CLG400C features a tightly coupled Processing System (PS) and Programmable Logic (PL).
- Programmable Logic (PL): This is where the massively parallel processing happens. For our vision system, we would implement a custom image processing pipeline using hardware description languages (VHDL/Verilog) or High-Level Synthesis (HLS). This pipeline could include FIR filters for edge detection, blob detection algorithms, and feature extraction. The results—such as defect coordinates or object classifications—are then passed to the PS.
- Processing System (PS): This contains two ARM Cortex-A9 cores. Here, we would run a Linux distribution (like PetaLinux) or a real-time operating system (RTOS). The PS takes the pre-processed data from the PL, performs higher-level analysis (e.g., using OpenCV libraries), makes decisions (pass/fail), logs results, and communicates with the outside world. The dual-core architecture allows one core to handle the real-time control loop while the other manages network traffic and a web-based status dashboard.
Memory and Storage: The Zynq interfaces with external DDR3/DDR3L SDRAM, which serves as the main system memory for the ARM processors and can also be used as a large frame buffer for the PL via AXI interconnects. For booting and non-volatile storage, the system typically uses a QSPI flash memory chip or a microSD card, with the boot mode selected by dedicated MIO (Multiplexed I/O) pins.
Output and Control Stage: Based on the inspection results, the PS can control external hardware. It might send a signal via GPIO to a pneumatic actuator to reject a defective part from the conveyor belt. It communicates status and results over a Gigabit Ethernet connection to a central factory management system. The XC7Z020-2CLG400C provides dedicated peripherals for Ethernet, USB, UART, SPI, and I2C, simplifying integration with a wide range of devices.
In essence, the Zynq SoC replaces what would have traditionally been a multi-chip solution (e.g., a CPU/microcontroller + a separate FPGA). This integration reduces board space, power consumption, and perhaps most importantly, the latency between sensing, processing, and acting.
Core Specifications for This Application
For a machine vision application, not all datasheet parameters are equally important. The following specifications of the XC7Z020-2CLG400C are critical for design success.
| Parameter | Value | Application Relevance |
|---|---|---|
| Processing System (PS) | Dual-Core ARM Cortex-A9 MPCore | Provides the power to run a full OS (e.g., Linux) for networking, file systems, and complex algorithms (like OpenCV), while the dual-core nature allows for task partitioning (e.g., real-time vs. non-real-time). |
| Logic Cells | 85K | Determines the size and complexity of the image processing pipeline that can be implemented in the PL. 85K cells are sufficient for multiple camera interfaces and sophisticated real-time filtering and feature extraction. |
| DSP Slices | 220 | These are hardware multipliers essential for accelerating digital signal processing tasks. In vision systems, they are heavily used for FIR filters, FFTs, and other mathematical-intensive operations on pixel data. |
| Block RAM | 4.9 Mb (560 Kb) | Used for on-chip frame buffers, line buffers, and storing filter coefficients. Sufficient on-chip memory reduces reliance on external DDR, lowering latency for critical processing steps. |
| Max PS Frequency (Speed Grade -2) | 766 MHz | The clock speed of the ARM cores directly impacts the performance of the software stack, influencing how quickly decisions can be made based on data from the PL. |
| Package | CLG400 (400-pin Chip Scale BGA) | This high-density package requires advanced PCB design skills. Its 19x19 mm footprint is compact, but requires careful routing, especially for the DDR3 interface. |
| I/O Pins | 200 (125 in CLG400) | Defines the number of external signals the chip can interface with. Critical for connecting to cameras, actuators, communication transceivers, and other system components. |
Reference Circuit and Component Selection
Designing a stable and reliable board around the XC7Z020-2CLG400C requires careful attention to several key support circuits. A robust design goes far beyond simply placing the SoC on a PCB. The following areas are critical.
Power Supply and Sequencing: The Zynq-7000 family has a complex power architecture with multiple voltage rails:
- VCCINT (1.0V): Core voltage for the PL. Requires a high-current, low-noise supply.
- VCCPINT (1.0V): Core voltage for the PS.
- VCCAUX (1.8V): Auxiliary voltage for internal logic.
- VCCO (1.8V/2.5V/3.3V): Voltage for the I/O banks. Each bank can have a different voltage, enabling easy interfacing with various logic levels.
- VCC_PSMIO, VCC_PSDDR, etc.: Several other rails for the PS peripherals.
DDR3/DDR3L Memory Interface: The connection to external DDR memory is one of the most challenging aspects of the layout. This is a high-speed, source-synchronous interface. All data, address, and control lines must be impedance-controlled (typically 50Ω single-ended, 100Ω differential) and length-matched to within tight tolerances (often +/- 5 mils). The clock signals must be routed as a differential pair. A 4- or 6-layer PCB is a minimum requirement, with a solid ground plane directly beneath the signal layers to ensure a clean return path. It's highly advisable to follow a known-good reference layout, such as the one provided with Xilinx evaluation kits (e.g., ZedBoard, MicroZed).
Boot Configuration: The Zynq SoC must be told where to find its initial boot software. This is configured by setting logic levels on several MIO pins (MIO[2:8]) during power-on reset. These pins are typically connected to pull-up or pull-down resistors to select a boot mode like QSPI Flash, SD Card, or JTAG for debugging. For a production system, booting from on-board QSPI flash is common for its speed and reliability. An SD card slot is often included for development and field updates. You can find more components and design ideas by exploring the full Browse Zynq-7000 Series.
Clocking: The PS requires a stable clock input (PS_CLK), typically 33.333 MHz or 50 MHz, from a low-jitter crystal oscillator. This single clock is used by internal PLLs to generate all the necessary clocks for the ARM cores, DDR controller, and peripherals. The PL side can be clocked from this PS-generated clock or from its own dedicated oscillators connected to the PL's clock-capable I/O pins.
Design Pitfalls and How to Avoid Them
Many Zynq-based projects fail not because of logic errors but due to fundamental hardware design mistakes. Here are some common pitfalls and how to steer clear of them.
| Common Mistake | Symptom | Fix |
|---|---|---|
| Incorrect Power Sequencing | Device does not boot. JTAG chain is not detected. Device may be permanently damaged. | Use a dedicated PMIC or a carefully designed sequencing circuit. Verify the sequence with an oscilloscope on the first prototype board before powering the Zynq for the first time. Cross-reference the sequence in the Zynq-7000 TRM (UG585). |
| Poor DDR3 Layout | System fails memory tests. Unpredictable crashes or data corruption when the OS boots. System fails to boot past the First Stage Bootloader (FSBL). | Strictly adhere to length-matching and impedance control rules. Use a PCB layout tool with constraint management. Simulate the interface using a tool like HyperLynx. Start with a known-good reference design. |
| Incorrect Boot Mode Strapping | The device attempts to boot from the wrong medium (e.g., tries SD card when QSPI was intended). The boot process hangs. | Carefully check the schematic for the MIO[2:8] pull-up/pull-down resistors against the boot mode table in the TRM. Ensure the resistor values are appropriate (e.g., 4.7kΩ to 10kΩ). |
| Inadequate Thermal Management | System becomes unstable or crashes under heavy load (e.g., running complex vision algorithms). Performance is throttled. | Perform a thermal analysis early in the design phase. For the CLG400 package, a simple heatsink with thermal interface material (TIM) is often required, especially if both ARM cores and a significant portion of the PL are active. |
Avoiding these pitfalls comes down to rigorous design discipline. The datasheets and technical reference manuals for the Zynq-7000 series are extensive and must be treated as the primary source of truth. Do not assume a "good enough" approach for power or high-speed interfaces. A single misplaced via or an incorrect resistor value can render a board non-functional. It is highly recommended to perform a thorough schematic and layout review with a senior engineer before sending a design for fabrication. Using the Xilinx Power Estimator (XPE) spreadsheet early in the design cycle can also prevent thermal and power delivery surprises later on.
Performance Optimization Tips
Once the basic circuit is functional, the focus shifts to optimizing performance, power, and reliability.
Thermal Management: The XC7Z020-2CLG400C's power consumption is highly dependent on application. A design running a simple Linux prompt will consume far less power than one performing real-time video processing. Use the Xilinx Power Estimator (XPE) to get a realistic estimate of power draw. For the CLG400 package, ensure there is a low-impedance thermal path from the top of the chip to a heatsink or a copper pour on the PCB. The BGA balls themselves provide a path to the board, so using thermal vias in the ground paddle area under the chip is crucial for conducting heat into the inner ground planes.
Power Integrity: For optimal performance, the power delivered to the Zynq must be clean and stable. Use a Power Delivery Network (PDN) analysis tool to ensure that voltage ripple and droop are within the specifications listed in the datasheet under all load conditions. This involves careful selection and placement of bulk and decoupling capacitors. A common strategy is to use a hierarchy of capacitors (e.g., 100µF, 10µF, 1µF, 0.1µF) to provide current over a wide frequency spectrum.
Signal Integrity: For high-speed interfaces beyond just DDR (like Gigabit Ethernet, PCIe, or high-speed camera links), signal integrity is paramount. Ensure controlled impedance routing for all differential pairs and high-speed single-ended lines. Avoid stubs, sharp corners, and excessive vias. Use termination schemes as recommended in the interface standard or the Zynq TRM. A solid, uninterrupted ground plane under these traces is non-negotiable.
PL-PS Interface Tuning: The AXI interconnect is the highway between the PL and PS. Its performance can be a system bottleneck. Use the appropriate AXI variant for the job: AXI-Lite for simple control registers, AXI-Stream for streaming data (like video), and AXI-Full/HP (High Performance) ports for memory-mapped access to DDR from the PL. Configure the data widths and clock frequencies of these interfaces to match the bandwidth requirements of your application, avoiding over-design which wastes PL resources.
Related Components and Accessories
A successful XC7Z020-2CLG400C design relies on a well-chosen ecosystem of supporting components. For our machine vision application, key components include:
- DDR3L SDRAM: Low-voltage DDR3L chips (e.g., from Micron or ISSI) are preferred for their lower power consumption. A common configuration is to use two 16-bit wide chips to create a 32-bit interface.
- QSPI Flash: A high-speed Quad SPI NOR Flash memory (e.g., from Micron or Winbond) is essential for fast booting. A capacity of 256Mb (32MB) or 512Mb (64MB) is typical.
- Ethernet PHY: To use the Gigabit Ethernet MAC in the Zynq PS, an external PHY transceiver (e.g., a Marvell Alaska or Texas Instruments DP83867) is required. This connects to the Zynq via an RGMII interface.
- Crystal Oscillators: A high-quality, low-jitter 33.333 MHz or 50 MHz oscillator is needed for the PS_CLK input. Additional oscillators may be needed for the PL or other peripherals.
- Power Management IC (PMIC): An integrated solution like the Infineon IRPS5401 or a similar device from TI can greatly simplify the complex power requirements.
Video Demonstration
Frequently Asked Questions (XC7Z020-2CLG400C FAQ)
How do I partition tasks between the ARM Processing System (PS) and the Programmable Logic (PL)?
The general rule is to use the PL for tasks that are massively parallel, repetitive, and have strict real-time deadlines. This includes high-speed signal processing, custom interface logic, and pixel-level image manipulation. Use the PS for tasks that are sequential, complex, and require rich software services. This includes running an operating system, managing file systems, handling network protocols (TCP/IP), and executing complex decision-making algorithms that may not be efficient to implement in hardware.
What is the best way to boot the XC7Z020-2CLG400C for a field-deployed product?
For a reliable production system, booting from on-board QSPI NOR Flash is the most common and robust method. The boot image, containing the First Stage Bootloader (FSBL), the bitstream for the PL, and the main application (e.g., U-Boot and the Linux kernel), is programmed into the QSPI flash. This provides a fast and non-volatile boot solution. While booting from an SD card is excellent for development and easy field updates, the physical connector and card itself can be a point of failure in harsh industrial environments.
What are the key considerations for the DDR3 memory interface layout?
The three most critical considerations are impedance control, length matching, and power integrity. All data (DQ), strobe (DQS), and address/command/control lines must be routed with a specific characteristic impedance, typically 50 ohms. The traces within each byte group must be length-matched to each other, and the clock/strobe signals must be length-matched to their associated data lines. Finally, the power supply for the memory and the Zynq's DDR controller must be exceptionally clean and well-decoupled.
Can I run a real-time operating system (RTOS) on the ARM cores?
Yes, absolutely. The dual-core ARM Cortex-A9 is well-suited for running an RTOS like FreeRTOS or a commercially supported one. You can run the RTOS in a "bare-metal" configuration (no Linux) for maximum determinism and low latency. A popular and powerful approach is Asymmetric Multiprocessing (AMP), where one core runs Linux for high-level tasks and networking, while the other core runs an RTOS for hard real-time control, with communication between them managed via shared memory.
How do I debug issues that span both the software (PS) and hardware (PL)?
This is the classic challenge of SoC design. Xilinx provides a powerful tool, the Vivado Logic Analyzer, which allows you to insert an Integrated Logic Analyzer (ILA) core into your PL design. You can then set up triggers based on hardware events in the PL and capture signal states. Crucially, you can cross-trigger between the software debugger (like Vitis or GDB) running on the PS and the ILA in the PL. For example, you can have the software debugger halt when a specific hardware event occurs, or have the ILA start capturing data when the CPU hits a certain breakpoint, allowing you to correlate software execution with hardware behavior.
Alan Carter
Senior Hardware Engineer & Component Specialist
Alan has over 15 years of expertise in embedded systems design, FPGA architecture, and global semiconductor supply chains. He specializes in component cross-referencing, lifecycle management, and helping OEMs navigate supply shortages.



