hardware/software system-on-chip co …eprints.utm.my/id/eprint/53896/1/teohongyapmfke2015.pdftime...
TRANSCRIPT
HARDWARE/SOFTWARE SYSTEM-ON-CHIP CO-VERIFICATION PLATFORMBASED ON LOGIC-BASED ENVIRONMENT FOR APPLICATION
PROGRAMMING INTERFACING
TEO HONG YAP
A project report submitted in partial fulfilment of therequirements for the award of the degree of
Master of Engineering (Electrical - Computer and Microelectronic System)
Faculty of Electrical EngineeringUniversiti Teknologi Malaysia
JUNE 2015
iii
Dedicated, in thankful appreciation for support, encouragement and understanding to
my beloved mother, father and sister....
iv
ACKNOWLEDGEMENT
First and foremost, thank God for giving me the strength to complete thisproject. My sincere thanks to my supervisor, Assoc. Prof. Dr. Muhammad Nadzirbin Marsono for his generous guidance, advice and motivation throughout this study.His critical advices and comments help me to develop the study project are highlyappreciated. I would like to sincerely thank Universiti Teknologi Malaysia (UTM) forproviding the facilities and equipment for this research. I would also like to take thisgolden opportunity to express my appreciation to my family, especially my parents,for supporting, inspiriting, and encouraging me during the study and completion ofthis thesis. Lastly, I would like to express my gratitude to those who support me to getthrough all the difficulties during the completion of the study.
Teo Hong Yap
v
ABSTRACT
System-on-chip (SoC) is a single-chip that integrates hardware and softwarecomponents. Hardware/software co-design and co-verification are crucial steps toensure functional correctness of SoC design. Hardware/software co-verificationtechnique is needed to test and decide ways to partition software and hardwarecomponents for an optimized system. Recently, field-programmable gate array(FPGA) prototyping has been proposed as a method that provides a rapid prototypingplatform of SoC development and verification. SoC FPGA prototyping involvesmultiple cross-platform asynchronous clock domains that make SoC verificationprocess becomes more challenging. This project implements an asynchronous first-in-first-out (FIFO) based data transfer between two hardware components whichare operating in different clock domains. This implementation operates in actualFPGA and makes use of Logic-based Environment for Application Programming(LEAP) infrastructure such as communication mechanism to allow communicationbetween hardware and software models or components. A study related to executiontime characterization is done to understand the effects of hardware/software taskspartitioning on hardware/software communication, hardware execution and softwareexecution time. Resource analysis is done on asynchronous FIFO implementationand it shows a logarithmic relationship between the logic elements and FIFO entries.An approximately linear relationship between two-way average latency and data sizeare shown by passing data from FPGA to host and return back the data from host toFPGA. MPEG-2 Audio Layer III (MP3) decoder case study shows with an optimumhardware/software partitioning, the co-verification platform is able to achieve acommunication time of approximately 30 million cycles with 99.99 percent of the timespent originated from hardware/software communication. This result clearly showsthat bidirectional communication between hardware and software plays a significantrole in affecting the total communication time spent for particular application whichhas tasks running in both hardware and software.
vi
ABSTRAK
Sistem-atas-Cip (SoC) adalah litar bersepadu (IC) yang mengintegrasikansemua komponen komputer atau litar elektronik ke dalam satu mikrocip.Pembangunan dan pengesahan SoC telah menjadi langkah-langkah yang penting untukmemastikan ketepatan fungsi reka bentuk SoC. Satu teknik pengesahan diperlukanbagi menguji sistem yang merangkumi tugas perkakasan dan perisian. Teknik ini jugamembolehkan partition komponen perisian dan perkakasan dijalankan untuk mendapatsatu sistem yang optimum. Baru-baru ini, tatasusunan get boleh atur cara medan(FPGA) telah dicadangkan sebagai satu kaedah bagi menyediakan satu platform untukpembangunan dan pengesahan SoC. Prototaip FPGA seperti reka bentuk SoC yangmelibatkan beberapa sistem jam tak segerak telah menjadikan proses pengesahanSoC semakin mencabar. Projek ini bertujuan untuk mereka bentuk logik masukdulu keluar dulu (FIFO) tak segerak untuk membolehkan pemindahan data antaradua komponen perkakasan yang beroperasi dengan system jam yang berbeza. Rekabentuk logik ini harus dapat beroperasi dalam FPGA dan menggunakan infrastrukturLogic-based Environment for Application Programming (LEAP) seperti mekanismekomunikasi untuk membolehkan komunikasi antara tugas perkakasan dan perisian.Kajian yang berkaitan dengan masa perlakuan telah dilakukan untuk memahamikesan pembahagian tugas perkakasan/perisian atas masa komunikasi, masa perlakuanperkakasan dan masa perlakuan perisian. Sumber analisis bagi pelaksanaan FIFOtak segerak menunjukkan hubungan logaritma antara unsur logik dan FIFO saizalamat. Kajian dua hala hantaran data menunjukan hubungan linear antara puratadua cara kependaman komunikasi dan saiz data. Kajian kes penyahkod MPEG-2Audio Layer III (MP3) menunjukkan pembahagian perkakasan/perisian yang optimumboleh mencapai masa komunikasi 30 juta kitaran di mana 99.99 peratus daripadamasa komunikasi itu adalah berasal daripada komunikasi bentuk perkakasan/perisian.Keputusan ini jelas menunjukkan bahawa komunikasi dua hala antara perkakasandan perisian memainkan peranan penting yang memberi kesan kepada jumlah masakomunikasi yang digunakan untuk aplikasi tertentu yang mempunyai tugas berjalandalam kedua-dua perkakasan dan perisian.
vii
TABLE OF CONTENTS
CHAPTER TITLE PAGE
DECLARATION iiDEDICATION iiiACKNOWLEDGEMENT ivABSTRACT vABSTRAK viTABLE OF CONTENTS viiLIST OF TABLES ixLIST OF FIGURES xLIST OF ABBREVIATIONS xiiiLIST OF SYMBOLS xivLIST OF APPENDICES xv
1 INTRODUCTION 11.1 Problem Background 11.2 Problem Statement 31.3 Objective 31.4 Project Scope 41.5 Report Organization 5
2 LITERATURE REVIEW 62.1 Virtual and FPGA-Based Prototyping 62.2 Co-simulation and Co-emulation 102.3 Hardware/Software Communication Interface 132.4 Clock Domain Crossing 152.5 Related Works on Communication Interface and
Clock Synchronization 162.6 Motivation for Extended Works 23
3 RESEARCH METHODOLOGY 25
viii
3.1 Research Flow 253.2 Research Tools 26
3.2.1 Asim Architect’s Workbench (AWB) 263.2.2 FPGA Platform 28
3.3 Asynchronous FIFO Basic Behavior 293.4 System Under Test 313.5 Research Outcome 323.6 Chapter Summary 32
4 DESIGN IMPLEMENTATIONS 344.1 Asynchronous First-In-First-Out 34
4.1.1 RTL Architecture 344.1.2 Timing Simulation 35
4.2 Hardware/Software Interfacing 364.2.1 Hardware/Software Communication 37
4.3 System under Test 394.3.1 Complete System 39
4.4 Chapter Summary 40
5 RESULT AND DISCUSSION 415.1 Results 415.2 Analysis 435.3 Case Study 44
5.3.1 MPEG-2 Audio Layer III Decoder 445.4 Chapter Summary 48
6 CONCLUSION 496.1 Conclusion 496.2 Direction for Future Work 49
REFERENCES 50Appendices A – C 53 – 68
ix
LIST OF TABLES
TABLE NO. TITLE PAGE
2.1 Functionality of each logic part exists in AS-FIFO design [6].18
2.2 Signal description of three stage FIFO [7]. 192.3 Comparison on previous works been done to synchronize data
between two different clock domains. 212.4 Function of LEAP interfacing components. 233.1 Description for each components in AWB file [2]. 284.1 Details about each layer in RRR protocol stack [18]. 385.1 Resources analysis of RTL implementation as the address
width (bits) increases. The address width signifies the totalentries within FIFO memory. 41
5.2 Latency analysis of two way data passing as the data size(bytes) increases. Analysis is done based on Altera DE2board with device Cyclone II EP2C35F672C6. 43
x
LIST OF FIGURES
FIGURE NO. TITLE PAGE
1.1 SoC basic architecture. 11.2 Hybrid virtual platform which consists of abstraction layer
made up of hardware and software modules with userapplication prototyping [9]. 2
1.3 Project flow and procedures. 41.4 Project report organization. 52.1 Co-design model with existence of hardware and software
components [11]. 72.2 Typical design flow being used which called Rapid System
Prototyping [13]. 72.3 Three steps to transform a system-level specification
systematically into a mixed hardware/software system [12]. 82.4 Heterogeneous modeling [12]. 82.5 SoC block partitioning between FPGA-based and virtual
prototyping [8]. 92.6 Single engine is utilized in homogeneous environment for
simulating both hardware and software components [22]. 102.7 Two different engines are utilized in heterogeneous environ-
ment for simulating both hardware and software components[22]. 11
2.8 Semi-homogeneous co-simulation environment based onInstruction Set Simulator and SystemC [22]. 12
2.9 RTL-level simulation is linked to ASIC emulators or FGPA-based prototypes by using co-emulation verification method.
132.10 Hardware/software interface acts as a connection between
software application and custom hardware module [11]. 142.11 Asynchronous clocks clk1 and clk2, with periods 13 and 10
respectively [16]. 15
xi
2.12 Output response for violating or meeting setup and holdrequirements [16]. 16
2.13 Data transfer circuit implements a method to transfer databetween a driver and a receiver which operated in distinctclock domains [5]. 17
2.14 Functional block diagram of AS-FIFO design [6]. 182.15 Three-stage FIFO with clocked sender and receiver [7]. 192.16 Synchronizer being proposed consists of half-cycle synchro-
nizing stages to minimize latency through the FIFO [7]. 202.17 LEAP structural overview [3, 4]. 223.1 Project flow which is conducted during the research. 263.2 AWB model tree for simple hardware/software hybrid build. 273.3 High level architecture block diagram for DE2 board [17]. 283.4 Schematic diagram of 50Mhz clock circuit for Altera DE2
board [17]. 293.5 Algorithm being used for designing asynchronous FIFO. 303.6 Functional block diagram of clock synchronization module. 313.7 Complete system been used to demonstrate the implemen-
tation of LEAP interface which allow communication beendone between hardware and software tasks. This system alsodemonstrate correct behavior of asynchronous FIFO whichbeen implemented. 32
4.1 RTL implementation for asynchronous FIFO logic. 354.2 Asynchronous FIFO timing simulation waveform one. 354.3 Asynchronous FIFO timing simulation waveform two. 364.4 RRR protocol stack [18]. 384.5 Results of testing been done for complete system. 405.1 Logic elements versus address width graph for asynchronous
FIFO implementation. 425.2 Dedicated logic registers versus address width graph for
asynchronous FIFO implementation. 425.3 Total latency versus data size. 435.4 Average latency versus data size. 445.5 Task graph for MP3 decoder [19]. 455.6 Hardware execution time (Cycles) versus area constraint for
MP3 case study. 465.7 Software execution time (Cycles) versus area constraint for
MP3 case study. 46
xii
5.8 Communication time versus area constraint for MP3 casestudy. 47
5.9 Communication ratio versus area constraint for MP3 casestudy. 48
xiii
LIST OF ABBREVIATIONS
AWB – Asim Architect’s Workbench
CPU – Central Processing Unit
FIFO – First-In-First-Out
FPGA – Field-Programmable Gate Array
GPS – Global Positioning System
IPC – Inter-Process Communication
ISS – Instruction Set Simulator
LEAP – Logic-based Environment for Application Programming
MP3 – MPEG-2 Audio Layer III
RTL – Register Transfer Level
SoC – System-on-Chip
xiv
LIST OF SYMBOLS
nComm – Communication ratio
nHSComm – Number of hardware-software communication
nHHComm – Number of hardware-hardware communication
xv
LIST OF APPENDICES
APPENDIX TITLE PAGE
A Customized asynchronous FIFO designed in BluespecSystem Verilog. 53
B System under test. 62C One of the system tasks partitioning between hardware and
software tasks for MP3 case study. 68
CHAPTER 1
INTRODUCTION
1.1 Problem Background
System-on-Chip (SoC) is a single integrated circuit (IC) that integrates orpackages all components of a computer or electronic circuits into a single microchip.SoC is evolving and causing more system components being integrated on single chipdevice for each generation. Figure 1.1 shows the basic architecture example of a SoC.
Figure 1.1: SoC basic architecture.
Existence of FPGA prototyping has become a method which provides aplatform for SoC development and verification. FPGA prototyping provides a fast andaccurate SoC model for early embedded software development in pre-silicon stage.
2
Virtual platform provides designer with a set of common functionalitiesby abstracting physical devices into an abstraction layer. Hybrid virtual platformsplits module functionality between FPGA and software [8]. Figure 1.2 shows ahybrid virtual platform which consists of abstraction layer made up of hardware andsoftware modules with user application prototyping. Concept of virtual platform andprototyping allows concurrent developed software design to be simulated on a modelof virtual platform [1]. System modeling can be done at early stages before the actualproduct being produced.
Figure 1.2: Hybrid virtual platform which consists of abstraction layer made up ofhardware and software modules with user application prototyping [9].
Hardware/software co-verification technique is needed to test and decide theway to partition software and hardware components in order to produce an optimizedsystem [1, 9, 10]. Co-verification is able to reflect concurrency and synchronizationof sub-systems within the main system by observing output signals correctness whileconcurrently executing hardware and software components with appropriate inputs.This technique co-verifies correct interactions between each of SoC sub-system and isnot limited to only verify the correctness of individual sub-system [1].
Virtual and FPGA-based prototyping requires an interface between hardwareand software. It has been identified by co-design community that interface synthesisbetween hardware and software components of system-on-chip (SoC) is becoming oneof the important research fields [1,9]. Hardware/software co-design and co-verificationmethodologies allow faster time-to-market frame as it makes development, integrationand verification of sub-system become easier than before.
3
1.2 Problem Statement
The benefit of hybrid virtual platform to provide a solution which allowedfunctionality of system-on-chip to be fully modelled through virtual and FPGAprototyping has become widely used in the industry. It is a known issue that highlatency of hardware/software communication has caused system modeling becomesmore challenging. Inaccurate data sampling by prototyping system due to high latencycommunication between hardware/software may result in inaccurate system modeling.The latency of hardware/software communication greatly depends on architecturebeing used while designing the hardware/software interface.
The FPGA prototyping such as system-on-chip design may involve platformcrossing that involves employment of multiple asynchronous clock domains has makethe design verification process becomes more challenging. It is important to implementa clock synchronization module which allows different functional components indifferent clock domains to synchronize with each other. Thus, asynchronous FIFOhas been identified as an approach to achieve sub-system synchronization.
Application can be partitioned into multiple small tasks which run in eitherhardware or software. It is important to understand the effects of differenthardware/software task partitioning and to identify factors that contribute to totalapplication execution time.
1.3 Objective
The objective of this project is to implement a hybrid virtual platform whichenables prototyping of SoC and early software development in pre-silicon stage. Theinterface between hardware and software task should allow verification of correctinteractions between each of the hardware and software components. Thus, LEAP[3, 4] has been identified as interface between hardware and software task throughprimitive fixed-width bi-directional JTAG communication channel.
In this project, a clock synchronization module is proposed to enableverification to be done for system having sub-system components operating in differentclock domains. The implementations of both are verified through actual data passingin real time. The latency and resources usage of the implementations is then analyzed.
4
A study related to execution time characterization is done to understand the effects ofhardware/software tasks partitioning on communication time, hardware execution timeand software execution time. This characterization is done through actual data passingthrough LEAP interface which being implemented for a specific case study.
1.4 Project Scope
In this project, LEAP interface is implemented using Asim Architect’sWorkbench [2] and Bluespec System Verilog. This project also covers theimplementation of clock synchronization module using Bluespec System Verilog.Both implementations are targeted for Altera DE2 platform. The behavior orfunctionality of the implementation is pre-verified though BlueSim and AlteraModelsim. Both implementations are finally verified through actual data passingin real time. Data analysis is done for each implementation. MPEG-2 AudioLayer III Decoder is used as a case study for execution time characterization whichis done for different combination of hardware and software tasks obtained usinghardware/software application mapping algorithm. Figure 3.1 shows the project flowbeen used in this project.
Figure 1.3: Project flow and procedures.
5
1.5 Report Organization
Figure 1.4 shows the project report organization. This report is written in sixchapters. The remaining chapters are organized as follows.
1. Chapter 2 describes previous works related to this project. Backgroundinformation related to this project is also discussed in Chapter 2.
2. The project flow and methodology is presented in Chapter 3. It includes theflow being taken while conducting this research. The tools being used are alsodiscussed in Chapter 3. General approach and method used while designing theclock synchronization module is also presented in Chapter 3.
3. The actual interface implementation that is being implemented in real hardwaredesign is discussed in Chapter 4. Design verification is also presented in Chapter4.
4. The interpretation and analysis of the results from implementation are discussedin Chapter 5. Detail analysis of the design is also presented.
5. Chapter 6 concludes the report. Direction of future work is also proposed.
Figure 1.4: Project report organization.
REFERENCES
1. Teich, J. (2012). Hardware/software codesign: The past, the present,and predicting the future. Proceedings of the IEEE, 100(SpecialCentennial Issue), 1411-1430.
2. Emer, J., Beckmann, C., & Pellauer, M. (2007, June). AWB: TheAsim Architect’s Workbench. In 3rd Annual Workshop on Modeling,Benchmarking, and Simulation (MoBS 2007).
3. Fleming, K., Yang, H. J., Adler, M., & Emer, J. (2014, September).The LEAP FPGA operating system. In Field Programmable Logicand Applications (FPL), 2014 24th International Conference on (pp.1-8). IEEE.
4. Parashar, A., Adler, M., Fleming, K., Pellauer, M., & Emer, J. (2010,December). LEAP: A virtual platform architecture for FPGAs. In TheFirst Workshop on the Intersections of Computer Architecture andReconfigurable Logic (CARL 2010).
5. Oprescu, F. A. (2002). U.S. Patent No. 6,359,479. Washington, DC:U.S. Patent and Trademark Office.
6. Zewei, L., & Mei, X. (2008, May). An effective AS-FIFO design formultiple asynchronous clock data transmission. In Communications,Circuits and Systems, 2008. ICCCAS 2008. International Conferenceon (pp. 960-963). IEEE.
7. Ono, T., & Greenstreet, M. (2009, May). A modular synchronizingFIFO for NoCs. In Proceedings of the 2009 3rd ACM/IEEEInternational Symposium on Networks-on-Chip (pp. 224-233). IEEEComputer Society.
8. Synopsys, Inc "Synopsys Hybrid Prototyping Solution." , 2012.
9. Rao R. Tummala: System on Package: Miniaturization of the EntireSystem. Introduction to System-on-Chip (SOC), Chapter (McGraw-Hill Professional, 2008), AccessEngineering.
10. Ernst, R. (1998). Codesign of embedded systems: Status and
51
trends. Design & Test of Computers, IEEE, 15(2), 45-54.
11. Schaumont, P. (2012). A practical introduction to hardware/softwarecodesign. Springer Science & Business Media.
12. Reis, R., Lubaszewski, M., & Jess, J. (Eds.). (2006). Design ofSystems on a Chip: Design and Test. Springer.
13. Richards, M. A. (1994). The Rapid Prototyping of ApplicationSpecific Signal Processor (RASSP) Program: Overview andAccomplishments (No. 942550). SAE Technical Paper.
14. Mark Balch: Complete Digital Design: A Comprehensive Guide toDigital Electronics and Computer System Architecture. Logic Designand Finite State Machines, Chapter (McGraw-Hill Professional,2003), AccessEngineering
15. Gangadharan, S., & Churiwala, S. (2013). Constraining Designs forSynthesis and Timing Analysis. Springer..
16. Churiwala, S., & Garg, S. (2011). Principles of VLSI RTL Design: APractical Guide. Springer Science & Business Media.
17. Altera “DE2 Development and Education Board, user manual”Version 1.5, 2012.
18. Parashar, A., Adler, M., Pellauer, M., & Emer, J. (2008, June). Hybridcpu/fpga performance models. In 3rd Workshop on ArchitecturalResearch Prototyping (WARP 2008).
19. Le Beux, S., Bois, G., Nicolescu, G., Bouchebaba, Y., Langevin, M.,and Paulin, P. “Combining mapping and partitioning exploration fornocbased embedded systems,” Journal of Systems Architecture, vol.56, no. 7, pp. 223–232, 2010.
20. Bouchhima, A., Gerin, P., & Pétrot, F. (2009, January).Automatic instrumentation of embedded software for high levelhardware/software co-simulation. In Design Automation Conference,2009. ASP-DAC 2009. Asia and South Pacific (pp. 546-551). IEEE.
21. You, M. K., & Song, G. Y. (2009, January). Case study: Co-simulation and co-emulation environments based on SystemC &SystemVerilog. In TENCON 2009-2009 IEEE Region 10 Conference(pp. 1-4). IEEE.
22. Cimatti, A., & Bernardo, M. (2006). Formal Methods for HardwareVerification. 6th International School on Formal Methods for the
52
Design of Computer, Communication, and Software Systems, SFM,22-27.
23. Yang, H., Yi, Y., & Ha, S. (2009, July). A timed HW/SWcoemulation technique for fast yet accurate system verification.In Systems, Architectures, Modeling, and Simulation, 2009.SAMOS’09. International Symposium on (pp. 74-81). IEEE.
24. Kim, Y. I., Aim, K. Y., Shim, H., Yang, W., Kwon, Y. S.,Ki, A., & Kyung, C. M. (2005, April). Automatic generationof software/hardware co-emulation interface for transaction-levelcommunication. In VLSI Design, Automation and Test, 2005.(VLSI-TSA-DAT). 2005 IEEE VLSI-TSA International Symposium on (pp.196-199). IEEE.