hardware/software system-on-chip co …eprints.utm.my/id/eprint/53896/1/teohongyapmfke2015.pdftime...

22
HARDWARE/SOFTWARE SYSTEM-ON-CHIP CO-VERIFICATION PLATFORM BASED ON LOGIC-BASED ENVIRONMENT FOR APPLICATION PROGRAMMING INTERFACING TEO HONG YAP A project report submitted in partial fulfilment of the requirements for the award of the degree of Master of Engineering (Electrical - Computer and Microelectronic System) Faculty of Electrical Engineering Universiti Teknologi Malaysia JUNE 2015

Upload: lynga

Post on 20-Mar-2019

214 views

Category:

Documents


0 download

TRANSCRIPT

HARDWARE/SOFTWARE SYSTEM-ON-CHIP CO-VERIFICATION PLATFORMBASED ON LOGIC-BASED ENVIRONMENT FOR APPLICATION

PROGRAMMING INTERFACING

TEO HONG YAP

A project report submitted in partial fulfilment of therequirements for the award of the degree of

Master of Engineering (Electrical - Computer and Microelectronic System)

Faculty of Electrical EngineeringUniversiti Teknologi Malaysia

JUNE 2015

iii

Dedicated, in thankful appreciation for support, encouragement and understanding to

my beloved mother, father and sister....

iv

ACKNOWLEDGEMENT

First and foremost, thank God for giving me the strength to complete thisproject. My sincere thanks to my supervisor, Assoc. Prof. Dr. Muhammad Nadzirbin Marsono for his generous guidance, advice and motivation throughout this study.His critical advices and comments help me to develop the study project are highlyappreciated. I would like to sincerely thank Universiti Teknologi Malaysia (UTM) forproviding the facilities and equipment for this research. I would also like to take thisgolden opportunity to express my appreciation to my family, especially my parents,for supporting, inspiriting, and encouraging me during the study and completion ofthis thesis. Lastly, I would like to express my gratitude to those who support me to getthrough all the difficulties during the completion of the study.

Teo Hong Yap

v

ABSTRACT

System-on-chip (SoC) is a single-chip that integrates hardware and softwarecomponents. Hardware/software co-design and co-verification are crucial steps toensure functional correctness of SoC design. Hardware/software co-verificationtechnique is needed to test and decide ways to partition software and hardwarecomponents for an optimized system. Recently, field-programmable gate array(FPGA) prototyping has been proposed as a method that provides a rapid prototypingplatform of SoC development and verification. SoC FPGA prototyping involvesmultiple cross-platform asynchronous clock domains that make SoC verificationprocess becomes more challenging. This project implements an asynchronous first-in-first-out (FIFO) based data transfer between two hardware components whichare operating in different clock domains. This implementation operates in actualFPGA and makes use of Logic-based Environment for Application Programming(LEAP) infrastructure such as communication mechanism to allow communicationbetween hardware and software models or components. A study related to executiontime characterization is done to understand the effects of hardware/software taskspartitioning on hardware/software communication, hardware execution and softwareexecution time. Resource analysis is done on asynchronous FIFO implementationand it shows a logarithmic relationship between the logic elements and FIFO entries.An approximately linear relationship between two-way average latency and data sizeare shown by passing data from FPGA to host and return back the data from host toFPGA. MPEG-2 Audio Layer III (MP3) decoder case study shows with an optimumhardware/software partitioning, the co-verification platform is able to achieve acommunication time of approximately 30 million cycles with 99.99 percent of the timespent originated from hardware/software communication. This result clearly showsthat bidirectional communication between hardware and software plays a significantrole in affecting the total communication time spent for particular application whichhas tasks running in both hardware and software.

vi

ABSTRAK

Sistem-atas-Cip (SoC) adalah litar bersepadu (IC) yang mengintegrasikansemua komponen komputer atau litar elektronik ke dalam satu mikrocip.Pembangunan dan pengesahan SoC telah menjadi langkah-langkah yang penting untukmemastikan ketepatan fungsi reka bentuk SoC. Satu teknik pengesahan diperlukanbagi menguji sistem yang merangkumi tugas perkakasan dan perisian. Teknik ini jugamembolehkan partition komponen perisian dan perkakasan dijalankan untuk mendapatsatu sistem yang optimum. Baru-baru ini, tatasusunan get boleh atur cara medan(FPGA) telah dicadangkan sebagai satu kaedah bagi menyediakan satu platform untukpembangunan dan pengesahan SoC. Prototaip FPGA seperti reka bentuk SoC yangmelibatkan beberapa sistem jam tak segerak telah menjadikan proses pengesahanSoC semakin mencabar. Projek ini bertujuan untuk mereka bentuk logik masukdulu keluar dulu (FIFO) tak segerak untuk membolehkan pemindahan data antaradua komponen perkakasan yang beroperasi dengan system jam yang berbeza. Rekabentuk logik ini harus dapat beroperasi dalam FPGA dan menggunakan infrastrukturLogic-based Environment for Application Programming (LEAP) seperti mekanismekomunikasi untuk membolehkan komunikasi antara tugas perkakasan dan perisian.Kajian yang berkaitan dengan masa perlakuan telah dilakukan untuk memahamikesan pembahagian tugas perkakasan/perisian atas masa komunikasi, masa perlakuanperkakasan dan masa perlakuan perisian. Sumber analisis bagi pelaksanaan FIFOtak segerak menunjukkan hubungan logaritma antara unsur logik dan FIFO saizalamat. Kajian dua hala hantaran data menunjukan hubungan linear antara puratadua cara kependaman komunikasi dan saiz data. Kajian kes penyahkod MPEG-2Audio Layer III (MP3) menunjukkan pembahagian perkakasan/perisian yang optimumboleh mencapai masa komunikasi 30 juta kitaran di mana 99.99 peratus daripadamasa komunikasi itu adalah berasal daripada komunikasi bentuk perkakasan/perisian.Keputusan ini jelas menunjukkan bahawa komunikasi dua hala antara perkakasandan perisian memainkan peranan penting yang memberi kesan kepada jumlah masakomunikasi yang digunakan untuk aplikasi tertentu yang mempunyai tugas berjalandalam kedua-dua perkakasan dan perisian.

vii

TABLE OF CONTENTS

CHAPTER TITLE PAGE

DECLARATION iiDEDICATION iiiACKNOWLEDGEMENT ivABSTRACT vABSTRAK viTABLE OF CONTENTS viiLIST OF TABLES ixLIST OF FIGURES xLIST OF ABBREVIATIONS xiiiLIST OF SYMBOLS xivLIST OF APPENDICES xv

1 INTRODUCTION 11.1 Problem Background 11.2 Problem Statement 31.3 Objective 31.4 Project Scope 41.5 Report Organization 5

2 LITERATURE REVIEW 62.1 Virtual and FPGA-Based Prototyping 62.2 Co-simulation and Co-emulation 102.3 Hardware/Software Communication Interface 132.4 Clock Domain Crossing 152.5 Related Works on Communication Interface and

Clock Synchronization 162.6 Motivation for Extended Works 23

3 RESEARCH METHODOLOGY 25

viii

3.1 Research Flow 253.2 Research Tools 26

3.2.1 Asim Architect’s Workbench (AWB) 263.2.2 FPGA Platform 28

3.3 Asynchronous FIFO Basic Behavior 293.4 System Under Test 313.5 Research Outcome 323.6 Chapter Summary 32

4 DESIGN IMPLEMENTATIONS 344.1 Asynchronous First-In-First-Out 34

4.1.1 RTL Architecture 344.1.2 Timing Simulation 35

4.2 Hardware/Software Interfacing 364.2.1 Hardware/Software Communication 37

4.3 System under Test 394.3.1 Complete System 39

4.4 Chapter Summary 40

5 RESULT AND DISCUSSION 415.1 Results 415.2 Analysis 435.3 Case Study 44

5.3.1 MPEG-2 Audio Layer III Decoder 445.4 Chapter Summary 48

6 CONCLUSION 496.1 Conclusion 496.2 Direction for Future Work 49

REFERENCES 50Appendices A – C 53 – 68

ix

LIST OF TABLES

TABLE NO. TITLE PAGE

2.1 Functionality of each logic part exists in AS-FIFO design [6].18

2.2 Signal description of three stage FIFO [7]. 192.3 Comparison on previous works been done to synchronize data

between two different clock domains. 212.4 Function of LEAP interfacing components. 233.1 Description for each components in AWB file [2]. 284.1 Details about each layer in RRR protocol stack [18]. 385.1 Resources analysis of RTL implementation as the address

width (bits) increases. The address width signifies the totalentries within FIFO memory. 41

5.2 Latency analysis of two way data passing as the data size(bytes) increases. Analysis is done based on Altera DE2board with device Cyclone II EP2C35F672C6. 43

x

LIST OF FIGURES

FIGURE NO. TITLE PAGE

1.1 SoC basic architecture. 11.2 Hybrid virtual platform which consists of abstraction layer

made up of hardware and software modules with userapplication prototyping [9]. 2

1.3 Project flow and procedures. 41.4 Project report organization. 52.1 Co-design model with existence of hardware and software

components [11]. 72.2 Typical design flow being used which called Rapid System

Prototyping [13]. 72.3 Three steps to transform a system-level specification

systematically into a mixed hardware/software system [12]. 82.4 Heterogeneous modeling [12]. 82.5 SoC block partitioning between FPGA-based and virtual

prototyping [8]. 92.6 Single engine is utilized in homogeneous environment for

simulating both hardware and software components [22]. 102.7 Two different engines are utilized in heterogeneous environ-

ment for simulating both hardware and software components[22]. 11

2.8 Semi-homogeneous co-simulation environment based onInstruction Set Simulator and SystemC [22]. 12

2.9 RTL-level simulation is linked to ASIC emulators or FGPA-based prototypes by using co-emulation verification method.

132.10 Hardware/software interface acts as a connection between

software application and custom hardware module [11]. 142.11 Asynchronous clocks clk1 and clk2, with periods 13 and 10

respectively [16]. 15

xi

2.12 Output response for violating or meeting setup and holdrequirements [16]. 16

2.13 Data transfer circuit implements a method to transfer databetween a driver and a receiver which operated in distinctclock domains [5]. 17

2.14 Functional block diagram of AS-FIFO design [6]. 182.15 Three-stage FIFO with clocked sender and receiver [7]. 192.16 Synchronizer being proposed consists of half-cycle synchro-

nizing stages to minimize latency through the FIFO [7]. 202.17 LEAP structural overview [3, 4]. 223.1 Project flow which is conducted during the research. 263.2 AWB model tree for simple hardware/software hybrid build. 273.3 High level architecture block diagram for DE2 board [17]. 283.4 Schematic diagram of 50Mhz clock circuit for Altera DE2

board [17]. 293.5 Algorithm being used for designing asynchronous FIFO. 303.6 Functional block diagram of clock synchronization module. 313.7 Complete system been used to demonstrate the implemen-

tation of LEAP interface which allow communication beendone between hardware and software tasks. This system alsodemonstrate correct behavior of asynchronous FIFO whichbeen implemented. 32

4.1 RTL implementation for asynchronous FIFO logic. 354.2 Asynchronous FIFO timing simulation waveform one. 354.3 Asynchronous FIFO timing simulation waveform two. 364.4 RRR protocol stack [18]. 384.5 Results of testing been done for complete system. 405.1 Logic elements versus address width graph for asynchronous

FIFO implementation. 425.2 Dedicated logic registers versus address width graph for

asynchronous FIFO implementation. 425.3 Total latency versus data size. 435.4 Average latency versus data size. 445.5 Task graph for MP3 decoder [19]. 455.6 Hardware execution time (Cycles) versus area constraint for

MP3 case study. 465.7 Software execution time (Cycles) versus area constraint for

MP3 case study. 46

xii

5.8 Communication time versus area constraint for MP3 casestudy. 47

5.9 Communication ratio versus area constraint for MP3 casestudy. 48

xiii

LIST OF ABBREVIATIONS

AWB – Asim Architect’s Workbench

CPU – Central Processing Unit

FIFO – First-In-First-Out

FPGA – Field-Programmable Gate Array

GPS – Global Positioning System

IPC – Inter-Process Communication

ISS – Instruction Set Simulator

LEAP – Logic-based Environment for Application Programming

MP3 – MPEG-2 Audio Layer III

RTL – Register Transfer Level

SoC – System-on-Chip

xiv

LIST OF SYMBOLS

nComm – Communication ratio

nHSComm – Number of hardware-software communication

nHHComm – Number of hardware-hardware communication

xv

LIST OF APPENDICES

APPENDIX TITLE PAGE

A Customized asynchronous FIFO designed in BluespecSystem Verilog. 53

B System under test. 62C One of the system tasks partitioning between hardware and

software tasks for MP3 case study. 68

CHAPTER 1

INTRODUCTION

1.1 Problem Background

System-on-Chip (SoC) is a single integrated circuit (IC) that integrates orpackages all components of a computer or electronic circuits into a single microchip.SoC is evolving and causing more system components being integrated on single chipdevice for each generation. Figure 1.1 shows the basic architecture example of a SoC.

Figure 1.1: SoC basic architecture.

Existence of FPGA prototyping has become a method which provides aplatform for SoC development and verification. FPGA prototyping provides a fast andaccurate SoC model for early embedded software development in pre-silicon stage.

2

Virtual platform provides designer with a set of common functionalitiesby abstracting physical devices into an abstraction layer. Hybrid virtual platformsplits module functionality between FPGA and software [8]. Figure 1.2 shows ahybrid virtual platform which consists of abstraction layer made up of hardware andsoftware modules with user application prototyping. Concept of virtual platform andprototyping allows concurrent developed software design to be simulated on a modelof virtual platform [1]. System modeling can be done at early stages before the actualproduct being produced.

Figure 1.2: Hybrid virtual platform which consists of abstraction layer made up ofhardware and software modules with user application prototyping [9].

Hardware/software co-verification technique is needed to test and decide theway to partition software and hardware components in order to produce an optimizedsystem [1, 9, 10]. Co-verification is able to reflect concurrency and synchronizationof sub-systems within the main system by observing output signals correctness whileconcurrently executing hardware and software components with appropriate inputs.This technique co-verifies correct interactions between each of SoC sub-system and isnot limited to only verify the correctness of individual sub-system [1].

Virtual and FPGA-based prototyping requires an interface between hardwareand software. It has been identified by co-design community that interface synthesisbetween hardware and software components of system-on-chip (SoC) is becoming oneof the important research fields [1,9]. Hardware/software co-design and co-verificationmethodologies allow faster time-to-market frame as it makes development, integrationand verification of sub-system become easier than before.

3

1.2 Problem Statement

The benefit of hybrid virtual platform to provide a solution which allowedfunctionality of system-on-chip to be fully modelled through virtual and FPGAprototyping has become widely used in the industry. It is a known issue that highlatency of hardware/software communication has caused system modeling becomesmore challenging. Inaccurate data sampling by prototyping system due to high latencycommunication between hardware/software may result in inaccurate system modeling.The latency of hardware/software communication greatly depends on architecturebeing used while designing the hardware/software interface.

The FPGA prototyping such as system-on-chip design may involve platformcrossing that involves employment of multiple asynchronous clock domains has makethe design verification process becomes more challenging. It is important to implementa clock synchronization module which allows different functional components indifferent clock domains to synchronize with each other. Thus, asynchronous FIFOhas been identified as an approach to achieve sub-system synchronization.

Application can be partitioned into multiple small tasks which run in eitherhardware or software. It is important to understand the effects of differenthardware/software task partitioning and to identify factors that contribute to totalapplication execution time.

1.3 Objective

The objective of this project is to implement a hybrid virtual platform whichenables prototyping of SoC and early software development in pre-silicon stage. Theinterface between hardware and software task should allow verification of correctinteractions between each of the hardware and software components. Thus, LEAP[3, 4] has been identified as interface between hardware and software task throughprimitive fixed-width bi-directional JTAG communication channel.

In this project, a clock synchronization module is proposed to enableverification to be done for system having sub-system components operating in differentclock domains. The implementations of both are verified through actual data passingin real time. The latency and resources usage of the implementations is then analyzed.

4

A study related to execution time characterization is done to understand the effects ofhardware/software tasks partitioning on communication time, hardware execution timeand software execution time. This characterization is done through actual data passingthrough LEAP interface which being implemented for a specific case study.

1.4 Project Scope

In this project, LEAP interface is implemented using Asim Architect’sWorkbench [2] and Bluespec System Verilog. This project also covers theimplementation of clock synchronization module using Bluespec System Verilog.Both implementations are targeted for Altera DE2 platform. The behavior orfunctionality of the implementation is pre-verified though BlueSim and AlteraModelsim. Both implementations are finally verified through actual data passingin real time. Data analysis is done for each implementation. MPEG-2 AudioLayer III Decoder is used as a case study for execution time characterization whichis done for different combination of hardware and software tasks obtained usinghardware/software application mapping algorithm. Figure 3.1 shows the project flowbeen used in this project.

Figure 1.3: Project flow and procedures.

5

1.5 Report Organization

Figure 1.4 shows the project report organization. This report is written in sixchapters. The remaining chapters are organized as follows.

1. Chapter 2 describes previous works related to this project. Backgroundinformation related to this project is also discussed in Chapter 2.

2. The project flow and methodology is presented in Chapter 3. It includes theflow being taken while conducting this research. The tools being used are alsodiscussed in Chapter 3. General approach and method used while designing theclock synchronization module is also presented in Chapter 3.

3. The actual interface implementation that is being implemented in real hardwaredesign is discussed in Chapter 4. Design verification is also presented in Chapter4.

4. The interpretation and analysis of the results from implementation are discussedin Chapter 5. Detail analysis of the design is also presented.

5. Chapter 6 concludes the report. Direction of future work is also proposed.

Figure 1.4: Project report organization.

REFERENCES

1. Teich, J. (2012). Hardware/software codesign: The past, the present,and predicting the future. Proceedings of the IEEE, 100(SpecialCentennial Issue), 1411-1430.

2. Emer, J., Beckmann, C., & Pellauer, M. (2007, June). AWB: TheAsim Architect’s Workbench. In 3rd Annual Workshop on Modeling,Benchmarking, and Simulation (MoBS 2007).

3. Fleming, K., Yang, H. J., Adler, M., & Emer, J. (2014, September).The LEAP FPGA operating system. In Field Programmable Logicand Applications (FPL), 2014 24th International Conference on (pp.1-8). IEEE.

4. Parashar, A., Adler, M., Fleming, K., Pellauer, M., & Emer, J. (2010,December). LEAP: A virtual platform architecture for FPGAs. In TheFirst Workshop on the Intersections of Computer Architecture andReconfigurable Logic (CARL 2010).

5. Oprescu, F. A. (2002). U.S. Patent No. 6,359,479. Washington, DC:U.S. Patent and Trademark Office.

6. Zewei, L., & Mei, X. (2008, May). An effective AS-FIFO design formultiple asynchronous clock data transmission. In Communications,Circuits and Systems, 2008. ICCCAS 2008. International Conferenceon (pp. 960-963). IEEE.

7. Ono, T., & Greenstreet, M. (2009, May). A modular synchronizingFIFO for NoCs. In Proceedings of the 2009 3rd ACM/IEEEInternational Symposium on Networks-on-Chip (pp. 224-233). IEEEComputer Society.

8. Synopsys, Inc "Synopsys Hybrid Prototyping Solution." , 2012.

9. Rao R. Tummala: System on Package: Miniaturization of the EntireSystem. Introduction to System-on-Chip (SOC), Chapter (McGraw-Hill Professional, 2008), AccessEngineering.

10. Ernst, R. (1998). Codesign of embedded systems: Status and

51

trends. Design & Test of Computers, IEEE, 15(2), 45-54.

11. Schaumont, P. (2012). A practical introduction to hardware/softwarecodesign. Springer Science & Business Media.

12. Reis, R., Lubaszewski, M., & Jess, J. (Eds.). (2006). Design ofSystems on a Chip: Design and Test. Springer.

13. Richards, M. A. (1994). The Rapid Prototyping of ApplicationSpecific Signal Processor (RASSP) Program: Overview andAccomplishments (No. 942550). SAE Technical Paper.

14. Mark Balch: Complete Digital Design: A Comprehensive Guide toDigital Electronics and Computer System Architecture. Logic Designand Finite State Machines, Chapter (McGraw-Hill Professional,2003), AccessEngineering

15. Gangadharan, S., & Churiwala, S. (2013). Constraining Designs forSynthesis and Timing Analysis. Springer..

16. Churiwala, S., & Garg, S. (2011). Principles of VLSI RTL Design: APractical Guide. Springer Science & Business Media.

17. Altera “DE2 Development and Education Board, user manual”Version 1.5, 2012.

18. Parashar, A., Adler, M., Pellauer, M., & Emer, J. (2008, June). Hybridcpu/fpga performance models. In 3rd Workshop on ArchitecturalResearch Prototyping (WARP 2008).

19. Le Beux, S., Bois, G., Nicolescu, G., Bouchebaba, Y., Langevin, M.,and Paulin, P. “Combining mapping and partitioning exploration fornocbased embedded systems,” Journal of Systems Architecture, vol.56, no. 7, pp. 223–232, 2010.

20. Bouchhima, A., Gerin, P., & Pétrot, F. (2009, January).Automatic instrumentation of embedded software for high levelhardware/software co-simulation. In Design Automation Conference,2009. ASP-DAC 2009. Asia and South Pacific (pp. 546-551). IEEE.

21. You, M. K., & Song, G. Y. (2009, January). Case study: Co-simulation and co-emulation environments based on SystemC &SystemVerilog. In TENCON 2009-2009 IEEE Region 10 Conference(pp. 1-4). IEEE.

22. Cimatti, A., & Bernardo, M. (2006). Formal Methods for HardwareVerification. 6th International School on Formal Methods for the

52

Design of Computer, Communication, and Software Systems, SFM,22-27.

23. Yang, H., Yi, Y., & Ha, S. (2009, July). A timed HW/SWcoemulation technique for fast yet accurate system verification.In Systems, Architectures, Modeling, and Simulation, 2009.SAMOS’09. International Symposium on (pp. 74-81). IEEE.

24. Kim, Y. I., Aim, K. Y., Shim, H., Yang, W., Kwon, Y. S.,Ki, A., & Kyung, C. M. (2005, April). Automatic generationof software/hardware co-emulation interface for transaction-levelcommunication. In VLSI Design, Automation and Test, 2005.(VLSI-TSA-DAT). 2005 IEEE VLSI-TSA International Symposium on (pp.196-199). IEEE.