uday walvekar dsp_seminar

Download Uday walvekar dsp_seminar

If you can't read please download the document

Upload: uday-walvekar

Post on 13-Aug-2015

186 views

Category:

Technology


1 download

TRANSCRIPT

  1. 1. MULTICORE INFORMATION AND POPULAR TEXAS INSTRUMENT MULTICORE DSP PROCESSORS UDAY WALVEKAR MTECH NIELIT CALICUT
  2. 2. WHY MULTICORE Gap between processor and memory speeds. Constraints in parallelism on instructions. Increased power consumption by single core processors.
  3. 3. MULTICORE
  4. 4. MULTICORE A multi-core processor is a single computing component with two or more independent actual processing units (called "cores). Homogenious and heterogenious. Maximum possible gain governed by AHMDAL'S law. Developed from instruction level parallelism and thread level parallelism.
  5. 5. MULTICORE Share caches or not. Shared memory or message passing inter-core communication methods. Partitoning. Communication. Agglomeration. Mapping.
  6. 6. MULTICORE Simultaneous Multithreading(SMT) present in cores and some times not.
  7. 7. MULTICORE
  8. 8. MULTICORE Cache coherence problem. Invalidation protocol with snooping
  9. 9. MULTICORE PROGRAMMING Default affinity mask is all 1s. OS scheduler tries to avoid migration as much as possible. Soft and hard Affinity.
  10. 10. MULTICORE PROGRAMMING #include int sched_getaffinity(pid_t pid, unsigned int len, unsigned long * mask); int sched_setaffinity(pid_t pid, unsigned int len, unsigned long * mask); win@win-Lenovo-Z580:~$ taskset -p 3108 pid 2763's current affinity mask: f
  11. 11. MULTICORE TO DSP TI multicore DSP: TMS320C6474 . TMS320C6674 (fixed and floating) TMS320C66AK2L06(arm+dsp+Keystone 2).
  12. 12. TMS320C6474 3 TMS320C64x+TM DSP Cores. Instruction Cycle Time: 0.83 ns (1.2-GHz Device); 1 ns (1-GHz Device); 1.18 ns (850-MHz Device). Cpu core structure same as c6713dsk. The complex multiply (CMPY) instruction takes four 16-bit inputs and produces a 32-bit real and a 32-bit imaginary output. New instructions such as 32-bit multiplications, complex multiplications, packing, sorting, bit manipulation, and 32-bit Galois field multiplication.
  13. 13. TMS320C6474 Boot Sequence DSP's internal memory is loaded with program and data sections. The DSP's internal registers are programmed with predetermined values. Public ROM Boot Core 0 is released from reset and begins executing from the L3 ROM base address and brings other cores out of reset by setting to 1 the EVTPULSE4 bit (bit 4).
  14. 14. TMS320C6474
  15. 15. TMS320C6474
  16. 16. TMS320C6474
  17. 17. TMS320C6474
  18. 18. TMS320C6474 PERIPHERALS The primary purpose of the EDMA3 is to service user programmed data transfers between two memory mapped slave endpoints on the device. The interrupt controller allows for up to 128 system events to be programmed to any of the twelve CPU interrupt inputs. A race condition may exist when certain masters write data to the DDR2 memory controller. The inter-integrated circuit (I2C) module provides interface between a C64x+ DSP and other devices compliant with Philips Semiconductors Inter-IC bus (I2C bus) specification.
  19. 19. TMS320C6474 PERIPHERALS The Ethernet Media Access Controller (EMAC) module provides an efficient interface between the C6474 DSP core processor and the networked community. The device contains the Semaphore module for the management of shared resources of the DSP cores. The read-modify-write sequence and Direct, InDirect accesses. Supports 3 masters and contains 32 semaphores. Frame synchronization handles timing and time alignment on the device by coordinating timing between the DSP cores.
  20. 20. TMS320C6674 Four TMS320C66xTM DSP Core Subsystems. Each with 1.0 GHz or 1.25 Ghz. Network Coprocessor. KeyStone Architecture-Multicore Navigator, TeraNet, Multicore Shared Memory Controller, and HyperLink. The C66x core incorporates 90 new instructions (compared to the C64x+ core) targeted for floating point and vector math oriented processing
  21. 21. 66AK2L06 Four TMS320C66x DSP Core Subsystems and Each With 1.0 GHz or 1.2 Ghz. Two ARM Cortex -A15 MPCoreTM Processors at Up to 1.2 Ghz. Understanding.
  22. 22. CONCLUSION Realize the imporatance of multicore. Its has large issues but even larger advantages. THANK YOU
  23. 23. TMS320C6474 PERIPHERALS The primary purpose of the EDMA3 is to service user programmed data transfers between two memory mapped slave endpoints on the device. The interrupt controller allows for up to 128 system events to be programmed to any of the twelve CPU interrupt inputs. A race condition may exist when certain masters write data to the DDR2 memory controller. The inter-integrated circuit (I2C) module provides interface between a C64x+ DSP and other devices compliant with Philips Semiconductors Inter-IC bus (I2C bus) specification.