concepts of multimedia processing and transmission it 481, lecture #1 dennis mccaughey, ph.d. 22...

49
Concepts of Multimedia Concepts of Multimedia Processing and Processing and Transmission Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

Upload: jeffrey-robbins

Post on 31-Dec-2015

222 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

Concepts of Multimedia Concepts of Multimedia Processing and TransmissionProcessing and Transmission

IT 481, Lecture #1Dennis McCaughey, Ph.D.

22 January, 2007

Page 2: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring2

OutlineOutline

Course DescriptionInstructorExams, Homework and ProjectGradingGeneral PoliciesLecture Schedule

Page 3: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring3

Course DescriptionCourse Description

Topics– The fundamentals of signal and image

processing, including algorithms for signal processing that have applications to multimedia

– Techniques for voice coding and recognition, CD and DVD technology, streaming video, WANs and LANs, and videoconferencing technology

Text: Multimedia Communications; Applications, Networks, Protocols and Standards, Fred Halsall,  Addison-Wesley; 1st edition (2002), ISBN: 0-201-39818-4.

Page 4: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring4

InstructorInstructor

Dennis McCaughey– Contact Information

703-263-7425 (Office) 703-624-6830 (Cell) [email protected] (e-mail) Office Hours: one hour before class

– Background PhD in EE University of Southern California 1977

– Thesis: Degrees of Freedom for Projection Imaging

Page 5: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring5

Exams, Homework and ProjectExams, Homework and Project

Mid-Term: 1 Hour Closed Book– Cover the key topics covered in class and

homework Final: Format “To Be Determined” Homework: 1) Reading assignments, 2)

Written answers to selected questions based on reading assignments, 3) Some limited math problems

Project: Format (Preliminary): MATLAB implementations of selected multimedia processing applications.

Page 6: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring6

More on the ProjectMore on the Project

A course project will explore aspects of multimedia signal processing and will be computer based using MATLAB.

Project topics will consist of a set of Matlab implementations addressing multimedia concepts assigned on a running basis over the semester.

Each student will be required to submit the project in the format of a final report.

The projects will be graded on the effort applied-not on Matlab programming skills.

Details regarding topics, content, and format will be provided during the course.

Page 7: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring7

GradingGrading

The final grade will be determined by a weighted average of the homework assignments, a mid-term exam, a final exam and a project

Homework 10%

Mid-Term 20%

Project 30%

Final 40%

Page 8: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring8

General PoliciesGeneral Policies

Collaboration– Students are permitted and encouraged to collaborate on homework

assignments.  – All graded work, however, must be the original effort of the student

submitting the paper. 

Homework– Homework will be collected at the beginning of each class period.  Note: 

Late homework will be accepted provided the reason for the delay is coordinated with the instructor within 2 days of its assignment. Homework solutions will be discussed in class.

  Make-up Exams– Make-up exams will not be given unless detailed written clarification

accompanied by documentation for the absence is provided. If this information is not provided an F grade will be given for the exam. The location and time for a make-up exam will be decided by the instructor. Also, students are expected to be in class and on-time for every class.

Page 9: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring9

Lecture Schedule (Preliminary)Lecture Schedule (Preliminary)

Week Date Chapter Topic Reading

Assignment Homework

1 1/22 1 Lecture #1: Introduction to Multimedia Communications

1,2

2 1/29 None Lecture #2: Signal Processing Fundamentals and Intro to Matlab

3 2/5 2 Lecture #3: Multimedia Information Representation

3

4 2/12 3 Lecture #4: Text Compression 3 5 2/19 3 Lecture #5: Image Compression 4 7 2/26 4 Lecture #6: Audio Compression 4 8 3/5 1-4 Mid-Term Exam &Project Review 9 3/12 None Spring Break 10 3/19 4 Lecture #7: Video Compression 5

11 3/26 5 Lecture #8: Standards for Multimedia Communications

6

12 4/2 6 Lecture #9: Digital Communication Basics

11

13 4/9 11 Lecture #10: Entertainment Networks and High Speed Modems

TBD

14 4/16 TBS Lecture #11: Data Privacy TBD 15 4/23 TBS Special Topics 1-6,11 16 4/30 1-6,11 Final Exam Review 5/14 Final Exam 7:30pm

Page 10: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

Multimedia CommunicationsMultimedia Communications

Page 11: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring11

What is Multimedia?What is Multimedia?

Multimedia is a combination of text, art, sound, animation, and video.

Slide: Courtesy, Hung Nguyen

Page 12: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring12

Multimedia Components SimplifiedMultimedia Components Simplified

Multimedia can be viewed as they combination of audio, video, data and how they interact with the user (more than the sum of the individual components)

Audio

Multimedia

VideoData

Page 13: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring13

BackgroundBackground

Fast paced emergence in applications in medicine, education, travel etc

Characterized by large documents that must be communicated with short delays

Glamorous applications such as distance learning, video teleconferencing

Applications that are enhanced by Video are often seen as driver for development of multimedia networks

Page 14: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring14

Forces Driving Communications That Forces Driving Communications That Facilitate Multimedia CommunicationsFacilitate Multimedia Communications

Evolution of communications and data networks

Increasing availability of almost unlimited bandwidth demand

Availability of ubiquitous access to the network

Ever increasing amount of memory and computational power

Sophisticated terminals Digitization of virtually everything

Page 15: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring15

New Information System ParadigmNew Information System Paradigm

Integration

MultimediaIntegrated

Communication

MultimediaProcessing

Broadband Link

Workstation, PC

Slide: Courtesy, Hung Nguyen

Page 16: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring16

Elements of Multimedia SystemsElements of Multimedia Systems

Two key communication modes– Person-to-person– Person-to-machine

TransportUse

InterfaceUse

Interface

TransportProcessingStorage and

Retrieval

UseInterface

Slide: Courtesy, Hung Nguyen

Page 17: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring17

Multimedia NetworksMultimedia Networks

The world has been wrapped in copper and glass fiber and can be viewed as a “hair ball” with physical, wireless and satellite entry/exit points.

Physical: LAN-WAN connections Wireless: Cellular telephony, wireless PC

connectivity Satellite: INMARSAT, THURYA, ACeS etc

Page 18: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring18

Multimedia Communication ModelMultimedia Communication Model

Partitioning of information objects into distinct types, e.g., text, audio, video

Standardization of service components per information type

Creation of platforms at two levels – network service and multimedia communication

Define general applications for multiple use in various multimedia environments

Define specific applications, e.g. e-commerce, tele-training, … using building blocks from platform and general applications

Page 19: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring19

RequirementsRequirements

User Requirements– Fast preparation and presentation– Dynamic control of multimedia applications– Intelligent support to users– Standardization

Network Requirements– High speed and variable bit rates– Multiple virtual connections using the same

access– Synchronization of different information types– Suitable standardized services along with

support

Page 20: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring20

Network RequirementsNetwork Requirements

ATM-BISDN and SS7 have enabled the switching based communications capabilities over the PSTN that support the necessary services

ATM-BISDN-SS7 will evolve to all optical “switchless” networks based on packet transfer

Page 21: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring21

Packet Transfer ConceptPacket Transfer Concept

Allows voice, video and data to be dealt with in a common format

More flexible than circuit switching which it can emulate while allowing the multiplexing of varied bit rate data streams

Dynamic allocation of bandwidth Handle Variable Bit Rate (VBR) directly

Page 22: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring22

ConsiderationsConsiderations

Buffering required for constant bit rate data such as audio

Re-sequencing and recovery capabilities must be provided over networks where packets may be received either in an order different from that transmitted or dropped– In an ATM network some packets can be

dropped while others may not (i.e. voice vs bank transfer data packets)

– Optimum packet lengths for voice video and data differ in an ATM network

– IP packets over the internet may arrive in a different order or be dropped.

Page 23: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring23

Digital Video Signal TransportDigital Video Signal TransportV

ideo

Encoder•Transformation•Quantization•Entropy Coding•Bit-Rate Control

Application

•Data Structuring

Use

rs

Network Multiplexing/Routing

•Overhead (FEC)•Re-Trans

•Error detection•Loss detection•Error correction•Erasure correction

Application

•Re-Synch

Decoder•De-quantization•Entropy decode•Inv Trans•Loss conceal•Post process

The following figure will be examined over the course of the semester

Page 24: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring24

Quality of Service (QoS)Quality of Service (QoS)

The set of parameters that defines the properties of media streams

Can define four QoS layers:1. User QoS: Perception of the multimedia data at

the user interface (“qualitative”)2. Application QoS: Parameters such as end-to-

end delay (“quantitative”)3. System QoS: Requirements on the

communications services derived from the application QoS

4. Network QoS: Parameters such as network load and performance

Page 25: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring25

Applications of MultimediaApplications of Multimedia

Business - Business applications for multimedia include presentations training, marketing, advertising, product demos, databases, catalogues, instant messaging, and networked communication.

Schools - Educational software can be developed to enrich the learning process.

Slide: Courtesy, Hung Nguyen

Page 26: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring26

Applications of MultimediaApplications of Multimedia

Home - Most multimedia projects reach the homes via television sets or monitors with built-in user inputs.

Public places - Multimedia will become available at stand-alone terminals or kiosks to provide information and help.

Slide: Courtesy, Hung Nguyen

Page 27: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring27

Compact Disc Read-Only (CD-ROM)Compact Disc Read-Only (CD-ROM)

CD-ROM is the most cost-effective distribution medium for multimedia projects.

It can contain up to 80 minutes of full-screen video or sound.

CD burners are used for reading discs and converting the discs to audio, video, and data formats.

Slide: Courtesy, Hung Nguyen

Page 28: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring28

Digital Versatile Disc (DVD) Digital Versatile Disc (DVD)

Multilayered DVD technology increases the capacity of current optical technology to 18 GB.

DVD authoring and integration software is used to create interactive front-end menus for films and games.

DVD burners are used for reading discs and converting the disc to audio, video, and data formats.

Slide: Courtesy, Hung Nguyen

Page 29: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring29

Multimedia CommunicationsMultimedia Communications

Multimedia communications is the delivery of multimedia to the user by electronic or digitally manipulated means.

Audio Communications(Telephony, sound, Broadcast)

Multimedia Communications

Video Communications(Video telephony,

TV/HDTV)

Data, text, imageCommunications

(Data Transfer, fax…)

Slide: Courtesy, Hung Nguyen

Page 30: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring30

Multimedia TermsMultimedia Terms

Page 31: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring31

Alternative Types of Media used in Alternative Types of Media used in Multimedia ApplicationsMultimedia Applications

Page 32: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring32

Multimedia Communications NetworksMultimedia Communications Networks

Page 33: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring33

Multimedia Networks and Their ServicesMultimedia Networks and Their Services

Page 34: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring34

Multimedia Networks and Their ServicesMultimedia Networks and Their Services

Page 35: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

Audio-Visual IntegrationAudio-Visual Integration

Page 36: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring36

Application in Biometrics – Bimodal Application in Biometrics – Bimodal Person VerificationPerson Verification

Existing methods for person verification are mainly based on a single modality which would have limitation in security and robustness

Audio visual integration using a camera and microphone makes person verification a more reliable product

Slide: Courtesy, Hung Nguyen

Page 37: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring37

Joint Audio-Video CodingJoint Audio-Video Coding

Correlation between audio and video can be used to achieve more efficient coding– Predictive coding of audio and video information

used to construct estimate of current frame (cross-modal redundancy)

– Difference between original and estimated signal can be transmitted as parameters

– Decision on what and how to send is based on Rate Distortion (R-D) criteria

Reconstruction done at receiver according to agreed-upon decoding rules

Slide: Courtesy, Hung Nguyen

Page 38: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring38

Cross-Model Predictive CodingCross-Model Predictive Coding

Visual Analysis

A-to-VMapping

DecisionModule(R-D)

Parameter X

XX ˆ

Nothing

Parameter X

Slide: Courtesy, Hung Nguyen

Page 39: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring39

Importance of InteractionImportance of Interaction

Multimedia is more than the combination of text, audio, video and data

Interaction among media is importantConsider a poorly dubbed movie

– Audio not synchronized with video– Lip movements inconsistent with

language– Audio dynamic range inconsistent with

the sceneSlide: Courtesy, Hung Nguyen

Page 40: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring40

Media InteractionMedia Interaction

Process and Model

Audio

TextImageVideo

Multimedia

Lip synchFace Animation

Joint A/V Coding

CompressionSynthesis3D Sound

Sign languageLip reading

Speech RecognitionText-to-Speech

Compression, GraphicsDatabase indexing/retrieval

TranslationNatural language

Slide: Courtesy, Hung Nguyen

Page 41: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring41

Bimodality of Human SpeechBimodality of Human Speech

Human speech is produced by vibration of the vocal cord, configuration of the vocal tract with muscles that generate facial expressions

Audio + Visual Perceived

ba ga da

pa ga ta

ma ga na

Slide: Courtesy, Hung Nguyen

Page 42: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring42

Basic DefinitionsBasic Definitions

The basic unit of acoustic speech is called a phoneme

In the visual domain, the basic unit of mouth movement is called viseme– A viseme is the smallest visibly distinguishable

unit of speech– Can contain several phonemes and thus form

one viseme group– A many-to-one mapping between phonemes and

visemes

Slide: Courtesy, Hung Nguyen

Page 43: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring43

Lip Reading SystemLip Reading System

Application to support hearing-impaired person

People learn to understand spoken language by combining visual content with lexical, syntactic, semantic and programmatic information

Automated lip reading systems– Speech recognition possible using only visual

information– Integrated with speech recognition systems to

improve accuracy Slide: Courtesy, Hung Nguyen

Page 44: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring44

Lip SynchronizationLip Synchronization

Applications – In VTC (video teleconferencing) where video

frame is dropped (low bandwidth requirement) but audio must still be continuous

– In non-real-time use such as dubbing in studio where recorded voice full of background noise

Time-warping commonly used in both audio and video modes– Time-frequency analysis– Video time-warping could be used for VTC– Audio time-warping could be used for dubbing

Slide: Courtesy, Hung Nguyen

Page 45: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring45

Lip TrackingLip Tracking

To prevent too much jerkiness in the motion rendering and too much loss in lip synchronization

Involved real-time analysis on 3-dimensional of the video signal plus one temporal dimension

Produce meaningful parameters

– Classification of mouth images into visemes– Measures of dimension, e.g. mouth widths and

heights Analysis tools – Fourier Transform, Karhunen-

Loeve Transform (KLT), Probability Density Function (pdf) Estimation

Slide: Courtesy, Hung Nguyen

Page 46: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring46

Audio-to-Visual Mapping for Lip Audio-to-Visual Mapping for Lip TrackingTracking

Conversion of acoustic speech to mouth shape parameters

A mapping of phonemes to visemes Could be most precisely implemented with a

complete speech recognizer followed by a look-up table– High computational overhead plus table look-up

complexity– Do not need to recognize spoken word to achieve audio-

to-visual mapping Physical relationships exist between vocal tract

shape and sound produced functional relationships exist between speech and visual parametersSlide: Courtesy, Hung Nguyen

Page 47: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring47

Classification-Based Conversion Classification-Based Conversion Approaches for Lip TrackingApproaches for Lip Tracking

Two-step process– Classification of acoustic signal using VQ

(vector quantization), HMM (hidden Markov model) and NN (neural network)

– Mapping of the acoustic classes into corresponding visual outputs, then averaged to get centroid

Shortcomings– Error resulting from averaging visual vector to

get visual centroid– Not a continuous mapping – finite output levels

Slide: Courtesy, Hung Nguyen

Page 48: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring48

Classification-Based ConversionClassification-Based Conversion

Phoneme Space Viseme Space

Centroid

Slide: Courtesy, Hung Nguyen

Page 49: Concepts of Multimedia Processing and Transmission IT 481, Lecture #1 Dennis McCaughey, Ph.D. 22 January, 2007

01/22/2007IT 481, Spring49

Audio and Visual Integration for Lip Audio and Visual Integration for Lip Reading ApplicationsReading Applications

Three major steps– Audio-visual pre-processing – Principal

Component Analysis (PCA) has been used for feature extraction

– Pattern recognition strategy (HMM, NN, time-warping…)

– Integration strategy (decision making) Heuristic rules to incorporate knowledge of phonemes

about the two modalities Combination of independent evaluation score for each

modalities

Slide: Courtesy, Hung Nguyen