deliverable - tv-ring...d4.3 evaluation results 2 version 1.0, 01/04/2016 1. executive summary this...

0 version 1.0, 01/04/2016 D4.3 Evaluation results

Deliverable

Project Acronym: TV-RING

Grant Agreement number: 325209

Project Title: Television Ring - Testbeds for Connected TV services using HbbTV

D4.3 Evaluation results

Revision: 1.0

Authors:

Jeroen Vanattenhoven (KU Leuven)

Aylin Vogl (IRT)

Nico Patz (RBB)

Marc Aguilar (i2CAT)

Project co-funded by the European Commission within the ICT Policy Support Program

Dissemination Level

P Public x

C Confidential, only for members of the consortium and the Commission Services

Abstract: This deliverable presents and describes the evaluation results, using both quantitative and qualitative data, of all pilots conducted in TV-RING. To make it easier for the reader, we start with the high-level overview regarding the objectives of the pilots in the project, followed by a high-level overview per pilot. Then, the main section contains the more detailed results for each pilot application. We conclude by presenting insights that were valid across pilots.


Revision History

Revision Date Author Organization Description

0.1 21/09/2015 Jeroen Vanattenhoven

KU Leuven Initial table of content and proposal for the structure of the deliverable


KU Leuven Integrate input from all partners


KU Leuven Provide feedback toward IRT on the input


KU Leuven Provide feedback toward RBB on the input


KU Leuven Provide feedback toward i2CAT on the input


KU Leuven Compile all input and prepare version for internal review

0.7 28/02/2016 David Pujals RTV (Cellnex Telecom)

Conduct an internal review of the deliverable


KU Leuven Address the internal review comments and finalize the deliverable for

submission

0.9 23/03/2016 Sergi Fernandez

i2CAT Review: consistency D4.1-D4.2-D4.3


KU Leuven Addressing new review comments

Statement of originality:

This document contains original unpublished work except where clearly indicated otherwise. Acknowledgement of previously published material and of the work of others has been made through appropriate citation, quotation or both.

Disclaimer

The information, documentation and figures available in this deliverable, is written by the TV-RING (Testbeds for Connected TV services using HbbTV) – project consortium under EC grant agreement ICT PSP-325209 and does not necessarily reflect the views of the European Commission. The European Commission is not liable for any use that may be made of the information contained herein.


1. Executive Summary

This deliverable presents and describes the evaluation results of the pilots conducted in TV-RING. Therefore, it forms an essential document and contains the outcome of “Task 4.5 Final evaluation of pilots”, aimed at aggregating and evaluating the outcome of the three pilots in quantitative and qualitative aspects.

The data that forms the source for this deliverable is quite extensive as pilots were conducted in three countries and in each country several applications were tested (sometimes in two or even three iterations). Therefore, we will now present and explain the structure of this deliverable.

The chapter “Pilot Evaluation Results” contains the main content of this deliverable. This chapter presents and discusses the evaluation results for each pilot application in-depth. There is one section for each country. For each country the information has been split up according to the different pilots. For each pilot we first briefly present key information in a table, and refer to the respective actions in D4.2. This allows the reader to cross-reference information. After the results in each pilot, we also foresee a conclusion for the results obtained in each country.

During an internal TV-RING workshop, we formulated some cross-pilot conclusions based on each countries’ findings. These results are found in “Results across the different pilots”.

The chapter “


General Conclusions” draws conclusions on the whole project. A first section provides an overview of the objectives that were achieved for each pilot during this project. The idea to start with such a condensed overview is to enable the reader to construct a view on the project’s results, before diving deeper into the concrete results. The first overview table shows the status of the pilots per country for the whole project (see Table 22). The three tables that follow, present the overview of the specific objectives defined for each pilot application; there is one table per country. For each specific objective we indicate whether or not the objective was reached (Yes, No, or Partial). We also added a clarification below the tables to illustrate why an objective was not fully reached. A similar approach was adopted for the metric we set out in the beginning of the project. In that section we show an overview of all metrics used and not-used. Finally, we use the overview of the objectives to formulate the main conclusions of the pilot evaluations.


2. Contributors

First Name Last Name Company e-Mail

Jeroen Vanattenhoven KU Leuven [email protected]

Marc Aguilar i2CAT [email protected]

David Pujals RTV (Cellnex

Telecom) [email protected]

Jordi Mata TVC [email protected]

Jordi Miquel Payo TVC [email protected]

Jordi Arraez TVC [email protected]

Aylin Vogl IRT [email protected]

Sven Gläser RBB [email protected]

Nico Patz RBB [email protected]

Susanne Heijstraten NPO [email protected]

Sergi Fernandez I2CAT [email protected]

mailto:[email protected]













Content

Revision History ......................................................................................................................... 1

1. Executive Summary ............................................................................................................... 2

2. Contributors .......................................................................................................................... 4

3. Introduction ........................................................................................................................ 10

4. Action Log ............................................................................................................................ 12

5. Pilot Evaluation Results ....................................................................................................... 13

5.1. Dutch Pilot ................................................................................................................... 13

5.1.1. Quality differentiation by using Digital Rights Management .............................. 13

5.1.2. In-house recommendations for HbbTV and Cable TV apps ................................ 18

5.1.3. HbbTV as a central interface for second screen competition ............................. 18

5.1.4. Conclusions for the Dutch pilot ........................................................................... 36

5.2. German Pilot ............................................................................................................... 37

5.2.1. verknallt & abgedreht ......................................................................................... 37

5.2.2. Unser Sandmännchen (Sandman) ....................................................................... 47

5.2.3. TV App Gallery ..................................................................................................... 50

5.2.4 Conclusions for the German Pilot ....................................................................... 61

5.3. Spanish Pilot ................................................................................................................ 63

5.3.1. TV3 a la carta multicamera ................................................................................. 63

5.3.2. MPEG DASH Encoder ........................................................................................... 79

5.3.3. Conclusions for the Spanish Pilot ........................................................................ 99

6. Results across the different pilots ..................................................................................... 101

7. General Conclusions .......................................................................................................... 106

7.1. Overview of the Pilot Objectives ............................................................................... 106

7.2. Overview of the Pilot Metrics ................................................................................... 114

7.3. Conclusions ............................................................................................................... 117

8. References ......................................................................................................................... 118

9. ANNEX 1: Dutch Pilot Questionnaire – DRM ..................................................................... 119

10. ANNEX 2: Dutch Pilot Questionnaire – IQ Test ............................................................. 121

11. ANNEX 3: Dutch Pilot Questionnaire – De Rijdende Rechter ........................................ 122

12. ANNEX 4: Dutch Pilot Questionnaire – Eurosong .......................................................... 123

13. ANNEX 5: Dutch Pilot Questionnaire – Een tegen 100 .................................................. 124

14. ANNEX 6: German Pilot Questionnaire – verknallt & abgedreht .................................. 126

15. ANNEX 7: German Pilot Questionnaire – verknallt & abgedreht phase 1 .................... 127


16. ANNEX 8. German Pilot Questionnaire – verknallt & abgedreht phase 2..................... 135

17. ANNEX 9. German Pilot Questionnaire – TV App Gallery.............................................. 140

18. ANNEX 10: Spanish Pilot Questionnaire – Oh Happy Day ............................................. 141

19. ANNEX 11: Spanish Pilot Questionnaire – FC Barcelona vs PSG Champions League .... 147

20. ANNEX 12: Spanish Pilot Questionnaire – Mayoral Elections Technical assessment ... 149

21. ANNEX 13: Spanish Pilot Questionnaire – Mayoral Elections UX ................................. 150

22. ANNEX 14: Spanish Pilot Questionnaire – FC Barcelona vs Juventus Champions League 152

23. ANNEX 15: Spanish Pilot Questionnaire – FC Barcelona vs Bayer Leverkusen Champions League 155

24. ANNEX 16: Spanish Pilot Questionnaire – FC Barcelona vs AS Roma Champions League 157


Table of Figures

Image 1: An overview of the number of participants and the number of SD and HD items they watched during the pilot. ............................................................................................................ 14

Image 2: Overview of the usability and user experience aspects of the DRM application ........ 15

Image 3: Participants opinion about the suggested price for the HD items, an indication of the value of the application ............................................................................................................... 16

Image 4: Participants preferences for different types of service, showing that most prefer a personalized service .................................................................................................................... 17

Image 5: Results relating to Psychological Involvement – Empathy ........................................... 22

Image 6: results relating to Behavioural Involvement ................................................................ 23

Image 7: Results relating to Psychological Involvement - Negative Feelings ............................. 23

Image 8: Results relating to Individual - Negative Feelings ........................................................ 24

Image 9: Results relating to Individual - Positive Feelings .......................................................... 24

Image 10: Results relating to Usefulness .................................................................................... 24

Image 11: Results relating to Distraction .................................................................................... 25

Image 12: Results related to the ease-of-use ............................................................................. 28

Image 13: Results related to the user experience ...................................................................... 29

Image 14: Results related to the social experience .................................................................... 29

Image 15: Results related to social interaction ........................................................................... 30

Image 16: Results relating to the sense of competition ............................................................. 30

Image 17: The second screen application did not appear to distract too much from the show itself ............................................................................................................................................. 31

Image 18: User-related aspects for the evaluation of 'Een tegen 100' ....................................... 33

Image 19: Development of user figures over the three phases of German Pilot 1 .................... 39

Image 20: Visits in relation to broadcast times (Phase 1) ........................................................... 39



Image 23: HbbTV Teaser (Phase 2) ............................................................................................. 41

Image 24: Visits in relation to Teaser times (Phase 1) ................................................................ 41



Image 27: Landing Page of verknallt & abgedreht (Phase 2) ...................................................... 43

Image 28: Videos played in verknallt & abgedreht (Phase 1) ..................................................... 44

Image 29: Videos played in verknallt & abgedreht (Phase 2) ..................................................... 44

Image 30: Reactions to the Social Media Feed ........................................................................... 46

Image 31: Visitors of the TV-based apps 17 Jun and 30 September 2015 .................................. 48


Image 32: Use of all available Sandman Apps ............................................................................. 48

Image 33: Use of the TV-based Sandman Apps August 2015 ..................................................... 49

Image 34: Use of the TV-based Sandman Apps October 2015 ................................................... 49

Image 35: Evaluation Environment ............................................................................................. 51

Image 36: End user evaluation .................................................................................................... 52

Image 37: TVAG end user data analysis ...................................................................................... 52

Image 38: TVAG professional user data analysis ........................................................................ 59

Image 39: Reported interest in HbbTV multicam services across user panel tests .................... 67

Image 40: User satisfaction with several aspects of HbbTV multicam app across user panel tests ..................................................................................................................................................... 68

Image 42: Typology of technical issues across user panel tests ................................................. 69

Image 43: Users and data streamed along time during live pilot, total ...................................... 73

Image 44: Users and data streamed along time during live pilot, breakdown by stream .......... 74

Image 45: Number of users by total data streamed and total minutes engaged ....................... 75

Image 46: Perceived value of multi-camera services in live football. ......................................... 76

Image 47: Stated preference of program content for multi-camera application ....................... 76

Image 48: Test application for performance test........................................................................ 81

Image 50: Test application for performance tests, unique manifest .......................................... 83

Image 51 LMS Scheme ................................................................................................................ 93

Image 52: LMS’s Dasher total amount of lost data blocks as a function of the number of simultaneous Dasher instances ................................................................................................... 95

Image 53: LMS’s Dasher CPU usage as a function of the number of simultaneous Dasher instances ...................................................................................................................................... 95

Image 54: LMS’s Dasher Input bitrate consumption as a function of the number of simultaneous Dasher instances ................................................................................................... 96

Image 55: LMS’s Dasher processing delay as a function of the number of simultaneous Dasher instances ...................................................................................................................................... 96

Image 56: LMS’s Dasher total amount of lost data blocks as a function of the number of simultaneous Dasher instances ................................................................................................... 97

Image 57: LMS’s Dasher CPU usage as a function of the number of simultaneous Dasher instances ...................................................................................................................................... 97

Image 58: LMS’s Dasher Input bitrate consumption as a function of the number of simultaneous Dasher instances ................................................................................................... 98

Image 59: LMS’s Dasher processing delay as a function of the number of simultaneous Dasher instances ...................................................................................................................................... 98

Image 60: Starting the clustering of the results. Each pilot leader presents one key evaluation result or insight. It is then immediately written down on a post-it, and placed under either a new theme, or an existing theme that fits. ............................................................................... 101

Image 61: More clusters start to take shape towards the end of the exercise. ....................... 102


Image 62: The workshop in Leuven, where we processed the results for each pilot ............... 107

Table 1: Participants of DRM test................................................................................................ 14

Table 2: Participants Nationale Wetenschaps Quiz .................................................................... 19

Table 3: Overview Quantitative Results ...................................................................................... 19

Table 4: Participants Rijdende Rechter ....................................................................................... 20

Table 5: Participants Eurosong .................................................................................................... 27

Table 6: Participants Een tegen 100 ............................................................................................ 32

Table 7: Nationale BN'er Quiz technical results .......................................................................... 34

Table 8: The 90's Test technical results ....................................................................................... 34

Table 9: De Verknipte Dierentest technical results ..................................................................... 35

Table 10. Lab Survey Phase 1 ...................................................................................................... 38

Table 11: Lab Survey Phase 2 ...................................................................................................... 38

Table 12: Participant information ............................................................................................... 50

Table 13: Time table and execution ............................................................................................ 58

Table 14: User panel sociodemographic parameters ................................................................. 67

Table 15: Unique visitors and page views per TV manufacturer ................................................ 72

Table 16: Experimentation documents for experimental latency test ....................................... 78

Table 17: Test user satisfaction with the level of delay of performed tasks across the three different TV devices .................................................................................................................... 79

Table 18: Results for MPEG-DASH Live ....................................................................................... 85

Table 19: Results for MPEG-DASH VoD ....................................................................................... 87

Table 20: Results for MPEG-DASH 1MPD Live............................................................................. 89

Table 21: Results for MPEG-DASH 1MPD VoD ............................................................................ 91

Table 22: General evaluation overview showing the reached status in each pilot .................. 109

Table 23: Overview of the Dutch pilot objectives and results .................................................. 110

Table 24: Overview of the German pilot objectives and results ............................................... 112

Table 25: Overview of the Spanish pilot 1 objectives and results. ........................................... 112

Table 26: Overview of Spanish pilot 2 objectives and results. .................................................. 113

Table 27: Overview of Dutch Pilot Metrics ............................................................................... 114

Table 28: Overview of German Pilot Metrics ............................................................................ 116

Table 29: Overview of Spanish Pilot Metrics ............................................................................. 117


3. Introduction

This deliverable presents the evaluation results of the TV-RING project. It concerns the evaluations of all applications deployed in the German, Dutch and Spanish pilots.

In “


Pilot Evaluation Results”, the more detailed results of all pilots will be described, presented

and discussed. Following this main part of the deliverable, is a section on cross-pilot insights.

Finally, we present the high-level results across the entire project to quickly assess the status in each pilot. Then, we present the high-level, but specific results in each pilot. This will provide a quick overview on the more specific results in each pilot.


4. Action Log

[18/09/2015] – Leuven, Belgium – KU Leuven – Communicate approach and templates for T4.5

[21/09/2015] – Leuven, Belgium – KU Leuven – Provide the first version of the table of content

[09/10/2015] – All partners – Collect all pilot evaluation results in templates for workshop in Leuven

[09/10/2015] – Leuven, Belgium – All partners – Organize workshop about the evaluation results

[26/11/2015] – Leuven, Belgium – KU Leuven – Finalize table of content based on workshop outcome

[15/01/2015] – All partners – Provide all input into the deliverable

[25/02/2016] – Leuven, Belgium – KU Leuven – Integrate all input

[28/02/2016] – Spain – RTV (Cellnex Telecom) – Internal review of the deliverable

[29/02/2016] – Leuven, Belgium – KU Leuven – Integrate review feedback and finalize deliverable

[15/03/2016] – Barcelona, Spain – i2CAT – Final review deliverable

[01/04/2016] – Leuven, Belgium – KU Leuven – Final Update deliverable


5. Pilot Evaluation Results

This chapter provides more detail into the results of all the pilots executed in TV-RING. To gather all evaluation results, presentation templates were created. These templates helped us to normalize and structure the evaluation results coming from different pilots. During a workshop on the 9th of October, in Leuven, each pilot leader presented his results to all partners, after which a discussion or Q&A took place to clarify certain things. The structure of these presentations was later on also used to create the table of contents of the deliverable and the structure of this section. Each subsection will start with a table that makes a reference to the respective pilot action in D4.2, the origin of the data, and the moment of measurement. Then, the results are explained in detail. Finally, each subsection ends with a table presenting the main conclusions.

Important to mention is that we have known three different phases in the project. We will also use those phases in our reporting:

Phase 1 (M13-M15): period before the first review

Phase 2 (M16-M24): period until the end of the foreseen pilot plan

Phase 3 (M25-M30): period in which more pilots were conducted

5.1. Dutch Pilot

The Dutch pilot focused on three different pilot applications. Firstly, we investigated how to offer online content using quality differentiation via DRM. After gathering the required data, and building the test application, an evaluation was carried out in December 2014. The application offered on-demand content in different qualities for several genres. The results of the evaluation provided us with insight into what people expect from online video catalogs.

Secondly, we studied how to offer in-house recommendations for HbbTV. Instead of following the approach in linear traditional TV, or the more modern recommender approach that focuses especially on the relation between people’s preferences and the content items, we built an application that incorporated mood, group composition, genre, and time-related factors. Actions for evaluating these started in November 2014 and ended in May 2015.

Finally, seven second screen applications were developed for specific TV shows. The first four were evaluated with users. The final three were too late in the project to evaluate. Efforts were undertaking from December 2014 until February 2016. In these evaluations we focused on how to provide a more compelling and social experience for collocated people in the same room, making use of their second screen devices.

5.1.1. Quality differentiation by using Digital Rights Management

Pilot Action 1.5

Data Source End-users, Technical Data

Measure Moment During, After

Pilot runtime 01/12/2014 – 24/12/2014


Participants 17

1.- Age Min 4 years old, max 65 years old, average between 35-55

2.- Sex 16 male, 1 female

3.- Home profile Mainly technical profiles such as engineers, IT, software.

Table 1: Participants of DRM test

We will present the results according to the questions used in the questionnaire (See ANNEX 1):

How many HD (high quality) programs did you watch on the test pilot application?

How many SD (standard quality) programs did you watch on the test pilot application?

Image 1: An overview of the number of participants and the number of SD and HD items they watched during the pilot.

The above results show that the panel was active and tried out the pilot application, but that not a lot of activity was generated. We will find out why in the results that follow. The next part asked how participants evaluated the different aspects of this application. We inquired about the visual appeal, engagement, and the user experience. The last three questions were adapted from the IBM After Scenario Questionnaire (ASQ, a validated usability questionnaire1),

1 1.

Lewis JR (1995) IBM computer usability satisfaction questionnaires: psychometric evaluation and instructions for use. International Journal of Human-Computer Interaction 7:57–78.

0

1

2

3

4

5

6

7

8

none 1 2 to 5 6 to 9 more than 10

Nu

mb

er

of

par

ticp

ants

Number of items watched

HD

SD


and inquired about the core aspects of usability. Participants were asked to judge these aspects on a 5-point Likert scale: disagree, somewhat disagree, neutral, somewhat agree, agree:

Judge the following elements of the test panel application.

a) The application looks visually attractive. b) I want to use this application again. c) Overall, it was a nice experience. d) Overall, this application was easy to use. e) Overall, I’m satisfied with the time I needed to use the application. f) Overall, I’m satisfied with the help information for the application.

Image 2: Overview of the usability and user experience aspects of the DRM application

We also wanted to know more about participants valued such an application:

What did you think of the price per HD (high quality) item in the test application?

0

1

2

3

4

5

6

7

8

9

10

The application

looks visually

attractive.

I want to use this

application again.

Overall, it was a nice

experience.

Overall, this application was easy to

use.

Overall, I'm satisfied with the

time I needed to

use the application.

Overall, I'm satisfied with the available

help information

for the application

Nu

mb

er

of

par

tici

pan

ts

disagree

somewhat disagree

neutral

somewhat agree

agree


Image 3: Participants opinion about the suggested price for the HD items, an indication of the value of the application

The above results are rather negative as a majority clearly found the price too high. In order to better understand why this is the case, the next question was asked:

Describe your experiences with the test pilot application? What did you like? What was not good? Why did you use the application a lot, or not that much?

Due to a technical issue the HD quality was reduced at participants’ homes. Most participants therefore also did not see a lot of difference between the SD and HD items. A second issues for participants was that the content offering was too limited, certainly compared to what they already had access to in real life. Finally, a number of usability issues with regard to playback were mentioned. The interface in general responded quite slowly. Fast forward and rewind functionality was quite cumbersome and did not support users in quickly navigating the timeline of one content item. And also a basic feature of current VoD systems was missing, namely the fact that the pilot application did not remember were the viewer was, when closing the application. This meant that people had to fast forward again to that point the next time they wanted to resume watching. We will illustrate these issues with participants’ reactions:

“I did not find the offering very interesting and I already saw this episode about nature. I compared the SD and HD items but I couldn’t really tell that HD was better.”

“I found it really inconvenient that when you stopped the program (for example, to continue watching it tomorrow), it would start again at the beginning.”

“What is a pity, is that you can’t navigate further into the program more easily. You can make steps of 30 seconds, but when you then have to move forward in a movie you just have to keep pressing.”

In order to investigate whether people would appreciate a service that would offer VoD subscription based on the genres that they typically watch, compared to the more standard

0

1

2

3

4

5

6

7

8

9

10

way too low too low about right too high way too high

Nu

mb

er o

f p

arti

ciap

nts


options of paying per item or a subscription to a complete VoD package, we asked participants’ preference with regard to the above mentioned options after the DRM pilot:

Which formula for paying do you prefer?

Image 4: Participants preferences for different types of service, showing that most prefer a personalized service

A follow up question was added to ask participants to explain in more detail why they preferred the option they selected. From Image 4 we can already see that most participants prefer a subscription over paying per item. Participants’ reasons for this are that when choosing for paying per item, they would probably not make the effort to do so. They prefer a subscription because they don’t want to make the effort of making a separate payment every month. Another reason for choosing subscriptions is that people believe they will end up being cheaper than paying per item. A few responses by participants:

“I don’t want to be continuously reminded that I have to pay”.

“A fixed amount per month; a clear view on the costs.”

“Paying for each item will probably mean that I won’t watch it.”

Whereas in the multiple choice question about the preference for the three formulas, personalization was preferred by most, participants did not provide any motivations for this in the qualitative feedback.

Conclusion(s) - All participants used the DRM portal, but this usage was quite low. We identified the reasons why participants did not value the DRM portal:

o The price for the VoD items was too high. o Due to a technical issue, the video quality was not as good

as at the offices of NPO and PPG. o The amount of offered content was too limited.

- The DRM app was easy-to-use; other aspects such as visual

0

1

2

3

4

5

6

7

Per item Subscription for all content

Subscription for genres I usually watch

Nu

mb

er

of

par

tici

pan

ts


attractiveness and available help information scored neutral. They were not bad, but they certainly did not stand out.

- There is a case for offering VoD packages based on personal preferences. Of three formulas, participants preferred a subscription based on the genres they usually watch.

5.1.2. In-house recommendations for HbbTV and Cable TV apps

5.1.2.1. Recommender evaluation phase 1

Pilot Action 1.9

Data Source End-users

Measure Moment Right after

Pilot runtime 20/01/2015 – 17/02/2015

Participants 51

Although 51 participants were active and rated the recommended items on the test application, the gathered data was too sparse to draw any conclusions.

5.1.2.2. Recommender evaluation phase 2

Pilot Action 1.10



Pilot runtime 26/06/2015 – 24/07/2015

Participants 7

Similar to phase 1, too few participants were active, and we were unable to gain sufficient data. There were 7 participants who indicated that they were willing to participate. Diaries (envelopes for returning the diaries) were sent to them. We only received the data from one participant, even after offering additional, substantial incentives.

5.1.3. HbbTV as a central interface for second screen competition


5.1.3.1. Nationale WetenschapsQuiz

Pilot Action 1.11



Pilot runtime 28/12/2014

Participants 9


2.- Sex 90% male; 10% female

3.- Home profile Student, retired, manager, technical developer, consultant, teacher, engineer, journalist, unemployed

Table 2: Participants Nationale Wetenschaps Quiz

For the evaluation of the Nationale WetenschapsQuiz, a comparison was made between the regular show, the version with Philips Hue lamps, and the version with the HbbTV app. We used the Social Presence in Gaming Questionnaire (SPGQ) (de Kort et al., 2007) (see ANNEX 2). In this section we have left out the data for the Philips Hue lamps for clarity. In the SPGQ we have the following categories: Psychological Involvement – Empathy, Behavioral Engagement, and Psychological Involvement – Negative Feelings. For each item, participants were asked to indicate the extent to which they agreed to the presented statements on a 5-point Likert scale: strongly disagree; disagree; neither agree, neither disagree; agree; strongly agree. Furthermore, they were asked to do this soon after having used the second screen app during the show. 3 of the 9 participants using the HbbTV version experienced technical difficulties.

The basic quantitative, statistical results can be found in the following table. These results only show the households that participated with at least to people (135). The numbers in the table resemble the answer of the participants as follows: Not at all (0) – slightly (1) – moderately (2) – fairly (3) – extremely (4). As stated earlier, with 5 participants in one condition, it is very difficult to draw hard conclusions from these results.

Viewing experience

Mean Empathy Mean Negative Feelings

Mean Behavioural Involvement

N

HbbTV 2,0 1,3 1,6 5

Normal 2,2 1,1 1,3 45

Table 3: Overview Quantitative Results

Some of the qualitative results indicate that, when it works, participants enjoy the experience:


“Works really cool”, “Great initiative!”, “Nice episode”

“I found it funny, but the added value is limited. You are in front of the TV and you discuss the answers and your thoughts with each other anyway. The most important is that the score is kept.”

5.1.3.2. De Rijdende Rechter

Pilot Action 1.12



Pilot runtime 23/03/2015 + 30/03/2015

Participants 6 (online questionnaire) – 2 (online interviews)


2.- Sex 6 male; 2 female

3.- Home profile Mainly technical, some retired.

Table 4: Participants Rijdende Rechter

For the evaluation of ‘De Rijdende Rechter’ we based our questionnaire largely on the items of the Social Presence in Gaming Questionnaire (SPGQ) (de Kort et al., 2007). We augmented this questionnaire with a number of additional questions. All items are covering 7 categories. From the SPGQ we have the categories: Psychological Involvement – Empathy, Behavioral Engagement, and Psychological Involvement – Negative Feelings. For each item, participants were asked to indicate the extent to which they agreed to the presented statements on a 5-point Likert scale: strongly disagree; disagree; neither agree, neither disagree; agree; strongly agree. Furthermore, they were asked to do this soon after having used the second screen app during the show.

Psychological Involvement – Empathy

1. I empathized with the other(s). 2. I felt connected to the other(s). 3. I found it enjoyable to be with the other(s). 4. When I was happy, the others were happy. 5. When the others were happy, I was happy. 6. I influenced the other’s mood. 7. I was influenced by the other’s mood. 8. I admired the other(s).


This first category of items mainly inquired about the positive feelings that (might) arise during the use of the application. Behavioral Engagement focuses, not on the feelings, but rather on the actions between the people in the room, during the use of the application.

Behavioral Engagement

1. My actions depended on the other’s actions. 2. The other’s actions were dependent on my actions. 3. The other paid close attention to me. 4. I paid close attention to the other. 5. What the others did affected what I did. 6. What I did affected what the others did.

The final category related to the SPGQ is Psychological Involvement – Negative Feelings. In contrast to Psychological Involvement – Empathy, these questions focus on the negative feelings related to the social aspects during use.

Psychological Involvement – Negative Feelings.

1. I felt jealous of the other. 2. I felt revengeful. 3. I felt schadenfreude (malicious delight).

Because the SPGQ contained items for negative feelings related to the social aspect during use, we added a number of items to inquire about the negative individual feelings and positive individual feelings.

Individual – Negative Feelings

1. I was sorry. 2. I was ashamed. 3. I felt bad. 4. I felt guilty.

Individual – Positive Feelings

1. I felt invigorated. 2. I saw it as a victory. 3. I felt activated. 4. I felt satisfied. 5. I felt powerful. 6. I felt proud.

Next, we included items to assess the perceived usefulness.


Usefulness

1. I found it a waste of time. 2. I found that I could have done more useful things.

Two more items were added to investigate whether participants found the second screen app to be distracting from the show.

Distraction

1. By playing the game I had the feeling that I was less focused on what happened in the program.

2. In order to follow the program, I sometimes could not pay sufficient attention to the game in order to get a good result.

Image 5: Results relating to Psychological Involvement – Empathy

In the following sections we will present and discuss the results, starting with Psychological Involvement – Empathy (see Image 5). The bars in the graphics represent the mean of all participant’s reactions to each statement. Overall, we noticed that the results are not outspoken in a negative or positive way. The item that scored the highest (3.67) indicates that participants really enjoyed being together with the other people in the room during the program. Somewhat related to this is the item “I felt connected to the other(s)”, showing that the social aspect is really valuable in such formats. The other items mainly pertained to the fact that people can influence each other’s moods. This did not seem to be the case during this evaluation.

To recap, Behavioral Engagement items focus on the extent to which participants’ actions are interdependent. The results in Image 6 show that this was not the case. However, it seems that participants were paying attention to each other during the playing of the show.

1

2

3

4

5

I empathized

with the other(s)

I felt connected

to ther other(s)

I found it enjoyable to be with

the other(s)

When I was happy, the

others were happy

When the others were

happy, I was happy

I influenced the other's

mood

I was influenced

by the other's mood

I admired the other(s) N

ot

at a

ll (1

) –

slig

htl

y (2

) –

mo

der

atel

y (3

) –

fair

ly (

4)

–

extr

emel

y (5

)

Psychological Involvement - Empathy


Image 6: results relating to Behavioural Involvement

The next aspect we evaluated is the Psychological Involvement – Negative Feelings (see Image 7). The results for this constructs are quite low, which is positive for the evaluation. Schadenfreude scores a little bit higher, but this aspect might be not that negative. Some comments made by participants about the Eurosong application illustrated that this might be considered a positive element in social games. People are teasing each other in a playful way. Here, we will continue with the results of the negative individual feelings (see Image 8). These are even lower than the negative, more socially oriented feelings. These are therefore all positive results. However, the positive individual feelings do not score very high either (see Image 9). This might be due to the somewhat strange formulation of the items.

Image 7: Results relating to Psychological Involvement - Negative Feelings

1

2

3

4

5

My actions depended on

the other's actions

The other's actions were

dependent on my actions

The other paid close attention

to me

I paid close attention to

the other

What the others did

affected what I did

What I did affected what the others did

No

t at

all

(1)

– sl

igh

tly

(2)

– m

od

erat

ely

(3)

– f

airl

y (4

) –

ext

rem

ely

(5)

Behavioral Engagement

1

2

3

4

5

I felt jealous of the other I felt revengeful I felt schadenfreude (malicious delight)

No

t at

all

(1)

– sl

igh

tly

(2)

– m

od

erat

ely

(3)

– fa

irly

(4

) –

extr

emel

y (5

)

Psychological Involvement - Negative Feelings


Image 8: Results relating to Individual - Negative Feelings

Image 9: Results relating to Individual - Positive Feelings

Finally, we come to discuss more usage-related constructs. The results in Image 10 show that playing along with the second screen format is found to be quite useful. The results in Image 11 indicate that there are some issues of distraction. It is a sign that the design of the application should incorporate the issue of possible distraction more actively.

Image 10: Results relating to Usefulness

1

2

3

4

5

I was sorry I was ashamed I felt bad I felt guilty

Individual - Negative Feelings

1

2

3

4

5

I felt invigorated I saw it as a victory

I felt activated I felt satisfied I felt powerful I felt proud

Individual - Positive Feelings

1

2

3

4

5

I found it a waste of time I found that I could have done more useful things

Usefulness


Image 11: Results relating to Distraction

When we review all the quantitative results, we conclude that there is indeed a positive, social experience during the playing of the second-screen concept of ‘De Rijdende Rechter’. Participants also found that using such a second-screen application for these kinds of programs is very useful. The main issue that will need further investigation is how to design such applications so they are fully complementary to the program on TV and do not distract the viewer/user too much.

In order to gain a better understanding of participant’s experiences we invited the user panel to an online interview. Despite the fact that only 2 participants replied, we learned a little bit more about the use of the second screen application in ‘De Rijdende Rechter’.

A first participant was a 43-year-old man, living with his girlfriend and son. He occasionally watched ‘De Rijdende Rechter’ to relax, and for the test used a laptop as second screen. One technical issue he experienced was that the avatars did not appear on his television. He would really have liked that:

“…that would be very interesting”

Other features he would like to see, is what his other countrymen think about the presented statements on the second screen application during the show. When asked about the possible distraction from the show due to the use of a second screen application, he said that indeed he focused less on the program. But when asking more about this, he said:

“On laptop you easily get distracted, to read you email for example, so your attention toward the television can easily go away. Maybe that’s not the same when using a tablet or smartphone.”

1

2

3

4

5

By playing the game I had the feeling that I was less focused on what happened in the program.

In order to follow the program I sometimes could not pay sufficient attention to the game in order to get a

good result.

Distraction


So, not the second screen application itself is distracting; rather the fact that on laptops people are more likely to switch to other applications compared to tablet or smartphones, is a cause of the distraction.

A second participant, also a man living with his girlfriend, had used second screens since the consumer program ‘Kassa’ on the channel NPO3 offered this possibility. They occasionally viewed ‘De Rijdende Rechter’. They watched most episodes of ‘Kassa’, one episode per week. And they had also participated in the ‘IQ Test’. For both ‘De Rijdende Rechter’ en ‘Kassa’ they like the feature that allows you to give your opinion, and related to this, what other people find. They would really like to see the latter implemented. Moreover, they would also like to go deeper into these opinions, perhaps an option to give more context surrounding your opinion using keywords. In addition to being interested in their countrymen’s opinions, the statements also cause them to discuss their own opinions between the two of them. Because they had used the app for ‘Kassa’ they expected it to be built in the same way. They experienced some user interface issues: the time limit for providing an answer to the statements was not clear. With regard to the possible distraction due to the use of a second screen, this participant responded:

“While you have to think about the statements, your mind is occupied off course, so you are a bit distracted from the show, but it’s not much.”

Also for this participant, the use of a second screen as such does not cause distraction from the show. Rather the activity of having to think about the statements might take away viewer’s attention from the show.

Conclusion(s) - The quantitative results show that participants enjoy the social experience offered by the second-screen application for ‘De Rijdende Rechter’. In addition, they also found the app very useful. A possible point of improvement is the fact that participants were somewhat distracted at times. The qualitative results (see below) provided some more insight into this matter.

- Expectations towards ‘De Rijdende Rechter’ second screen application were influenced by other second screen applications used by the participant. Similar functionality and UI design therefore support the ease-of-use.

- Tablets are better for second screen applications because when people use laptops they report to more easily open other applications and be distracted from the show.

- A key feature for certain types of programs, is to show comments from other households. In this case, this means allowing people to view why they were for or against a certain statement in the show. People are very curious to see where they stand in relation to their countrymen on the statements, and their motivations.

- Effective user interfaces are crucial. Even more so since people often have little time to provide answers. Also, timely notification that input will be required by the participants is important.


5.1.3.3. Eurovision Song Contest

Pilot Action 1.13 & 1.14


Measure Moment Right After


Participants 13

1.- Age Average 34,6y; between 25-44y

2.- Sex 8 male; 5 female

3.- Home profile Mainly technical profiles. 5 people from Spain, 8 from The Netherlands.

Table 5: Participants Eurosong

For the evaluation of the Eurosong Contest application we formulated 18 questions, relating to demographics, Eurosong familiarity, usability, user experience and social experience. 10 participants completed the questionnaires. A 5-point Likert scale (from strongly disagree, disagree, neither agree neither disagree, agree, to strongly agree) was used for the responses of the following statements:

Overall, I am satisfied with the ease-of-use of this application.

Overall, I am satisfied with the user experience of this application.

The application provided a strong social experience.

The application provided a strong sense of competition. I really wanted to win.

The application resulted in many discussions and conversations.

The application distracted me too much from the show.

In order to gain more insight into the above, more quantitative questions, we also included a number of open questions relating to a general evaluation of the application, the user experience and the usability:

What can improve the application? Which elements can add something extra to the experience?

For which kinds of programs would you like to see more such applications?

Which elements of the application did you find easy to use? Briefly explain why.

Which elements of the application did you find hard to use? Briefly explain why.

Which elements of the application improved the experience? Briefly explain why.

Which elements of the application worsened the experience? Briefly explain why.


We will discuss the results of each aspect separately, starting with the ease-of-use. The quantitative results are shown in Image 12. They show that a large majority is satisfied with the usability of the application. When looking at the qualitative results, participants reported that they found the voting, login and playing were easy to do. The difficulties concerning the interaction were in the uploading of the profile picture, and the speed at which the questions were posted on the second screen. Some participants were surprised by that. Some comments:

“The start up was easy, although I couldn’t upload a photo. It was overall easy to use.”

“Playing (answering question) was quite easy. Elements responded quickly to interaction and the login process was also straightforward.”

“Maybe starting with better instructions. It was hard to understand them and connect all devices.”

“The questions came too sudden. Therefore, I missed a few.”

Image 12: Results related to the ease-of-use

The user experience was also very positive for most participants (see Image 13). 7 out of 10 users were positive or very positive about the user experience. The other participants experienced some technical issues with the application and could therefore not fully enjoy or use the new format. Related to the user experience was the social experience and the sense of competition, of which the results are shown in Image 14, Image 15 and Image 16.

0

1

2

3

4

strongly disagree

disagree neither agree, neither disagree

agree strongly agree

Nu

mb

er o

f p

arti

cip

ants



Image 13: Results related to the user experience

Image 14: Results related to the social experience

0

1

2

3

4

strongly disagree



Nu

mb

er o

f p

arti

cip

ants


0

1

2

3

4

5

strongly disagree



Nu

mb

er o

f p

arti

cip

ants



Image 15: Results related to social interaction

Image 16: Results relating to the sense of competition

When we analyze the above results we can explain partly why participants appreciated the Eurosong Contest application. The experience was very social, and the competitive element made it more compelling. People also appreciated seeing what the other household had voted for. It opened a lot of opportunities for discussion and laughter. The negative points were that the whole experience was quite long, but this is due to the format of the Eurosong finals. Furthermore, the technical difficulties with some users, and the questions that, for some, arrived too quickly to answer, were less positive for the experience.

Given the above results we will also take a look at the following data about what genres participants believe have the most potential for such second-screen experiences. They suggested sports, quizzes and game shows, and reality shows such as ‘So You Think You Can Dance’ and ‘The Voice’.

0

1

2

3

4

5

6

7

8

strongly disagree



Nu

mb

er o

f p

arti

cip

ants


0

1

2

3

4

5

strongly disagree



Nu

mb

er o

f p

arti

cip

ants



Finally, a research question that we would like to see answered, relevant to most second-screen formats such as this, is whether or not the second-screen attracts so much attention to the point that it distracts people from the show itself. The results in Image 17 show that this does not seem to be the case. Nevertheless, we believe that this will remain a point of attention for the design of any second screen format.

Image 17: The second screen application did not appear to distract too much from the show itself

Conclusion(s) - The majority of the participants was positive about the ease-of-use of the application.

- People greatly enjoyed the social gaming experience. - The inclusion of a competitive element caused a compelling

experience. - Participants also had the feeling there were more discussions and

conversations using this format. - People did not feel that the second screen app distracted from the

show.

5.1.3.4. Een tegen 100

Pilot Action 1.15



Pilot runtime 14/06/2015 + 21/06/2015

Participants 13 (3 valid responses)

0

1

2

3

4

5

strongly disagree



Nu

mb

er o

f p

arti

cip

ants



1.- Age Average 52,6y; between 41-63y

2.- Sex 3 male

3.- Home profile Mainly technical, some retired.

Table 6: Participants Een tegen 100

For the evaluation of the quiz show ‘Een tegen 100’ and its second-screen application, we constructed an online questionnaire that inquired about participants’ view on the usability, user experience, social aspects, competition, distraction from the main program, and their expectations. Unfortunately, only three male participants between 41 and 63 years old completed the questionnaire afterwards. Therefore, we cannot rely very heavily on these results.

Here is an overview of the questionnaire items for the first set of user related aspects:

a) I found this app easy to use. b) I enjoyed using this app. c) Using the app caused more conversation and discussion in the living room. d) Playing the app increased the sense of competition. e) I was completely absorbed by playing the app. f) It was hard to play the game and simultaneously follow the program. g) I have the sense that I have a better understanding of the program. h) I don’t need such an app. I can enjoy the program on its own. i) The second screen app exceeded my expectations.

The answers are shown in Image 18. What is clear is that the application is easy to use. All three participants gave ease of use the highest possible score. Two of the three participants enjoyed using the app, while the third was neutral. For the other aspects the results diverge too much to draw any conclusions. We asked a number of follow up questions:

Was it clear that you should play against your roommates?

Did it provide enough added value for you?

What did you enjoy in the app?

What could be done better?

All participants answered ‘Yes’ to the first question. So the app clearly states its purpose or its concept to the users. The answers regarding the app’s added value were moderately positive: 2x ‘a little bit’, and 1x ‘Yes’. Participants appreciated the timing, and the feedback about the other players’ results. Room for improvement: adding a ranking so players can compare themselves to the others, and providing more time for the people to start up the application.


Image 18: User-related aspects for the evaluation of 'Een tegen 100'

Conclusion(s) - The application was easy to use, and participants clearly understood its concept, namely that they were supposed to play the other members in the household.

- For the other aspects it is unclear what the outcome is, due to the low number of participants.

- Through the evaluation comments, we note that a ranking is important, providing enough time for participants to answer the statements is necessary, and that the provided timing information for the quiz was good.

5.1.3.5. Nationale BN-er Quiz

Pilot Action 1.16

Data Source Technical Data

Measure Moment During


Participants 509

The National BN’er Quiz is a quiz for people that involves guessing famous Dutch people. As this pilot application was run toward the end of the project we do not have that much data (see Table 7).

0

1

2

3 N

um

ber

of

par

tici

pan

ts

User evaluation aspects

completely disagree

disagree

neutral

agree

completely agree


Total viewers TV program 166.000

Notifications shown 7.844

Regular second screen players

6.000

Red button clicks 509

Players of the app 8

Table 7: Nationale BN'er Quiz technical results

Conclusion(s) - The discrepancy between the amount of red button clicks and actual participants is still large. Because we did not plan a user evaluation in the final month of the project, it is impossible to now the reason behind this discrepancy.

5.1.3.6. The 90’s Test

Pilot Action 1.17




Participants 552 (red button clicks) – 3 participants

The 90’s Test is another second-screen quiz format for HbbTV around the theme of the nineties. As this pilot application was run toward the end of the project we do not have that much data (see Table 8).

Potential reach (devices) 150.000




Table 8: The 90's Test technical results


Conclusion(s) - The discrepancy between the amount of red button clicks and actual participants is still large. Because we did not plan a user evaluation in the final month of the project, it is impossible to now the reason behind this discrepancy.

5.1.3.7. De Verknipte Dierentest

Pilot Action 1.18




Participants 617 (red button clicks) – 9 participants

Yet another quiz format for HbbTV. As this pilot application was run toward the end of the project we do not have that much data (see Table 9).

Potential reach (devices) 150.000



Promo banner presses 39


Table 9: De Verknipte Dierentest technical results


Conclusion(s) The discrepancy between the amount of red button clicks and actual participants is still large. Because we did not plan a user evaluation in the final month of the project, it is impossible to know the reason behind this discrepancy.

The extra promotional banner does generate more clicks to the application. Unfortunately, it does not result yet in more actual players.

5.1.4. Conclusions for the Dutch pilot

The Dutch pilot focused on three areas: Quality differentiation using Digital Rights Management (DRM), In-house recommendations for HbbTV and Cable TV apps (Recommender), and HbbTV as a central interface for second screen competition (2nd screen).

- For the online DRM application, we found out more about how content can be differentiated successfully in video-on-demand platforms. Our results clearly indicate the opportunities for on-demand products that are differentiated based on quality, and that take into account the viewer’s preferences and viewing patterns. One difficult criteria to realize in practice is a very complete library of content, that offers every possible TV show as soon as it is released anywhere. The latter is not in the hands of 1 but many stakeholders in the industry. Our app was easy to use but did not stand out in terms of visual attractiveness.

- In the second area we constructed a novel way of looking at recommendations. We incorporated the historical household usage, and factored in elements such as mood, group composition, genre and time-related aspects. The main input for the constructed algorithm came from diary studies and workshop with users in the first part of the project. Due to the low participation we could not further validate this approach. Nevertheless, the insights obtained around the role of mood, group composition, genre and time, did resonate with the participants in the study. The latter represents a novel approach toward recommender systems as current approaches focus a lot on the relation between personal taste/preference and content items. Our approached was also published (Vanattenhoven & Geerts, 2015). This new approach is essential because the current broadcast model is mainly based on linear viewing schedules. Therefore, efforts on this topic were merged inside NPO with an ongoing research project “Linear broadcast programming schemes provide valuable data for video recommendations”.

- In the third and final area, we explored and evaluated the design of second-screen applications that are to be used in concert with a specific TV program. The goal here was to stimulate the social interaction in the home. Via the evaluation of the first 4 applications, we have indeed established that most participants enjoyed these new formats, especially the social experience that it provides. An important element that ensures a more active experience is competition. People get more engaged when they are playing each other. Keeping scores was therefore also an important feature for such applications, as mentioned by the participants themselves. We also found that the app did not seem to distract participants too much from the show. As household member currently (often) own personal devices, and might be distracted from the TV because of


this, such second screen applications that are dedicated to the TV show, can bring people’s attention back to the show.

The above results were obtained in sometimes difficult circumstances; for some evaluations user participation was quite low, despite several updates of the user panel and adding separate, substantial incentives. Especially for the recommender part this was problematic. Possible reasons for this are bad luck (in our experience, response to recruitment is often somewhat difficult, very rarely too overwhelming), and perhaps trying to evaluate too many applications (11 different applications or versions of applications, some of which in several subsequent episode of the TV show).

5.2. German Pilot

In the German pilot, three applications were developed and evaluated. The first one is verknallt & abgedreht and this pilot ran in three phases between November 2014 and August 2015. This concept focused on a younger audience and investigated how such communities can be served during the broadcasts, but also in between broadcasts, via several HbbTV features.

The second application in this project, was the Sandman program for children. Also for this application, a hybrid service for very young viewers between 3 and 7 years of age, three phases determined the runtime of the pilot. The evaluation ran between November 2014 and August 2015.

Finally, a different kind of application, namely an app store called “TV App Gallery”, offered a central place for HbbTV apps. Its evaluation was carried out in October 2015.

5.2.1. verknallt & abgedreht

Pilot Actions 2.9, 2.14, 2.28


Measure Moment During and after

Pilot runtime Phase 1: 17/11/2014 – 09/01/2015

Phase 2: 02/02/2015 – 27/02/2015

Phase 3: 12/08/2015 – 31/08/2015

Participants Phase 1: 5 + quantitative evaluation

Phase 2: 7 + quantitative evaluation

Phase 3: no qualitative evaluation

The evaluation of verknallt & abgedreht (“Abenteuer Liebe” was the working title and got changed for the publication) uses quantitative as well as qualitative data. During the first two phases, which included a “plain” TV companion and a social media use case, qualitative feedback was gathered in group discussions with young people from the target group (for


details cf. D4.2, chapter #6.1.1.2). Qualitative data were gathered in two lab surveys: 06 December 2014 in phase 1 and 13 February for phase 2. The Lab Survey 1 interview guide and the originally developed online questionnaire can be found in Annex 8. The online questionnaire for phase 2 was re-used as Lab Survey 2 interview guide and can a be found in Annex 9.

1.- Age Participant ages ranging between 9 and 13 years old.

2.- Sex Participant gender was 3 female and 2 male.

3.- Home profile Not available

Table 10. Lab Survey Phase 1

1.- Age Participant ages ranging between 11 and 15 years old.

2.- Sex Participant gender was 4 female and 3 male.

3.- Home profile Not available

Table 11: Lab Survey Phase 2

Quantitative data for all three phases were collected with the help of IRT, using the open source analytics tool PIWIK (http://piwik.org/) for measuring the usage of the HbbTV application and AKAMAI Analytics for measuring the use of video files.

Over all three phases verknallt & abgedreht had almost 24.000 visits only on HbbTV (PIWIK figures) with a total of 215 GB of video traffic (AKAMAI analytics). The first phase was broadcast on Germany’s nation-wide children’s TV channel KIKA (Kinder-Kanal) in November and December 2014 (4 weeks), the second phase saw a re-broadcast on the regional level (Berlin and Brandenburg) during the winter holidays in February 2015 (2 weeks), and the third phase brought the 20 episodes back on the TV screens for another two weeks on the small national channel Einsfestival in August 2015.

Finding #1: Most visits when the program is not available on broadcast

In line with the actual reach of the individual TV channels the first phase had the most extensive use of the HbbTV application, the second phase had less viewers/users, but its peaks reached those of the first phase, while the third phase had a smaller number of users, which is mainly due to a) the summer vacations and b) the fact that there are no shows for the younger audience in the usual schedule of Einsfestival, so that it is no great surprise that young viewers hardly watch this channel. The following diagrams give an overview of relative usage figures as RBB’s internal rules do not permit publication of absolute figures and statistics.

http://piwik.org/


Image 19: Development of user figures over the three phases of German Pilot 1

One main result of the closer evaluation of the figures was that the HbbTV application was used most on those days when the TV show was not on air, so that it can be assumed that it filled a gap and kept users connected with the show while it was not available on broadcast (see Image 20).

Image 20: Visits in relation to broadcast times (Phase 1)

During phase 1 the TV show was on air from Mondays to Thursdays, one episode a day. Image 20 above illustrates how visit figures peaked between Fridays and Sundays, when verknallt & abgedreht could only be found in the HbbTV application. The development in phase 2 supports this: The graph below shows that again usage figures peaked when the program was not on air.

Mo So So So So So So

Broadcast

Visits HbbTV



The situation in phase 3 was slightly different in the sense that user figures did not vary so much as in the first phases. The peak on weekends, however, was visible again.


Finding #2: HbbTV Teasers raise awareness

Looking at the same dataset again from a different perspective we found a direct correlation between the air times of the HbbTV Teaser and the number of visits in the app.

Mo Fr So Fr So So Di

Broadcast

Visits HbbTV

Mo Fr So Fr So

Broadcast

Visits HbbTV


Image 23: HbbTV Teaser (Phase 2)

The Teaser was visible for the complete period (in case of Phase 2 02.02., 0:00h - 13-02., 23:59h), i.e. also on days and at times when the program was not broadcast.

The following figures show how figures started to rise as soon as the Teaser was put on air and fell remarkably soon after it was taken off.

Image 24: Visits in relation to Teaser times (Phase 1)

Due to a technical mistake the Teaser was only published on day 2 and the graph shows that the number of visits to the HbbTV app increased remarkably.

Mo So So So So So So

Teaser

Broadcast

Visits HbbTV



In Phase 2 (see Image 25) the user figures dropped even more when the Teaser was switched off, whereas the drop was less obvious but still visible in Phase 3 (see Image 26 below).


In all phases the HbbTV application was online beyond the broadcasting period and in all phases the Teaser was taken from the start panel right after the broadcast of the final episode (Episode 20), which explains the significant decrease of visit figures towards the end of the online phase.

Finding #3: Most featured, most watched

Video consumption figures unveil another remarkable success of teasers. Starting the verknallt & abgedreht widget from the ARD start panel, users would always be led to the Landing Page (see Image 27 below), which highlighted two videos at the left of the screen – top left: the most recent episode (in the screenshot Episode 20) and bottom left: the trailer video (here called “Best of”).

Mo Fr So Fr So So Di

Teaser

Broadcast

Visits HbbTV

Mo Fr So Fr So

Teaser

Broadcast

Visits HbbTV


Image 27: Landing Page of verknallt & abgedreht (Phase 2)

As a result, the most recent episode was featured at this prominent place for 24 hours until the next video was made available. In Phase 1, shows were broadcast from Mondays to Thursdays (1/day, 4/week, thus 20 in 5 weeks) so that Thursday’s episode would be featured here for four full days – from Thursday after broadcast until Monday’s broadcast was finished and made available as on demand video. Looking at the video usage statistics, we found out two different things: (1) the on demand episodes were played more than the related videos (although they had also been available in the VOD archive while the related videos were not), and (2) those that had been featured on the Landing Page for the longest time (episodes 4, 8, 12, and 16) were also played most often (see Image 28 below).

The only exception from the rule was episode 1, which had been online for the longest time altogether, i.e. from day 1 (after broadcast) until the application was taken offline.


Image 28: Videos played in verknallt & abgedreht (Phase 1)

In Phase 2 the broadcast scheme changed. Every day two episodes were broadcast so that (1) two episodes per day were made available on demand each day (Mon – Fri, for two weeks), and (2) it was always only the second episode that was featured as ‘most recent’ on the Landing Page.

Consequently, episodes with even numbers were played more often than those with odd numbers and episodes 10 and 20 were played most, because they had been featured for the longest time; again with the exception of episode 1, for the reasons given above.

Image 29: Videos played in verknallt & abgedreht (Phase 2)

Epis

od

e 1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

Visits HD

Visits SD

Epis

od

e 1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

Visits UHD

Visits HD

Visits SD


Only a few files have actually been used in UHD. Devices have started trickling into the market very slowly since December 2014.

Social Media Feeds are preferably followed on smart devices. The Social Media Feed feature was evaluated through AttrakDiff2 questionnaires and qualitative group discussions after the young test users had watched two episodes on TV using the Social Media Feed in parallel for some time. All in all, this feature was marked as neither good nor bad, but with a tendency towards “not interesting”. The graphic evaluation of the AttrakDiff questionnaire (see Image 30) underlines what was said in the group discussions: It is okay to see other people’s tweets or posts on the TV screen, but users who are really interested will use their smartphone for posting so it would also be more natural to see this feed – which gathered posts from Facebook, Twitter and Instagram – on the smart device. In that case it would be nice if they could post in the same app, too, and did not have to switch between apps to either post or read.

Although several studies3 showed that (especially young) users tend to use their smart devices parallel to watching TV, the invited test users rather stated that it was relatively difficult to follow the feed and the program in parallel.

Eventually, their disinterest also matched one other, more general remark that they did not like reading on the TV screen, see 6.2.1.4.

2 According to the website of the creators of the AttrakDiff evaluation method, “AttrakDiff facilitates the

anonymous evaluation of a chosen product by customers, users, etc. The evaluation data enables us to gauge how the attractiveness of the product is experienced, in terms of usability and appearance and whether optimization is necessary.” For further information see: http://www.uid.com/en/services/portfolio/uid/attrakdiff-1.html

3 Some studies about Second Screen usage during TV consumption: Nielsen Connected Devices Report

Q3:13; the Connected Life Study of TNS Global in 2014 (Summary online at http://connectedlife.tnsglobal.com/), Accenture Report 2015: Digital video and the connected consumer, see a summary online at: https://www.accenture.com/us-en/insight-digital-video-connected-consumer

http://www.uid.com/en/services/portfolio/uid/attrakdiff-1.html

http://connectedlife.tnsglobal.com/

https://www.accenture.com/us-en/insight-digital-video-connected-consumer

https://www.accenture.com/us-en/insight-digital-video-connected-consumer


Image 30: Reactions to the Social Media Feed

As the use of the Social Media Feed was very successful in the course of the dedicated program for the 25th anniversary of the Fall of the Wall in 2014, RBB will not discard it as generally not interesting. The evaluation result rather has to be that this feature does not suit every kind of program nor every target group.

General Feedback from Face-to-Face Discussions with Users

We want to conclude the evaluation of the verknallt & abgedreht HbbTV app with a number of comments recorded during the group discussions with users from the target group. During Phase 2 – on 13 February 2015 – we had two parallel group discussions with seven boys and girls aged 10 to 15. After watching two episodes of verknallt & abgedreht with them - using the Social Media Feed feature only during one episode so as to check which option they would prefer – we handed out a short questionnaire and went through the questions together to make sure that the questions were understood correctly but filled in individually. The questions can be found in Annex 8.

Before and after the broadcast of the two episodes we had short discussions; in discussion #1 we asked them to tell a little about the ways they use media, especially TV and Internet. In discussion #2 we asked more questions about the program and the HbbTV app they had now seen and tested. Both discussions were guided along a simple questionnaire in order to record the answers but at the same time motivate the testers and enable them to give us information we may not have asked for explicitly. The main results of these discussions are summarized below:

Most of our guests had not heard of, let alone used HbbTV before

Not at all t rue Somewhat t rue Absolutely t rue

4 5

I like comment ing on

ot her people's post s in

t he Live-Blog

I had t he feeling t hat my

post s were appreciat ed

1 2 3

Because of t he Live-Blog

I don't feel alone

Because of t he Live-Blog

I feel more connect ed

wit h t he program

The Live-Blog t akes me

closer t o t he act ors and

producers of t he program

The Live-Blog is a

brilliant means of social

int eract ion

I did not feel comfort able

post ing t o t he Live-Blog

Discussions in t he Live-

Blog are less personal

t han in t he social

net works

Through t he Live-Blog I

can easily communicat e

wit h ot hers


After seeing the HbbTV app and playing around with it, their main and unanimous remark was that TV and interaction do not belong together: “When I come home from school or from training I just want to sit down and let the program run. I don’t want to play on the Remote Control!” This may be a question of “acquired taste”, but their main emphasis was on the lean back attitude of watching TV.

After very few minutes that they needed to get acquainted with the idea of navigating “a website” with a remote control, they had no difficulties finding related content or understanding the enabled features.

When they went through the available related content it soon became clear that a) video is the most attractive kind of extra content, and b) they did not want to read texts on the TV screen (“I have to read all day at school, on the smartphone and while doing my homework! TV time should be text-free time.”)

5.2.2. Unser Sandmännchen (Sandman)

Pilot Action 2.24



Pilot runtime 17/06/2015 – 31/10/2015

Participants quantitative evaluation

We used the prepared infrastructure of the TV-Ring pilot to evaluate another video-centered HbbTV app, although it had not been built with TV-Ring resources. The idea was mainly to digest the first evaluation findings, feed them back into the app development process of this new app, evaluate this in the course of the TV-Ring pilot phase and compare the results to see, if we had learned and reacted correctly. The main finding, namely that related content should not be text but video, perfectly suited the target group of the new app: Unser Sandmännchen – from now on called The Sandman in this document – is a hybrid service for very young viewers between 3 and 7 years of age. The complex structure and the abundance of related content in verknallt & abgedreht was reduced to a minimum: every day the most recent bedtime story is added to the app and beyond that there are only three more videos which are changed weekly. This simple structure makes understanding and navigating the app very easy.

There has not been a qualitative evaluation of the app or of general interests of the target group (which clearly would include the parents of these young viewers). All of the following results were drawn from user statistics. The first apps (Android, iOS, FireTV and HbbTV) were launched on 17 June 2015, the device portal apps were launched in a second wave in August. The following statistics cover the whole period from 17 June to 31 October 2015.

Teasers are not the only way to push a brand


Image 31: Visitors of the TV-based apps 17 Jun and 30 September 2015

Visitor statistics for the Sandman TV apps on the one hand underline the findings from 6.2.1.2 in that intensive use of teasers helped to spur a quick uptake of the apps during the first days (17-21 June). On the other hand, they seem to confirm an old saying that the quality of a product is its best advertisement. During the summer break visitor figures were good but showed little development while after the end of the summer vacations the number of users increased without further marketing – the powerful campaign had focused on the first weeks after the launch. In essence, on air teasers are powerful, but word-of-mouth seems to be even more effective.

Amazon FireTV has quickly become a player in the TV app market

The HbbTV app was launched in parallel with a tablet app (available for Android and iOS devices), apps for TV portals (Sony, LG and Samsung), and an app for Amazon Fire TV. The following evaluation will focus on the TV-oriented apps.

Image 32: Use of all available Sandman Apps

Image 32 shows that during October 2015 most users of the Sandman app family have used the tablet apps (here only called “Apps”), followed by the website, while HbbTV had a good third rank. Amazon Fire TV, which had only recently started into the German market already

24%

11%

5% 1%

59%

Online

HbbTV

Amazon Fire TV

Portal

Apps


had a remarkable 5% of the share, while only one in a hundred visitors came via a device portal app.

The success of the FireTV app is even more remarkable when comparing only the TV-based apps (see Image 33 and Image 34 below):

Image 33: Use of the TV-based Sandman Apps August 2015

Image 34: Use of the TV-based Sandman Apps October 2015

While in August the portal apps had only just been launched, the FireTV app had already gained a share of 22% compared with 75.2% of the visitors using the HbbTV app. In October the Amazon FireTV app had climbed up to 30% compared with 6% for the portal apps While HbbTV is still the strongest in this competition, being the preferred access point of 64% of the TV viewers (here divided into the three broadcast channels through which the app was started).

Conclusions for the Sandman app

23%

6%

35%

30%

6%

HbbTV RBB

HbbTV MDR

HbbTV KIKA


All in all, the Sandman apps have been a great success and it can be assumed that they will remain available for many years even if the preferred technologies might change and new ways of accessing it may be added with time.

5.2.3. TV App Gallery

5.2.3.1. Usability test in the lab

Pilot Action 2.18

Data Source Questionnaires / Usability Test

Measure Moment After

Pilot runtime September 2013 – February 2016

Participants 13

In order to provide feedback about usability and accessibility of the portal, a small lab test with questionnaire was performed for end users. The questionnaire was in German and is placed in ANNEX 9. There were no special criteria for end users.

1.- Age Participants average age ranged between 25 and 42 years old

2.- Sex 70% female, 30% male

3.- Home profile Mainly technical profiles such as engineers, IT, software. Also an accountant, construction engineer and clerk. All participants were from Germany.

Table 12: Participant information

The evaluation was carried out individually with each participant. For this purpose, a workplace was established, which was equipped with an HbbTV-able TV. The TV had access to the TVAppGallery. Each participant has been generally briefed, depending on the level of knowledge about HbbTV and the traditional distribution channels of HbbTV apps. The interviews were executed within 3 days (11. till 13.08.2015).


Image 35: Evaluation Environment

The evaluation started with the following task:

Open the TVAppGallery and add the HbbTV App “Pong” (App-ID: 372655) to “My Apps”.

After participants conducted the task, we posted the following questions:

Did you have any problems? If so, please describe them. What could be improved? What do you think about the concept of having access to any HbbTV application at a

central location? What do you find on the navigation positive and what is negative? What is your general impression of the TV App Gallery? How useful do you think in general, is an open app portal for HbbTV? Do you use HbbTV also at home? If so, what do you use?

(The questionnaire was in German, because there were only German participants possible.)


Image 36: End user evaluation

After having analyzed all the interviews, the documented answers and comments of the

interviewees were transferred into digital form (MS Word) and translated into English. The

answers were printed out and grouped to the related objectives, which were indicated in

D4.1.1.

Image 37: TVAG end user data analysis


Objective #1: Importance

Do users feel the need for such an application portal?

Most of the interviewees reported that an open HbbTV portal is a good idea and important to have. They also mentioned different arguments why they think it is necessary:

“…because you are independent from manufacturers and you have a big amount of different apps to choose.”

“…because it is the same structure (design…) at every TV set, isn't it?”

“…possible for developers to add apps here.”

Others wrote, the idea is good, but only if the TV program or the security does not suffer:

“…as long as the aspects of (privacy and data-) protection and the quality of the broadcast program do not suffer.”

“…be aware of commercial influences!”

“…how do you control which apps could be offered through the portal? Maybe this is also a question of security?”

“…Is the rights management of the HbbTV Apps on TVs so far limited, that they are unproblematic in a safety-related way?”

Only one participant stated that the TVAppGallery is completely senseless.

Objective #2: Concept idea

What do users think about the portal idea?

All interviewees stated that the idea of having access to all HbbTV applications at a central location is very good. Two of them also mentioned that all the different portals are impracticable.

“Very good, because all the different portals need every time a short practicing.”

“Good! Better overview of all the applications (program-wide).”

“In principle a good idea, everything is at one place.”

Nobody thinks the idea is not thought through or missed anything.

Objective #3: Structure/Design

Do users understand the portal structure?


All in all, users reported that the structure of the portal including the main menu (categories on the top) is actual clear and simple at the first sight. The problem is that the current state of the TVAppGallery is a prototype and therefore it does not work reliable so there were many complaints and issues.

“In principle it is clearly arranged. If you do some navigation or actions a precise menu navigation is missing. What key does effect what event? Here a help or hints would be nice.”

“Basically intuitive”

“Positive is the quick navigation via remote control.”

“I didn't find any remarks. At first sight it looks pretty clear and easy to handle to me.”

Some interviewees explained they had difficulties because of the menu navigation. Sometimes it’s confusing because there are too many different menus. It seems as if there is no real navigation concept.

“If you walk through the navigation, the content of the different items does not update itself automatically. The content does not change until you press the ‘ok’ button.”

“It should be possible to switch between categories faster; it is not very handy to press each time ‘ok’.”

“At least it is positive that you can choose some actions by using keys. But: there are remote controls without keys, how do you add Apps then?”

A further often mentioned gap is the fact that there is no return functionality. Therefore, people had to start from the beginning if they made an error during navigation.

“There is no return functionality. More information about the buttons would be helpful.”

“If you start an application from the TVAppGallery it should be possible to get back to the Gallery. It is not good to just get back to the broadcast program.”

Also the labelling especially of the ‘key menu’ is not meaningful and leads to deception.

“Would be nice if labels have additionally icons. Labels could be more present.”

“The description ‘3. My Apps’ is not very meaningful. Also ‘5. Select’ is not good. Maybe it's better to have ‘3. add to My Apps’ and ‘5. Start’.”


“More precise labels.”

Objective #4: Usability

Do users feel comfortable with the portal?

A lot of participants reported that they would like to have a manual for the TVAppGallery. That does not suggest that the use of the portal was easy for them. Problems were also detected during the task. Here some of the points explained in Objective #3 appeared. Additionally, they mentioned that there is no feedback, which leads to confused users.

“Response about successful/failed actions is missing.”

“Short success-message that the app was added to My Apps would be nice.”

“User Feedback: If you add the App, there is no positive response (like ‘added’).”

Another stumbling block was the search function. Several problems arose here:

There are two different search methods available. Sometimes people didn’t find any search method. The application crashed most of the time while searching. They used a search function that did not suit their needs and the application started

immediately.

“App-ID: If I found the app via ID, it would be good if there would be a context menu instead of starting the app directly.”

“App crashes a lot if you use the search function. It may be necessary to re-engineer the App.”

“Cross-category search function.”

“Search function is missing. Amendment: Found search function, but by accident. It should be positioned better.”

“If you search the app over ID you will find the app but it will just be opened. It is not possible to add the app to My Apps”

At least four participants had no problem with the task and wrote very positive comments.

“Yes, very easy. No problems.”

“It is intuitive. A response after pressing ‘3 for My Apps’ would be good.”

“Easy, if you entered the AppGallery and you found ‘Pong’ over the


menu item ‘Games’.”

“It was easy to find the App but to add it was more difficult at the beginning. After a view tries I made it.”

Conclusion(s) Objective #1: Importance

A central location, which organizes all available HbbTV applications is really important to generally promote access to the apps. Proprietary and broadcast-related portals have different structures, which always require a short learning period. This can be avoided by using a central portal. In order to design a successful portal, a simple and sophisticated management concept that also ensures safety is required - security in terms of privacy and data protection.

Objective #2: Concept idea

The concept and the idea of the TVAppGallery is very well received. People love the idea of having a wider range of HbbTV applications in one location. This is possible since it is an open portal and everyone can provide his applications. That is one of the biggest advantages and also challenges of the concept. The good thing is that a variety of apps could be offered. However, in terms of safety, the concept is not yet mature. It lacks verification mechanisms and ideas to control malicious applications. All in all, the idea is simple and obvious. In other areas like in the mobile sector the apps are also managed effectively via centralized app stores.

Objective #3: Structure/Design

The structure and design of the TVAppGallery is clear and simple but it is in need of improvement in many parts. The navigation concept is not well-conceived, there are too many menus, there are no clear expressions and its complexity leads to confused and unsatisfied users. In particular, the labelling of the ‘key menu’ is not meaningful and users are confused.

In many cases, the users are stuck in an impasse and the only way out is the “exit-button”. That means they are leaving the HbbTV area completely and have to start all over again.

Objective #4: Usability

The participants reported a lot that they would like to have a manual for the TVAppGallery. That does not suggest that the use of the portal was easy for them. Problems were also detected during the task. Here, some of the points explained in Objective #3 reappeared. In addition, they mentioned that the application gives no feedback, which caused confusion among the users. Another stumbling block was the search function. Several problems arose here. First of all, there are two


different search methods available. Despite, sometimes people didn’t find any search method. If they used the function they were looking for, the application crashed most of the time during the search. If they used the other search method, the application started immediately and there was no way to get back again. At least some participants had no problem with the task, probably they randomly chose the search function they were looking for.

5.2.3.2. Discussion with professional stakeholders at important events

Pilot Action 2.19

Data Source Discussions with professional users

Measure Moment After

Pilot runtime September 2016 – February 2016

Participants Concrete information not available (difficult to keep track of across the many events that we participated in).

To promote the concept of the TVAppGallery and find an appropriate partner for publishing. The goal, among other things, of the TV-RING project was, to participate at several events and present the open portal to a wide audience convincing people about the concept and trying to publish it of some kind of way.

The method used for this task involved many face to face discussions, explaining the concept and the idea to professionals in this sector like representatives of TV manufactures, national media regulators, experts from the broadcast and TV area, and visitors of trade fairs. This happened at trade fairs, appropriate events, meetings etc. Time, event, people and sometimes a short summary of the conversation were described. Some of the work was carried out shortly before the project kick-off. Nevertheless, the results are important for the project and will therefore be summarized briefly in the table below.

Date Event/Meeting

Results before the

start of the TV-RING

project

TVAppGallery was presented at several trade fairs like NEM-Summit,

“Medientage” in Munich, C3-Conference at CeBIT, IFA and IBC.

It was also presented at several events and meetings like the AG

SmartTV of the “Deutsche TV-Plattform”. A great part of the visitors

were broadcasters and manufacturers. It was also shown to attendees

of the general meeting of the “Deutsche TV-Plattform” and in a

meeting with the Bavarian State Chancellery and the Media Network

Bavaria.

Three well-known manufacturers expressed a more profound interest


into the TVAppGallery. IRT provided them with more detailed

information. The Gallery was implemented as a prototype on a

Kathrein-Box.

Preparations for the creation of a consortium to put the TVAppGallery

into operation. (Project description: technical, operative and financial

aspects). Brand protection for the Gallery logo was proved. Creating a

technical specification for implementing the TVAppGallery at end

devices.

App developer companies like the M.E.N. Media Entertainment

Networks GmbH, ITSMYTV and Cellular provided apps for the

TVAppGallery. Also broadcasters like ARD, ZDF and arte provided some

of their apps for the TVAppGallery.

24. September 2013 Mail discussion and information exchange with a large A brand TV

manufacturer. They give as a feedback that they appreciated the kind

additional information on the beautiful service.

October 2013 Presentation at the Münchner Medientage

27. November 2013 Meeting with the formerly mentioned A brand TV manufacturer at IRT

about the TVAppGallery among other things.

Q4 2013 / Q1 2014 Definition of features for HbbTV version 2.0 via telephone conferences

and e-mail. IRT was crucially involved into this process. The

requirement to launch independent and not broadcast-related HbbTV

applications from mobile devices was added.

18. December 2013 Presentation at the TV-RING Meeting to project partners.

November 2014 Study of „Futuresource Consulting“: Building up and maintaining a

proprietary SmartTV platform is not a rewarding business model for TV

manufacturers

November 2015 Meeting HbbTV Requirements about the standardization and further

development of HbbTV.

Table 13: Time table and execution

The details about the conversations like time, event, persons and a short summary were

printed out and grouped into stakeholder groups.


Image 38: TVAG professional user data analysis

Manufacturers

Mainly the smaller TV set manufacturers were interested in the TVAppGallery. They had no

desire to build their own portal systems. Based on their interest, IRT designed a technical

directive about implementing the TVAppGallery in TV devices. Together with a Set-Top-Box

manufacturer, the TVAppGallery was implemented for testing purposes to an end-device in

June 2013. This prototype was presented in the subsequent months in several meetings and at

trade fairs.

The larger manufacturers saw huge potential in portals for new business models and preferred

to rely on their own proprietary solutions instead of HbbTV. Meanwhile, there could be a new

opportunity for the TVAG. Manufacturers recognize that the handling of a portal is not that

easy and profitable as they thought in the beginning. Also they fear the potential of the

upcoming portals/technologies from Google and Amazon. To counter such developments, they

may think of the idea to merge portals. The ‘SmartTV alliance’, which tries to harmonize

browser profiles for manufacturers app portals evolves in that direction. But this approach is

not as open as that of the TVAG and it is not based on the HbbTV standard.

In the end, none of the major manufacturers showed sufficient interest to move the concept

of the TVAG further into the market.

Broadcaster

Broadcasters were generally sympathetic toward the concept of the TVAG although they are

not the intended target audience on the service side. All of them can directly use the auto-

start function to launch HbbTV applications from their broadcast services. While they thus are

not dependent on something like the TVAG, they still like the idea to place their services there.


Also the idea of broadening the usage of HbbTV by this means is a positive aspect for them.

But they would not be the main drivers for such a concept.

App Developer

Two app developer companies provided some HbbTV apps for the TVAppGallery. This was very

useful especially for demonstration purposes. During discussions it turned out that all

developers saw a huge advantage in the TVAppGallery to enlarge their business. At the

moment they are dependent of broadcasters needs. If they want to earn money they have to

implement the ideas and concepts of the broadcasters in the first place. There is no

opportunity to offer their own app ideas on the market. Therefore, they would be very happy

about the opportunities provided by the TVAppGallery.

Standardization bodies

IRT worked actively on the new HbbTV standard version 2.0. In this process a variety of features were compiled and discussed in many conference calls mainly in Q4/2013 and Q1/2014. At the end of this phase, a set of key features were determined. These features became part of the HbbTV version 2.0. IRT proposed and supported the idea of opening up the standard again a little bit more. A possibility should be created to make HbbTV more attractive for people outside the broadcasting sector. With the help of the so called ‘companion screen’ feature developers are able to offer their HbbTV applications for TV sets without any access barriers. The feature was accepted and is now part of 2.0 and ETSI TS 102 796 v1.3.1 (2015-10).

Conclusion(s) Most of the promotional work was carried out from late 2012 until the

end of 2013. Within the TV-RING project we planned to present and

promote the TVAppGallery in 2013. Unfortunately, the projected start

was postponed to September 2013, but the work had already started.

The timing of the promotional work does hardly fit with the schedule of

the project (evaluation phase), but this does not influence the results of

the discussions we had with several stakeholders.

Objective #1: Promotion

A lot of effort was carried out to publish the HbbTV portal. The portal

was demonstrated at trade fairs, events, meetings and mentioned in

several HbbTV presentations. The portal concept was discussed with

many people and the feedback was positive overall.

Objective #2: Publishing

Several stakeholders are excited about the opportunities the portal

could offer and almost everybody confirmed that they would love to use

the portal for their HbbTV applications. The single problem of this

concept is the responsibility for the portal. Nobody wants to offer the

TVAG and therefore take the responsibility. The fear of being ruined by

lawsuits is too big.


5.2.4 Conclusions for the German Pilot

The German pilot was conducted in a time where HbbTV is already a mature technology and apps can enable so many attractive features and new user experiences. But although basic features like HbbTV TeleText, the interactive EPG or the VOD archive (Mediathek) are used by many German users each day, HbbTV is still not part of the daily routine of TV viewers, especially not of young viewers.

The described key results of for “verknallt & abgedreht” and “Unser Sandmännchen” can be condensed in the following conclusions:

- Most visits happen before or after the actual broadcast of the related TV programme. In line with the actual reach of the individual TV channels, the most extensive usage slots of the HbbTV application in the first pilot phase are matching exactly the usage peaks of the second phase, be it on a daily or hourly basis. As the TV screen must be shared between the main TV broadcast video and the companion service, it is only logical for users to decide to view on-screen content not at the same time.

- HbbTV notification teasers raise awareness. As soon as the application was advertised as overlay, the usage numbers remarkably increased in phase 1 and 2. Also when the teasers were switched off, usage numbers decreased again. One can conclude that viewers are mainly focussed on the omnipresent broadcast, and they need to be made aware that there is additional content to use.

- Most featured content is also mostly watched. As video consumption figures show, when advertised directly in the landing page of the application, usage numbers of theses video clips are higher than non-advertised ones. Users don’t have to navigate deep into the dedicated content sections of the application, but can access content more easily when it is more up front.

- Social media on the first screen is an area still to be explored. People mark social feeds on the main screen as not interesting enough. They still tend to participate and to consume social content on their individual devices. Also following the programme by watching and in parallel reading the social feed is quite challenging.

- Young people are not well known to HbbTV as a concept. They tend to be passive with the main screen, not going to navigate through categories or other web-site-like interface structures with a remote control.

- In contrast, simple applications, focussed on video only can be a success, even for youngest users. Easy ways to access and to consume desired video content, without the need for reading navigation or another information are key to success.

The high technological standards enabling numerous features gives broadcasters the chance of exploring new territory and at the same time finding out what users want or like by


experimenting with all sorts of new features. All of the above findings will help us a lot in creating innovative and attractive apps and features for HbbTV.

Continuing with the TVAppGallery conclusions: This application is a system to provide an open marketplace for HbbTV applications. In general, HbbTV applications are tied to broadcast programs and accessed through the ‘red button concept’. The HbbTV versions until 1.5 don’t provide any technology to give access to third party applications. The goal of the TVAppGallery is to open the HbbTV application market to non-broadcaster companies. Otherwise they have only the opportunity to purchase proprietary app portals, which are offered by device manufactures. The TVAppGallery provides an efficient way access to SmartTV devices for all parties. The described key results of the TVAppGallery pilot can be seen as follows:

- The concept and the idea of the TVAppGallery is very well received in the evaluation. TVAppGallery is an open application portal aiming at offering consumers a wider range of TV applications based on the HbbTV specification. The directory of applications can also be integrated in vendor-specific TV application portals enabling a single UI for applications on a SmartTV.

- With the help of the TVAppGallery, any registered user can make new applications available. This openness of the approach is one of its biggest advantages: Applications providers can easily publish and test applications while potentially reaching a very large number of customers on a wide range of devices. No re-adaptation of applications to specific vendor platforms is required. In the evaluation this is appreciated by both sides. At the same time, this means substantial challenges for real-world deployments: A central instance is required to verify quality, content and legitimacy of newly registered and updated applications. This does involve substantial cost for operation, maintenance and surveillance yet also poses a variety of legal and liability risks for the directory operator. A suitable provider would also need to be seen as neutral by application providers as well as manufacturers. Until now, no viable market solution was found to solve this.

- Nevertheless, our evaluation results clearly confirm that it is perceived as being highly relevant to have such a publication and test portal independent of broadcast or manufacturer-dependent portals. Thus, the concept of the openly usable portal was well received amongst many different stakeholders, with clear positive feedback. However, potential operators of such a directory and portal service see the legal risks that may arise from the potential responsibility and ownership as being too high.

- Regarding the evaluation of the user interface for consumers as well as for application developers, it was noted that for a commercial operation, the navigation would need to be further simplified. In some instances, clearer feedback was expected and it was perceived that the search functionalities need improvements. The user experience is clearly key for the acceptance of such an application portal. However, the priority remains to solve the legal and commercial issues first.


5.3. Spanish Pilot

A sequence of tests has been carried out. First, four small-sample controlled pilots were executed from December 2014 to June 2015. The goal was to fine-tune the usability of the test application, sort out any outstanding technical issues, and obtain market data on the most attractive content for users to be used in larger experiments. The content for these tests were a singing contest show, a special news report on the Spanish local elections, and two football matches.

This first phase of controlled-sample pilot tests set the stage for the two large-scale open pilots, which involved thousands of users. A preparatory pilot was carried out in September, with news coverage of the Catalan national day rally, to ensure the successful deployment of the envisioned live pilots. The selected content for the live pilots were two FC Barcelona football matches in September and November 2015. Detailed data was obtained on the patterns of usage of the application, technical performance parameters and several metrics of user satisfaction. An in-depth analysis of these data has yielded rich information on HbbTV market penetration potential, models of user segmentation and clustering, and analytics on the user’s behavior in real-life settings.

Finally, three batteries of in-lab technical tests were carried out in November and December 2015. These tests were designed to test the performance of the TVC application across a range of TV devices, to evaluate the performance of the MPEG-DASH encoder, and to determine experimentally the impact of hardware performance on latency and user satisfaction.

5.3.1. TV3 a la carta multicamera

5.3.1.1. Oh Happy Day! Program

Pilot Action 3.8, 3.9 & 3.18

Data Source End-users and Technical Data

Measure Moment During and right after

Data collection methodology

Several research methods were combined in this first pilot action. Data was obtained via a mix of technical logging, direct email contact for technical issues, an online questionnaire (see ANNEX 10), contextual interviews, and automatic video analysis to verify the user panel’s active participation.

Data analysis and discussion of results

For optimal data visualization, the analysis of the data is presented in the aggregate at the end of this chapter. The insights generated specifically at this research action, which were used to enhance the HbbTV application in this iteration, are presented in the conclusions section below.

Conclusion(s) - A total of 20 test TV devices were successfully installed in the user


panel households. The user panel members were able to participate in the piloting activities with the TVs.

- Research data was obtained on the initial impression of the users with the application. This impression was judged to be positive, with a high level of interest in the proposed services.

- A number of technical issues were detected with the application, mostly with the performance of the multi-camera service.

- The System Usability Score4 was calculated for the test application. The SUS obtained was of 69%, which can be considered a good initial score, and a good basis for the iterative improvement of the app during the course of the pilot.

- On a less positive note, the content offered through the application were not considered to be of much interest. More precisely, the multi-camera functionality was not deemed to be adding much value to the program ‘as it is’. Other choices of program selection will have to be explored in further tests to determine the most adequate programs.

5.3.1.2. Champions League Quarter Finals Match: FC Barcelona – Paris

Saint Germain

Pilot Action 3.10




The data for this pilot action was collected via technical logging of page requests and an online questionnaire (see ANNEX 11).



4 SUS yields a single number representing a composite measure of the overall usability of the system

being studied. Note that scores for individual items are not meaningful on their own. SUS scores have a range of 0 to 100. To calculate the SUS score, first sum the score contributions from each item. Each item's score contribution will range from 0 to 4. For items 1, 3, 5, 7, and 9 the score contribution is the scale position minus 1. For items 2, 4, 6, 8 and 10, the contribution is 5 minus the scale position. Multiply the sum of the scores by 2.5 to obtain the overall value.


Conclusion(s) - The increase in the number of cameras (from 2 to 4) was beneficial for the user satisfaction with the multi-camera service.

- A significant number of user panel members felt that the quality of the offered additional camera views was inferior to the broadcast. Ensuring uniform qualities across multi-camera content is important, as otherwise an unfavorable comparison is likely (i.e. users complaining about SD additional broadband views because they compare them to the HD broadcast)

- The choice of content was liked by most users, which considered that sports events can benefit from such multi-camera services.

- A number of suggestions as regards app usability, navigation and page layout were made. In particular, specific complaints have been voiced about the very poor usability of the traditional remote control to navigate the HbbTV application. These should be studied for implementation in further tests to enhance the user experience.

5.3.1.3. Mayoral Elections News Special

Pilot Action 3.11


Measure Moment Before, during and right after


The data for this pilot action was collected via technical logging of page requests and two online questionnaires, a technical issues-oriented form administered before the pilot, and a post-event feedback form (see ANNEX 12 and ANNEX 13). This was the first instance of deployment of a second screen functionality in the Catalan pilot, in response to usability concerns with the traditional remote control devices used in previous test iterations.



Conclusion(s) - A Second Screen remote control functionality was successfully implemented and deployed in the pilot application. The TV-handheld linking process worked as intended, and was proven to provide a better user experience in controlling the HbbTV application. However, in a number of cases, several issues prevented user panel participants from using the Second Screen functionality. These will have to be addressed in future tests.


- The pre-test technical validation and assistance was useful in helping those users who were willing to try out the Second Screen remote control solution that was fielded for the first time. The insights generated with this technical validation should be included in the “help” functions of the HbbTV application.

- The choice of content was considered adequate by most users.

5.3.1.4. Champions League Finals Match: FC Barcelona – Juventus FC

Pilot Action 3.12




The data for this pilot action was collected via technical logging of page requests and an online questionnaire (see ANNEX 14).



Conclusion(s) - The research data generated confirmed that sports broadcasts are some of the most attractive content for users of multi-camera services.

- The changes implemented in the Second Screen functionality from the previous version (easier TV-device linkage, improved GUI in the Second Screen app) were successful in enticing more users to test the solution, and in propping up their user experience.

5.3.1.5. User Panel Aggregate Data Analysis

The user feedback data obtained at the four user panel tests that were carried out was processed, aggregated and analyzed, to obtain a big picture of the trajectory of the actions carried out so far with the user panel.

The sociodemographic composition of the respondents varied across the four pilot tests, as not all members of the user panel provided feedback for all tests (i.e. some of them were on vacation or away from home, had impeding technical problems, or simply failed to participate).


1.- Age Average ranged between 40 and 46 years old, with participant ages ranging between 15 and 59 years old.

2.- Sex The sample composition fluctuated between 60-90% male, and a corresponding 40%-10% female.

3.- Home profile Families made up 60-85% of the sample households, with most of the rest being couples (40-15%).

Table 14: User panel sociodemographic parameters

As can be seen in the figure below, most user panel respondents found the application to be interesting and adding value to the content. A readily noticeable improvement in scores can be seen from the first test to the second, when program selection shifted from a singing contest to live sports and special information programs about high-impact events.

Image 39: Reported interest in HbbTV multicam services across user panel tests

As regards the data on the user evaluation of several aspects of the multi-camera application, contained in the figure below, almost all aspects of the application obtained acceptable marks across the trajectory of user tests. Some of these aspects, such as the availability of additional cameras and program selection, had a marked increase throughout the test phase. Taking the quantitative data in conjunction with free-text user comments, we can conclude that the quality of videos, the ease of accessing content and using the application and the clarity of the information in the app were the most praised aspects.

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Oh Happy day! Quarter Champions

Election day Final Champions

No interest

Not much value

Very interesting


Image 40: User satisfaction with several aspects of HbbTV multicam app across user panel tests

The severity of the issues experienced by users lessened as the piloting trajectory advanced. The increase of critical issues in the Election Day test can be attributed to the implementation of the Second Screen functionality. Critical, impeding issues which rendered the application unusable were below the 10% mark and, with the exception of the Election Day pilot, remained low throughout this first phase of the pilot validation.

Image 41: Incidence of technical issues across user panel tests

All kinds of technical issues experienced by user panel members were generally diminishing over time, with most issues being experienced by a small minority of users. The only exception to this general tendency are issues related to videos not loading in an adequate amount of time, a hardware performance related problem which was incorrectly attributed by many users to a software bug. An upsurge of crashing incidents in the Election Day pilot can be

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Oh Happy day!

Quarter Champions


Video quality

Speed of view change

Ease of accessing contents

Ease of use of application

Program selection

Availability of multiple cameras

Clarity of information

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Oh Happy day! Quarter Champions


None

Minor

Critical


ascribed to the implementation of the Second screen functionalities, which at first caused some devices to work in unintended ways.

Image 42: Typology of technical issues across user panel tests

5.3.1.6. Catalan National Day Rally (News Coverage)

Pilot Action 3.13 & 3.14




These technically-oriented pilots were devised to test the experimental scenario to clear the way for the launching of large-scale open pilots. The data for these transitional pilot actions were collected via technical logging of page requests, to monitor the correct deployment of the tested scenario and ensure success in the next pilot phase.


Approximately 11 user panel households took part in these tests, with an estimated audience of 11-33 people. All impeding technical issues with the multi-camera application that had been detected in previous tests were proven to be fixed. The deployment of the Second screen application was also successful, with few issues detected. The only technical problems remaining were known issues with hardware performance, and problems with the linking of second screen devices

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Oh Happy day!

Quarter Champions


No problems reported

Video contents not loading properly

Visualization issues (layout)

Perceived low video quality

Application crashed/rebooted

Poor app performance


Conclusion(s) - The HbbTV application provides a sufficient level of user experience to be piloted in a large-scale, non-controlled group setting.

- The CDN technical setup can successfully support the deployment of a large-scale live open pilot.

5.3.1.7. Champions League Group Match: FC Barcelona – Bayer

Leverkusen

Pilot Action 3.15




The data for this pilot action was collected via technical logging of page requests in both HbbTV application and Second Screen application, and an online questionnaire devised to provide a summative evaluation of the whole piloting trajectory, thus addressing the research questions set out in the evaluation framework (see ANNEX 15).


The data collection effort in this pilot action was entirely focused in answering the research questions set in the evaluation framework of the Spanish pilot. The answers to these questions can be condensed in the following statements:

Conclusion 1: Multiple view on-demand content partly yields to more content being watched.

• 57% of test users declared having spent more time watching a given multiple view program compared to a similar single-view program.

• However, 14% did not, and 29% of respondents were unsure of their answer.

Users enjoy content that has multiple views.

• This seems to hold true for both live and on-demand content.

• 87% of test users declared having enjoyed more watching a given multiple view program compared to a similar single-view program.

• In addition, the TVC HbbTV app achieved a satisfaction score of 7.4/10.

Most users will repeat consumption of content previously watched live if these content are made available with multiple views.

• However, a majority of these users are likely to be “dabblers”, just trying it out.


• 78% of test users declared having gone back, at some point, to re-watch multi-camera fragments of a previously offered pilot show.

• This figure includes 47% who re-watched once or twice, and 30% who re-watched more than twice.

There is more user engagement with the program when live content has multiple views.

• 78% of test users declared having zapped less between channels while watching a multiple view program compared to a similar single-view program.

Conclusion(s) - The pilot marked the first successful large-scale open deployment of the multi-camera application in a live sports event.

- The research effort focused on the allowed the collection of data which validated the research questions of the evaluation framework.

- The large audience of users of the application attests to the very significant level of interest raised by the test application. A further final live pilot test is recommended, to go beyond the initial research questions delineated at the beginning of the pilot validation and explore other relevant issues such as audience behavior and most attractive content for multi-camera.

5.3.1.8. Champions League Group Match (Open): FC Barcelona – AS

Roma

Pilot Action 3.16




The data for this pilot action was collected via technical logging of page requests in both HbbTV application and Second Screen application. This data was processed and subjected to a battery of exploratory statistical analyses to determine the underlying structure of the data, and derive information on the distribution of the variables. Some multivariate analyses were carried out as well, to ascertain whether there were any significant relations between the variables.

Contextual qualitative data was obtained with the post-event administration of an online feedback questionnaire (see ANNEX 16), and the collection of the dedicated Twitter feeds set up for users to comment on the program.

The datasets with the full logs of the HbbTV application, Second Screen application and CDN are available for research purposes on request.



TV Manufacturer Unique Visitors % Page Views %

LG 2,341 37.7 10.899 36.0

Sony 1,428 23.0 7.289 24.1

Panasonic 1,298 20.9 6.898 22.8

Samsung 922 14.9 3.848 12.7

Philips 69 1.1 486 1.6

Toshiba 55 0.9 310 1.0

Other 37 0.6 286 0.9

Total 6,203 30.265

Table 15: Unique visitors and page views per TV manufacturer

The table above presents the data on unique visitors and page views on the application, broken down by the manufacturer of the users’ TV device. This manufacturer data may be useful to infer the market penetration of each manufacturer in the Catalan territory, in the segment of HbbTV 1.5-compliant devices. If we proceed with an analysis of this data, we can observe that there is a very strong correlation, in the order of R2=0.9898, between the number of unique visitors and the number of page views requested. To explore more in depth this relationship, a more accurate non-parametric statistic of variance (the One-way ANOVA or Kruskal-Wallis H test) was calculated. The TV manufacturer of each device was introduced as the independent variable, while the total seconds spent in each page (a continuous and more accurate variable than just page views) was added as dependent variable. The test outcome revealed that there is indeed a statistically significant difference across groups, with a Chi-square statistic of 56.256. Because of the lack of comparable studies in the literature, it is not possible to compare the research significance of this Chi-square value with those of other experiments. The associated Median test carried out in the same analysis confirmed that audiences stayed on the application slightly longer depending on their TV models, but with such small observed differences that these can be regarded as insignificant for theory building purposes.


Image 43: Users and data streamed along time during live pilot, total

In the figure above, we can see how total number of user and total data usage strongly correlate with the beginning of the match and the mid-time. Users did not watch so much in the midst of the bustle of the sport event, but rather in the starting minutes of each half. The highest usage peak occurred during the ten minutes before and the ten minutes right at the start of the match. A second, gentler peak can be observed in the minutes around the start of the second half. The size of these usage peaks is approximately five times and three times larger than during the rest of the match.

0

50

100

150

200

250

300

350

400

450

500 N

um

ber

of

Use

rs

UTC Time

Users per time

0

1000

2000

3000

4000

5000

6000

Dat

a st

ream

ed (

in M

Bs)

UTC Time

Data per time


Image 44: Users and data streamed along time during live pilot, breakdown by stream

The figure above displays the volume of data transmitted during the pilot test at each of the five audio-visual data streams that could be requested by the multi-camera application. It can readily be noticed that the mosaic was the most watched stream, accounting for over 80% of the total data traffic. The broadcast (indicated as Stream 1 in the graph) came next. The rest of streams (Stream 2 with the three stars, Stream 3 with the coach, and Stream 3 the overview of the attack) were only watched sporadically.

0

50

100

150

200

250

300

350

400 D

taa

stre

amed

(in

MB

s)

UTC Time

Users per Stream per Time

Stream1

Stream2

Stream3

Stream4

Mosaic

0

500

1000

1500

2000

2500

3000

3500

4000

Dat

a st

ream

ed (

in M

Bs)

UTC Time

Data per Stream per Time

Stream1

Stream2

Stream3

Stream4

Mosaic


Image 45: Number of users by total data streamed and total minutes engaged

The graphs in the figure above bin the 6203 pilot test users by their total usage of the application, in terms of total data streamed during the match and total minutes engaged on the application. As we can readily see, both variables have very non-normal distributions, strongly skewed to the left. These are long right tail variables, where an overwhelmingly majority of the cases are concentrated in the left side of the graph. In connection with our analysis of user behavior in the pilot test, the implications are that 95.02% of users were engaged in the application for less than ten minutes, and 74.65% of users were engaged for less than three minutes.

Contextual qualitative analysis

It must be acknowledged that the robustness of our qualitative results is negatively affected by the large error margin of the computed statistics. This is a direct consequence of the very small sample obtained, of only 37 respondents. Nevertheless, even allowing for the likelihood of a 16-percentual-point mismatch between the population parameters and the sample statistics

0

200

400

600

800

1000

1200

1400

1600

1800

0 500 1000 1500 2000 2500 3000 3500 4000 4500

Nu

mb

er o

f u

sers

Data streamed (in MBs)

Users per Data streamed

0

500

1000

1500

2000

2500

0 20 40 60 80 100 120 140 160

Nu

mb

er o

f u

sers

Total minutes engaged

Users per Minutes engaged


reported below, these results can give a general guidance on certain relevant issues. Most importantly, the user-generated statements and comments obtained via the questionnaire and program Twitter feed can provide vital contextual information to complement the analysis of the CDN log data.

Image 46: Perceived value of multi-camera services in live football.

Image 47: Stated preference of program content for multi-camera application

These figures provide an indication of the perceived value of the piloted multi-camera services, and their suitability for a range of program content. Error margin notwithstanding, the large differences between the results of different response categories allow for the identification of certain facts. First, multi-camera services can indeed be perceived as value-adding by a majority of users. And second, that the nature of the program content is a significant factor in the audience’s decision to use such services.

3%

16%

78%

No, I'm not interested

Yes, but it doesn't add much value

Yes, I find it very interesting

81% 73%

65%

38%

27%

11% 8% 5% 5%

0%

10%

20%

30%

40%

50%

60%

70%

80% 90%

100%


Conclusion(s) - Users did not watch during the sport action, but mostly used the multi-camera service when there was a moment of lesser activity in the course of the match. Thus, the current level of excitement of the match seems to be the main factor in shaping the level of engagement of users with the service. The more exciting the moment of the match, the less likely is that users will be using the multi-camera application.

- The selection of the streams is also very important. Not all multi-camera views are born equal: some will experience much heavier usage than others. To a very large extent, the choice of available views determines the dynamics of camera switching of the users.

- We can propose a clustering of sports audiences on the basis of their multi-camera application usage pattern. A large majority of users can be expected to be “dabblers”, just trying out the application out of curiosity, but not interacting with the system in a sustained and meaningful way. At the other end of the distribution, there will be as well a small group of “very engaged” users, using the application on a regular (but not necessarily predictable) basis. If we set the threshold for engagement at 10 minutes of active usage, for our pilot test the relative size of both groups can be estimated at about 95% and 5% of the total audience of app users.

- An adequate selection of the program content has been found to

be critical for the success of a multi-camera HbbTV application. Program genres in which the relevant action may happen simultaneously in several locations provide the best fit for a multi-camera service. Sports such as football, basketball or tennis, and racing events like the Formula One or MotoGP competitions have been identified as particularly suitable content for multi-camera.

5.3.1.9. Experimental Assessment of the Impact of Hardware Variance

on Latency and User Satisfaction with HbbTV Applications

Pilot Action 3.19




The test consisted in an experimental design in which the participant user was asked to carry out a sequence of six typical tasks on a range of three devices with three different content programs. The tasks, always carried out in the same order as displayed below, were the following:


1. Access the test HbbTV application 2. Choose a specified content from the list of available multi-camera contents 3. Go to the multi-camera mosaic 4. Choose a view from the multi-camera mosaic 5. Return to the mosaic 6. Choose a different view from the mosaic

Programs were circulated between the TVs after each test. Task completion time was recorded, as well as how satisfied were the users with the amount of time it took for the TV device to complete the task (not acceptable – intermediate – acceptable).

A total of 13 users volunteered to participate to the test. In a number of occasions, tasks could not be completed at the first try because of technical issues with the TV devices. The test was carried out with three devices. Two other devices were prepared for the test; however, in the pre-test phase both devices were determined to be unable to run the HbbTV 1.5 application, and in consequence with this finding were excluded from the user test.

Table 16: Experimentation documents for experimental latency test


A total of 78 data measurement points were obtained, this being the outcome of 13 users performing 6 tasks each. The tasks were randomized across the three TV devices with different known latencies. The latencies were measured five times of each task, and averaged. Since all six tasks had similar delays, the measure taken to characterize each device was the average of all six tasks. The results of the data analysis are presented in the figure below, which

content evaluation content evaluation content evaluation content evaluation content evaluation content evaluation

1 1 2 1 1 1 1 1 1 1 2 1 1

2 1 2 1 1 1 1 1 1 1 1 1 1

3 2 3 2 3 2 2 2 2 2 1 2 1

4 1 2 1 1 2 1 2 1 2 1 1

5 3 2 3 1 3 1 3 1 3 1 3 1

6 2 1 2 1 2 2 2 2 2 2 2 2

7 1 3 1 2 1 1 1 1 1

8 1 2 1 1 1 1 1 1 1 1 1

9 3 3 3 2 3 2 3 2 3 2 3 2

10 2 3 2 1 2 1 2 3 2 1 2 3

11 2 2 2 1 2 2 2 1 2 2 2 1

12 1 2 1 1 1 2 1 1 1 1 1 1

13 3 2 3 2 3 1 3 1 3 1 3 1

1 2 2 2 3 2 1 2 2 2

2 2 3 2 3 2 3 2 3 2 3 2 3

3 3 3 3 3 3 3 3 3 3 3 3 3

4 2 3 2 3 2 2 2 3 2 2 2 3

5 1 2 1 3 1 3 1 3 1 3 1 3

6 3 3 3 3 3 3 3 3 3 3 3 2

7 2 3 2 3 2 3 2 3 2 2 2 3

8 2 2 2 3 2 2 2 3 2 1 2 3

9 1 2 1 3 1 3 1 3 1 3 1 3

10 3 3 3 3 3 3 3 3 3 3 3 2

11 3 2 3 3 3 2 3 1 3 3 3 2

12 2 3 2 3 2 3 2 3 2 3 2 3

13 1 3 1 3 1 2 1 3 1 2 1 3

1 3 2 3 3 3 1 3 3 3

2 3 3 3 2 3 2 3 2 3 3 3 2

3 1 3 1 2 1 3 1 3 1 2 1 2

4 3 2 3 2 3 2 3 2 3 3 3 2

5 2 2 2 2 2 1 2 1 2 2 2 1

6 1 2 1 3 1 2 1 2 1 2 1 2

7 3 3 3 3 3 3 3 2 3 3 3 3

8 3 3 3 1 3 3 3 2 3 3 3 2

9 2 2 2 2 2 2 2 2 2 2 2 2

10 1 3 1 2 1 2 1 3 1 2 1 2

11 1 3 1 1 1 2 1 2 1 1 1 2

12 3 3 3 2 3 2 3 2 3 3 3 2

13 2 2 2 3 2 2 2 1 2 2 2 2

Task 5. Back to mosaic Task 6. Choose other cam

C

B

A

Task 1. Access app Task 2. Choose content Task 3. Go to mosaic Task 4. Choose cam

UserTV Model


summarizes the number of users which found unacceptable, intermediate and acceptable the level of delay they experienced in performing tasks with the differently-delayed TVs.

Table 17: Test user satisfaction with the level of delay of performed tasks across the three different TV devices

In the figure above, we can clearly observe that the level of satisfaction is strongly correlated with the average time to complete tasks of each TV device.

Conclusion(s) - There was no significant observed difference across devices between the different tasks. That is, the choice of task did not change the users’ perception of the acceptable level of delay.

- Hardware delays above the 5-6 second threshold were found to produce frustration in most test users. This frustration increases progressively as delays become longer, quickly deteriorating the user experience.

- Delays larger than 8-10 seconds were considered not acceptable by all test users

- However, users reported that the subjective feeling of “slowness” could be ameliorated with the inclusion of a “waiting indication”; that is, a design element in the application (i.e. a clock, a completion bar, etc.) which reassures the user that the task is being completed, even if it is at a slower rate than the user would prefer.

5.3.2. MPEG DASH Encoder

5.3.2.1. Cross-device Application Performance Test – Madrid workshop

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Model A (12.3 sec) Model B (5.5 sec) Model C (7 sec)

Acceptable

Intermediate

Not acceptable


Pilot Action 3.20

Data Source Professional-users and Technical Data

Measure Moment during


The interoperability of DVB Hybrid III Conference took place at the premises of Cellnex Telecom in Tres Cantos, Madrid. The tests involved eight different manufacturers of SmartTVs.

For this workshop a multiplex with three TV-RING services was prepared:

• TVC TV-RING MQ App URL:

http://193.104.51.238:8080/hmc/workshop_tvring_01_real/index.html

• TVC TV-RING SQ App URL:

http://193.104.51.238:8080/hmc/workshop_tvring_02_singlequality/index.html

• TVC TV-RING 1MPD App URL:

http://193.104.51.238:8080/hmc/workshop_tvring_03_oneMPD/index.html

Two separate tests were carried out:

TV-RING Multi-camera tests (MPEG-DASH LIVE and VoD)

The purpose of the test is to verify the behavior of TVs with HbbTV1.5 with the TV-RING pilot content that were deployed both in Gurb and in pilot tests open to the public. These pilots consist in the broadcast of multiple live cameras via broadband in conjunction with the broadcast of specific live events. These pilots use different MPEG-DASH Live streaming per camera with different manifestos, although each camera can have multiple qualities to adapt to the bandwidth available to the user. Subsequently, these events are made available to users as video on demand (VoD) also in MPEG-DASH.

To check the behavior of HbbTV v1.5 TVs with our MPEG-DASH streams, for VoD as well as for live content, the HbbTV application used in Gurb was configured within the service called "TVC TV-RING MQ":

http://193.104.51.238:8080/hmc/workshop_tvring_01_real/index.html

http://193.104.51.238:8080/hmc/workshop_tvring_02_singlequality/index.html

http://193.104.51.238:8080/hmc/workshop_tvring_03_oneMPD/index.html


Image 48: Test application for performance test

To feed the Live content, a Wowza server was configured with 2 content and a mosaic:

Mosaic: Mosaic selection of the two cameras. . Multi-quality output format SD 1024x576p 3 and 5Mbps.

Content 1: TV3 program signal. Multi-quality output format HD 1280x720 + SD in 1024x576p 3 7Mbps and 5Mbps.

Content 2: Esport3 program signal. Multi-quality output format SD 1024x576p 3 and 5Mbps.

The output of generated MPEG-DASH streams contained a screen-printed indication to know what quality is being played at the time.

Image 49: Range of qualities of adaptive streaming

The regular Gurb pilot VoD content is used for the VoD tests.

NOTE: The service "TVC TV-RING SQ", which contains the same application, was also configured, giving Live manifests with a single quality (3Mbps). The aim is to use this service in those cases in which the multi-quality streaming fails, to check whether quality changes have an impact on the problems found.

The tests with the TV sets consisted of the following checkpoints on the Video play of the MPEG-DASH content:


- Multi-quality Real Live:

Correct Video play of streams

Immediacy of Video play

Fluency and error-free Video play.

Check of quality changes (where quality is burned onto image).

Subtlety of quality changes.

Sequence of qualities reproduced by the device.

Time of mosaic-camera switching.

If the Video play of multi-quality streams causes any problems, the test is repeated with the single-quality streaming application.

- VoD




Check of quality changes (note: in this test there is no indication of which quality is being played, so only error-free Video play can be assessed).

Subtlety of quality changes.

Sequence of qualities reproduced by the device.

Time of mosaic-camera switching.

TV-RING Multi-camera tests with Unique Manifest (MPEG-DASH 1MPD LIVE and VoD)

The purpose of the test is to verify the behavior of HbbTV1.5 television sets with an alternative way to generate content for the TV-RING multi-camera pilot: to use a single manifest (MPD) with multiple independent video components (in the DASH jargon, multiple 'Adaptation Sets’). A test lab was prepared to perform the test with both MPEG-DASH Live as with video on demand (VoD), notwithstanding the generation of the streams with different tools. To simplify the test, the content is generated in only one quality.

To check the behavior of HbbTV v1.5 TVs with our MPEG-DASH streams, for VoD as well as for live content, the HbbTV application used in Gurb was configured within the service called "TVC TV-RING 1MPD":


Image 50: Test application for performance tests, unique manifest

When the streaming content is played and there is a camera switch, the application displays a screen with logs indicating how many components of type 'video' have been recovered by the command 'getComponents', and what index is assigned to be played.

To generate the Live content, a broadcast of the "Oh Happy Day" program was used, which has 2 content and a mosaic. To generate Live streaming the "Unified Streaming" server was used.

To generate the VoD content, the same broadcast of the "Oh Happy Day" program was used, which has 2 content and a mosaic. DASH content was generated with GPAC tools supported by an Apache server.

The tests with the TV sets consisted of the following checkpoints on the Video play of the MPEG-DASH content:




Video commutation

Check of correct video commutation

Errors observed during commutation (freeze, fading, etc.)

Commutation time


In the following subsections the analysis and results can be found for all SmartTV models tested in the following tables. The particular manufacturers and models have been


anonymized, as information of the performance of each TV set is deemed confidential by the manufacturers. This was a condition to be allowed to participate in the workshop.

TV-RING Multi-camera tests (MPEG-DASH LIVE and VoD)

The test results for each anonymized model are summarized in the following table:

MPEG-DASH LIVE

Vendor Model Stream Video play Quality Changes Cam Changes Cam

Switch Time

A A1 Video play OK.

Correct and with no

observable alterations. Note: First quality

displayed is lowest.

Correct.

Note: There is a black screen in the

transition.

4-5s

A A2 Video play OK.

Correct and with no



Correct.


transition.

5-6s

B B1

Stream is not

reproduced: screen remains black.

N/A

C C1 Video play OK.

Correct and with no



Correct.


transition.

5s

C C2 Video play OK.

Correct and with no



Correct.


transition.

4-5s

C C3

Video play OK. Note: there’s

buffering with black screen at start

Correct and with no



Correct.


transition.

6s

D D1 Video play OK.

Correct and with no



Correct.


transition.

7-8s


E E1

Buffering with frozen image at start, Video play of 3MB quality about 10 seconds, then black screen.

Note: it appears only the first playlist read

is reproduced.

No quality change has

been observed.

No Video play when cam commutation.

N/A

E E2

Black screen for a few seconds, then

crashes to broadcast.

F F1

Video play of only the first 8 seconds.

Note: it appears only the first manifesto is

reproduced.


been observed.

Correct commutation, but stops after 8 sec.

Video play.

5s

F F2

Video play of only the first 8 seconds.

Note: it appears only the first manifesto is

reproduced.


been observed.

Correct commutation, but stops after 8 sec.

Video play

5s

G G1

Correct Video play at start, but stops after

Video play time longer than a playlist

(>8s) and does not continue.

Correct and with no



Correct commutation, but stops after a few seconds of Video play

9-10s

H H1

Stream is not

reproduced: screen remains black.

Table 18: Results for MPEG-DASH Live

MPEG-DASH VoD

Vendor Model Stream Video play Quality Changes Cam Changes Cam

Switch Time


A A1

Video play OK.

Correct and with no observable

alterations

When cam changes content starts playing from start

Note: nevertheless, jumping with ‘seek’ work correctly.

5-6s

A A2

Video play OK.


alterations



6-7s

B B1

Video play OK.


alterations

Correct.

Note: There is a black screen in the transition.

4-7s

C C1

Video play OK.


alterations

Correct.


3-5s

C C2

Video play OK.


alterations

Correct.


7-9s

C C3

Video play OK.


alterations

Correct.


9-10s

D D1

Application crashes to broadcast.

N/A

E E1

Buffers with frozen image for 2

seconds, then plays video correctly.


alterations.



4-5s


E E2

Black screen for a few seconds, then

crashes to broadcast.

F F1

Video play OK.


alterations



5s

F F2

Video play OK.


alterations

When cam changes video freezes for 20s and then

proceeds to play as intended. Sometimes video play fails and

black screen is displayed.

>15s

G G1

Video play OK.


alterations

Correct.

Note: video freezes during change

9-10s

H H1

Video play OK.


alterations

Correct.


5-8s

Table 19: Results for MPEG-DASH VoD

TV-RING Multi-camera tests with Unique Manifest (MPEG-DASH 1MPD LIVE and VoD)

The test results for each anonymized model are summarized in the following table:

MPEG-DASH 1MPD LIVE

Vendor Model Stream Video play Cam Changes Cam

Switch Time


A A1

Correct Video play (of default video)

Does not work properly: it seems to change cam (black screen for 1-2

sec.), but afterwards the same video is still played.

App logs show that getComponents(0) can obtain the 3

components of the video.

N/A

A A2


Does not work properly: it seems to change cam (black screen for 1-2

sec.), but afterwards the same video is still played.

App logs show that getComponents(0) can obtain the 3


N/A

B B1

Stream is played but video and audio are de-synchronised:

video starts and stops, audio plays continuously.

No effect. App logs show that that getComponents(0) cannot obtain any

video component.

N/A

C C1

Stream is not played. Screen goes black.

No effect. App logs show that that getComponents(0) cannot obtain any

video component.

N/A

C C2

Stream is not played. App crashes to broadcast.

App logs not displayed. N/A

C C3


Cam change fails. App logs show that getComponents(0) can obtain the 3


N/A

D D1

Stream is not played. App crashes to broadcast.

App logs not displayed N/A


E E1



E E2 [not tested] [not tested] N/A

F F1




N/A

F F2




N/A

G G1



H G2



Table 20: Results for MPEG-DASH 1MPD Live

MPEG-DASH 1MPD VoD

Vendor Model Stream Video play Cam Changes Cam

Switch Time

A A1

Video play OK

Correct.


App logs show that getComponents(0) can obtain the 3 components of the

video.

1s


A A2

Video play OK

Correct.


App logs show that getComponents(0) can obtain the 3 components of the

video.

1,5-4s

B B1

Default video is played, but starts and stops.

Audio is synchronized.

Correct.

Note: screen goes black for <1sec, video is displayed without audio, then

after 7sec video play is slightly accelerated, and audio starts playing

afterwards, synchronized.

3 components are detected at getComponents.

1s

(7s for audio)

C C1

Video is played for 2sec (a camera chunk), then mosaic

is played continuously.



N/A

C C2





N/A

C C3





N/A

D D1

Video play OK

Cam change causes app to crash to broadcast. App logs show that

getComponents(0) can obtain the 3 components of the video.

N/A

E E1


Combo does not work properly, cam change cannot be selected.

N/A


E E2

Stream is not played. Screen goes black, TV freezes.

N/A

F F1

Video play OK



N/A

F F2

Video play OK



N/A

G G1

After 12sec delay, video play starts and runs properly.

Toolbar not displayed, commutation not possible. App logs not displayed.

N/A

H H1

For 3sec cam1 is played, then crashes back to mosaic without any external input.

Cam changes after a long period (buffering issues suspected)

Cam rank is not as expected (0->mosaic, 1i 2->cams)

>15s

Table 21: Results for MPEG-DASH 1MPD VoD

Conclusion(s) A total of 13 devices were tested with the piloted HbbTV multi-camera application. The results of the performance test can be summarized in the following figures, which give an appraisal of how successfully can the application be used in real-live conditions with market-available TV devices.

MPEG-DASH LIVE

- Stream Video play: 6 devices played video correctly, 7 failed to do so. None had other issues.

- Quality change: 7 changed the quality of streamed video correctly, 3 played video but with no observable quality differences and 3 were unable to play any video.

- Cam change: 6 switched cam correctly, 3 switched but stopped playing after a few seconds, and 4 failed to display any video.

- Cam switch time: the range of time elapsed in switching cameras with all devices was between 4 and 10 seconds.

MPEG-DASH VoD


- Stream Video play: a majority of devices (11) had no issues playing stream video; only two devices crashed to broadcast while attempting to complete the task.

- Quality change: All 11 devices which could play video could change video quality without observable alterations.

- Cam change: From 11 eligible devices, 6 TVs changed camera correctly, 4 started play from the very beginning (a serious impeding issue), and one froze for more than 20 seconds before playing correctly.

- Cam switch time: widely differing depending on the model, ranging from 3 to 20 seconds.

MPEG-DASH 1MPD LIVE

- Stream Video play: only 3 TVs can display the live streams, the rest (9) fail to do so.

- Cam change: None of the 13 test devices is able to switch camera - Cam switch time: Not available (all devices fail to switch)

Note: The fact that the test failed in all models, even in those in which VoD worked, is strongly suggestive that manifest generation should be revised.

MPEG-DASH 1MPD VoD

- Stream Video play: 5 devices play streams correctly, 2 crash while attempting to complete task. The 6 remaining devices experience issues which significantly impact performance

- Cam change: just 2 devices can change cam properly. 2 more devices change cam with significant technical issues, while the other 9 devices cannot complete cam switch.

- Cam switch time: between 1 and 4 seconds in well-functioning TVs. Up to 15 seconds in devices with issues.

Note: Some issues were observed with navigation in some elements of the app in some models, design updates are recommended to avoid interference in future tests.

5.3.2.2. MPEG-DASH Encoder Performance Test

Pilot Action 3.21


Measure Moment during


The Pilot required encoding live content in a format suitable for adaptive streaming. The DASH protocol was selected and the performance of the DASH encoder of the Live Media Streamer open source project (developed by i2CAT) was evaluated: 2 Full HD input streams could be encoded simultaneously on a 4-cores machine, and 5 on a 24-cores machine.


The MPEG-DASH standard has been designed with adaptive streaming in mind, and so it was selected as the protocol to download content from the streaming server to the end users. Basically, this protocol offers the player software different quality options for the same content, each with a different bandwidth cost. The player can then choose the highest quality that the observed network bandwidth permits, and download it using conventional HTTP.

This requires that the streaming server has stored all the possible qualities (“representations” in DASH slang) for all the served content. For Video-On-Demand content (VOD), this only entails an off-line pre-processing of all the content and a lot of storage space. For Live scenarios, though, all representations must be generated on the fly, and this is the task that was evaluated.

Next, the performed experiments are described in detail, the outcome of the experiments is presented as a series of plots, and finally some conclusions are drawn.

Test description

i2CAT’s LiveMediaStreamer (LMS) framework contains a Dasher module, which is capable of ingesting a network stream and generate all the required representations. The representations are stored to disk and distributed through a conventional web server.

For this particular scenario, LMS was configured in the shape depicted in Image 51: The input stream is received, decoded and rescaled to as many target sizes as configured. The rescaled streams are then encoded to the configured bitrate and fed to the Dasher element, which generates the representations and stores them to disk, ready to be served over HTTP.

Image 51 LMS Scheme

The performance of the above LMS configuration has been assessed with the following experiment: The above scheme has been instantiated multiple times on the same machine until it started dropping packets due to the machine being overloaded. This indicates the number of simultaneous DASH streams that a given machine can generate using LMS with this configuration.

Also, the CPU usage and input bandwidth have been monitored to ensure that packet losses are due to system congestion (as expected) and not due to network limitations.

The input stream is a full HD (1920x1080 @ 24 fps) encoded in H.264 @ 3Mbps and delivered through RTP over UDP. To avoid external network traffic interference, all tests have used the loopback interface.

All representations have been encoded using H.264 in a number of frame sizes and bitrates, as dictated by the configuration file. Two different configurations have been tested:

tv3.cfg: contains 3 representations (qualities):

1920x1080@7Mbps

1280x720@5Mbps

Receive Dasher

Encode Rescale

Encode Rescale Decode

Encode Rescale Disk


1024x576@3Mbps

google.cfg: contains 6 representations (qualities):

1920x1080@4Mbps

1280x720@2Mbps

854x480@869Kbps

640x360@686Kbps

426x240@264Kbps

256x144@100Kbps

Two different machines have been tested, with different characteristics:

Server A: 4 cores (Intel(R) Core(TM) i5-4430 CPU @ 3.00GHz) and 8GB RAM. This is a single-chip, 4-cores, non-hyper-threaded server.

Server B: 24 cores (Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz) and 32GB RAM. This is a dual-chip, 6-cores, hyper-threaded server.

Four variables have been analyzed for a period of 60 seconds:

Lost Data Blocks (%): Amount of Data Blocks (either uncompressed frames or segments of compressed stream) which could not be processed in time due to the system being busy. This is the total amount of lost blocks during the length of the simulation. Ideally it should be 0. The moment it goes above 0 indicates that the machine cannot generate this amount of DASH streams without introducing artefacts.

CPU (%): Percentage of the CPU time used by LMS. The maximum value is the number of cores times 100% (i.e., 400% for Server A and 2400% for Server B). When this quantity approaches the maximum value the machine is overloaded and probably losses (and artefacts) will occur.

Input bitrate (Mbs): Input bandwidth used, in Megabytes per second. If this line saturates before the machine is CPU-overloaded, it means that there is a network interface bottleneck, unrelated to LMS.

Processing delay (s): Average amount of time it takes a frame to navigate the whole system, from the input receiver until it is written to disk. It should have a constant value, unless the machine is overloaded.


The outcome of the experiments is shown in the plots below. Some conclusions are then drawn in the following section. The following subsections contain the plots for Server A and Server B.


Server A

Image 52: LMS’s Dasher total amount of lost data blocks as a function of the number of simultaneous Dasher instances

Image 53: LMS’s Dasher CPU usage as a function of the number of simultaneous Dasher instances

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

1 2 3 4 5 Num dashers

Lost Data Blocks

tv3.cfg

google.cfg

0

50

100

150

200

250

300

350

400

450


CPU (%)

tv3.cfg

google.cfg


Image 54: LMS’s Dasher Input bitrate consumption as a function of the number of simultaneous Dasher instances

Image 55: LMS’s Dasher processing delay as a function of the number of simultaneous Dasher instances

0

2

4

6

8

10

12

14

16

18


Input bitrate (Mbs)

tv3.cfg

google.cfg

0

50

100

150

200

250

300

350

400

450


Processing delay (ms)

tv3.cfg

google.cfg


Server B

Image 56: LMS’s Dasher total amount of lost data blocks as a function of the number of simultaneous Dasher instances

Image 57: LMS’s Dasher CPU usage as a function of the number of simultaneous Dasher instances

0

2000

4000

6000

8000

10000

12000

14000

16000

1 2 3 4 5 6 7 8 9 10 Num dashers

Lost Data Blocks

tv3.cfg

google.cfg

0

500

1000

1500

2000

2500

3000

1 2 3 4 5 6 7 8 9 10 Num dashers

CPU (%)

tv3.cfg

google.cfg


Image 58: LMS’s Dasher Input bitrate consumption as a function of the number of simultaneous Dasher instances

Image 59: LMS’s Dasher processing delay as a function of the number of simultaneous Dasher instances

Conclusion(s) First and foremost, the amount of simultaneous Dasher instances that LMS supports on each of these machines is readily apparent from the Lost Data Blocks plots (Image 40 and Image 44): It is the highest number of instances which do not introduce losses:

Maximum supported instances

Server A 2

Server B 5

Second, the configuration does not seem to have much impact on the performance of the system. The reason might be that the additional 3 representations in google.cfg not present in tv3.cfg are very small in size and therefore do not stress the system so much.

0

5

10

15

20

25

30

35

1 2 3 4 5 6 7 8 9 10 Num dashers

Input bitrate (Mbs)

tv3.cfg

google.cfg

0

50

100

150

200

250

300

350

400

450

1 2 3 4 5 6 7 8 9 10 Num dashers

Processing delay (ms)

tv3.cfg

google.cfg


Also, it can be seen that the different curves behave as expected:

- The lost data blocks (Image 40 and Image 44) begin at 0 (no data is lost) and stick to 0 as the number of instances increase, until the machine becomes overloaded. From that point onwards, the number of lost data blocks increases steadily.

- The CPU usage curves (Image 41 and Image 45) increase linearly with the number of instances, up to the point where the machine overloads (roughly when the maximum value for the CPU usage is reached).

- The maximum CPU usage for both servers is as expected, this is, 400% for Server A and 2400% for Server B.

- The Input bitrate plots (Image 42 and Image 46) increase linearly at roughly 3Mbs per Dasher instance and no saturation is observed. This means that the input stream is always consumed and the data is lost elsewhere in the system when the machine becomes overloaded.

- The Processing delay curves (Image 43 and Image 47) remain constant up to the overload point, meaning that, as long as the system has enough resources, adding more Dasher instances does not increase the processing delay. Past the overload point, the processing delay increases.

- It is worth noting that in these experiments there was no effort to reduce the constant delay, which is mainly due to the different encoders: a judicious adjustment of the encoder parameters can probably reduce the processing delay should the need arise.

5.3.2.1. Local vs Global CDN

As already indicated during the mid-term review, the initial evaluation idea was discarded. The local CDN was deployed as main distribution infrastructure for the managed pilot actions. Initially it was planned a comparison between the local CDN and the global one, used in open three open pilots, in order to ascertain if local CDNs could be a cost-effective alternative to global CDNs. During the project lifetime, after being in contact with partners and stakeholders this study was considered of no relevance. Pricing of content distribution has been reduced and is not homogeneous depending service providers and customers. In terms of performance, no relevant differences have been encountered among both configurations.

5.3.3. Conclusions for the Spanish Pilot

The key results of the Spanish pilot multi-camera application pilot can be condensed in the following conclusions:

- Multi-camera content can be attractive for audiences under certain usability conditions and for certain programs. Nevertheless, there are constraints from a user-centric point of view that may limit its generalization if not properly addressed by HbbTV application designers.


- Early on during the user evaluation activities, it was found that traditional remote controls offer very poor usability for users accustomed to more agile handheld device navigations. Several approaches to replace or complement the traditional remote control have been successfully implemented, such as speech and gesture recognition controls [5]. In the TV-RING project multi-camera pilot, a Second Screen solution was tested on field trials. The results obtained give weight to the hypothesis that a Second Screen solution can overcome the app navigation limitations posed by conventional remote controls. However, there are some challenges to the uptake of such solutions, as the less technically-savvy users (which usually coincide with older cohorts) are held back by the lack of compatible handhelds in some households and the need to link the devices. Easing the Second Screen – TV linking process, for example with QR codes and visual step-by-step instructions, is paramount to accelerate uptake.

- A number of general usability recommendations for the design of HbbTV applications emerge from the prototype refinement and live piloting phases. These include offering very agile navigations, ensuring consistency in commands by using color codes (i.e. red means go back, green means go forward), making explicit to the user the function of every button (forward, back to main screen, exit app), limiting clutter in the screen with minimalistic designs so that content is always the center of attention, and displaying a machine reaction for every user action.

- A significant finding is that hardware performance problems have a serious impact on the user experience. Users may display some patience with waiting while loading content and slight degradations in video quality, but are not so understanding with instances in which they feel their TVs take an excessive amount of time to process their requests. More specifically, hardware delays above the 5-6 second threshold were found to produce frustration in most test users. This frustration increases progressively as delays become longer, quickly deteriorating the user experience. Delays above the 8-10 second mark were considered not acceptable by all test users. Performance problems attributable to hardware are very difficult to address by app developers. Nevertheless, it has been found that their negative impact on the user experience can be minimized by the simple expedient of adding any indication of “task in progress” for the user (i.e. a completion bar, a “wait…” sign), and this reassures the user and may compel her to “stay tuned”.

- Content selection is critical for the success of a multi-camera HbbTV application. Programs in which the relevant action may happen simultaneously in several locations are the best picks for multi-camera content. Sports such as football, basketball or tennis, and racing events like the Formula One or MotoGP competitions have been identified as particularly suitable content for multi-camera. Other kinds of content such as special informative events (i.e. demonstrations, election days) and song contests were piloted during the course of the TV-RING project. The audience’s reaction to these programs was fairly positive as well. Nevertheless, a lesser level of interest was detected, as many users did not see the value of multi-camera services for those kinds of programs vis-à-vis the broadcast content produced by an experienced audiovisual producer.


6. Results across the different pilots

The previous chapter described the results in each pilot separately. Here, we will look for insights across pilots. These results were obtained after a workshop held in Leuven, using the results from each pilot. We clustered the data and derived common themes (see Image 60 and Image 61). The section below will present these results for each identified, common theme.

Image 60: Starting the clustering of the results. Each pilot leader presents one key evaluation result or insight. It is then immediately written down on a post-it, and placed under either a new theme, or an existing theme that fits.


Image 61: More clusters start to take shape towards the end of the exercise.

The application concepts are clear, and valued by participants

The first category relates to the concept/format of the program/application. The results come from ‘Verknallt & Abgedreht’, the ‘TV App Gallery’, and ‘Een tegen 100’. Participants in these pilots said that the concepts behind the applications are very clear, that they appreciate more interaction in the television domain (which is currently very static), and that they enjoyed themselves in these new forms of watching TV. This is an important positive result because it was a core objective in the TV-RING project.

A lack of ownership

This point mainly refers to the ‘TV App Gallery’ and its valorization. The concept is appreciated and would certainly fulfil the needs of many viewers. The problem currently is that there are few organizations interested in driving the technical and business development further. There is a lack of ownership at this time.

Successful types of content: video (not text), multi-camera, personalized content

Both for traditional and new HBB television, the type of content plays a critical role. In the community focused concept for ‘Verknallt & Abgedreht’, the more web-oriented blog content was not interesting for users. They preferred to have video content. From the Spanish pilot we learned that multi-camera content is also appreciated by many users, and certainly provides added value in a number of cases. Finally, personalized content is essential in the current TV landscape. In the Dutch pilot we were able to identify the surrounding success factors for offering personalized content.


Content control, safety & privacy are essential for user acceptance

From the ‘TV App Gallery’ in the German Pilot we learned from both a technical and a user point of view that safety and security are essential in order to make such portals a success. This requires sophisticated management features and security in terms of privacy and data protection.

Increasing Engagement

For the Spanish pilot there was a clear link between the multi-camera concept and engagement by the users, especially when the live content has multiple views. On-demand content via multi-camera applications only partly yields to more content being watched. Most users will watch content they have previously watched (live) again, if it is offered with multi-camera functionality. For ‘Verknallt & abgedreht’ in the German Pilot we found that most visits occurred during the weekends, when people have more available time. However, the visitors did not stay long and the amount of visitors decreased over time. For ‘Sandmännchen’ we found that since the end of the summer holidays, the amount of users has been increasing, and that most users are using HbbTV. For the DRM application in the Dutch pilot, the engagement was not that good. Participants used the portal offering content diversified by the different image qualities, but not to a great extent. The interviews and online questionnaires afterwards identified the reasons behind this result.

Providing Proper Synchronization

For second-screen applications making use of HbbTV synchronization is an important topic (insights from the second screen applications in the Dutch pilot). There are many technical challenges. But here we want to point to the importance of good synchronization for the user experience. When people are watching TV, they are in a relaxed state of mind. If at a certain time during the show, viewers are given the opportunity to provide input (answer a poll for example), it is important to announce this in a timely fashion. Users have to grab their tablets or smartphones, start-up the device and the application, and only then can they start providing their input. Also at the end of the user interaction sufficient time should be provided for the users to provide their input. Finally, the right answers should only be shown after everyone has provided their input. In some cases, this synchronization was not fine-tuned properly, leading to some users being able to see the right answer, while they could still change their own answer.

Distraction by using a second screen can be limited by careful design

An important research question for second screen HbbTV applications, is whether or not they distract people from the show itself. Naturally, we would like to see that the second screen applications augment the whole experience, not that the second screen experience replaces (part of) the program experience. Our results from the evaluations in the Dutch pilot and the German pilot indicate that this does not seem to be the case. We do stress however, that care should be taking for addressing possible distraction during the design phase. Furthermore, we have indications that tablets and smartphones are better suited as second screen devices, because participants reported that when using laptops, they more easily switch to other applications such as email clients. The latter off course increases distraction.

Raise HbbTV Awareness


Even in Germany, a country with a lot more HbbTV adoption than for example Belgium, regular viewers do not know of the existence of, nor the possibilities of HbbTV. In the TV-RING projects, we experienced two implications of this. Firstly, the possibilities of using applications that augment the viewing experience and increase the interactivity needs to be advertised, communicating about HbbTV in general, as well as for each individual program that wants to attract users for these features. When awareness was raised in the German and Dutch pilots for example, the extent of viewer participation also increased. Secondly, since HbbTV brings more interactivity to the television set, the traditional user interface concepts in viewer’s heads are not sufficient. People experienced difficulties navigating on the applications on HbbTV because they had never done so. Therefore, extra attention has to be paid to the usability, more specifically, the learnability of new HbbTV applications.

Providing Ease-to-use HbbTV Applications

By introducing many HbbTV applications to the greater public, and involving those users throughout the whole development and design cycle, we gained a great deal of insight regarding the usability of the respective applications. The evaluations of the 2nd screen formats and the DRM applications in the Dutch pilot, and the TV App Gallery in the German pilot form the basis for this section. Firstly, Dutch viewers already used some of the second screen applications that were not investigated in this project. They reported expecting similar functionality and UI design in other applications. This implies that providing consistency between applications is very important; otherwise people have to learn new things for each application. Secondly, the most prominent issues related to user interaction were a slow response by the TV application (might be due to the novelty of the technology, but nevertheless attention should be paid to optimize performance), complicated navigation menus, unclear labels used for the navigation menus, unclear search functionality, absence of functionality with which users can go back one step, bad synchronization between the second screen application and the TV, and the absence of proper and useful feedback about the system status to the user. It might seem that there a lot of interaction issues. However, overall, participants were satisfied with the ease-of-use of the pilot applications. We consider the detailed issues mentioned earlier to be very important points of attention for any future HbbTV application. Thirdly, not only the usability matters, the user experience also has to be taken into account. The DRM app for example, was fairly easy to use, but did not stand out visually; it was not that attractive to users. Finally, a concept related to usability, namely ‘usefulness’ can be a key driver for the success of an application. A clear case is the TV App Gallery. Where proprietary and broadcast-related portals have different user interface structures and therefore require much effort to learn, offering a central portal such as the TV App Gallery, eliminates this need. The TV App Gallery therefore has a clear advantage concerning the usefulness of app stores.

Supporting the Social Experience

A key goal in the Dutch pilot and German, was to increase and support social experiences using HbbTV. In the Dutch pilot this was realized via several TV formats that offered extra functionality via second screen applications. We found that most participants enjoyed these social experiences, that they felt more connected, and that they felt there were more discussions and conversations as a result of using these second screen applications. Elements that influence the social experience are the use of competition in the format, the use of avatars, voting on statements or polls, and rankings for the household members. Not every


idea related to social experiences was a success. In the German pilot application ‘Verknallt & Abgedreht’ participants did not appreciate the live-blog feature.


7. General Conclusions In this chapter we will reflect on all evaluation results in TV-RING. To do so, we will first present the overview of the objectives that were set out at the beginning of the project, and the metrics we chose to focus on.

7.1. Overview of the Pilot Objectives

Table 22 presents a high-level overview of the whole project. After this, we created one table for the pilots in each country to present the respective results. This is necessary because we formulated specific objectives for each pilot in the beginning of the project. The information was gathered, discussed and analyzed during a workshop on 9 October 2015 in Leuven, Belgium.

For preparation, presentation templates were created by KUL for all partners. The partners could then fill in their evaluation information in these templates. We created one overview presentation, in which each pilot leader had to present:

the applications in their pilot,

the objectives that were formulated in D4.1.1,

an indication of which objectives were reached with 3 options: Yes/No/Partial

an indication of what happened with the application with 4 options: Implemented/Deployed/Evaluated/Valorized)

a clarification that summarizes the evaluation shortly in words.

We also created one template for the conclusions per pilot application. For each pilot evaluation the partners presented:

separate conclusions

evaluation material supporting this conclusion

the source of the material (end-users, professional users, technical data)

the moment of evaluation (before, during, immediately after, or after use)


Image 62: The workshop in Leuven, where we processed the results for each pilot


Pilot Application Status

Dutch Pilot German Pilot Spanish Pilot

Quality differentiation

using DRM

In-house recommend-

dations

2nd

Screen competition

Verknallt & Abgedreht

Unser Sandmännchen

TV App Gallery Multicam Live Multicam VoD MPEG-DASH

Implemented

Deployed in pilot

Evaluated

Valorized

Comments Based on the results of the 1st phase and the literature study we identified the critical success factors for online video. Low user response impaired further investigation in phase 2.

User research in WP2 and some evaluation results helped us construct a recommender application that incorporates household context. Low user response again in phase 2.

Many 2nd

screen applications have been tested. Results indicate they provide a positive, social experience. An important valorization result is see2gather (commercial product).

The HbbTV app was broadcast three times, on different TV channels at regional and national level. Quantitative and qualitative feedback helped a lot in shaping future applications and services.

Based on experience and qualitative feedback from verknallt & abgedreht, the Sandman app was optimized to complement a set of apps for tablets and IP-based TV services like CE developer portals and Amazon Fire. The app has been very successful from and will remain online without a foreseen limit.

The app has been evaluated with many end-users. To gather reactions from professional users, the app was presented at trade fairs, appropriate events, meetings etc. The evaluation results provide an accurate view on the important aspects of use, as well as its place in the current and

App has been piloted in six distinct events; evaluation data fully supports pilot objectives 2-4-5, and provides evidence for at least partial fulfillment of pilot objectives 1-3.

App has been piloted in six distinct events; evaluation data fully supports pilot objectives 2-4-5, and provides evidence for at least partial fulfillment of pilot objectives 1-3.

The technology was tested and assessed in-lab.


future HbbTV landscape.

Table 22: General evaluation overview showing the reached status in each pilot


DUTCH PILOT Description of the objective Yes No Partial

Quality differentiation using Digital Rights Management: Can we simplify the encoding process and differentiate the quality of content based on one key with different statuses? Test user perception of service (objective and subjective). Are people willing to pay more for high quality content?

Objective 1 Simplify the encoding process and differentiate quality and content based on one DRM key with

Objective 2 Investigate if a certain DRM key plays the right content

Objective 3 Test user perception of services and willingness to pay for differentiated content

Objective 4 Investigate maximum based bandwidth

Objective 5 Investigate new DRM models

In-house recommendations for HbbTV and CTV apps: Develop an intelligent recommendation engine data entry that presents both personal and group recommendations on the central HbbTV set, based on variables as time of day, device status and historical data.

Objective 1 Investigate how people watch television content in a household

Objective 2 Investigate what influence the variables mood, device, time of day, and family composition have on viewing habits

Objective 3 Develop a recommendation data entry that presents recommendations for individual persons and groups on the central HbbTV Set

Objective 4 Investigate how the outcome can be integrated in existing recommendation models and tools

HbbTV as a central interface for 2nd

screen competition: Stimulate social interaction and create added value on linear TV through an HbbTV app that acts as a central interface for group second screen play-along in a closed home network.

Objective 1 Pair many 2nd

screen devices in a household or other closed network to one ‘master’ app and synchronize the results with HbbTV.

Objective 2 Investigate how we can keep the technology scalable

Objective 3 Create and encourage real social interaction

Table 23: Overview of the Dutch pilot objectives and results

Clarification for the objectives that were not or partially reached:

DRM Objective 1: The objective is partly performed because connection to the, in 2014 delivered, NPO premium play-out platform that encrypts video and supports DRM licensing (in contrast to NPO’s unencrypted catch-up service), was mandatory. This play out ecosystem already streamlined its DRM process as we originally intended, using a single encryption key for multiple bitrates segmentation.

DRM Objective 2: The objective is partly performed, again because of the 2014 delivered NPO premium play-out platform that already supported DRM segmentation to some extent.


An IP rights issue prevented us from offering a number of popular TV programs, that we would have liked to use during the user test period.

DRM Objective 4: The pilot only featured 1 Mbps and 2,5 Mbps adaptive HLS. In the test group both max qualities where instantly reached for there where no bandwidth shortage on the client side. Maximum based bandwidth therefore was 100%, a common situation in the Netherlands.

DRM Objective 5: During the course of the project the uptake of new DRM models in the industry developed faster than foreseen at the start of TV-RING (2013).

GERMAN PILOT Description of the objective Yes No Partial

Verknallt & abgedreht Phases 1 & 3: Do users feel continuously motivated to use the service?

Objective 1 Do the users perceive any difference in UHD and other content?

Objective 2 Do users enjoy the content?

Objective 3 Do users feel continuously motivated to use the content?

Objective 4 Is the service usable by first-time HbbTV users and experienced HbbTV users?

Verknallt & abgedreht Phase 2: Do social media features attract users?

Objective 1 Do users perceive TV show and app as a seamless service or do they feel distracted from the TV show?

Objective 2 Do users feel involved in the TV show?

Objective 3 Do users feel the presence of other users?

Objective 4 Do users feel continuously motivated to use the content?

Sandmännchen: Do very young users continuously use the service?

Objective 1 Do visitors return?

Objective 2 How long do visitors stay in the service?

Objective 3 How do visit figures develop over time?

Objective 4 Do TV ratings correlate to service usage figures?

TV App Gallery: The goal is to use and promote the TV AppGallery during the TV-RING project. The TVAG prototype was developed within the FI-content project.

Objective 1 Do users feel the need for such an application portal?

Objective 2 What do users think about the portal idea?

Objective 3 Do users understand the portal structure?

Objective 4 Do users feel comfortable with the menu structure?


Objective 5 Promote the TV App Gallery

Objective 6 Find a partner for publishing

Table 24: Overview of the German pilot objectives and results


Verknallt & abgedreht Phase 1 & 3, Objective 2: Users like the content. More specifically, they prefer video over text in this application. But they don’t like to engage in much interaction right after coming home from school. They they want to relax, and broadcast is more suitable for this moment.

Verknallt & abgedreht Phase 1 & 3, Objective 3: Users are motivated to use the content over time, but the time-related aspects have a great impact. These are detailed in the results section. For example, most visits to the application were noted when there was no broadcast.

Verknallt & abgedreht Phase 1 & 3, Objective 4: It is usable but the user evaluation highlighted several areas for improvement. Most users of HbbTV in general, have never heard of this concept, and therefore also don’t really know that a TV can be as interactive as a computer. They are also not used to engage in such activities with a remote control.

TV App Gallery, Objective 3: During the user evaluation, users were able to use the portal structure. Nevertheless, the evaluation revealed several opportunities for improving the navigation of the application.

TV App Gallery, Objective 4: The usability test clearly indicated that the menu structure was not optimal. Many users experienced difficulties during navigation.

TV App Gallery, Objective 6: Despite the wide interest showed during the many events, a real publishing partner for this app had not yet been identified.

SPANISH PILOT 1 Description of the objective Yes No Partial

Multi-camera Live/On Demand: Evaluate the user experience with a live/on demand content multi-camera HbbTV app?

Objective 1 Multiple view on-demand content, yields to more content being watched

Objective 2 More user enjoyment of on-demand content that has multiple views

Objective 3 On-demand repeated consumption of content previously watched live due to availability of multiple views

Objective 4 Less channel zapping when live content has multiple views

Objective 5 More user enjoyment of live content that has multiple views

Table 25: Overview of the Spanish pilot 1 objectives and results.



• Multicam, Objective 1: 57% of test users declared having spent more time watching a given multiple view program compared to a similar single-view program. However, 14% did not, and 29% of respondents were unsure of their answer.

• Multicam, Objective 2: 78% of test users declared having gone back, at some point, to re-watch multi-camera fragments of a previously offered pilot show (This figure includes 47% who re-watched once or twice, and 30% who re-watched more than twice.). However, a majority of these users are likely to be “dabblers”, just trying it out.

SPANISH PILOT 2 Description of the objective Yes No Partial

MPEG-DASH Assesment: Assess the performance of the LiveMediaStreamer (LMS) DASH configuration deployed in the Catalan pilot.

Objective Run an in-lab battery of tests for assessment

Table 26: Overview of Spanish pilot 2 objectives and results.


7.2. Overview of the Pilot Metrics In this section we present an overview of metrics we set out in the beginning. The following tables

will show the metrics initially defined in D4.1 Evaluation Plan, and whether or not these metrics were

used in the end. Below the table is a clarification for the why certain metrics were not used.

DUTCH PILOT

Category Parameters use Parameters not used

Quality differentiation using Digital Rights Management

Location City

Engagement Stream starts, page views, duration of visit Archive depth

Traffic Maximum served bitrate played

Devices Type of Device

In-house recommendations for HbbTV and CTV apps

Location City

Engagement Number of unique visits, duration of visits, stream starts

Click through rate

Action Page views Entry page, exit page

Used devices Device type Static PC, Laptop

Traffic VOD absolute and average Size of stream absolute and average

Rating Recommendation accuracy

HbbTV as a central interface for second screen competition

Location Region (partially)

Engagement Duration of visits

Actions Number of unique visitors Entry page, exit page

Devices HbbTV, number of devices Type of second screen device

Rating Engagement, usability

Table 27: Overview of Dutch Pilot Metrics

Clarification metrics not used in the DRM pilot:

Archive dept: We only offered 5 different genres with each a section of episodes in HD and

SD quality but not really an archive per genre. So we couldn’t investigate or measure this.

Clarification metrics not used in the Recommender pilot:


Entry page, exit page: All users entered the application on the same page. It was not relevant

to measure their exit page, only the duration of visits and rating of the items was really

important.

Static PC, Laptop: We could measure type of device like TV, tablet, mobile or desktop but we

couldn’t distinguish between PC or laptop.

Size of stream absolute and average: We used the standard maximum bitrates of 1Mbit/s,

our reporting tools do not measure the actual individual bitrate served. From average usage

figures it can be concluded that over 97% of the stream served are on the max bitrate.

Clarification metrics not used in Second screen pilot:

Duration of visits: This was not relevant for the pilot. What mattered were the number of

users playing along.

Entry page, exit page: This is not relevant because everyone enters the same page first. We

didn’t t measure the second screen application with GA or Comscore in such detail so we

were not able to register this.

Type of second screen device: We were not able to measure this.

GERMAN PILOT

Category Parameters used Parameters not used

Phase 1: verknallt & abgedreht

UX Accessibility, overall usability, aesthetics/appeal/attractiveness, enjoyment/pleasure, engagement, hedonic quality, flow/immersion, empowerment, sociability, participation, reciprocity, social presence

Technical (for the application) visitors, visits, page views, average generation time of the site, average time on page

(for the video streams) Requests per program, video requests for pilot duration, total bandwidth, accumulated traffic

(for the application) clicks per visit, duration, average duration for returning visitors

(for the video streams) video stream size, measured traffic per program

Phase 2: verknallt & abgedreht (with social interaction)

UX Overall usability, enjoyment/pleasure, engagement, empowerment, sociability, participation, reciprocity, social presence

Accessibility, aesthetics/appeal/attractiveness, perceived usefulness, hedonic quality, flow/immersion, distraction/helpful






Phase 3: verknallt & abgedreht (not planned)

UX - -





Sandmännchen (not planned)

UX - -





TV App Gallery

UX Accessibility, effectiveness, overall usability, aesthetics/appeal/attractiveness, usefulness

*Technical evaluation was not possible. There is no way to offer the TVAG to a wide audience.

Table 28: Overview of German Pilot Metrics

Clarification for planned parameters not used:

Verknallt & abgedreht Phase 1, UX: None of the parameters were used due to the fact that nobody registered as test panel user and the method could not be transferred to the Lab Survey 1.

Verknallt & abgedreht Phase 2, UX: Accessibility, aesthetics, perceived usefulness and hedonic quality were found to be not relevant due to the selection of parameters already covering the related research questions. Flow and distraction were not used due to the lack of applicable methods.

Verknallt & abgedreht Phase 1 and 2, Technical: Clicks per visit, duration and average duration for returning visitors were not implemented due to constraints of the quality and validity of their results. Video stream size and measured traffic per program were found to be not of interest for the research questions.

SPANISH PILOT

Category Parameters used Parameters not used

TV3 a la carta multicamera

Engagement Duration of visits, rating

Actions Number of unique visitors, entry page, exit page


Devices HbbTV, number of devices

Usability Rating

MPEG DASH Encoder

Performance Resources usage (CPU, memory, etc.), maximum number of live video tracks per DASH stream, maximum transcoding quality

Local Managed CDN

Performance Bandwidth consumption from the origin server, bandwidth consumption from the proxy cache, bandwidth savings, latency

Global CDN

Consumption (Usage) GB per month, total requests, origin volume

Throughput (Performance)

Peak request per second, average requests per second, bandwidth at 95% (Mbps), Cache Efficiency (%), Peak Mbps, Average Mbps, Peak Origin Mbps, Average Origin Mbps

Table 29: Overview of Spanish Pilot Metrics

7.3. Conclusions

Now we come to the end of this deliverable and reflect upon the presented results. Firstly, we have to look at the objectives set out in the beginning, and verify whether or not we have achieved them. By and large, we have achieved most of our objectives. For the few objectives that were not or partially achieved, we have had valid – mostly practical – reasons. When we analyze the metrics we notice that there are more deviations from the initial plan. This can be explained by the fact that initially, we thought of what metrics might be valuable, and that as the project progresses and the actual pilots come closer, we have a better idea of what makes sense to evaluate.

In the TV-RING project, we have carried out a huge amount of work. In this evaluation document alone, we discussed the evaluation results of 13 applications in 3 countries. The scope of these applications has been very broad and opened up the idea of the traditional TV experience toward what can be achieved with HbbTV and secondary devices. The applications involve second screen applications for social, interactive experiences with TV, an App Store for HbbTV, new recommender approaches, personalized online video offerings, community oriented platforms to keep the audience engaged during the broadcast and in-between broadcasts, simple and interactive concepts for children, live and on-demand multi-camera applications, and a number of specific technical achievements.


8. References

1. Yvonne AW de Kort, Wijnand A. IJsselsteijn, and Karolien Poels. 2007. Digital games as social presence technology: Development of the Social Presence in Gaming Questionnaire (SPGQ). Proceedings of PRESENCE 195203. Retrieved January 12, 2016 from http://home.ieis.tue.nl/ydkort/de%20kort%20et%20al%20Digital%20games%20as%20social%20presence%20technology%20PRESENCE%202007.pdf

2. Jeroen Vanattenhoven and David Geerts. 2015. Contextual aspects of typical viewing situations: a new perspective for recommending television and video content. Personal and Ubiquitous Computing 19, 5-6: 761–779. http://doi.org/10.1007/s00779-015-0861-0


9. ANNEX 1: Dutch Pilot Questionnaire – DRM

Which TV, video and Internet products that you have to pay for, do you use?

Which free TV, video and Internet products do you use?

How many HD (high quality) programs did you watch on the test pilot application?

None – 1– 2 to 5 – 6 to 9 – more than 10

How many SD (standard quality) programs did you watch on the test pilot application?

None – 1– 2 to 5 – 6 to 9 – more than 10

Indicate how important the following factors are for a product or service that offers TV and video.

Not important at all – not important – neutral – important – very important

A. Image quality

B. Completeness of the offering

C. Speed of availability of the offering

D. Amount of time with which content can be retrieved in the library

E. Ease of use

F. Possibility to watch on other devices (tablet, laptop)

G. Avoid advertising during the program

H. Avoid advertising between programs

I. Having a personalized offering

Describe your experiences with the test panel.

Judge the following elements of the test panel application.

A. The application looks visually attractive.

B. I want to use this application again.

C. Overall, it was a nice experience.

D. Overall, this application was easy to use.

E. Overall, I’m satisfied with the time I needed to use the application.

F. Overall, I’m satisfied with the available help information for the application.

Which are the main reasons you use VoD services (in general)?


How often do you use VoD services?

Never – a couple of times a year – a couples of times a month – a couple of times a week – (almost) daily

What did you think of the price per HD (high quality) item in the test application?

Way to low – too low – about right – too high – way too high

Which formula for paying do you prefer?

Per item – subscription for all content – subscription for genres I usually watch

Why do you prefer this formula for paying?


10. ANNEX 2: Dutch Pilot Questionnaire – IQ Test

How many people were playing along in your household?

Did you participate in the HbbTV/red button test?

Did you participate with Philips Hue lamps?

Indicate for the following items how you felt during the game:

Not at all – slightly – moderately – fairly – extremely

I empathized with the others.

My actions depended on the actions of the others.

The actions of others depended on my actions.

I felt connected with the others.

The others were paying attention to me.

I was paying attention to the others.

I was jalous.

I found it cozy with the others.

When I was happy, the others were happy.

When the others were happy, I was happy.

I influenced the others’ moods.

I was influenced by the others’ moods.

I admired the others.

What the others did, influenced what I did.

What I did, influenced what the others did.

I felt vengeful.

I had schadenfreude (malicious delight).

Do you have other remarks about the second screen, HbbTV or the Philips Hue lights?


11. ANNEX 3: Dutch Pilot Questionnaire – De Rijdende Rechter

How many people played the app in your living room?

Judge the following statements

Not at all – slightly – moderately – fairly – extremely

I empathized with the other(s)

I felt connected to ther other(s)

I found it enjoyable to be with the other(s)

When I was happy, the others were happy

When the others were happy, I was happy

I influenced the other's mood

I was influenced by the other's mood

I admired the other(s)

My actions depended on the other's actions

The other's actions were dependent on my actions

The other paid close attention to me

I paid close attention to the other

What the others did affected what I did

What I did affected what the others did

I felt jealous of the other

I felt revengeful

I felt schadenfreude (malicious delight)

I found it a waste of time

I found that I could have done more useful things

I was sorry

I was ashamed

I felt very bad

I felt guilty

I got a mental boost

I saw it as a victory

I felt charged

I felt satisfied

I felt powerful

I felt proud

By playing the game I had the feeling that I was less focused on what happened in the program.

In order to follow the program, I sometimes could not pay sufficient attention to the game in order to get a good result.

Do you have any other remarks about your experience or did you run into any problems during the playing of the app?


12. ANNEX 4: Dutch Pilot Questionnaire – Eurosong

Judge the following statements:

Strongly disagree – disagree – neither agree, nor agree – agree – strongly agree







What can improve the application? Which elements can add something extra to the experience?

For which kinds of programs would you like to see more such applications?

Which elements of the application did you find easy to use? Briefly explain why.

Which elements of the application did you find hard to use? Briefly explain why.

Which elements of the application improved the experience? Briefly explain why.

Which elements of the application worsened the experience? Briefly explain why.


13. ANNEX 5: Dutch Pilot Questionnaire – Een tegen 100

How many people played the app in your living room?

Was it clear that the goal was to play against other household members?

Yes – No

If you were playing with multiple people, what was the effect in the living room?

Which other second screen applications have you used in front of the TV?

How often do you normally watch Een tegen 100?

Never – sometimes – often – always

I usually watch Een tegen 100

Not – alone – with people in my family – with friends and family (also people outside my family)

Judge the following statements

Completely disagree (1) – completely agree (5)

Playing along with the second screen distracted me from what was going on in the program.

I found this app easy to use.

I enjoyed using this app.

Using the app caused more conversation and discussion in the living room.

It was hard to play the game and simultaneously follow the program.

I was completely absorbed by playing the app.

Playing the app increased the sense of competition.

I have the sense that I have a better understanding of the program.

I don’t need such an app. I can enjoy the program on its own.

The second screen app exceeded my expectations.

Did playing along with the second screen app have any added value for you?

Yes – No – A little bit

What did you like or what did you find good about the app?

What could be improved, or could be different in the app?


Do you have any other remarks about your experience or did you run into any problems during the playing of the app?

What grade would you give to the app (between 1 and 10)?

Are you also familiar with the stand-alone app of Een tegen 100?

Yes – No

What grade would you give the stand-alone app of Een tegen 100?

For which programs would you enjoy playing along?


14. ANNEX 6: German Pilot Questionnaire – verknallt & abgedreht

Part 1 – TV Usage

1. How often do you watch TV? 2. Are you using a connected Smart TV at home? 3. If you have a TV in your room, is it connected to the Internet? 4. What kind of content do you prefer?

Part 2 – verknallt & abgedreht

1. Through which channel do you watch the program? 2. How many episodes have you seen so far? 3. Would you recommend the show to friends?

Part 3 – Related Content

1. Do you know these additional content? (Screenshot) 2. Which of them did you like most? 3. Would you recommend them to friends?

Part 4 – Live-Blog

1. Do you know the Live-Blog at www.verknalltundabgedreht.de? 2. Do you know the Live-Blog on the TV screen?

Part 5 – (AttrakDiff)

1. Please evaluate the service by checking the boxes in the below table

2. Questions about the Live-Blog (summarized in Image 30) 3. Would you recommend the Live-Blog to friends?

http://www.verknalltundabgedreht.de/


15. ANNEX 7: German Pilot Questionnaire – verknallt & abgedreht

phase 1

(Planned) Online questionnaire

1. Wer bist du? Deine E-Mail Adresse E-Mail Adresse* ________________________________ Verwende bitte eine E-Mail-Adresse, die du regelmäßig abrufst. Nach der Registrierung schicken wir dir eine E-Mail mit einem Link. Wenn du den Link anklickst, kannst du deine Registrierung abschließen. E-Mail Adresse bestätigen* ________________________________ Es ist wichtig, dass du deine E-Mail-Adresse zweimal hintereinander eingibst, denn so können Tippfehler vermieden werden. Dein Geschlecht Du bist*

ein Junge

ein Mädchen Dein Geburtsdatum (Auswahl-Menü) Um zu sehen, ob Eure Meinungen vielleicht vom Alter abhängig sind, benötigen wir dein Geburtsdatum. Dein Wohnort Bundesland Region

2. Welche Medien nutzt du?

Habt ihr einen Fernseher zu Hause??

Ja

Nein

Besitzt du einen eigenen Fernseher?

Ja

Nein

Wie oft nutzt du den Fernseher?

täglich

wöchentlich

monatlich

gelegentlich/selten


nie

Nutzt du ein Fernsehgerät mit Internetanschluss (Smart TV)?

Ja

Nein

Wofür nutzt du den Fernseher?

Fernsehprogramm schauen

Mediatheken

Videostreaming-Dienste (wie Watchever, Maxdome oder andere)

Skype

Sonstiges: ____________________________

Welche von diesen Geräten besitzt du?

einen Computer/Laptop

ein Smartphone (z.B. iPhone)

einen Tablet PC

Womit gehst du ins Internet? (Du kannst bis zu drei Kreuze machen)

Computer

Smartphone

Tablet PC

Wie oft nutzt du das Internet?

täglich

wöchentlich

monatlich

zumindest gelegentlich/selten

nie

Wofür nutzt du das Internet?

Kommunikation (E-Mail, Chatten, Communities, Gesprächsforen, soziale Netzwerke, Skype)

Videos/Musik (YouTube, Mediatheken … )

Informationen (z.B. Nachrichten, für Schule)

Web 2.0/Mitmachen (Blogs, eigene Videos hochladen, Twitter, Fotosammlungen, Wikipedia)

Sonstiges: ____________________________

a) Wenn du Social Media-Anwendungen nutzt, welche nutzt du regelmäßig?

Wikipedia

Videoportale (YouTube, Vimeo,…)

private Netzwerke und Communities (Facebook, Google+,…)

Instagram

Blogs

Twitter

Sonstiges: ____________________________


3. Fragen zu Vorlieben und Gewohnheiten:

Welche Sendungen interessieren dich?

Dokumentationen oder Kultur

Sehe ich oft ○ ○ ○ ○ ○ ○ Sehe ich nie

Fernsehmagazine (Politik, Kultur, Ratgeber usw.)


Tägliche Serien (Krimis, Thriller, Soaps usw.)


Wöchentliche Serien (Krimis, Thriller, Soaps usw.)


Fernsehshows (Casting, Reality-TV, Spielshows, Quiz, Talk usw.)


Nachrichten


Was machst du, während du fernsiehst?

Zusatzinformationen zur Sendung im Internet suchen

oft ○ ○ ○ ○ ○ ○ nie

Hausarbeiten (Spülen, Aufräumen, Wäsche aufhängen …)

oft ○ ○ ○ ○ ○ ○ nie

Hausarbeiten (Spülen, Aufräumen, Wäsche aufhängen …)

oft ○ ○ ○ ○ ○ ○ nie

Essen

oft ○ ○ ○ ○ ○ ○ nie

Kommunikation mit anderen zur laufenden Sendung (WhatsApp, Skype, …)

oft ○ ○ ○ ○ ○ ○ nie

Kommunikation mit anderen unabhängig von dem, was gerade läuft

oft ○ ○ ○ ○ ○ ○ nie

Twitter, Facebook, Blogs zur Sendung (z.B. #tatort)

oft ○ ○ ○ ○ ○ ○ nie

Hausaufgaben

oft ○ ○ ○ ○ ○ ○ nie

Sonstiges: ____________________________


4. Fragen zu deinen Erwartungen:

Was könnte dich zu dieser Serie besonders interessieren?

a) Chat

…mit anderen Zuschauern und

Sehr interessant ○ ○ ○ ○ ○ ○ uninteressant

…mit den Jugendlichen aus der Serie


… mit Experten zum Thema Liebe


…mit Experten aus der Produktion


Auf welchem Gerät würdest du einen Chat zur Sendung gerne lesen? Auf…

Computer/ Laptop

Smartphone

Tablet PC

Fernseher

Auf welchem Gerät würdest du selbst gerne aktiv mitchatten? Auf…

Computer/ Laptop

Smartphone

Tablet PC

Fernseher

b) Homestories (Videos) der Jugendlichen Während die Sendung im TV läuft

Äußerst interessant ○ ○ ○ ○ ○ ○ uninteressant

Auf welchem Gerät würdest du dir die Homestorys gerne ansehen? Auf…

einem Computer, Laptop

einem Smartphone

einem Tablet PC

einem Fernseher Außerhalb der Sendezeit



Auf welchem Gerät würdest du dir die Homestories gerne ansehen? Auf…

einem Computer, Laptop einem Smartphone

einem Tablet PC

einem Fernseher

c) Bildergalerien

Während die Sendung im TV läuft


Auf welchem Gerät würdest du dir die Bildergalerien gerne ansehen? Auf…


einem Smartphone

einem Tablet PC



Auf welchem Gerät würdest du dir die Bildergalerien gerne ansehen? Auf..


einem Smartphone

einem Tablet PC

einem Fernseher d) Steckbriefe von Mitwirkenden (Schauspieler, Film Crew, Experten)





einem Smartphone

einem Tablet PC



Auf welchem Gerät würdest du dir die Steckbriefe gerne ansehen? Auf…



einem Smartphone

einem Tablet PC

einem Fernseher

e) Lustige Videos zum Thema „unnützes Liebeswissen“





einem Smartphone

einem Tablet PC



Auf welchem Gerät würdest du die Videos zum Thema „unnützes Liebeswissen“ gerne ansehen? Auf…


einem Smartphone

einem Tablet PC

einem Fernseher

5. Bemerkungen, Ergänzungen und Erläuterungen

Interview guide Lab Survey 1

Einleitung

„Wir haben hier eine Fernsehserie für Kinder und Jugendliche“

„Wir schauen uns zusammen einen kurzen Film an“

◦ Bitte am PC zeigen, um der Navigation der HbbTV-App nicht vorzugreifen

„Wie findet Ihr das? Klingt das für Euch interessant?“

◦ Falls nicht: „Welches Thema hätte Euch mehr interessiert?“

„Zu dieser Fernsehserie gibt es noch mehr auf dem Fernseher“

◦ Fernseher an, zum KIKA schalten, App aufrufen

◦ Ihr wisst selbst am besten, ob Eure Kinder HbbTV kennen. Falls nicht, würde ich nicht HbbTV erklären – höchstens mal nennen, damit sie beim nächsten Mal wissen, wovon die Rede ist:

◦ „Was wir hier in unserer Abteilung gemacht haben, ist eine HbbTV-App. Das ist ein langweiliges Wort, das keiner versteht, aber eigentlich bedeutet das, dass man nicht nur fernsehen kann, sondern zu dieser Serie noch mehr findet“


Verständnis-Test (Explore)

* Kinder die Anwendung entdecken lassen. Jedes Kind soll selbst (und einzeln) mit der Fernbedienung die Anwendung erkunden. Ist es selbstverständlich, welche Tasten sie dafür benutzen müssen?

„Was genau man hier tun kann, sollt Ihr jetzt mal selbst ausprobieren. Eine(r) von Euch bleibt mit UVW hier, die anderen gehen mit XY mal durch die anderen Räume und gucken sich alles

◦ Das Test-Kind sitzt vor dem Fernseher, ein Beobachter sitzt daneben und gibt nur zarte Anstöße, falls es mal nicht weitergeht.

◦ Bittet das Kind, zu sagen, was es tut und warum; oft ist das den Testern unangenehm, weil sie Angst haben, etwas falsches oder dummes zu sagen.

Sollte es wieder verstummen, muss man spontan entscheiden, ob man daran erinnert oder stumm beobachtet und notiert.

Spätestens wenn man nicht mehr versteht, was es tut oder zu tun versucht, sollte man nochmal darum bitten, dass es laut sagt, was es tun will.

◦ Auf keinen Fall korrigieren, falls das Kind einen unerwarteten Weg nimmt! Aber auf jeden Fall notieren!

◦ Nur eingreifen, wenn es ohne Hilfe gar nicht weiterginge. Lieber ermutigen, es weiter zu versuchen und in so einem Fall genau notieren, was passiert ist.

Es gibt kein richtig oder falsch! Vielleicht sagt Ihr das auch laut und deutlich – das würde ich spontan in der Situation entscheiden!

Test-Ablauf

◦ Die Startseite ist auf dem Bildschirm, s.o.

◦ Die Fernbedienung liegt vor dem Fernseher

◦ Das Kind sieht die Startseite – was tut es?

◦ Sollte es selbst zur Fernbedienung greifen, beobachtet einfach, was es tut.

Notieren müsst Ihr nur, wenn etwas unserem Konzept widerspricht. Ansonsten reichen Notizen wie „Alles schnell verstanden“ „Oder alles wie erwartet“

◦ Sollte es nicht intuitiv zur Fernbedienung greifen, sprecht es an,

was es sieht und

was es glaubt, was sich hinter all diesen Dingen (Bildern) vielleicht verbirgt

Drückt dem Kind dann die Fernbedienung in die Hand und bittet es, zu sagen, was es tut und warum

◦ Es müssen nicht alle Seiten durchgeklickt werden, das würde zu lange dauern! Und das ist eigentlich auch kein Usability-Test

◦ Wichtig ist aber, dass am Ende (nochmal) ein Video aufgerufen wird

Vergleichs-Test (Experience)

* Hierbei geht es um die Frage, ob der Qualitätsunterschied HD/SD eine Rolle spielt. Dieser Test bedingt aber, dass von mindestens einem Video, das in der HbbTV-App ist, auch eine bessere oder schlechtere Qualität vorliegt. Ob und wo so etwas existiert, weiß Remo. Falls nicht, kann man das streichen


◦ Ein Video aus der HbbTV-App heraus aufrufen

◦ Dasselbe Video nochmal am PC aufrufen

„Wir zeigen Euch jetzt zweimal das gleiche Video. Also in beiden Videos passiert dasselbe, aber findest du, dass eins davon schöner aussieht?“


16. ANNEX 8. German Pilot Questionnaire – verknallt & abgedreht

phase 2

Interview guide Lab Survey 2 (originally developed as online questionnaire)

Teil 1

Fragen zur Gerätenutzung

Wie oft siehst Du fern? Kreuze eine Antwort an.

täglich □

mehrmals pro Woche □

einmal pro Woche □

seltener □

nie □

Nutzt Du ein Fernsehgerät mit Internetanschluss? Kreuze eine Antwort an

ja □ nein □ weiß nicht □ Wenn Du einen eigenen Fernseher hast, ist dieser an das Internet angeschlossen? Kreuze eine Antwort an. ja, ist angeschlossen □

nein, ist nicht angeschlossen □

weiß nicht □

habe keinen Fernseher □

Welche Sendungen schaust Du Dir am liebsten an? Du kannst bis zu 3 Kreuze machen.

Sitcoms/Comedy □

Scripted Reality □

Comics / Zeichentrick □

Krimis / Mystery □

Wissensmagazine/ Dokumentation □

Ärzte-/Krankenhaus-Serien □

Daily Soaps □

Info/Nachrichten □

Casting-Shows □

Sportsendungen □

Fragen zur Sendung

Wo schaust Du „verknallt & abgedreht“? Du kannst beliebig viele Antworten geben.


im rbb-Fernsehen □

in der Mediathek □

in der „verknallt & abgedreht“-Anwendung

(HbbTV)

□

sonstiges_____________________ □

Wie viele Folgen hast Du bereits gesehen? Schreibe die Zahl auf.

Zahleneingabe von 1-20: __________________ Wie wahrscheinlich ist es, dass Du diese Sendung Deinen Freundinnen und Freunden weiterempfiehlst? Kreuze eine Zahl an. (0 unwahrscheinlich – 10 sehr wahrscheinlich)

0 – 1 – 2 – 3 – 4 – 5 – 6 – 7 – 8 – 9 – 10

Fragen zu den Zusatzinhalten

Kennst Du die Zusatzinhalte auf dem Fernseher? Kreuze eine Antwort an.

Ja, kenne ich und habe ich schon angeschaut □

Ja, kenne ich, nutze ich aber nicht □

Nein, kenne ich nicht □


Welche Zusatzinhalte gefallen Dir insgesamt besonders gut? Du kannst bis zu 3 Kreuze machen.

Live-Blog □

Videos □

Voting □

Interviews □

Steckbriefe über die Darsteller □

Steckbriefe über die Crew □

Lexikon Filmfachsprache □

Rubrik „Unnützes Liebes-Wissen“ □

Bildergalerien der Promis, Orte, Coaches □

Wie wahrscheinlich ist es, dass Du die Zusatzinhalte Deinen Freundinnen und Freunden weiterempfiehlst? Kreuze eine Zahl an. (0 unwahrscheinlich – 10 sehr wahrscheinlich)

0 – 1 – 2 – 3 – 4 – 5 – 6 – 7 – 8 – 9 – 10

Fragen zum Live-Blog

Kennst Du den Live-Blog auf www.verknalltundabgedreht.de ?

Ja, habe schon mitgemacht □ Ja, habe den Live-Blog während der Sendung verfolgt □ Ja, habe den Live-Blog angeschaut, aber nicht während der Sendung □ Nein, den Live-Blog kenne ich nicht □

http://www.verknalltundabgedreht.de/


Kennst Du den Live-Blog auf dem Fernseher?

Ja, habe schon mitgemacht □

Ja, habe den Live-Blog während der Sendung verfolgt □

Ja, habe den Live-Blog angeschaut, aber nicht während der Sendung □

Nein, den Live-Blog kenne ich nicht □


Teil 2

Nachfolgend findest Du Wortpaare, mit deren Hilfe Du die Anwendung zu verknallt & abgedreht bewerten kannst. Sie stellen jeweils extreme Gegensätze da, zwischen denen eine Abstufung möglich ist.

Diese Bewertung bedeutet, dass die Anwendung für Dich eher kompliziert ist.

Denke nicht lange über die Wortpaare nach, sondern gib bitte die Einschätzung ab, die Dir spontan in den Sinn kommt. Vielleicht passen einige Wortpaare nicht so gut auf die Anwendung, kreuze aber bitte trotzdem immer eine Antwort an. Denke daran, dass es keine „richtigen“ oder „falschen“ Antworten gibt – nur Deine persönliche Meinung zählt!


17. ANNEX 9. German Pilot Questionnaire – TV App Gallery

Did you have any problems? If so, please describe them.

What could be improved?

What do you think about the concept of having access to any HbbTV application at a central location?

What do you find on the navigation positive and what is negative?

What is your general impression of the TV App Gallery?

How useful do you think in general, is an open app portal for HbbTV?

Do you use HbbTV also at home? If so, what do you use?


18. ANNEX 10: Spanish Pilot Questionnaire – Oh Happy Day

User Profile

1. Age 2. Sex 3. Generally, I am a person who is... Showing stress Tense or stressed Animated, excited, enthusiastic Upset, angry, irritated Satisfied Unsatisfied

Attentive, careful Enable Nervous Restless, worried Passive 4. Which of the following descriptions fits the characteristics of your home? Families with children Couple Group of friends Person households Other 5. How often do you watch TV last month? Once Two or three times Once a week Several times a week Daily Several times a day 6. Where do you usually watch? In my house At the home of someone else (friends, family ..) At work or at school Elsewhere


7. To what extent do you usually watch television alone or with others (friends, family ...)? I usually watch TV alone I usually watch TV with other More or less watch television alone and with other 8. When you see the television with others, how many people are usually around you? 1 other 2 or 3 others 4 or 5 others Over five others 9. What other activities do you usually do while you see on TV throughout the week? While using a laptop While I use a desktop computer While I use a tablet While I play games (e.g. the mobile device or laptop ...) While listening to the radio While I do household chores (e.g. Iron, ...) Other 10. What kinds of programs you see most often? Movies Sports Drama series and TV series Documentaries News and information Comedy Magazines and variety programs Programs 'hobbies' and lifestyles Musical programs Cartoons 'Drama' (Big Brother type) Competitions Other 11. How often do you watch TV on the Internet (for example, through the Charter TV3 website) last month? Once Two or three times Once a week


Several times a week Daily Several times a day 12. Do you use the Internet to watch programs you missed or you could not see live? Yes No 13. Do you like to watch TV programming following the emissions, or prefer to do it at a time of your choice? Following programming At the time that I choose 14. You tend to watch a TV program if you have been recommended by friends or family? Yes, I always tend to ignore their recommendations Yes, but only after searching for more information No, rarely

Technical Glitches 15. On a scale of 0 to 10, to what extent have you been affected by technical problems while using the multi-camera app? 0-10 16. Have there been instances when you've been prevented from using the multi-camera app because of technical problems? Yes, I could use the app at any time Yes, at some point I could not use due to technical problems No, Always I've wanted to use the app, it worked fine 17. Which of the following technical problems you experienced at least one occasion? I tried to enter the TV-RING tab within the application from TV3, but I have not had access Within the TV-RING tab, I wanted to see the content but not the videos I uploaded Other

Evaluation of the application - I 18. How often do you think you would use this app?


Daily A couple of times a week A couple of times a month Less than once a month Almost never 19. Did you find the multi-camera content easily? Yes, it was quite clear Yes, but I really No, I've searched but have not found No, I do not know what the content is multi-camera 20. Have you found it interesting to watch a program from different points of view? Yes, I think it is a useful feature Yes, but it does not add much value to the program No, just not enough to me 21. How satisfied are you with the following items of the application? Video Quality Speed when changing views Accessibility of content Ease of use of the application Selection of available content Availability of multiple cameras Clarity of presentation of the information very dissatisfied dissatisfied Not sure satisfied very satisfied 22. How do you agree with the following statements? I think I would use the application often I found that the application was unnecessarily complex I found that the application was easy to use I would need the help of a technician to use the app I found that the various elements of the application were well organized and integrated I found too many inconsistencies in the application I guess most people learn to use the application quickly I found that the application was heavy to use


I felt comfortable using the application I had to learn many things before you can start using the application Strongly disagree Disagree Neither agree nor disagree According Strongly Agree 23. Is there something you've missed in the application? 24. In general, how satisfied are you with the application? 0-10 25. Would you recommend this app to your friends and family? 0-10

Evaluation of the application - II 26. Which of the pictograms image below best represents your mood after you have used the multi-application? Pictograms A-I 27. As of happy or unhappy you felt using multi-application? 1-9 28. How did you feel calm or active application by using the multi? 1-9 29. To what extent do the following words describe your emotional state while taking multi-use application? Attentive Pleased Surprised Cheerful Focused Bored Involved


Not at all A little Moderately Pretty Extremely


19. ANNEX 11: Spanish Pilot Questionnaire – FC Barcelona vs

PSG Champions League

User Profile

1. Age 2. Sex 3. Which of the following descriptions fits the characteristics of your home? Families with children Couple Group of friends Person households Other

Technical Glitches 4. Have there been instances when you've been prevented from using the multi-camera app because of technical problems? Yes, I could use the app at any time Yes, at some point I could not use it due to technical problems No, I've always been able to use the app, it worked fine 5. On a scale of 0 to 10, to what extent have you been affected by technical problems while using the multi-camera app? 0-10 6. Which of the following technical problems you experienced at least one occasion? I tried to enter the TV-RING tab within the application TV3, but I did not have access Within the TV-RING tab, I wanted to see the content but not the videos I uploaded Other

Evaluation of the application 7. Have you found it interesting to watch a program from different points of view? Yes, I think it is a useful feature Yes, but It does not add much value to the program


No, just not enough to me 8. What is your opinion about the interest of the additional views offered in the application? Messi Ibrahimovic Luis Enrique Slightly interesting Uninteresting Interesting Very interesting

9. How often do you think you would use this app if offered again in a football game? Frequently (every five minutes or more) From time to time (every 10-20 minutes) Once or twice throughout the game Would not use it

9.1.- If you selected "Would not use" in the previous question, could explain on what grounds?

10. How satisfied are you with the following items of the application? Video Quality Speed when changing views Accessibility of content Ease of use of the application Selection of available content Availability of multiple cameras Clarity of presentation of the information Very dissatisfied Dissatisfied Not sure Satisfied Very satisfied


20. ANNEX 12: Spanish Pilot Questionnaire – Mayoral Elections

Technical assessment

Glitches in the installation of the second screen app 1. Have you had the opportunity to settle the second screen app on a tablet or smartphone, following the instructions in the attached PDF in our e-mail ("Pilot Municipal Elections May 2015")? Yes, I tried to install the second screen app No, I have not had the chance to install the application 1.1 If so, you could control the multi-application on your TV via tablet or smartphone? Yes, I could switch views through the tablet No, I have come to control the application and switch views across the table at any time 2. If you experienced any problems or technical use tablet or phone, could you briefly describe the problem? 3. And finally, could you tell us which model of tablet and operating system you have done use?


21. ANNEX 13: Spanish Pilot Questionnaire – Mayoral Elections

UX

User Profile

1. Age 2. Sex 3. Which of the following descriptions fits the characteristics of your home? Families with children Couple Group of friends Person households Other

Technical Glitches 4. There have been times when you've been experiencing technical problems that have prevented you from using the app to change the view from your tablet or smartphone? Yes, I could not control the application on your tablet or phone at any time Yes, at one time I could not control the change of camera phone or tablet due to technical problems No, I've always been able to control the app from my tablet or phone, it worked fine I do not know, this time I could not participate in testing the application of mobile or tablet

4.1 If you have attempted to use a phone or tablet to control the application, could you tell us what model and OS are your tablet or phone?

5. On a scale of 0 to 10, to what extent have you been affected by technical problems while using the multi-camera app? 0-10 6. Which of the following technical problems have you experienced at least one occasion? I tried to enter the TV-RING tab within the application TV3, but I have not had access Within the TV-RING tab, I wanted to see the content but not the videos I uploaded Other


Evaluation of the application 7. Have you found it interesting to watch a program from different points of view? Yes, I think it is a useful feature Yes, but it does not add much value to the program No, just not enough to me 8. How often do you think you would use this app if offered again in a special informative event? Frequently (every five minutes or more) From time to time (every 10-20 minutes) Once or twice throughout the game Would not use


9. How satisfied are you with the following items of the application? Video Quality Speed when changing views Accessibility of content Ease of use of the application Selection of available content Availability of multiple cameras Clarity of presentation of the information Very dissatisfied Dissatisfied Not sure Satisfied Very satisfied

10. Is there something you've missed in the application? 11. In general, how satisfied are you with the application?


0-10


Juventus Champions League

User Profile

1. Age 2. Sex 3. Which of the following descriptions fits the characteristics of your home? Families with children Couple Group of friends Person households Other Technical Glitches 4. There have been times when you've been experiencing technical problems that have prevented you from using the app to change the view from your tablet or smartphone? Yes, I could not control the application on your tablet or phone at any time Yes, at one time I could not control the change of camera phone or tablet due to technical problems No, I've always been able to control the app from your tablet or phone, it worked fine I do not know, this time I could not participate in testing the application of mobile or tablet

4.1 If you have attempted to use a phone or tablet to control the application, could you tell us what model and OS are your tablet or phone?

5. On a scale of 0 to 10, to what extent have you been affected by technical problems while using the multi-camera app? 0-10 6. Which of the following technical problems have you experienced at least one occasion?


I tried to enter the TV-RING tab within the application TV3, but I did not have access Within the TV-RING tab, I wanted to see the content but not the videos I uploaded Other

Evaluation of the application 7. Have you found it interesting to watch a program from different points of view? Yes, I think it is a useful feature Yes, but it does not add much value to the program No, just not enough to me

8. What is your opinion about the interest of the additional views offered in the application? Slightly interesting Uninteresting Interesting Very interesting

9. How often do you think you would use this app if offered again in a football game? Frequently (every five minutes or more) From time to time (every 10-20 minutes) Once or twice throughout the game Would not use


10. How satisfied are you with the following items of the application? Video Quality Speed when changing views Accessibility of content Ease of use of the application Selection of available content Availability of multiple cameras Clarity of presentation of the information Very dissatisfied Dissatisfied


Not sure Satisfied Very satisfied

11. Is there something you've missed in the application?



Bayer Leverkusen Champions League

User Profile

1. Age 2. Sex 3. Which of the following descriptions fits the characteristics of your home? Families with children Couple Group of friends Person households Other Evaluation of the application

4. Did you like programs and content offered through the application, compared to the same program without additional views?

Yes No I do not know

5. Has the fact that there were multi-camera programs prompted you to spend more time watching TV, compared with the same program with no views?


6. Have you have accessed the application to watch again fragments of programs you've already seen live, this time to watch them from different views?

Yes, often Yes, once or twice No, never I do not know


7. Has the possibility to access multiple views in a program made you change channels less, compared with the same program without additional views?


8. In conclusion, from 0 to 10, 0 being the lowest score and 10 the highest, what is your overall assessment of the implementation of the final multi-camera application of the TV-RING project

0-10

9. Finally, what do you think are the positive aspects of having more views on a particular content?


24. ANNEX 16: Spanish Pilot Questionnaire – FC Barcelona vs AS

Roma Champions League

User Profile

1. Age 2. Sex 3. Which of the following descriptions fits the characteristics of your home? Families with children Couple Group of friends Person households Other 4. With how many people did you watch the game?

I watched it by myself 1 other person 2 or 3 other people 4 or 5 other people

5. Which is the model of your TV?

Evaluation of the application 6. In general, how satisfied are you with the application? 0-10

7. Is there something you've missed in the application? 8. How satisfied are you with the following items of the application? Video Quality Speed when changing views Accessibility of content Ease of use of the application


Selection of available content Availability of multiple cameras Clarity of presentation of the information Very dissatisfied Dissatisfied Not sure Satisfied Very satisfied

9. Have you found it interesting to watch a match from different points of view? Yes, I think it is a useful feature Yes, but I does not add much value to the program No, just not enough to me 10. What kind of programs do you think would like to see in multi-camera? Football Basketball Special news (elections, etc.) News and news Magazines and variety programs Musical programs 'Drama' Quiz shows and competitions Formula 1 and motorcycling Other

deliverable - tv-ring...d4.3 evaluation results 2 version 1.0, 01/04/2016 1. executive summary this...

Documents