[ieee 2009 international conference on networks security, wireless communications and trusted...

4
The Measurement of the Search Charts of Music Yue Yang, Changjia Chen, Yishuai Chen, School of Electric and Information Engineering, Beijing Jiaotong University [email protected], changjiac [email protected], [email protected] Abstract Nowadays, more and more singers published various songs. There are about 50,000 songs in the internet. The songs are refreshed everyday, so we need a real-time chart of music. In this paper, our study is based on the real-time search charts of music. We studied the algorithm of the charts and analyzed the ranking of the charts. According to the ranking of the charts, we got some special characters. Those characters can not be seen from other music charts, such as selling charts and other week’s charts of music. While, there are also some problems about the real- time charts, like special users of charts. Those problems need us to solve. We will get more impersonality charts after solving those problems. Keywords: search charts, special users, real-time charts 1. Introduction Music, likes movies and pictures now are important parts of people's lives. Music charts play an important role in the music industry. Giles [1] told us that the charts of music can return the information to the music market. Many music charts are based on the selling of CDs, such as UK charts and Billboard’s [2] charts. But nowadays, we can’t estimate the popularity of music only by the selling. Today there are many ways for people to listen the music. Many people choose to listen to the music on internet instead of buying the CDs. So the most popular music must not to be the best selling one. Today, if we want to know the popular music, we can search on the internet. Through the hit charts of the music we will get the popularity of the music. There are many kinds of music charts, such as the hit charts, the download charts, the mobile ringtone charts and so on. We will discuss the charts of searching in this paper. A discussion of real-time charts is provided in the next section. Section 3 introduces a special chart that we study in this paper. We show the data and results we got in section 4. Section 5 is a conclusion of our study. The last two sections are about the future work and acknowledgment. 2. The real-time charts Most music charts can only show the musical rank of the last day, or the last week. Those charts are only able to tell us what or who is the most popular song or singer last day or week. So the message of the charts is not the real-time. Sometimes we need to know the latest rank of the music. For example, when a new single is published in today's morning, then many people want to listen to the latest song at the first time. If you don't know the latest song, you will not be the in-man. Especially, when we search the music charts, we want to know the latest and the hottest song. So we will expend the charts can tell us the latest results. In 2008, when the Sichuan earthquake happened in china, the songs for the earthquake are popular. But many charts of music don’t show that until one or two weeks later. 3. The naver-charts In this paper, we get the data from the naver-charts. The naver-charts are real-time common charts. As we know, the www.naver.com has over than 70% of the Korean market. So we can say that more than 35 million Korean people use it. The messages of the www.naver.com are translated to many languages and quoted in any web station of Korean entertainment. The naver-charts we studied have many differences from other charts. Firstly, in fact, the naver-charts is not the hit charts, it is a searching charts. In this charts, the rank is based on the searching times. Secondly, the naver-charts show the top-10 hit keywords of music in real-time. So it is a real-time chart. Thirdly, the chart is 2009 International Conference on Networks Security, Wireless Communications and Trusted Computing 978-0-7695-3610-1/09 $25.00 © 2009 IEEE DOI 10.1109/NSWCTC.2009.295 346

Upload: yishuai

Post on 09-Feb-2017

214 views

Category:

Documents


1 download

TRANSCRIPT

The Measurement of the Search Charts of Music

Yue Yang, Changjia Chen, Yishuai Chen, School of Electric and Information Engineering, Beijing Jiaotong University

[email protected], [email protected], [email protected]

Abstract

Nowadays, more and more singers published various songs. There are about 50,000 songs in the internet. The songs are refreshed everyday, so we need a real-time chart of music. In this paper, our study is based on the real-time search charts of music. We studied the algorithm of the charts and analyzed the ranking of the charts. According to the ranking of the charts, we got some special characters. Those characters can not be seen from other music charts, such as selling charts and other week’s charts of music. While, there are also some problems about the real-time charts, like special users of charts. Those problems need us to solve. We will get more impersonality charts after solving those problems.

Keywords: search charts, special users, real-time charts 1. Introduction

Music, likes movies and pictures now are important parts of people's lives. Music charts play an important role in the music industry. Giles [1] told us that the charts of music can return the information to the music market. Many music charts are based on the selling of CDs, such as UK charts and Billboard’s [2] charts. But nowadays, we can’t estimate the popularity of music only by the selling. Today there are many ways for people to listen the music. Many people choose to listen to the music on internet instead of buying the CDs. So the most popular music must not to be the best selling one. Today, if we want to know the popular music, we can search on the internet. Through the hit charts of the music we will get the popularity of the music.

There are many kinds of music charts, such as the hit charts, the download charts, the mobile ringtone charts and so on. We will discuss the charts of searching in this paper.

A discussion of real-time charts is provided in the next section. Section 3 introduces a special chart that we study in this paper. We show the data and results we got in section 4. Section 5 is a conclusion of our study. The last two sections are about the future work and acknowledgment.

2. The real-time charts

Most music charts can only show the musical rank of the last day, or the last week. Those charts are only able to tell us what or who is the most popular song or singer last day or week. So the message of the charts is not the real-time.

Sometimes we need to know the latest rank of the music.

For example, when a new single is published in today's morning, then many people want to listen to the latest song at the first time. If you don't know the latest song, you will not be the in-man. Especially, when we search the music charts, we want to know the latest and the hottest song. So we will expend the charts can tell us the latest results. In 2008, when the Sichuan earthquake happened in china, the songs for the earthquake are popular. But many charts of music don’t show that until one or two weeks later.

3. The naver-charts

In this paper, we get the data from the naver-charts. The naver-charts are real-time common charts. As we know, the www.naver.com has over than 70% of the Korean market. So we can say that more than 35 million Korean people use it. The messages of the www.naver.com are translated to many languages and quoted in any web station of Korean entertainment.

The naver-charts we studied have many differences from other charts. Firstly, in fact, the naver-charts is not the hit charts, it is a searching charts. In this charts, the rank is based on the searching times. Secondly, the naver-charts show the top-10 hit keywords of music in real-time. So it is a real-time chart. Thirdly, the chart is

2009 International Conference on Networks Security, Wireless Communications and Trusted Computing

978-0-7695-3610-1/09 $25.00 © 2009 IEEE

DOI 10.1109/NSWCTC.2009.295

346

keywords charts of music. The keywords of music include not only the names of the songs, but also the names of the singers and the lyrics. So we can see the trend of the singer, the song and the lyrics from the figures below. 4. Data and results

The data used in this paper is collected in one week. From June 9th to June 17th, we collect the data of the music charts. In this paper, we use part of the data to study. Korean people work for six days in one week. They only rest in every Sunday. So we just concerned about the data of working days. The duration of data is from June 9th 12am to 13th 00am and 14th 00am to 17th 12am. The time showed in the figures is Seoul Time.

For the charts we studied is a searching charts, the numbers of the keywords in the data is about 800. Those keywords conclude the name of the singer, the name of the song, the singer with his song, especially the lyrics of the song. We sorted the keywords into four kinds. Also we can sort the singers into two kinds according to the searching keywords. Parts of them are popular for their names are searched in different forms. Compared the rest singers are common. Table 1 and 2 show us the sorts.

Table 1. The sorts of keywords total singers songs Singer&

song lyrics others

758 55 578 81 48 6 Table 2. The sorts of singers

Singers Popular singers Common singers 55 32 23 4.1. The basic characteristic of the ranking

According to the ranking of the music, we can get the popularity of each singer and song. Here are four figures to explain it.

In the figure 1 below, we can see two lines. These two lines show the rank of two keywords. The keywords user searched for are the singer’s name A (A=wonder girls) and their song a (a=so hot). The blue line is about the “singer and song”, the other line is about the keyword of the “song”. The data of the figure 1 is only from 12am to 12pm in one day.

Figure 2 is also about the rank of “A & a” and a. Figure 2 shows the ranking of keywords in one week. The x-axis of figure 2 is from 0am to 24pm. The y-axis of that show the dates of the data. From top to bottom, there are eight days. The forth day is Saturday. The blue line in figure 2 is about the “singer and song”; the yellow line is only about the keyword of the “song”.

We select another famous singer to compare with the A, we call it B (B=MC Mong). The song of B is b(b=circus). A and B are both famous Korean singers.

Figure 3 illustrates the ranking of “B & b” and b in eight days. And the figure 4 illustrates those in one week. There three lines in figure 3. The blue and green lines are same as the ones in figure 1. The blue line shows the dynamic of the hits of “B & b”, the green line is about the keyword of b and the red line is only based on the “lyrics of b”.

Then we will know how to distinguish the popularity of each singer. If a song is popular, not only the song, but also the singer will be hot. There is a correlation between the popularity of one singer and the one of his song. From the figures we know that the b is more popular than a. 4. 2. Correlation between the A&a and a

Figure 1. The ranking of A&a and a in 9th

Figure 2. The ranking in one week

Figure 3. The ranking of B&b and b in 9th

347

Figure 4. The ranking in one week

We can see that through the correlation between the ranking of “singer and song” and the one of the “song”. We only use the data of July 9th to study in this paper. The expression of the binomial correlation is:

)()()()()(),(

YDXDYEXEXYEYX −=ρ

In our study, X means the ranking of “singer and song”. Y means the ranking of the “song”. E(X) means the average rank of the “singer and song” in one day. E(Y) means the average rank of the “song” in one day. D(X) shows the variance of the “singer and song”. D(Y) shows the one of the “song”. Ρwill show us the correlation between the ranking of the “singer and song” and the one of the “song”.

Finally, we got the correlation of A and B. ρ(A&a,a)=0.0573. ρ(B&b,b)=0.2975.

Obviously, ρ(B&b,b) is much bigger than ρ(A&a,a). So the influence of A’s fans is smaller than the one of B’s. If a singer’s song is popular, the number of that singer’s fans will be bigger; therefore the song will be in the top-k ranking for long time.

4. 3. The special users

In the figure 1, we can see two things. The first thing

they tell us is the popularity of the singer; the second thing is the time that the users search for their singer. We tested from 12am to 12pm. So we can know that the fans of “A” searched their singer from 3pm to 8pm. They rest from 5pm to 7pm.

In the figure 3, the singer’s name is B (B=MC Mong), and his song is b (b=circus). We can also see that the users who search for the song b are common users. The searching time is continuously. But those who search the “singer and song” are not common users. Those uncommon users appeared from 2pm to 3pm and 5pm to 7pm.

We also noticed the same thing in the figure 2 and 4. The keywords of “singer and song” only disappeared in the afternoon and Saturday. So the users who search the “singer and song” only have time in the afternoon

and weekend. Among the users of the naver, there are many users only concerning about one special singer's rank. We call those special users fans. The age of the special users are almost 10 to 20. 4. 4. The special singers

If a singer published a song recently, he or she should be in the top-k ranking. If a singer has not published new songs for a long time, he will not be in the top-10 of the charts. In the music charts, there is a special singer who doesn’t have new songs. But for the music charts, we only want to see the latest singer and song in the top-k. So this is not right for the music charts. Figure 5 shows this thing. The keyword in figure 5 is the name of a very famous singer, TVXQ. Though he published last new song in September of 2006, his name is still in the top-k of music charts everyday. Compared with it, figure 6 to 10 show the ranking of other five singers. We named the five singers with A, B, C, D and E. The singers A and B are the same as the ones before. C published his album in May 22nd. E published her album just in July 14th. The figures show the ranking of the singer from July 9th to 17th. The 9th’s data is on the top of the picture and the 17th’s data is on the bottom. For the 13th’s data is similar with 12th’s, we just concerned about the other 7days.

Figure 5. Singer=T

Figure 6. Singer=A

348

Figure 7. Singer=B

Figure 8. Singer=C

Figure 9. Singer=D

Figure 10. Singer=E

5. Conclusion

The real-time charts have many advantages. They will show the most popular songs in time. They can transfer the newest messages to users as soon as possible. Otherwise, the real-time charts also have some problems. The special users will be a big problem of the real-time charts, especially for the common charts. They will have an effect on the results of ranking. The affection of special users should be eliminated or weakened. 6. Future works

Why do the users search for musical material in the

internet, and how? What kind of application do the users like? [3] Whether the charts we used are impersonal or not? Those things are still concerned.

We need to address a new algorithm of ranking to avoid the infection of special users and let the music charts to be more reliable.

On the other hand, the RSS and the news are also popular. So the real-time charts of RSS and the news will be popular in future. The fresh time of that and the users' study are our future work. 7. Acknowledgements This work is supported by: (1) Chinese Ministry of Science and Technology “973” (2007CB307101) (2) The National Natural Science Foundation of China (60772043, 60672069) (3) Chinese Ministry of Education (20050004033) (4) The Foundation of BJTU (2003SM017) 8. References [1]David E. Giles, “Increasing returns to information in the U.S. popular music industry”, Econometrics Working Paper , EWP0510. [2] David E. Giles, “Survival of the Hippest: Life at the Top of the Hot 100”, Econometrics Working Paper, EWP0507. [3] Joe Futrelle, J. Stephen Downie, “Interdisciplinary Communities and Research Issues in Music Information Retrieval”, 2002 IRCAM – Centre Pompidou, 2002

349