behavioral-based cheating detection in online first … cheating detection in online first person...

Behavioral-Based Cheating Detection inOnline First Person Shooters using

Machine Learning TechniquesHashem Alayed

University of Southern [email protected]

Fotos FrangoudesUniversity of Southern California

[email protected]

Clifford NeumanInformation Sciences Institute

[email protected]

Abstract—Cheating in online games comes with many con-sequences for both players and companies. Therefore, cheatingdetection and prevention is an important part of developing acommercial online game. Several anti-cheating solutions havebeen developed by gaming companies. However, most of thesecompanies use cheating detection measures that may involvebreaches to users’ privacy. In our paper, we provide a server-side anti-cheating solution that uses only game logs. Our methodis based on defining an honest player’s behavior and cheaters’behavior first. After that, using machine learning classifiers totrain cheating models, then detect cheaters. We presented ourresults in different organizations to show different options fordevelopers, and our methods’ results gave a very high accuracyin most of the cases. Finally, we provided a detailed analysisof our results with some useful suggestions for online gamesdevelopers.

Keywords—Cheating Detection; Online Games; MachineLearning

I. INTRODUCTION

Security in online games is an important issue for anygame to gain revenue. Therefore, cheating detection andprevention is an important part of developing a commercialonline game. Cheating in online games comes with manyconsequences for both players and companies. It gives cheatersan unfair advantage over honest players and reduces theoverall enjoyment of playing the game. Developers are affectedmonetarily in several ways as well. First, honest players chooseto leave the game if cheaters are not banned and developerslose their subscription money. Next, the game will gain abad reputation, which will lead to reduced revenue. Finally,companies will have to spend more resources each time a newcheat is discovered in order to develop and release patches tocountermeasure the cheats.Anti-cheating solutions have been developed by gaming com-panies to counter these problems. However, most of thesolutions use cheating detection measures that may involvebreaches to users’ privacy, such as The Warden of ”World ofWarCraft” [1], and PunkBuster [2].In this paper, we provide a method of detecting cheats by sim-ply monitoring game logs and thus using a player’s behaviorstudy techniques. We chose to apply our method on a FirstPerson Shooter (FPS) game we developed using the Unity3D

game engine [3]. The Game is fully online using a client-server model, and game logs are collected on the server side.After collecting game logs, they are pre-processed and thenseveral supervised Machine Learning techniques are applied,such as Support Vector Machines and Logistic Regression, inorder to create detection models. When creating the models,we will provide different experimental organizations for ourdata: Multi-class classification, in which each cheat will berepresented as a class. Then, Two-Class classification, in whichall cheats will be classified as ’yes’. After that, similar cheatsgrouped together as one class. Finally, creating a model foreach cheat seperatly. The resulted detection models can thenbe used on new unlabeled data to classify cheats. The classifi-cation of cheats will be based on different unique features. Wedefined our novel features in addition to previously used ones(in [4] and [5]). Finally, we will rank features and analyze theresults; then suggest proper solutions for developers on howto use the resulting models.The remainder of the paper is organized as follows; Section IIwill contain related work. In Section III, we will describe thedesign and implementation of the game, the feature extractor,and the data analyzer. We will then explain our experimentalresults and analyze them in Section IV. Finally, in Section Vwe will readdress the problem to conclude and discuss futurework.

II. RELATED WORK

Several papers showcased techniques for analyzing humanplayers’ behavior using machine learning techniques. Thesetechniques can be used to either differentiate a human playerfrom a bot or to detect cheating. One of the earliest works inthe field of using machine learning in online games was doneby Thurau et al. They used different approaches to develophuman-like AI agents using the analysis of human playerbehavior. That was done by introducing a neural networkbased bot that learns human behavior [6], and dividing playersactions into strategies and tactics using Neural Gas WaypointLearning algorithm to represent the virtual world [7]. Then,Improving [7] by using applying Manifold Learning for solv-ing the curse of dimensionality [8]. They also used Bayesianimitation learning instead of Neural Gas Waypoint Learningalgorithm in [9] in order to improve [7].

978-1-4673-5311-3/13/$31.00 ©2013 IEEE

Fig. 1: Trojan Battles in-game screenshot

Kuan-Ta Chen et al. [10] provided a bot detection methodusing Manifold Learning. Although their method was veryaccurate, it was based only on the character’s movementposition and thus only worked for detecting a moving bot.Yeung et al [4] provided a scalable method that uses aDynamic Bayesian Network (DBN) to detect AimBots in FPSgames. Their method provided good results, although if morefeatures were used, better results could have been obtained.Galli et al. [5] used different classifiers to detect cheating inUnreal Tournament III. The detection accuracy was very high;however, their results were based on play-tests using up to 2human players versus 1 AI player. Also, some features reliedon the fact that there are only two players in the game, whichis not the case in commercial online games. For example, theycalculate distance between a player and his target all the time(even if the target is not visible).In this paper, we will present different ways of cheating detec-tion using more (and variant) features and different classifiersthan [4] and [5].

III. DESIGN AND IMPLEMENTATION

Our system consists of four components: a game clientwith access to an AimBot system; a game server that handlescommunication between all the clients and logging; a featureextractor that pre-process log data; and finally, a data analyzerthat is responsible of training classifiers and generating modelsfor cheats.

A. The Game: Trojan Battles

Trojan Battles (Figure 1) is the game based on which allthe tests were taken on. It is an online multi-player FPS game,built at GamePipe Lab at the University of Southern California[11]. The game, both server and client, were developed fromscratch for full control over any game development changesand modifications with ease. This also, gives the ability tocreate game logs that contain data similar to many of thecommercial games available now.

1) Game Server: The game server was developed using C#and is a non-authoritative server, thus being mainly responsiblefor communication between the clients and for logging gamedata. The server also manages game time, ensures all clientsare synchronized, and notifies all players when each match

Fig. 2: Lobby screen with customization options

Fig. 3: In-game AimBot toggle screen

ends. Communication is event-based and achieved by sendingspecific messages on in-game actions, like when a playerspawns, dies, fires, etc. It also receives a heartbeat messageon intervals that are set before the start of each match, whichincludes position and state data.

2) Game Client: The game client was developed usingUnity3D and is used by players in order to login on the server,using their unique names. Players can then create a new gameor find an existing game on the lobby screen (Figure 2), join,and finally play a timed-based death-match. Each game canbe customized by using a different map (currently there isa medium-sized and a larger-sized map), changing the gameduration (players can choose between 3, 5, 10 and 15 minutes),and altering the network protocol to be used (TCP or UDP)and the number of heartbeat messages being sent each second(can vary from 5 to 30).The game provides five different weapons each player canchoose from. These are machine gun, riffle, sniper, shotgunand rocket launcher. Each of these weapons provides differentaiming capabilities as well. Each game client gives the playeran Aimbot utility as well which enables them to activate anddeactivate any cheats they want during each match.

3) Type of AimBots: Like commercial AimBots [12], eachcheat is implemented as a feature that can be triggered in-game (Shown in Figure 3). These features can be combinedto apply a certain AimBot, or to create more complicated andharder to detect cheats.

The types of AimBots implemented in this game are:• Lock (L): A typical AimBot that when it is enabled, will

aim on a visible target continuously and instantly.• Auto-Switch (AS): Must be combined with Lock to be

activated. It will switch the lock on and off to confuse

the cheat detector. This AimBot was used in [4].• Auto-Miss (AM): Must be combined with Lock to be

activated. It will create intentional misses to confuse thecheat detector as well. This AimBot was also used in [4].

• Slow-Aim (SA): Must be combined with Lock to beactivated and it could also be combined with Auto-Switchor Auto-Miss. This is a well-known cheat that works asLock, but it will not aim instantly to the target, but ratherease in from the current aim location.

• Auto-Fire (AF): Also called a TriggerBot; when a playercross-hair aims on a target, it will automatically fire.Could be combined with Lock to create an ultimate cheat

B. Feature Extractor

The feature extractor is a very important component ina behavior-based cheating detection technique. We need toextract the most useful features that define a cheating behaviorof an unfair player from a normal behavior of an honest player.These features depend on the data sent from the client to theserver. In online games, data sent between clients and serverare limited due to the bandwidth limitation and the smoothnessof the gameplay. Therefore, in our game, we tried to reducethe size of messages being exchanged to reduced bandwidthconsumption while at the same time logging informative datathat would provide meaningful results.The features are extracted in terms of time frames, and someof them are similar to what has been done in [5] and [4].However, we used different time frames to observe their effecton the detection. Most of the features are extracted fromthe HeartBeat messages logged into our system. If a featuredepends on a target, then it will be calculated only when atarget is visible. In case of multiple visible targets, it willconsider the closest target as the current target. For example,’Mean Distance’ calculates the distance between the playerand his closest visible target. We will explain the featuresextracted below and what kind of data they need from the logs.Also, we will show the most important features in section IV-Bbelow. Note that features which are specific to FPS games willbe marked as (FPS), and Generic Features that can be used inother types of games will be marked as (G):

1) Number of Records (G): Contains the total number oflog rows during the current time frame. It uses the datafrom the HeartBeat messages.

2) Number of Visible-Target Rows (G): Contains the num-ber of rows a target was visible. It also uses the datafrom the HeartBeat messages.

3) Visible-To-Total Ratio (G): Calculate the ratio of thenumber of rows where a target was visible to the numberof total rows in the current frame.

4) Aiming Accuracy (FPS): Tracks the aiming accuracybased on the visible targets at each frame. If a targetis visible, and the player is aiming at it, the aimingaccuracy increases exponentially by each frame. Whenthe player loses the target then the accuracy startsdecreasing linearly. This feature will use aiming targetID and visible targets list from the logs.

5) Mean Aiming Accuracy (FPS): While a target is visible,simply divide the number of times a player aimed on atarget by the number of times a target was visible. Thisfeature will use aiming target ID, and visible targets listfrom the logs.

6) Player’s Movement (G): While a target is visible, thisfeature will define the effect of a player’s movementon aiming accuracy. It will use player’s movementanimation from the logs.

7) Target’s Movement (G): While a target is visible, thisfeature will define the effect of a target’s movementon a player’s aiming accuracy. It will look up thetarget’s movement animation from the logs at the currenttimestamp.

8) Hit Accuracy (FPS): It will simply divide the number ofhits over the number of total shots within the specifiedtime frame. Head shots will be giving higher pointsthan other body shots. It will use the target ID fromthe Actions Table in the logs (target will be 0 if it is amiss).

9) Weapon Influence (FPS): This is a complex feature thatuses Weapon type, Distance, and Zooming to define it.While a target is visible, it will calculate the distancebetween the player and the closest target. Then, it willcalculate the influence of the weapon used from thisdistance, and whether the player is zooming or not. Thisfeature will use weapon type, player’s position, target’sposition, and zooming value from the logs. It will lookup the target’s position from the logs at the currenttimestamp.

10) Mean View Directions Change (G): This feature definesthe change of the view direction vectors during thespecified time frame. It will be calculated using the meanof the Euler Angles between vectors during that timeframe. Note that players will change view directionsdrastically when the target dies; therefore, we put re-spawn directions into account.

11) Mean Position Change (G): This feature will define themean of distances between each position and the nextduring the specified time frame. Like View Directionsabove, this feature will put re-spawns into account.

12) Mean Target Position Change (G): This feature willdefine the target’s position changes only when a targetis visible.

13) Mean Distance (G): While a target is visible, calculatethe distance between the player and the target. Then,calculate the mean of these distances. It will use aplayer’s position and its closest target’s position fromthe logs.

14) Fire On Aim Ratio (FPS): This Feature will define theratio of firing while aiming on a target. It will use thefiring flag as well as the aiming target from the logs.

15) Fire On Visible (FPS): This Feature will define the ratioof firing while a target is visible. It will use the firingflag from the logs.

16) Instant On Visible (FPS): This feature will define the

ratio of using instant weapons while a target is visible.It will use the Instant-Weapon flag from the logs.

17) Time-To-Hit Rate (FPS): This feature will define thetime it takes a player from when a target is visible untilthe target gets a hit. It will use clients’ timestamps fromthe HeartBeat Table in addition to Hit Target from theAction Result Table.

18) Cheat: This is the Labeling feature and it will specifythe type of cheat used. The cheat that is used more than50% of the time frame will be the labeled cheat. If nocheats were used over 50% of the time frame, it will belabeled as ”normal”.

C. Data Analyzer

The Data Analyzer will take the generated features’ fileand apply some machine learning classifiers on them to detectthe cheats used in our game. First, we will train the classifierson a labeled set of data using cross-validation with 10 folds.Then, we will generate detection models to use them in a3-player test run. Finally, we will specify the accuracy ofeach classifier with each AimBot type and present them inthe Results section below. We used an implemented DataMining tool called Weka [13] to apply the machine learningclassifiers and create the models.

TABLE I: Number of Instances used for(a) Parts 1 and 2

Frame Size Number of instancesCheats Normal Total

10 990 1082 207230 421 445 86660 217 223 45090 149 159 308

(b) Part 3

Frame SizeNumber of instances

Cheats Normal TotalLB AF10 708 282 1082 207230 300 121 445 86660 157 60 233 45090 107 42 159 308

(c) Part 4

Frame Size Number of instancesCheats Normal Total

10 190 1082 127230 100 445 54560 50 233 28390 30 159 189

IV. EXPERIMENTS

To produce the training data, we played eighteen differentdeath-matches. Eight matches were 10 minutes long and tenmatches were 15 minutes long. Each match was played bytwo players over the network. Thus, we collected around460 minutes of data. At each match, one player used oneof the cheats in section III-A3 and the other player playednormally, except for two matches, which were played withoutcheats. The messages between the clients and the server wereexchanged at a rate of 15 messages/second. This rate wasselected to accommodate delays, and to avoid over usage ofbandwidth.Since the feature extractor is flexible, we generated datainto different time frames, namely, 10, 30, 60 and 90seconds/frame. Then, we classified them using the mostcommonly used classifiers; Logistic Regression, which issimple, and Support Vector Machines (SVM), which iscomplex but more powerful. To apply SVM in Weka, weused the SMO algorithm [14] For SVM, we used twodifferent Kernels, namely: Linear (SVM-L) Kernel, andRadial Bases Function (SVM-RBF) Kernel. SVM containsdifferent parameters that could be changed to obtain differentresults. One of those parameters is called the ComplexityParameter (C), which controls the softness of the marginin SVM, i.e. Larger ’C’ will lead to harder margins [15].For L Kernel, we set the complexity parameter (C) valueto 10. The complexity parameter of RBF Kernel was set to1000. We trained the classifiers and created the models usingcross-validation using 10 folds [16].

A. Results

To display the results more clearly, we will illustrate themin five parts. The first four parts are used to create and trainthe models. Each part of the first four represents different dataorganization methods. The last part is the testing phase using30 minutes of collected data from three different players.Before we show those representations (parts), Table I showshow many instances were used for each part. Note that TableI(c) contains the average number of cheat instances used inPart 4. Also, note that the number of total instances is lessthan 460 minutes sometimes, since we omit any feature rowthat has a Visible-To-Total ratio of less than 5%.

1) All cheats together, using multi-class classification:First, we analyzed data using multi-class classification,in which we had all the instances mixed together usingspecific label for each cheat class. The accuracy obtainedusing such methods was shown in Table II, and thebest accuracy obtained was using Logistic Regressionwith frame size = 60. The confusion matrix of the bestclassification test is shown in Table II. In this table, younotice that there were many miss-classifications withinthe four ”Lock-Based” cheats (shown in gray): L, AS,AM, and SA; and that is due to the similarity between

TABLE II: Multi-class Classification(a) Accuracy values for each classifier using different time frames

Frame Size SVM-L SVM-RBF LogisticAccuracy Accuracy Accuracy

10 72.9 73.7 72.330 77.9 78.4 76.760 80.7 80.7 80.990 78.2 79.2 77.5

(b) Confusion matrix for Logistic and frame size 60

Actual PredictedL AS AM SA AF No

L 25 6 3 15 1 1AS 3 20 11 2 0 1AM 3 10 14 2 1 3SA 13 2 2 19 0 0AF 0 0 0 0 60 0No 1 1 5 0 0 226

those four cheats (They are all based on Locking withsome confusion added in AS, AM, and SA). Therefore,in section IV-A.3 below, we combined those four cheatsas one cheat to observe the difference in the resultingaccuracy.

2) All cheats combined, using two classes (yes or no):We analyzed data by combining all the cheating as asingle cheat, i.e. label a cheat as ”yes” if occurred.As you can see in Table III, instead of posting theconfusion matrix for each classifier, we showed thedifferent accuracy measurements. The measurementsare: Overall Accuracy (ACC), True Positive Rate(TPR), and False Positive Rate (FPR). The best valueswere distributed between all the classifiers. Therefore,it depends on what the game developers care about most.

3) Lock-Based cheats classified together versus Auto-Fire (Multi-Class):Once we combined Lock-Based cheats together weachieved an 18% jump in overall accuracy compared toPart 1 above. As you can see in Table IV(a), the bestvalue was obtained by using SVM-RBF when framesize = 60. The confusion matrix of this test is shown inTable IV(b).

4) Each cheat classified separately:By separating each cheat, we achieved the highestaccuracy. However, each cheat will have its favorite

TABLE III: Two-class Classification

Frame SVM-L SVM-RBF LogisticSize ACC TPR FPR ACC TPR FPR ACC TPR FPR10 87.7 85.5 10.2 89.3 87.1 8.6 87.3 85.3 10.830 93.5 92.4 5.4 93.7 92.6 5.2 93.4 92.4 5.660 97.3 96.8 2.1 97.3 97.2 2.6 97.1 97.2 390 97.1 96.6 2.5 97.1 96 1.9 95.1 96 5.7

TABLE IV: Locked-Based vs AF Classification

(a) Accuracy values for each classifier using different time frames

Frame Size SVM-L SVM-RBF LogisticAccuracy Accuracy Accuracy

10 88.6 89.6 88.430 94.9 95.2 94.260 98 98.2 94.290 98.1 97.1 95.1

(b) Confusion matrix for SVM-RPF andframe size 60

Actual PredictedLB AF No

LB 153 1 3AF 0 60 0No 1 3 229

classifier as shown in Table V. Again, the choiceof a certain classifier depends on which accuracymeasurement the game developer cares about the most.We should clarify that the larger frame size, the lessnumber of instances. This happened because we areusing the same data.

5) Three Players Test Set:After we generated the models in the previous fourparts, we played a 30-min long death-match thatcontained three players; one honest player and twocheaters (Lock and Auto-Fire). By looking at the resultsin Parts 1, 2, 3 nd 4 above, we decided to chooseclassifiers SVM-L and SVM-RBF with frame sizes 30and 60. Therefore, we used the models created by theprevious trained data on the current new unseen data(The 30-min, Three-players gameplay).We represented the results in Table VI as follows:Table VI(a) shows the number of instances collectedfor each frame size. Table VI(b) shows the best resultsobtained using models from Parts 1, 2, and 3 above.Finally, Table VI(c) shows the best results obtained byusing the models from Part 4. In this table, we showthe detection accuracy as TPR for each cheat type andfor normal gameplay. We chose TPR to capture thedetection accuracy for each model when we providedunknown cheat type.

B. Features Ranking

Before we analyse the results, we will show the featuresthat were important and useful for the prediction process. Inthis paper, we will provide the ranking achieved by using onlythe SVM-L models on all the training parts above. To rank thefeatures, we calculated the squares of the weights assigned bythe SVM classifier [17]. Therefore, the feature with highestweight (squared) will be the most informative feature.

TABLE V: Separated Cheats Classification

(a) Lock

Frame SVM-L SVM-RBF LogisticSize ACC TPR FPR ACC TPR FPR ACC TPR FPR10 97.5 87.9 1 97.5 89 1.1 97.1 87.9 1.530 98.7 95.2 0.7 98.4 95.2 0.9 96 90.5 2.960 98.6 94.1 0.4 98.9 94.1 0 96.5 88.2 1.790 98.5 94.7 0.6 98.5 94.7 0.6 98.9 94.7 0

(b) Auto-Switch

Frame SVM-L SVM-RBF LogisticSize ACC TPR FPR ACC TPR FPR ACC TPR FPR10 93.9 73.9 2.7 94.2 75.5 2.6 93.4 71.7 2.930 96.2 86.8 2.2 96.4 86.8 2 96.4 86.8 260 98.9 97.3 0.9 98.9 97.3 0.9 97.4 97.3 2.690 99.5 95.8 0 99.5 95.8 0 99.5 95.8 0

(c) Auto-Miss

Frame SVM-L SVM-RBF LogisticSize ACC TPR FPR ACC TPR FPR ACC TPR FPR10 93.5 65.8 2.4 93.2 66.5 2.8 93.3 67.7 2.930 96.9 87.9 1.8 96.7 87.9 2 95.9 84.8 2.560 98.1 90.9 0.9 97.8 87.9 0.9 95.1 78.8 2.690 97.8 86.4 0.6 97.8 86.4 0.6 96.1 81.8 1.9

(d) Slow Aim

Frame SVM-L SVM-RBF LogisticSize ACC TPR FPR ACC TPR FPR ACC TPR FPR10 97.7 90.5 1 97.8 91.1 1 97.5 91.6 1.430 99.4 98.6 0.4 98.8 95.9 0.7 97.3 91.9 1.860 100 100 0 100 100 0 99.6 100 0.490 100 100 0 100 100 0 100 100 0

(e) Auto-Fire

Frame SVM-L SVM-RBF LogisticSize ACC TPR FPR ACC TPR FPR ACC TPR FPR10 92.9 75.9 2.7 94.4 81.6 2.2 92.2 75.5 3.530 96.6 92.6 2.2 97.2 94.2 2 96.1 90.9 2.560 98.3 96.7 1.3 98.6 98.3 1.3 98.6 98.3 1.390 100 100 0 100 100 0 99 100 1.3

Table VII shows the ranking of the features for each part insection IV-A above (only training parts 1 to 4). We only showthe top five features since there is a big difference in weightsbetween the first 3-5 and the others. Also, we noticed thatthere were some differences in the rankings between differentframe sizes. However, the top feature is always the same fordifferent frame sizes.The most common top feature is ’Mean Aim Accuracy’, andis the most informative feature in the prediction of Aimbots.For the ’Auto-Fire’ cheat, ’Fire-on-Visible’ and ’Fire-on-Aim’ (both got very high weights) were the most informativefeatures since we are looking for detecting instant firing whena target is over the crosshair.

C. Analysis

As we can observe from the results above, to apply super-vised machine learning methods we need full knowledge aboutour data. In our context, which is FPS AimBots, we noticedthat awareness of the type of each cheat could help increasethe accuracy of the classifier. However, using multi-class

TABLE VI: Test Set Results

(a) Number of instances

Frame Size Number of instancesL AF Normal Total

30 59 41 58 15860 29 24 32 85

(b) Accuracy results for Parts 1, 2 and 3

Part Best Selected ResultsAccuracy Frame Size Classifier

1 78.8% 60 SVM-L2 94.1% 60 SVM-RBF3 98.8% 60 SVM-L

(c) Accuracy results for Part 4

CheatType

Best Selected ResultsTPRfor L

TPRfor AF

TPRfor Normal

FrameSize Classifier

L 96.6% 2.4% 100% 60 SVM-RBFAS 100% 0% 100% 60 BothAM 89.7% 0% 100% 60 SVM-RBFSA 89.7% 0% 100% 60 BothAF 55.2% 100% 96.9% 60 SVM-L

classifier did not prove to be as accurate since we hadconfusion between the lock-based cheats. On the other hand,by separating the cheats and having a classifier for eachtype of cheat, we achieved higher accuracy in the first fourparts (training sets). This shows that separated cheat detectioncan help developers with cheat detection by applying all themodels simultaneously on each instance (explained in the nextparagraph). In part IV-A.5 of the experiments, we noticed thatthe models obtained from parts IV-A.3 and IV-A.4 gave thehighest accuracy.

TABLE VII: Top Five Features in Each Training Part(a) Top Five Features for Part 1

Rank Frame Size30 60

1 MeanAimAcc MeanAimAcc2 FireOnAim FireOnAim3 FireOnVisible AimAcc4 AimAcc FireOnVisible5 TrgtPosChange TrgtPosChange

(b) Top Five Features for Part 2


1 MeanAimAcc MeanAimAcc2 FireOnVisible FireOnVisible3 FireOnAim HitAcc4 TrgtPosChange FireOnAim5 HitAcc MeanDistance

(c) Top Five Features for Part 3


1 MeanAimAcc MeanAimAcc2 FireOnVisible FireOnAim3 FireOnAim FireOnVisible4 HitAcc HitAcc5 AimAcc PlayerMov

(d) Top Five Features for Part 4: Lock


1 MeanAimAcc MeanAimAcc2 HitAcc FireOnVisible3 PlayerMov PlayerMov4 WpnInf HitAcc5 FireOnAim TrgtMov

(e) Top Five Features for Part 4: Auto-Switch


1 MeanAimAcc MeanAimAcc2 TrgtPosChange TrgtPosChange3 PlayerMov PlayerMov4 FireOnAim MeanDistance5 PositionChange WpnInf

(f) Top Five Features for Part 4: Auto-Miss


1 MeanAimAcc MeanAimAcc2 AimAcc HitAcc3 FireOnAim MeanDistance4 HitAcc TrgtPosChange5 PlayerMov PlayerMov

(g) Top Five Features for Part 4: Slow-Aim


1 MeanAimAcc MeanAimAcc2 HitAcc HitAcc3 PlayerMov MeanDistance4 FireOnAim TimeToHit5 MeanDistance AvgInstntWpnOnVis

(h) Top Five Features for Part 4: Auto-Fire


1 FireOnVisible FireOnVisible2 FireOnAim FireOnAim3 HitAcc HitAcc4 TrgtMov MeanAimAcc5 WpnInf MeanDistance

However, when these models are applied to an online game,there is no labeled set (i.e. unseen data). Therefore, thereneeds to be a method of confirming whether a player isa cheater or not. A good way is to specify a thresholdto identify a cheater. The threshold can be based on thenumber of flagged records and when reached, a player can bedetermined as a cheater. The threshold value depends on thedetection accuracy of the model and on the developers’ policyfor cheat detection. It also depends on when the detection istaking place: online while the cheater is playing (real-timedetection), or offline after a game is finished. The followingis an example of how this detection technique can be usedby changing the threshold, based on the above test data (PartIV-A.5). For the purposes of the example, the model fromPart IV-A.2 is going to be used and a frame size 60. Thethreshold is going to be set to 50%, which provides a loosedetection to avoid some false positives. Then, by using thefollowing formula the actual number of cheating records thatwill flag a player as a cheater can be determined:

NumberOfRecords(NR) =

MatchLengthInSeconds/FrameSize

Then,

NumberOfCheatingRecords(NCR) =

NR ∗ Threshold

In this example:

NR = 1800/60 = 30

Therefore,NCR = 30 ∗ 0.5 = 15

On the other hand, if we look at the results on the test setfor applying the models from part IV-A.4, we notice that theall Lock-Based models detected the Lock cheat accurately;however, only the Auto-Fire model detected the Auto-Firecheat accurately. Therefore, using these models, and applyingboth of them simultaneously on the gameplay data canprovide the required detection mechanism. The threshold isset again at 50%; i.e. NCR = 15 using the previous formula,however the following formula needs to be used to detect acheater:

LockBasedCheatingRecords(LBCR) =

Max(NCR(L), NCR(AS), NCR(AM), NCR(SA))

Then,OverallNCR = LBCR+NCR(AF )

The reason for using an overall threshold is to detectplayers that use different cheats during the same match.

After analyzing the data, we noticed that some resultsweren’t accurate enough sometimes. Many reasons couldcause this inaccuracy to appear. First, in online games, delays(lags) happen all the time, which can cause the messageexchange rate to fall down (it reaches less than 5 messages/secin some cases). This delay reduces the accuracy of datacollection and causes, some features give odd values hence,low accuracy will be expected. Another reason for inaccuracyis the number of cheating data collected compared to normalbehavior data. In some cases, especially with separated cheats,the ratio between cheats and normal data is higher than 40:60(which is a reasonable rate between normal and abnormal datafor classification in our opinion). Finally, improving the featureset, by either adding new features or by modifying some ofthe existing ones, can also help increase accuracy.Overall, the accuracy that was achieved is very high, es-pecially with the AimBots confusions added (Auto-Switchand Auto-Miss). Frame sizes affect the accuracy; the framesize 60 caused highest achievable accuracy over most of theexperiments. However, as mentioned before, larger frames willresult in a fewer number of instances. Also, we assume thatany frame larger than 60 is too large, since smart cheaterswill activate cheats only when they need them. Therefore,larger frame size might not be accurate to identify a cheatingbehavior. In our opinion, we suggest using any frame sizebetween 30 and 60 seconds.

V. CONCLUSION AND FUTURE WORK

In this paper, we presented a behavior-based method todetect cheaters in Online First Person Shooter games. Thegame was developed using Unity3D game engine and ithad client-server architecture. The game logs were storedin the server side, to be parsed by a feature generator andthen analyzed by the data analyzer. Data was analyzed usingimplemented machine learning classifiers in Weka.

After conducting several experiments using different typesof detection models, we obtained great accuracy results.We then suggested using thresholds when applying thegenerated models. The value of the threshold depends onthe developers’ policy for cheat detection. It could be low(harsh) or high (loose); however, it should not be too harshor it will capture many experienced honest players as cheaters.

In general, game data analysis depends on how muchyou know the game. Therefore, developers could adjustdata in messages exchanged between client and server toaccommodate the generation of the features. Also, usingbehavior-based cheat detection methods will protect players’privacy.

More work could be done in the future to improve thecheat detection models. We could collect more data using

a mixture of cheats for each player instead of one cheatthe whole game. Also, more features could be generatedto improve identifying cheats. Moreover, we could identifythe behavior of each player separately. That means buildingcheating detection models for each player instead of an overallmodel for each cheat. Other improvements and additionscould be done including increasing the number of maps andweapons.

ACKNOWLEDGMENT

We would like to thank the members of GamePipe lab at theUniversity of Southern California for their continuous supportand advices; especially Professor Michael Zyda, Balakrishnan(Balki) Ranganathan, Marc Spraragen, Powen Yao, Chun(Chris) Zhang, and Mohammad Alzaid.

REFERENCES

[1] G. Hoglund and G. McGraw, Exploiting online games: cheating mas-sively distributed systems, 1st ed. Addison-Wesley Professional, 2007.

[2] S. Webb and S. Soh, “A survey on network game cheats and P2Psolutions,” Australian Journal of Intelligent Information ProcessingSystems, vol. 9, no. 4, pp. 34–43, 2008.

[3] Unity3d - game engine. [Online]. Available: http://www.unity3d.com/[4] S. Yeung and J. C. Lui, “Dynamic Bayesian approach for detecting

cheats in multi-player online games,” Multimedia Systems, vol. 14,no. 4, pp. 221–236, 2008. [Online]. Available: http://dx.doi.org/10.1007/s00530-008-0113-5

[5] L. Galli, D. Loiacono, L. Cardamone, and P. Lanzi, “A cheatingdetection framework for Unreal Tournament III: A machine learningapproach,” in CIG, 2011, pp. 266–272.

[6] C. Thurau, C. Bauckhage, and G. Sagerer, “Combining Self OrganizingMaps and Multilayer Perceptrons to Learn Bot-Behavior for a Commer-cial Computer Game,” in Proc. GAME-ON, 2003, pp. 119–123.

[7] C. Thurau, C. Bauckhage, and G. Sagerer, “Learning Human-LikeMovement Behavior for Computer Games,” in Proc. Int. Conf. on theSimulation of Adaptive Behavior. MIT Press, 2004, pp. 315–323.

[8] C. Thurau and C. Bauckhage, “Towards manifold learning for gamebotbehavior modeling,” in Proceedings of the 2005 ACM SIGCHIInternational Conference on Advances in computer entertainmenttechnology, ser. ACE ’05. New York, NY, USA: ACM, 2005, pp. 446–449. [Online]. Available: http://doi.acm.org/10.1145/1178477.1178577

[9] C. Thurau, T. Paczian, and C. Bauckhage, “Is Bayesian ImitationLearning the Route to Believable Gamebots?” in Proc. GAME-ON NorthAmerica, 2005, pp. 3–9.

[10] K. Chen, H. Pao, and H. Chang, “Game Bot Identification based onManifold Learning,” in Proceedings of ACM NetGames 2008, 2008.

[11] USC GamePipe Laboratory. [Online]. Available: http://gamepipe.usc.edu/

[12] Artificial Aiming. [Online]. Available: http://www.artificialaiming.net/[13] Weka 3: Data Mining Software in Java. [Online]. Available:

http://www.cs.waikato.ac.nz/ml/weka/[14] J. Platt, “Advances in kernel methods,” B. Scholkopf, C. J. C.

Burges, and A. J. Smola, Eds. Cambridge, MA, USA: MITPress, 1999, ch. Fast training of support vector machines usingsequential minimal optimization, pp. 185–208. [Online]. Available:http://dl.acm.org/citation.cfm?id=299094.299105

[15] M. Rychetsky, Algorithms and Architectures for Machine LearningBased on Regularized Neural Networks and Support Vector Approaches.Germany: Shaker Verlag GmbH, Dec. 2001.

[16] R. Kohavi, “A study of cross-validation and bootstrap for accuracyestimation and model selection,” in Proceedings of the 14thinternational joint conference on Artificial intelligence - Volume2, ser. IJCAI’95. San Francisco, CA, USA: Morgan KaufmannPublishers Inc., 1995, pp. 1137–1143. [Online]. Available: http://dl.acm.org/citation.cfm?id=1643031.1643047

[17] I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, “Gene selection forcancer classification using support vector machines,” Mach. Learn.,vol. 46, no. 1-3, pp. 389–422, mar 2002. [Online]. Available:http://dx.doi.org/10.1023/A:1012487302797

behavioral-based cheating detection in online first … cheating detection in online first person...

Documents