wade not in unknown waters. part two

10
Wade not in unknown waters. Part two. Author: Andrey Karpov Date: 01.02.2012 This time I want to speak on the 'printf' function. Everybody has heard of software vulnerabilities and that functions like 'printf' are outlaw. But it's one thing to know that you'd better not use these functions, and quite the other to understand why. In this article, I will describe two classic software vulnerabilities related to 'printf'. You won't become a hacker after that but perhaps you will have a fresh look at your code. You might create similar vulnerable functions in your project without knowing that. STOP. Reader, please stop, don't pass by. You have seen the word "printf", I know. And you're sure that you will now be told a banal story that the function cannot check types of passed arguments. No! It's vulnerabilities themselves that the article deals with, not the things you have thought. Please come and read it. The previous post can be found here: Part one. Introduction Have a look at this line: printf(name); It seems simple and safe. But actually it hides at least two methods to attack the program. Let's start our article with a demo sample containing this line. The code might look a bit odd. It is, really. We found it quite difficult to write a program so that it could be attacked then. The reason is optimization performed by the compiler. It appears that if you write a too simple program, the compiler creates a code where nothing can be hacked. It uses registers, not the stack, to store data, creates intrinsic functions and so on. We could write a code with extra actions and loops so that the compiler lacked free registers and started putting data into the stack. Unfortunately, the code would be too large and complicated in this case. We could write a whole detective story about all this, but we won't. The cited sample is a compromise between complexity and the necessity to create a code that would not be too simple for the compiler to get it "collapsed into nothing". I have to confess that I still have helped myself a bit: I have disabled some optimization options in Visual Studio 2010. First, I have turned off the /GL (Whole Program Optimization) switch. Second, I have used the __declspec(noinline) attribute. Sorry for such a long introduction: I just wanted to explain why my code is such a crock and prevent beforehand any debates on how we could write it in a better way. I know that we could. But we didn't manage to make the code short and show you the vulnerability inside it at the same time. Demo sample The complete code and project for Visual Studio 2010 can be found here. const size_t MAX_NAME_LEN = 60;

Upload: tanyazaxarova

Post on 06-Jul-2015

134 views

Category:

Technology


0 download

DESCRIPTION

This time I want to speak on the 'printf' function. Everybody has heard of software vulnerabilities and that functions like 'printf' are outlaw. But it's one thing to know that you'd better not use these functions, and quite the other to understand why. In this article, I will describe two classic software vulnerabilities related to 'printf'. You won't become a hacker after that but perhaps you will have a fresh look at your code. You might create similar vulnerable functions in your project without knowing that.

TRANSCRIPT

Page 1: Wade not in unknown waters. Part two

Wade not in unknown waters. Part two.

Author: Andrey Karpov

Date: 01.02.2012

This time I want to speak on the 'printf' function. Everybody has heard of software vulnerabilities and

that functions like 'printf' are outlaw. But it's one thing to know that you'd better not use these

functions, and quite the other to understand why. In this article, I will describe two classic software

vulnerabilities related to 'printf'. You won't become a hacker after that but perhaps you will have a fresh

look at your code. You might create similar vulnerable functions in your project without knowing that.

STOP. Reader, please stop, don't pass by. You have seen the word "printf", I know. And you're sure that

you will now be told a banal story that the function cannot check types of passed arguments. No! It's

vulnerabilities themselves that the article deals with, not the things you have thought. Please come and

read it.

The previous post can be found here: Part one.

Introduction

Have a look at this line:

printf(name);

It seems simple and safe. But actually it hides at least two methods to attack the program.

Let's start our article with a demo sample containing this line. The code might look a bit odd. It is, really.

We found it quite difficult to write a program so that it could be attacked then. The reason is

optimization performed by the compiler. It appears that if you write a too simple program, the compiler

creates a code where nothing can be hacked. It uses registers, not the stack, to store data, creates

intrinsic functions and so on. We could write a code with extra actions and loops so that the compiler

lacked free registers and started putting data into the stack. Unfortunately, the code would be too large

and complicated in this case. We could write a whole detective story about all this, but we won't.

The cited sample is a compromise between complexity and the necessity to create a code that would

not be too simple for the compiler to get it "collapsed into nothing". I have to confess that I still have

helped myself a bit: I have disabled some optimization options in Visual Studio 2010. First, I have turned

off the /GL (Whole Program Optimization) switch. Second, I have used the __declspec(noinline)

attribute.

Sorry for such a long introduction: I just wanted to explain why my code is such a crock and prevent

beforehand any debates on how we could write it in a better way. I know that we could. But we didn't

manage to make the code short and show you the vulnerability inside it at the same time.

Demo sample

The complete code and project for Visual Studio 2010 can be found here.

const size_t MAX_NAME_LEN = 60;

Page 2: Wade not in unknown waters. Part two

enum ErrorStatus {

E_ToShortName, E_ToShortPass, E_BigName, E_OK

};

void PrintNormalizedName(const char *raw_name)

{

char name[MAX_NAME_LEN + 1];

strcpy(name, raw_name);

for (size_t i = 0; name[i] != '\0'; ++i)

name[i] = tolower(name[i]);

name[0] = toupper(name[0]);

printf(name);

}

ErrorStatus IsCorrectPassword(

const char *universalPassword,

BOOL &retIsOkPass)

{

string name, password;

printf("Name: "); cin >> name;

printf("Password: "); cin >> password;

if (name.length() < 1) return E_ToShortName;

if (name.length() > MAX_NAME_LEN) return E_BigName;

if (password.length() < 1) return E_ToShortPass;

retIsOkPass =

universalPassword != NULL &&

strcmp(password.c_str(), universalPassword) == 0;

Page 3: Wade not in unknown waters. Part two

if (!retIsOkPass)

retIsOkPass = name[0] == password[0];

printf("Hello, ");

PrintNormalizedName(name.c_str());

return E_OK;

}

int _tmain(int, char *[])

{

_set_printf_count_output(1);

char universal[] = "_Universal_Pass_!";

BOOL isOkPassword = FALSE;

ErrorStatus status =

IsCorrectPassword(universal, isOkPassword);

if (status == E_OK && isOkPassword)

printf("\nPassword: OK\n");

else

printf("\nPassword: ERROR\n");

return 0;

}

The _tmain() function calls the IsCorrectPassword() function. If the password is correct or if it coincides

with the magic word "_Universal_Pass_!", then the program prints the line "Password: OK". The purpose

of our attacks will be to have the program print this very line.

The IsCorrectPassword() function asks the user to specify name and password. The password is

considered correct if it coincides with the magic word passed into the function. It is also considered

correct if the password's first letter coincides with the name's first letter.

Regardless of whether the correct password is entered or not, the application shows a welcome

window. The PrintNormalizedName() function is called for this purpose.

Page 4: Wade not in unknown waters. Part two

The PrintNormalizedName() function is of the most interest. It is this function where the "printf(name);"

we're discussing is stored. Think of the way we can exploit this line to cheat the program. If you know

how to do it, you don't have to read further.

What does the PrintNormalizedName() function do? It prints the name making the first letter capital and

the rest letters small. For instance, if you enter the name "andREy2008", it will be printed as

"Andrey2008".

The first attack

Suppose we don't know the correct password. But we know that there is some magic password

somewhere. Let's try to find it using printf(). If this password's address is stored somewhere in the stack,

we have certain chances to succeed. Any ideas how to get this password printed on the screen?

Here is a tip. The printf() function refers to the family of variable-argument functions. These functions

work in the following way. Some amount of data is written into the stack. The printf() function doesn't

know how many data is pushed and what type they have. It follows only the format string. If it reads

"%d%s", then the function should extract one value of the int type and one pointer from the stack. Since

the printf() function doesn't know how many arguments it has been passed, it can look deeper into the

stack and print data that have nothing to do with it. It usually causes access violation or printing trash.

And we may exploit this trash.

Let's see how the stack might look at the moment when calling the printf() function:

Page 5: Wade not in unknown waters. Part two

Figure 1. Schematic arrangement of data in the stack.

The "printf(name);" function's call has only one argument which is the format string. It means that if we

type in "%d" instead of the name, the program will print the data that lie in the stack before the

PrintNormalizedName() function's return address. Let's try:

Name: %d

Password: 1

Hello, 37

Password: ERROR

This action has little sense in it for now. First of all, we have at least to print the return addresses and all

the contents of the char name[MAX_NAME_LEN + 1] buffer which is located in the stack too. And only

then we may get to something really interesting.

If an attacker cannot disassemble or debug the program, he/she cannot know for sure if there is

something interesting in the stack to be found. He/she still can go the following way.

First we can enter: "%s". Then "%x%s". Then "%x%x%s" and so on. Doing so, the hacker will search

through the data in the stack in turn and try to print them as a line. It helps the intruder that all the data

in the stack are aligned at least on a 4-byte boundary.

To be honest, we won't succeed if we go this way. We will exceed the limit of 60 characters and have

nothing useful printed. "%f" will help us - it is intended to print values of the double type. So, we can use

it to move along the stack with an 8-byte step.

Here it is, our dear line:

%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%x(%s)

This is the result:

Page 6: Wade not in unknown waters. Part two

Figure 2. Printing the password. Click on the picture to enlarge it.

Let's try this line as the magic password:

Name: Aaa

Password: _Universal_Pass_!

Hello, Aaa

Page 7: Wade not in unknown waters. Part two

Password: OK

Hurrah! We have managed to find and print the private data which the program didn't intend to give us

access to. Note also that you don't have to get access to the application's binary code itself. Diligence

and persistence are enough.

Conclusions on the first attack

You should give a wider consideration to this method of getting private data. When developing software

containing variable-argument functions, think it over if there are cases when they may be the source of

data leakage. It can be a log-file, a batch passed on the network and the like.

In the case we have considered, the attack is possible because the printf() function receives a string that

may contain control commands. To avoid this, you just need to write it in this way:

printf("%s", name);

The second attack

Do you know that the printf() function can modify memory? You must have read about it but forgotten.

We mean the "%n" specifier. It allows you to write a number of characters, already printed by the

printf() function, by a certain address.

To be honest, an attack based on the "%n" specifier is just of a historical character. Starting with Visual

Studio 2005, the capability of using "%n" is off by default. To perform this attack, I had to explicitly allow

this specifier. Here is this magic trick:

_set_printf_count_output(1);

To make it clearer, let me give you an example of using "%n":

int i;

printf("12345%n6789\n", &i);

printf( "i = %d\n", i );

The program's output:

123456789

i = 5

Page 8: Wade not in unknown waters. Part two

We have already found out how to get to the needed pointer in the stack. And now we have a tool that

allows us to modify memory by this pointer.

Of course, it's not very much convenient to use it. To start with, we can write only 4 bytes at a time (int

type's size). If we need a larger number, the printf() function will have to print very many characters

first. To avoid this we may use the "%00u" specifier: it affects the value of the current number of output

bytes. Let's not go deep into the detail.

Our case is simpler: we just have to write any value not equal to 0 into the isOkPassword variable. This

variable's address is passed into the IsCorrectPassword() function, which means that it is stored

somewhere in the stack. Do not be confused by the fact that the variable is passed as a reference: a

reference is an ordinary pointer at the low level.

Here is the line that will allow us to modify the IsCorrectPassword variable:

%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f %n

The "%n" specifier does not take into account the number of characters printed by specifiers like "%f".

That's why we make one space before "%n" to write value 1 into isOkPassword.

Let's try:

Page 9: Wade not in unknown waters. Part two

Figure 3. Writing into memory. Click on the picture to enlarge it.

Are you impressed? But that's not all. We may perform writing by virtually any address. If the printed

line is stored in the stack, we may get the needed characters and use them as an address.

For example, we may write a string containing characters with codes 'xF8', 'x32', 'x01', 'x7F' in a row. It

turns out that the string contains a hard-coded number equivalent to value 0x7F0132F8. We add the

"%n" specifier at the end. Using "%x" or other specifiers we can get to the coded number 0x7F0132F8

and write the number of printed characters by this address. This method has some limitations, but it is

still very interesting.

Conclusions on the second attack

We may say that an attack of the second type is hardly possible nowadays. As you see, support of the

"%n" specifier is off in contemporary libraries by default. But you may create a self-made mechanism

subject to this kind of vulnerabilities. Be careful when external data input into your program manage

what and where is written into memory.

Particularly in our case, we may avoid the problem by writing the code in this way:

printf("%s", name);

General conclusions

We have considered only two simple examples of vulnerabilities here. Surely, there are much more of

them. We don't make an attempt to describe or at least enumerate them in this article; we wanted to

show you that even such a simple construct like "printf(name)" can be dangerous.

Page 10: Wade not in unknown waters. Part two

There is an important conclusion to draw from all this: if you are not a security expert, you'd better

follow all the recommendations to be found. Their point might be too subtle for you to understand the

whole range of dangers on yourself. You must have read that the printf() function is dangerous. But I'm

sure that many of you reading this article have learned only now how deep the rabbit hole is.

If you create an application that is potentially an attack object, be very careful. What is quite safe code

from your viewpoint might contain a vulnerability. If you don't see a catch in your code, it doesn't mean

there isn't any.

Follow all the compiler's recommendations on using updated versions of string functions. We mean

using sprintf_s instead of sprintf and so on.

It's even better if you refuse low-level string handling. These functions are a heritage of the C language.

Now we have std::string and we have safe methods of string formatting such as boost::format or

std::stringstream.

P.S. Some of you, having read the conclusions, may say: "well, it's as clear as day". But be honest to

yourself. Did you know and remember that printf() can perform writing into memory before you read

this article? Well, and this is a great vulnerability. At least, it used to. Now there are others, as much

insidious.