cs305/503, spring 2009 introduction michael barnathan

CS305/503, Spring 2009Introduction

Michael Barnathan

Welcome• Who I am:

– Michael Barnathan – Monmouth University CS alumnus, now an adjunct professor.– This is my second semester teaching.– It’s also the first year the course is being taught in Java.– I’ll do my best, but please do voice any concerns.

• What this course will be about:– Data Structures– Algorithms– Java

• Prereq: CS176, CS501B, or equivalent Java experience.• What I expect of you:

– It’s like learning to play an instrument:– Come to class – don’t miss your “lessons”.– Keep programming on your own – “practice”.

– You can eventually learn how to play by practicing without taking lessons (though it will take longer), but you’ll never learn how if you go to lessons but don’t practice!

– Much as I like teaching, this class is offered for your own benefit! Data Structures and Algorithms form the core of CS. You will use these skills whenever you touch software design in any context, and this is the best opportunity you will ever receive to learn them. Past here, everyone will expect you to know this stuff already – or to learn it on your own, without assistance, if you don’t.

• Your goals: Let them guide your learning.

Syllabus

• It’s online.• If changes need to be made to the plan, I will

discuss these changes with you first.– Most changes will probably work in your favor

anyway.

• Let’s review it.

http://rockhopper.monmouth.edu/~mbarnath/2009/CS305/syllabus.doc

Java• This is the first time this course is being taught in Java (it was previously

taught in C++).• Have any of you worked with Java before?

• Because Java may be new to some of you, we will cover it in some detail. However:– You should also make an effort to learn the language on your own (it’s fairly

similar to C++).– You will be expected to know the language on the assignments and exams.– You will also be expected to know how to write, compile, and execute a Java

program without using an IDE.• The systems in this lab are unreliable when it comes to running IDEs. They may not

always work. Even when they do, you may find yourself struggling with the IDE longer than it would have taken you to just write the code!

• I recommend learning how to use vim or emacs, as these are very powerful editors, but failing that, at least know how to use nano (which is like Notepad for the Linux terminal).

• Know how to invoke the javac and java commands to compile and run your programs.

What is a Computer Program?

• What is a computer?• What is a program?• What are the key components of a program?

• Must things be this way?– Must a computer be a machine?– Need a program run on a computer?– Can a program be created while avoiding those

components?

The Class’ Model

• According to the class, a computer:– Is a machine (what’s a machine?)– Performs tasks designated by a user.– Has security in place to protect its integrity.

• And a program:– Interacts with the computer/user (I/O).– Consumes system resources.– Accomplishes a goal.

What is a computer?Given infinite time and space, you CAN build a computer using rocks*.

* Warning: do not attempt in a finite universe unless otherwise recommended by a physicist.

Or neurons…

--xkcd

Computer systems are actually organized in a fairly similar manner to brains.

http://www.xkcd.com/505/

http://www.xkcd.com/505/

First Two Computers

Here the first known computer is seen using the second to perform a calculation.

--Office online clip art.

Turing Machines• Computers can actually be implemented with abstract models that don’t

exist at all…• One such example is a Turing machine.• This model consists of:

– An infinite length of tape (“memory”, “storage”)– A head, which can read or write a single strip of tape and move one strip left

or right at a time.– A stack, containing instructions and a stack pointer.

• Why implement a fake computer?– Because Turing machines are able to perform any calculation a real machine

can.– Therefore, things that cannot be done on a Turing machine can’t be done on

real machines either.• A machine or programming language capable of performing every

computation that can be performed on a Turing machine is called turing complete.

Data Structures and Algorithms: What are they?

• Data structures are models.– They are means of arranging data within your program.– Choosing the right arrangement can make your job easier!

• Algorithms are recipes.– For solving problems in general:

• Programming.• Cooking.• Driving.• Solving math problems.• Many other areas.

– You don’t need a computer to execute them.– In fact, very often you are the one executing them:

• Following driving directions.• Solving puzzles.• Interacting with a program (such as one we’ll discuss today).

What is the difference between an algorithm and a program?

• The differences are slight.• Programs are implementations of data structures and

algorithms.• Data structures and algorithms are abstract concepts.• Just as the Turing machine was an abstract computer.• Programs are generally directly interpretable, and can

somehow be used directly to solve problems.– Very often you will need to “translate” an algorithm, given

in pseudocode or by a diagram, into a program.– Sometimes you’ll derive an algorithm but leave the

implementation to others.– Often you’ll be expected to do both!

How do we measure algorithms?• Correctness:

– Produces the correct answer given the correct input.• “Halting”:

– Stops after it does what it’s supposed to.– Doesn’t crash.

• Performance:– Time (e.g. CPU time)– Space (e.g. Memory)

• Development time – don’t overlook it.• Is saving 5 seconds per run worth a week of your time? Unless the

app. is time-critical, probably not.• Complexity:

– Is the solution simple, intuitive, easy to understand?

Tradeoffs

• As computer scientists, you will make many tradeoffs:– Space vs. time.– Space and time vs. complexity.– Everything vs. dev. time.

• “Optimization” of an algorithm takes time and effort.

• Making these tradeoffs wisely is what separates “good” from “great”.

Measuring Time and Space• An abstract notion, not usually measured in seconds, but “units”

that only have meaning relative to one another.– For example, “2” takes twice as long as “1”, but we have no idea how

long “1” takes.• We care only about the order of growth.

– Not how long something takes.– Instead, how much longer it takes with more data.

• We’ll start with “Worst-case” analysis.– Murphy’s Law: assume that everything that can slow the algorithm

down will.– If your algorithm runs slower when it reads in the number “2”, worst-case

analysis would have you assume every element in the dataset is 2!– However unlikely this may be – it is the worst case, not the worst probable

case.– The purpose of this is to find out the absolute longest amount of

time / largest amount of space that an algorithm might use.• There’s also a “best-case”, which assumes the opposite.

Big-O (“Asymptotic”) Notation.• O(f(n)), where f(n) is some function; e.g.:

– O(1) “Constant time”– O(log n) “Logarithmic”– O(n) “Linear”– O(n log n) “n log n” (less common: “Linearithmic”)– O(n^2) “Quadratic”– O(n^p) “Polynomial”– O(2^n) “Exponential”

• This only indicates the dominant term of the algorithm’s growth.– For example, the function 3n^2 + 10n + 5 is O(n^2).

• We don’t take constant terms into account.– y = .0001x^2 is worse than y = 10000x.– This is because as x gets large, the x^2 term will dominate.

• Therefore, even if algorithm B takes twice as long as algorithm A, they would still belong to the same complexity class.– There is no such thing as “O(2n)”; it’s just O(n). Constants are not counted.

Best

Worst

Big-O visualized

How bad is exponential time?

This bad:

O(n log n)

O(2^n)

Other Symbols• Big-O is only one of the symbols used in asymptotic analysis.• Others are Θ and Ω.

– O represents an upper bound (“no worse than”).– Ω represents a lower bound (“no better than”).– Θ represents a tight bound (“exactly; no better or worse than”).

• It is not wrong to say a linear function is O(n2); it just means we don’t know enough to refine our estimate. O(n) represents a “tighter” bound – and a more complete state of knowledge about the running time of the algorithm.– It is wrong to say the function is anything other than Θ(n), however.

An Θ(n) bound, if correctly proven, will never change.• For the next few weeks, we will use O and Θ interchangeably (many

sources do, although it’s not precise to do so) and will avoid Ω altogether. But do be aware of the difference; we’ll start to distinguish between them later.

Optimality• Unfortunately, exponential time is not always avoidable.• Certain problems can’t be solved any faster.• Thus, an “optimal” solution might not be fast…

– It’s just the fastest one that can exist.• Our subject is the study of algorithms: how to solve problems

efficiently.• The study of how quickly problems can be solved by any algorithm

is called complexity theory.– We unfortunately won’t have time to cover it.– It should get some coverage in CS306/CS512.– There is a famous unsolved problem in this field known as “P=NP?”

– This has to do with whether a certain class of problems whose solutions can be verified in polynomial time can also be solved in polynomial time.

– If not, these algorithms are all exponential (or worse!)• There’s a $1 million prize for solving it, but it’s an extremely difficult question!

The Guessing Game:Writing a simple Java program.

• Problem:– Guess a random number from 1 to 100.– Prompt the user to input guesses.– Print out whether the number is actually higher,

lower, or equal to the user’s guess.– Repeat until the user guesses correctly.

• How would you do this in Java?– The answer is in Thursday’s notes.

Decomposition of the problem• It helps to make problems simpler; to iteratively take

them down to the level at which you can implement them.

• “Higher”/“Lower”/“Equal” translate into “if” comparisons between the number and the guess.

• “Repeat until…” usually means “use a loop”.• “Prompt the user” means you’ll be using Java’s input

facilities (hint: look up the Scanner class).• Random numbers can be generated using

Math.random() or the Random class.• When in doubt about how to use a class, consult the

Java API documentation!

That’s all for now!

• The lesson:– Algorithmic thought exists outside of computer

science. You can apply what you will learn in this class in many unexpected places.

• Next class: The guessing game, review of asymptotic analysis, arrays.

cs305/503, spring 2009 introduction michael barnathan

Documents

java program

computer program

java commands

equivalent java experience

xkcd computer systems

class dont

class model

dont practice