데이터분석언어 (python) - openwith.net°이터분석언어_1python2018.pdf · 회차 날짜...

28
데이터분석언어 (Python) 2018-9~12 윤형기 ([email protected]) 제 1강

Upload: others

Post on 18-Sep-2019

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

데이터분석언어 (Python)

2018-9~12

윤형기 ([email protected])

제 1강

Page 2: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

회차 날짜 주요 내용 교재

1 9/8 강의 소개

Python 개요

실습환경 구축

(Windows/Linux 설치)

2 9/15 기본프로그래밍 (1) Python 언어구조, 데이터 타입

변수, Expression과 연산자

3 9/29 기본프로그래밍 (2)

Sequence (List, Tuple, …)

Dictionary와 Set

4 10/6 기본프로그래밍 (3)

제어구문

함수기초

5 10/13 기본프로그래밍 (4) 함수응용

Exception 구문

6 10/20 Python OOP (1)

7 10/27 Python OOP (2)

8 11/3 <중간고사>

9 11/10 빅데이터와 Python

Module과 패키지

10 11/17 String과 Regular Expression

File과 Text

11 11/24 Standard Library

12 12/1 Network 및 Web 프로그래밍

13 12/8 Numpy

14 12/15 Matplotlib과 Pandas (1)

15 12/22 Pandas (2)

16 12/29 <기말고사>

Page 3: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

순서

• 세상 변하는 얘기 – Disruptive Technologies

– Prime Mover – OSS와 Python

• Python 개요 – 특징과 역사

– Python as a Programming Language

– Python Interpreter와 CPython

– Python Use Cases

• Python과 데이터분석

• 실습환경

• Python in a Sheet

• 실습

Page 5: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

소프트웨어와 오픈소스 (OSS)

2014-12-13 5

Page 6: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

Python 배경 – 프로그래밍 언어와 오픈소스

프로그래밍 언어의 역사

• Before C: – 1957 FORTRAN/ 1959 COBOL/

1964 BASIC

• C – 1969 C

– 1973 PASCAL

• C++ – 1983 C++

• http://www.youtube.com/watch?v=JoVQTPbD6UY

• After C/C++ – 1991 Python

– 1995 Java, Javascript

– 1995 R

– 2009 Go

OSS의 역사

• 1960's ARPANET, ...

• 1969 Unix

• 1980 Usenet

• 1983 GNU 프로젝트

• 1985 FSF

• 1989 386BSD, FreeBSD, …

• 1991 Linux kernel

• 1994 MySQL

• 1996 Apache 웹 서버

• 2001 Open Source 선언:

• 2004 Ubuntu

Page 7: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

• http://www.youtube.com/wat

ch?v=POexV1k62_Y

2014-12-13 7

Page 8: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

Forerunners

Bjarne Stroustrup

Yukihiro Matsumoto

James Gosling Larry

Wall

Rasmus Lerdorf

Ken Thompson

Dennis Ritchie

Linus Torvalds

Brendan Eich

Richard Stallman

Larry Page

Bill Joy Tim

Berners-Lee Guido van Rossum

Page 9: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

Python History

• Conceived in the late 1980s,

• Implementation began in 1989 as a successor to ABC language – On the origins of Python, Van Rossum wrote in 1996

• ...In December 1989, I was looking for a "hobby" programming project that would keep me occupied during the week around Christmas. My office ... would be closed, but I had a home computer, and not much else on my hands. I decided to write an interpreter for the new scripting language I had been thinking about lately: a descendant of ABC that would appeal to Unix/C hackers. I chose Python as a working title for the project, being in a slightly irreverent mood (and a big fan of Monty Python's Flying Circus). — Guido van Rossum

• Process – PSF (Python Software Foundation)

• Python’s intellectual property is vested in the PSF

• Python’s reference source repositories (Mercurial git)

– Python Enhancement Proposals (PEPs) - public docs

Page 10: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

• Guido van Rossum – Python’s inventor, architect

– Benevolent Dictator For Life (BDFL).

• Zen of Python - PEP 20 – S/W principles that influences design of Python by Tim Peters.

• Beautiful is better than ugly.

• Explicit is better than implicit.

• Simple is better than complex.

• Complex is better than complicated.

• Flat is better than nested.

• …

Page 11: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

Python 특징

• 특징 – Simple, but not simplistic.

– A general-purpose programming language

– A very high-level language (VHLL).

– OOP language

– * A functional programming language

– Batteries Included - Standard Library and Extension Modules

• Python Implementations - 4 production-quality implementations – CPython

• - Classic Python (Python) = implementation of Python.

• = a compiler, interpreter, and set of built-in and optional extension modules

– Jython,

– IronPython,

– PyPy - generate native machine code “just in time”

Page 12: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

• Syntax and semantics – Indentation – Expressions, Statements and control flow – Typing

• Strong Typing • Dynamic Typing • Dynamic Typing - Duck test "If it walks like a duck and it quacks like

a duck, then it must be a duck“

• Libraries – https://pypi.org/

• Development Environments – REPL (read–eval–print loop) – IDLE – IDE – IPython

Page 13: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

• Python version – (…)

• alpha releases, tagged as 3.x a0, 3.x a1, and so on.

• beta release, 3.x b1, and after the betas, at least one release candidate, 3.x rc1.

• final release of 3.x (3.x.0)

– Python 2.7 • first released in July 2010

• Python 2.7's end-of-life postponed to 2020

– Python 3.0 • first released in 2008. - Each v3 minor release adds features

• initially called Python 3000 (or py3k) released in 2008

• In 2017, Google announced work on a Python 2.7 to Go transcompiler to improve performance

Page 14: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

Python Interpreter

• What? – Process

– Lexing ; text line in source code AST

– Parsing ; analyze AST

– Compiling ; AST (structured) code object

– Interpreting ; code object로 실제로 Do! • (여러 의미)

– = PVM = stack machine (call stack과는 구별)

– Bytecode interpreter

» (bytecode = intermediate code = internal representation of Py program in the interpreter

• Ex. Byterun = Python Python interpreter

Page 15: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

• Intermediate approach

JIT compiler/Bytecode

Compiled Interpreted

Ready to run Not cross-platform Cross-platform Interpreter Required

Often Faster Inflexible Simpler to test Often slower

Source code is private

Extra step Easier to debug Source coude is public

C, C++, Objective-C PHP, Javascript, Ruby, Perl

Hybrid - Java, C#, VB.NET, Python

Page 16: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

• Cpython – (1) compiler to convert to bytecode – (2) VM to run the bytecode

• = stack-based (instead of register-based) • dis module has most of the details

– (3) C interface to interact with the VM

– Python/ceval.c • PyEval_EvalFrame(PyFrameObject *f)

– Modules/main.c • Py_Main(int argc, wchar_t **argv)

• Cpython vs Cython – `Cython` is a language in itself that is a superset of `Python` (i.e. (almost)

all `Python` syntax is accepted) and `CPython` is one (the most trusted and used) implementation of `Python` in `C`.

– Cython adds a few extensions to the Python language, and lets you compile your code to C extensions, code that plugs into the CPython interpreter.

Page 17: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

>>> import dis >>> def add(x,y): ... z=x+y ... return z ... >>> dis.dis(add) 2 0 LOAD_FAST 0 (x) 2 LOAD_FAST 1 (y) 4 BINARY_ADD 6 STORE_FAST 2 (z) 3 8 LOAD_FAST 2 (z) 10 RETURN_VALUE

Page 18: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

>>> help(list) Help on class list in module builtins: class list(object) | list() -> new empty list | list(iterable) -> new list initialized from iterable's items | | Methods defined here: | | __add__(self, value, /) | Return self+value. | | __contains__(self, key, /)

>>> help(list.sort) Help on method_descriptor: sort(...) L.sort(key=None, reverse=False) -> None -- stable sort *IN PLACE*

Page 19: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

Primitive building blocks

• 개요 – some for description of data and processes applied to them

• Syntax – 문법

• Semantic – meaning of languages

• Type System – Typed vs. Untyped

• Untyped allows any operation to be performed on any data – ex. tcl

– Strongly-Typed vs. Weakly-Typed

– Static vs. Dynamic Typing

Page 20: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

Python Use Cases – Some Projects

Page 21: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

Python과 데이터분석

• 데이터분석

• 개념의 확장 – Spreadsheet 중심 분석

– + BI/OLAP/DB Query

– + 통계 분석

– + 텍스트 분석 (SNA/감성분석, 마이닝, 검색)

– + Machine Learning

– + Deep Learning

Page 22: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

• Data Science

• Python과 데이터분석

22

Page 23: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

실습환경

Page 24: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

• Choice

• 강의 범위

• 강의안 (?)

Py/IDLE pipenv virtualenv vi Spyder/ ipython

Anaconda Eclipse Pycharm Atom

Windows

Linux

Page 25: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

• 실습환경 – Linux

– MS Windows

– Others

• Raspberry Pi

• Cloud

Page 26: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

Python 프로그래밍 언어

Page 27: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

• Lexical Structure – Lines and Indentation

– Character Sets

– Tokens

– Statements

• Data Types – Numbers

– Sequences

– Sets

– Dictionaries

– Callables

– Boolean Values

• Strings

• Variables and Other References – Variables

– Assignment Statements

• Functions

• Expressions and Operators – Numeric Operations

– Sequence Operations

– Set Operations

– Dictionary Operations

• Control Flow Statements – if else while

– for break continue

– try raise with

• Classes & OOP

• Exceptions

• Core Built-ins and Standard Libraries

• Modules & Packages

Page 28: 데이터분석언어 (Python) - openwith.net°이터분석언어_1Python2018.pdf · 회차 날짜 주요 내용 교재 1 9/8 강의 소개 Python 개요 실습환경 구축 (Windows/Linux

실습