python于web 2.0网站的应用 - qcon beijing 2010
DESCRIPTION
在QCon Beijing 2010上的演讲TRANSCRIPT
Python于Web 2.0网站的应用
洪强宁QCon Beijing 2010
http://www.flickr.com/photos/arnolouise/2986467632/
About Me• Python程序员
• 2002年开始接触Python
• 2004年开始完全使用Python工作
• http://www.douban.com/people/hongqn/
• http://twitter.com/hongqn
Python
• Python is a programming language that lets you work more quickly and integrate your systems more effectively. You can learn to use Python and see almost immediate gains in productivity and lower maintenance costs. (via http://python.org/)
开发迅捷
import osfrom collections import defaultdict
d = defaultdict(int)
for dirpath, dirnames, filenames in os.walk('.'): for filename in filenames: path = os.path.join(dirpath, filename) ext = os.path.splitext(filename)[1] d[ext] += len(list(open(path)))
for ext, n_lines in d.items(): print ext, n_lines
统计各种语言的代码行数: 13行
资源丰富
• Battery Included: 标准库内置200+模块
• PyPI: 9613 packages currently
• 网络/数据库/桌面/游戏/科学计算/安全/文本处理/...
• easily extensible
web.pyimport web
urls = ( '/(.*)', 'hello')app = web.application(urls, globals())
class hello: def GET(self, name): if not name: name = 'World' return 'Hello, ' + name + '!'
if __name__ == "__main__": app.run()
http://webpy.org/
Flaskimport flask import Flaskapp = Flask(__name__)
@app.route("/<name>")def hello(name): if not name: name = 'World' return 'Hello, ' + name + '!'
if __name__ == "__main__": app.run()
http://flask.pocoo.org/
WSGIhttp://www.python.org/dev/peps/pep-0333/
Why so many Python web frameworks?
• Because you can write your own framework in 3 hours and a total of 60 lines of Python code.
• http://bitworking.org/news/Why_so_many_Python_web_frameworks
doctestdef cube(x): """ >>> cube(10) 1000 """ return x * x
def _test(): import doctest doctest.testmod()
if __name__ == "__main__": _test()
nose http://somethingaboutorange.com/mrl/projects/nose/
from cube import cube
def test_cube(): result = cube(10) assert result == 1000
numpy
>>> from numpy import *>>> A = arange(4).reshape(2, 2)>>> Aarray([[0, 1], [2, 3]])>>> dot(A, A.T)array([[ 1, 3], [ 3, 13]])
http://numpy.scipy.org/
ipython
$ ipython -pylabIn [1]: X = frange(0, 10, 0.1)In [2]: Y = [sin(x) for x in X]In [3]: plot(X, Y)
http://numpy.scipy.org/
ipython
$ ipython -pylabIn [1]: X = frange(0, 10, 0.1)In [2]: Y = [sin(x) for x in X]In [3]: plot(X, Y)
http://numpy.scipy.org/
virtualenv
$ python go-pylons.py --no-site-packages mydevenv$ cd mydevenv$ source bin/activate(mydevenv)$ paster create -t new9 helloworld
http://virtualenv.openplans.org/
创建一个干净的、隔离的python环境
>>> import this
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
翻译:赖勇浩
http://bit.ly/pyzencn
优美胜于丑陋
明了胜于晦涩
简洁胜于复杂
复杂胜于凌乱
扁平胜于嵌套
间隔胜于紧凑
可读性很重要
即便假借特例的实用性之名,也不可违背这些规则
不要包容所有错误,除非你确定需要这样做
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
当存在多种可能,不要尝试去猜测
而是尽量找一种,最好是唯一一种明显的解决方案
虽然这并不容易,因为你不是 Python 之父
做也许好过不做,但不假思索就动手还不如不做
如果你无法向人描述你的方案,那肯定不是一个好方案;反之亦然
命名空间是一种绝妙的理念,我们应当多加利用
Simple is better than complex
class HelloWorld{ public static void main(String args[]) { System.out.println("Hello World!"); }}
Readability counts
• 强制块缩进,没有{}和end
• 没有费解的字符 (except "@" for decorators)
if limit is not None and len(ids)>limit: ids = random.sample(ids, limit)
TOOWTDI
• There (should be) Only One Way To Do It.
• vs. Perlish TIMTOWTDI (There Is More Than One Way To Do It)
TOOWTDI
• There (should be) Only One Way To Do It.
• vs. Perlish TIMTOWTDI (There Is More Than One Way To Do It)
a = [1, 2, 3, 4, 5]b = []for i in range(len(a)): b.append(a[i]*2)
TOOWTDI
• There (should be) Only One Way To Do It.
• vs. Perlish TIMTOWTDI (There Is More Than One Way To Do It)
a = [1, 2, 3, 4, 5]b = []for i in range(len(a)): b.append(a[i]*2)
TOOWTDI
• There (should be) Only One Way To Do It.
• vs. Perlish TIMTOWTDI (There Is More Than One Way To Do It)
a = [1, 2, 3, 4, 5]b = []for i in range(len(a)): b.append(a[i]*2)
b = []for x in a: b.append(x*2)
TOOWTDI
• There (should be) Only One Way To Do It.
• vs. Perlish TIMTOWTDI (There Is More Than One Way To Do It)
a = [1, 2, 3, 4, 5]b = []for i in range(len(a)): b.append(a[i]*2)
b = []for x in a: b.append(x*2)
TOOWTDI
• There (should be) Only One Way To Do It.
• vs. Perlish TIMTOWTDI (There Is More Than One Way To Do It)
b = [x*2 for x in a]
http://twitter.com/hongqn/status/9883515681
http://twitter.com/robbinfan/status/9879724095
有图有真相
Python C
http://www.flickr.com/photos/nicksieger/281055485/ http://www.flickr.com/photos/nicksieger/281055530/
看图不说话
Ruby
http://www.flickr.com/photos/nicksieger/280661836/
看图不说话
Java
http://www.flickr.com/photos/nicksieger/280662707/
MEMCACHED_ADDR = ['localhost:11211']
from local_config import *
config.py
MEMCACHED_ADDR = [ 'frodo:11211', 'sam:11211', 'pippin:11211', 'merry:11211',]
local_config.py
MEMCACHED_ADDR = ['localhost:11211']
from local_config import *
config.py
MEMCACHED_ADDR = [ 'frodo:11211', 'sam:11211', 'pippin:11211', 'merry:11211',]
local_config.py文件名后缀不为.py时,也可使用exec
class GroupUI(object): def new_topic(self, request): if self.group.can_post(request.user): return new_topic_ui(self.group) else: request.response.set_status(403, "Forbidden") return error_403_ui(msg="成为小组成员才能发帖")
def join(self, request): if self.group.can_join(request.user): ...
class Group(object): def can_post(self, user): return self.group.has_member(user)
def can_join(self, user): return not self.group.has_banned(user)
class GroupUI(object): @check_permission('post', msg="成为小组成员才能发帖") def new_topic(self, request): return new_topic_ui(self.group)
@check_permission('join', msg="不能加入小组") def join(self, request): ...
class Group(object): def can_post(self, user): return self.group.has_member(user)
def can_join(self, user): return not self.group.has_banned(user)
decorator
def print_before_exec(func): def _(*args, **kwargs): print "decorated" return func(*args, **kwargs) return _
@print_before_execdef double(x): print x*2
double(10)
decorator
def print_before_exec(func): def _(*args, **kwargs): print "decorated" return func(*args, **kwargs) return _
@print_before_execdef double(x): print x*2
double(10)
输出:
decorated20
class check_permission(object): def __init__(self, action, msg=None): self.action = action self.msg = msg
def __call__(self, func): def _(ui, req, *args, **kwargs): f = getattr(ui.perm_obj, 'can_' + self.action) if f(req.user): return func(ui, *args, **kwargs) raise BadPermission(ui.perm_obj, self.action, self.msg) return _
class check_permission(object): def __init__(self, action, msg=None): self.action = action self.msg = msg
def __call__(self, func): def _(ui, req, *args, **kwargs): f = getattr(ui.perm_obj, 'can_' + self.action) if f(req.user): return func(ui, *args, **kwargs) raise BadPermission(ui.perm_obj, self.action, self.msg) return _
class check_permission(object): def __init__(self, action, msg=None): self.action = action self.msg = msg
def __call__(self, func): def _(ui, req, *args, **kwargs): f = getattr(ui.perm_obj, 'can_' + self.action) if f(req.user): return func(ui, *args, **kwargs) raise BadPermission(ui.perm_obj, self.action, self.msg) return _
class check_permission(object): def __init__(self, action, msg=None): self.action = action self.msg = msg
def __call__(self, func): def _(ui, req, *args, **kwargs): f = getattr(ui.perm_obj, 'can_' + self.action) if f(req.user): return func(ui, *args, **kwargs) raise BadPermission(ui.perm_obj, self.action, self.msg) return _
class check_permission(object): def __init__(self, action, msg=None): self.action = action self.msg = msg
def __call__(self, func): def _(ui, req, *args, **kwargs): f = getattr(ui.perm_obj, 'can_' + self.action) if f(req.user): return func(ui, *args, **kwargs) raise BadPermission(ui.perm_obj, self.action, self.msg) return _
class check_permission(object): def __init__(self, action, msg=None): self.action = action self.msg = msg
def __call__(self, func): def _(ui, req, *args, **kwargs): f = getattr(ui.perm_obj, 'can_' + self.action) if f(req.user): return func(ui, *args, **kwargs) raise BadPermission(ui.perm_obj, self.action, self.msg) return _
class GroupUI(object): @check_permission('post', msg="成为小组成员才能发帖") def new_topic(self, request): return new_topic_ui(self.group)
@check_permission('join', msg="不能加入小组") def join(self, request): ...
class Group(object): def can_post(self, user): return self.group.has_member(user)
def can_join(self, user): return not self.group.has_banned(user)
def send_notification_mail(email, subject, body): msg = MSG_SEND_MAIL + '\0' + email + '\0' + subject + '\0' + body mq.put(msg)
def async_worker(): msg = mq.get() msg = msg.split('\0') cmd = msg[0] if cmd == MSG_SEND_MAIL: email, subject, body = msg[1:] fromaddr = '[email protected]' email_body = make_email_body(fromaddr, email, subject, body) smtp = smtplib.SMTP('mail') smtp.sendmail(fromaddr, email, email_body) elif cmd == MSG_xxxx: ... elif cmd == MSG_yyyy: ...
@asyncdef send_notification_mail(email, subject, body): fromaddr = '[email protected]' email_body = make_email_body(fromaddr, email, subject, body) smtp = smtplib.SMTP('mail') smtp.sendmail(fromaddr, email, email_body)
def async(func): mod = sys.modules[func.__module__] fname = 'origin_' + func.__name__ mod.__dict__[fname] = func def _(*a, **kw): body = cPickle.dumps((mod.__name__, fname, a, kw)) mq.put(body) return _
def async_worker(): modname, fname, a, kw = cPickle.loads(mq.get()) __import__(modname) mod = sys.modules[modname] mod.__dict__[fname](*a, **kw)
def async(func): mod = sys.modules[func.__module__] fname = 'origin_' + func.__name__ mod.__dict__[fname] = func def _(*a, **kw): body = cPickle.dumps((mod.__name__, fname, a, kw)) mq.put(body) return _
def async_worker(): modname, fname, a, kw = cPickle.loads(mq.get()) __import__(modname) mod = sys.modules[modname] mod.__dict__[fname](*a, **kw)
def async(func): mod = sys.modules[func.__module__] fname = 'origin_' + func.__name__ mod.__dict__[fname] = func def _(*a, **kw): body = cPickle.dumps((mod.__name__, fname, a, kw)) mq.put(body) return _
def async_worker(): modname, fname, a, kw = cPickle.loads(mq.get()) __import__(modname) mod = sys.modules[modname] mod.__dict__[fname](*a, **kw)
def async(func): mod = sys.modules[func.__module__] fname = 'origin_' + func.__name__ mod.__dict__[fname] = func def _(*a, **kw): body = cPickle.dumps((mod.__name__, fname, a, kw)) mq.put(body) return _
def async_worker(): modname, fname, a, kw = cPickle.loads(mq.get()) __import__(modname) mod = sys.modules[modname] mod.__dict__[fname](*a, **kw)
def async(func): mod = sys.modules[func.__module__] fname = 'origin_' + func.__name__ mod.__dict__[fname] = func def _(*a, **kw): body = cPickle.dumps((mod.__name__, fname, a, kw)) mq.put(body) return _
def async_worker(): modname, fname, a, kw = cPickle.loads(mq.get()) __import__(modname) mod = sys.modules[modname] mod.__dict__[fname](*a, **kw)
def async(func): mod = sys.modules[func.__module__] fname = 'origin_' + func.__name__ mod.__dict__[fname] = func def _(*a, **kw): body = cPickle.dumps((mod.__name__, fname, a, kw)) mq.put(body) return _
def async_worker(): modname, fname, a, kw = cPickle.loads(mq.get()) __import__(modname) mod = sys.modules[modname] mod.__dict__[fname](*a, **kw)
def async(func): mod = sys.modules[func.__module__] fname = 'origin_' + func.__name__ mod.__dict__[fname] = func def _(*a, **kw): body = cPickle.dumps((mod.__name__, fname, a, kw)) mq.put(body) return _
def async_worker(): modname, fname, a, kw = cPickle.loads(mq.get()) __import__(modname) mod = sys.modules[modname] mod.__dict__[fname](*a, **kw)
def async(func): mod = sys.modules[func.__module__] fname = 'origin_' + func.__name__ mod.__dict__[fname] = func def _(*a, **kw): body = cPickle.dumps((mod.__name__, fname, a, kw)) mq.put(body) return _
def async_worker(): modname, fname, a, kw = cPickle.loads(mq.get()) __import__(modname) mod = sys.modules[modname] mod.__dict__[fname](*a, **kw)
def get_latest_review_id(): review_id = mc.get('latest_review_id') if review_id is None: review_id = exc_sql("select max(id) from review") mc.set('latest_review_id', review_id) return review_id
def cache(key): def deco(func): def _(*args, **kwargs): r = mc.get(key) if r is None: r = func(*args, **kwargs) mc.set(key, r) return r return _ return deco
def cache(key): def deco(func): def _(*args, **kwargs): r = mc.get(key) if r is None: r = func(*args, **kwargs) mc.set(key, r) return r return _ return deco
def get_review(id): key = 'review:%s' % id review = mc.get(key) if review is None: # cache miss id, author_id, text = exc_sql("select id, author_id, text from review where id=%s", id) review = Review(id, author_id, text) mc.set(key, review) return review
如果cache key需要动态生成呢?
需要动态生成的cache key该如何写decorator?
@cache('review:{id}')def get_review(id): id, author_id, text = exc_sql("select id, author_id, text from review where id=%s", id) return Review(id, author_id, text)
def cache(key_pattern, expire=0): def deco(f): arg_names, varargs, varkw, defaults = inspect.getargspec(f) if varargs or varkw: raise Exception("not support varargs") gen_key = gen_key_factory(key_pattern, arg_names, defaults)
def _(*a, **kw): key = gen_key(*a, **kw) r = mc.get(key) if r is None: r = f(*a, **kw) mc.set(key, r, expire) return r return _ return deco
def cache(key_pattern, expire=0): def deco(f): arg_names, varargs, varkw, defaults = inspect.getargspec(f) if varargs or varkw: raise Exception("not support varargs") gen_key = gen_key_factory(key_pattern, arg_names, defaults)
def _(*a, **kw): key = gen_key(*a, **kw) r = mc.get(key) if r is None: r = f(*a, **kw) mc.set(key, r, expire) return r return _ return deco
def cache(key_pattern, expire=0): def deco(f): arg_names, varargs, varkw, defaults = inspect.getargspec(f) if varargs or varkw: raise Exception("not support varargs") gen_key = gen_key_factory(key_pattern, arg_names, defaults)
def _(*a, **kw): key = gen_key(*a, **kw) r = mc.get(key) if r is None: r = f(*a, **kw) mc.set(key, r, expire) return r return _ return deco
inspect.getargspec>>> import inspect>>> def f(a, b=1, c=2):... pass... >>> inspect.getargspec(f)ArgSpec(args=['a', 'b', 'c'], varargs=None, keywords=None, defaults=(1, 2))>>>>>>>>> def f(a, b=1, c=2, *args, **kwargs):... pass... >>> inspect.getargspec(f)ArgSpec(args=['a', 'b', 'c'], varargs='args', keywords='kwargs', defaults=(1, 2))
def cache(key_pattern, expire=0): def deco(f): arg_names, varargs, varkw, defaults = inspect.getargspec(f) if varargs or varkw: raise Exception("not support varargs") gen_key = gen_key_factory(key_pattern, arg_names, defaults)
def _(*a, **kw): key = gen_key(*a, **kw) r = mc.get(key) if r is None: r = f(*a, **kw) mc.set(key, r, expire) return r return _ return deco
def cache(key_pattern, expire=0): def deco(f): arg_names, varargs, varkw, defaults = inspect.getargspec(f) if varargs or varkw: raise Exception("not support varargs") gen_key = gen_key_factory(key_pattern, arg_names, defaults)
def _(*a, **kw): key = gen_key(*a, **kw) r = mc.get(key) if r is None: r = f(*a, **kw) mc.set(key, r, expire) return r return _ return deco
hint:• str.format in python 2.6: '{id}'.format(id=1) => '1'• dict(zip(['a', 'b', 'c'], [1, 2, 3])) => {'a': 1, 'b': 2, 'c': 3}
def cache(key_pattern, expire=0): def deco(f): arg_names, varargs, varkw, defaults = inspect.getargspec(f) if varargs or varkw: raise Exception("not support varargs") gen_key = gen_key_factory(key_pattern, arg_names, defaults)
def _(*a, **kw): key = gen_key(*a, **kw) r = mc.get(key) if r is None: r = f(*a, **kw) mc.set(key, r, expire) return r return _ return deco
hint:• str.format in python 2.6: '{id}'.format(id=1) => '1'• dict(zip(['a', 'b', 'c'], [1, 2, 3])) => {'a': 1, 'b': 2, 'c': 3}
def cache(key_pattern, expire=0): def deco(f): arg_names, varargs, varkw, defaults = inspect.getargspec(f) if varargs or varkw: raise Exception("not support varargs") gen_key = gen_key_factory(key_pattern, arg_names, defaults)
def _(*a, **kw): key = gen_key(*a, **kw) r = mc.get(key) if r is None: r = f(*a, **kw) mc.set(key, r, expire) return r return _ return deco
class Feed(object): def get_entries(self, limit=10): ids = exc_sqls("select id from entry where feed_id=%s order by id desc limit %s", (self.id, limit)) return [Entry.get(id) for id in ids]
class FeedCollection(object): def get_entries(self, limit=10): mixed_entries = [] for feed in self.feeds: entries = feed.get_entries(limit=limit) mixed_entries += entries mixed_entries.sort(key=lambda e: e.id, reverse=True) return mixed_entries[:10]
class Feed(object): def get_entries(self, limit=10): ids = exc_sqls("select id from entry where feed_id=%s order by id desc limit %s", (self.id, limit)) return [Entry.get(id) for id in ids]
class FeedCollection(object): def get_entries(self, limit=10): mixed_entries = [] for feed in self.feeds: entries = feed.get_entries(limit=limit) mixed_entries += entries mixed_entries.sort(key=lambda e: e.id, reverse=True) return mixed_entries[:10]
class Feed(object): def get_entries(self, limit=10): ids = exc_sqls("select id from entry where feed_id=%s order by id desc limit %s", (self.id, limit)) return [Entry.get(id) for id in ids]
class FeedCollection(object): def get_entries(self, limit=10): mixed_entries = [] for feed in self.feeds: entries = feed.get_entries(limit=limit) mixed_entries += entries mixed_entries.sort(key=lambda e: e.id, reverse=True) return mixed_entries[:10]
class Feed(object): def get_entries(self, limit=10): ids = exc_sqls("select id from entry where feed_id=%s order by id desc limit %s", (self.id, limit)) return [Entry.get(id) for id in ids]
class FeedCollection(object): def get_entries(self, limit=10): mixed_entries = [] for feed in self.feeds: entries = feed.get_entries(limit=limit) mixed_entries += entries mixed_entries.sort(key=lambda e: e.id, reverse=True) return mixed_entries[:10]
数据库查询行数 = len(self.feeds) * limit
class Feed(object): def get_entries(self, limit=10): ids = exc_sqls("select id from entry where feed_id=%s order by id desc limit %s", (self.id, limit)) return [Entry.get(id) for id in ids]
class FeedCollection(object): def get_entries(self, limit=10): mixed_entries = [] for feed in self.feeds: entries = feed.get_entries(limit=limit) mixed_entries += entries mixed_entries.sort(key=lambda e: e.id, reverse=True) return mixed_entries[:10]
浪费的Entry.get数 = len(self.feeds-1) * limit
iterator and generatordef fib(): x, y = 1, 1 while True: yield x x, y = y, x+y
def odd(seq): return (n for n in seq if n%2)
def less_than(seq, upper_limit): for number in seq: if number >= upper_limit: break yield number
print sum(odd(less_than(fib(), 4000000)))
itertools• count([n]) --> n, n+1, n+2
• cycle(p) --> p0, p1, ... plast, p0, p1, ...
• repeat(elem [,n]) --> elem, elem, elem, ... endless or up to n times
• izip(p, q, ...) --> (p[0], q[0]), (p[1], q[1]), ...
• islice(seq, [start,] stop [, step]) --> elements from seq[start:stop:step]
• ... and more ...
class Feed(object): def iter_entries(self): start_id = sys.maxint while True: entry_ids = exc_sqls("select id from entry where feed_id=%s and id<%s order by id desc limit 5", (self.id, start_id)) if not entry_ids: break for entry_id in entry_ids: yield Entry.get(entry_id) start_id = entry_ids[-1]
class FeedCollection(object): def iter_entries(self): return imerge(*[feed.iter_entries() for feed in self.feeds])
def get_entries(self, limit=10): return list(islice(self.iter_entries(), limit))
class Feed(object): def iter_entries(self): start_id = sys.maxint while True: entry_ids = exc_sqls("select id from entry where feed_id=%s and id<%s order by id desc limit 5", (self.id, start_id)) if not entry_ids: break for entry_id in entry_ids: yield Entry.get(entry_id) start_id = entry_ids[-1]
class FeedCollection(object): def iter_entries(self): return imerge(*[feed.iter_entries() for feed in self.feeds])
def get_entries(self, limit=10): return list(islice(self.iter_entries(), limit))
class Feed(object): def iter_entries(self): start_id = sys.maxint while True: entry_ids = exc_sqls("select id from entry where feed_id=%s and id<%s order by id desc limit 5", (self.id, start_id)) if not entry_ids: break for entry_id in entry_ids: yield Entry.get(entry_id) start_id = entry_ids[-1]
class FeedCollection(object): def iter_entries(self): return imerge(*[feed.iter_entries() for feed in self.feeds])
def get_entries(self, limit=10): return list(islice(self.iter_entries(), limit))
class Feed(object): def iter_entries(self): start_id = sys.maxint while True: entry_ids = exc_sqls("select id from entry where feed_id=%s and id<%s order by id desc limit 5", (self.id, start_id)) if not entry_ids: break for entry_id in entry_ids: yield Entry.get(entry_id) start_id = entry_ids[-1]
class FeedCollection(object): def iter_entries(self): return imerge(*[feed.iter_entries() for feed in self.feeds])
def get_entries(self, limit=10): return list(islice(self.iter_entries(), limit))
class Feed(object): def iter_entries(self): start_id = sys.maxint while True: entry_ids = exc_sqls("select id from entry where feed_id=%s and id<%s order by id desc limit 5", (self.id, start_id)) if not entry_ids: break for entry_id in entry_ids: yield Entry.get(entry_id) start_id = entry_ids[-1]
class FeedCollection(object): def iter_entries(self): return imerge(*[feed.iter_entries() for feed in self.feeds])
def get_entries(self, limit=10): return list(islice(self.iter_entries(), limit))
class Feed(object): def iter_entries(self): start_id = sys.maxint while True: entry_ids = exc_sqls("select id from entry where feed_id=%s and id<%s order by id desc limit 5", (self.id, start_id)) if not entry_ids: break for entry_id in entry_ids: yield Entry.get(entry_id) start_id = entry_ids[-1]
class FeedCollection(object): def iter_entries(self): return imerge(*[feed.iter_entries() for feed in self.feeds])
def get_entries(self, limit=10): return list(islice(self.iter_entries(), limit))
数据库查询行数 = len(self.feeds) * 5 ~
len(self.feeds)*5 + limit -5
class Feed(object): def iter_entries(self): start_id = sys.maxint while True: entry_ids = exc_sqls("select id from entry where feed_id=%s and id<%s order by id desc limit 5", (self.id, start_id)) if not entry_ids: break for entry_id in entry_ids: yield Entry.get(entry_id) start_id = entry_ids[-1]
class FeedCollection(object): def iter_entries(self): return imerge(*[feed.iter_entries() for feed in self.feeds])
def get_entries(self, limit=10): return list(islice(self.iter_entries(), limit))
浪费的Entry.get数 =0 ~ len(self.feeds)-1
class User(object): def __init__(self, id, username, screen_name, sig): self.id = id self.username = username self.screen_name = screen_name self.sig = sig
user = User('1002211', 'hongqn', 'hongqn', "巴巴布、巴巴布巴布巴布!")
$ python -m timeit -s '> from user import user> from cPickle import dumps, loads> s = dumps(user, 2)' \> 'loads(s)'100000 loops, best of 3: 6.6 usec per loop
$ python -m timeit -s '> from user import user> from marshal import dumps, loads> d = (user.id, user.username, user.screen_name, user.sig)> s = dumps(d, 2)' 'loads(s)'1000000 loops, best of 3: 0.9 usec per loop
cPickle vs. marshal
$ python -m timeit -s '> from user import user> from cPickle import dumps, loads> s = dumps(user, 2)' \> 'loads(s)'100000 loops, best of 3: 6.6 usec per loop
$ python -m timeit -s '> from user import user> from marshal import dumps, loads> d = (user.id, user.username, user.screen_name, user.sig)> s = dumps(d, 2)' 'loads(s)'1000000 loops, best of 3: 0.9 usec per loop
cPickle vs. marshal
7倍速度提升
$ python -m timeit -s '> from user import user> from cPickle import dumps, loads> s = dumps(user, 2)' \> 'loads(s)'100000 loops, best of 3: 6.6 usec per loop
$ python -m timeit -s '> from user import user> from marshal import dumps, loads> d = (user.id, user.username, user.screen_name, user.sig)> s = dumps(d, 2)' 'loads(s)'1000000 loops, best of 3: 0.9 usec per loop
cPickle vs. marshal
7倍速度提升
$ python -c '> import cPickle, marshal> from user import user> print "pickle:", len(cPickle.dumps(user, 2))> print "marshal:", len(marshal.dumps((user.id, \> user.username, user.screen_name, user.sig), 2))'pickle: 129marshal: 74
cPickle vs. marshaltimeit
43%空间节省
$ python -c '> import cPickle, marshal> from user import user> print "pickle:", len(cPickle.dumps(user, 2))> print "marshal:", len(marshal.dumps((user.id, \> user.username, user.screen_name, user.sig), 2))'pickle: 129marshal: 74
cPickle vs. marshaltimeit
43%空间节省
namedtuple
from collections import namedtuple
User = namedtuple('User', 'id username screen_name sig')
user = User('1002211', 'hongqn', 'hongqn', sig="巴巴布、巴巴布巴布巴布!")
user.username-> 'hongqn'
__metaclass__
class User(tuple): __metaclass__ = NamedTupleMetaClass __attrs__ = ['id', 'username', 'screen_name', 'sig']
user = User('1002211', 'hongqn', 'hongqn', sig="巴巴布、巴巴布巴布巴布!")
s = marshal.dumps(user.__marshal__())User.__load_marshal__(marshal.loads(s))
from operator import itemgetter
class NamedTupleMetaClass(type): def __new__(mcs, name, bases, dict): assert bases == (tuple,) for i, a in enumerate(dict['__attrs__']): dict[a] = property(itemgetter(i)) dict['__slots__'] = () dict['__marshal__'] = tuple dict['__load_marshal__'] = classmethod(tuple.__new__) dict['__getnewargs__'] = lambda self: tuple(self) argtxt = repr(tuple(attrs)).replace("'", "")[1:-1] template = """def newfunc(cls, %(argtxt)s): return tuple.__new__(cls, (%(argtxt)s))""" % locals() namespace = {} exec template in namespace dict['__new__'] = namespace['newfunc'] return type.__new__(mcs, name, bases, dict)
案例六
• 简化request.get_environ(key)的写法
• e.g. request.get_environ('REMOTE_ADDR') --> request.remote_addr
descriptor
• 一个具有__get__, __set__或者__delete__方法的对象
class Descriptor(object): def __get__(self, instance, owner): return 'descriptor'
class Owner(object): attr = Descriptor()
owner = Owner()owner.attr --> 'descriptor'
常用的descriptor
• classmethod
• staticmethod
• property
class C(object): def get_x(self): return self._x def set_x(self, x): self._x = x x = property(get_x, set_x)
class environ_getter(object): def __init__(self, key, default=None): self.key = key self.default = default
def __get__(self, obj, objtype): if obj is None: return self return obj.get_environ(self.key, self.default)
class HTTPRequest(quixote.http_request.HTTPRequest): for key in ['HTTP_REFERER', 'REMOTE_ADDR', 'SERVER_NAME', 'REQUEST_URI', 'HTTP_HOST']: locals()[key.lower()] = environ_getter(key) del key
locals()
class environ_getter(object): def __init__(self, key, default=None): self.key = key self.default = default
def __get__(self, obj, objtype): if obj is None: return self return obj.get_environ(self.key, self.default)
class HTTPRequest(quixote.http_request.HTTPRequest): for key in ['HTTP_REFERER', 'REMOTE_ADDR', 'SERVER_NAME', 'REQUEST_URI', 'HTTP_HOST']: locals()[key.lower()] = environ_getter(key) del key
import httplib
orig_connect = httplib.HTTPConnection.connect
def _patched_connect(self): if HOSTS_BLOCKED.match(self.host): return _connect_via_socks_proxy(self) else: return orig_connect(self)
def _connect_via_socks_proxy(self): ...
httplib.HTTPConnection.connect = _patched_connect
使用Python时需要注意的问题
• Pythonic!
• Avoid gotchas http://www.ferg.org/projects/python_gotchas.html
使用Python时需要注意的问题
• Pythonic!
• Avoid gotchas http://www.ferg.org/projects/python_gotchas.html
• Unicode / Character Encoding
使用Python时需要注意的问题
• Pythonic!
• Avoid gotchas http://www.ferg.org/projects/python_gotchas.html
• Unicode / Character Encoding
• GIL (Global Interpreter Lock)
使用Python时需要注意的问题
• Pythonic!
• Avoid gotchas http://www.ferg.org/projects/python_gotchas.html
• Unicode / Character Encoding
• GIL (Global Interpreter Lock)
• Garbage Collection
开发环境
• 编辑器: Vim / Emacs / Ulipad
• 版本管理: subversion / mercurial / git
• wiki/错误跟踪/代码浏览: Trac
• 持续集成: Bitten
Python Implementations
• CPython http://www.python.org/
Python Implementations
• CPython http://www.python.org/
• Unlanden-Swallow http://code.google.com/p/unladen-swallow/
Python Implementations
• CPython http://www.python.org/
• Unlanden-Swallow http://code.google.com/p/unladen-swallow/
• Stackless Python http://www.stackless.com/
Python Implementations
• CPython http://www.python.org/
• Unlanden-Swallow http://code.google.com/p/unladen-swallow/
• Stackless Python http://www.stackless.com/
• IronPython http://ironpython.net/
Python Implementations
• CPython http://www.python.org/
• Unlanden-Swallow http://code.google.com/p/unladen-swallow/
• Stackless Python http://www.stackless.com/
• IronPython http://ironpython.net/
• Jython http://www.jython.org/
Python Implementations
• CPython http://www.python.org/
• Unlanden-Swallow http://code.google.com/p/unladen-swallow/
• Stackless Python http://www.stackless.com/
• IronPython http://ironpython.net/
• Jython http://www.jython.org/
• PyPy http://pypy.org/