聚會時間公告: 10月份聚會為10月25號星期六下午兩點在MocaMona / 講者:

九月 25, 2008
» 挺不錯的pyinstall

distutils跟setuptools有些什麼什麼好又有些什麼什麼不好,
其實不少python使用者是心知肚明的,
不過一般來說python programmer除了喜歡pythonic之外,
發佈套件也通常都會遵守發佈python package的標準格式,
發佈setuptools的eggs跟distutils的source tarball
雖然每個python user遵循標準程序的原因未必相同,
不過這似乎已是一種不得不的慣例.

最近在distutils跟setuptools之外出現了新選擇:
Tarek Ziadé的distributeIan Bicking的pyinstall

比較完整也比較吸引我眼球的是Ian Bicking的pyinstall,
一來Ian Bicking的東西向來簡單好用,
二來pyinstall的確解決了一些煩雜的問題,
三來Ian Bicking的社群影響力較大,成為新標準的可能性極高.

pyinstall大致相容於setuptools的easy_install並且提供了一些新的功能跟補強,
我自己認為pyinstall最重要的東西,
是在unix上提供了一個足以取代egg格式的新格式: bundle,
bundle格式有兩個我認為很重要的特點:
第一個就是dependencies include,所有相依的套件全都被放在同一個bundle檔案裡.
第二個是source based.
bundle有點像是一整個相依的freebsd ports或gentoo
ebuild加上其distfiles集合在一起的source整合格式,
所有的source跟編譯及安裝規則都放在同一個檔案裏面,在安裝的同時才進行編譯(.pyc and .so)
所以不像binary格式的eggs需要依版本分2.4的eggs跟2.5的eggs,只要下載同一個檔案就可以安裝了.
當然source based distro/package system的好壞見仁見智, 也會有些限制.
不過至少編譯時間過久這個缺點對於python來講應該是不存在的.
因為大部分的python packages依靠c的部份不多, 另外產生.pyc檔也並不會太慢.

此外pyinstall除了支援同一作者的virtualenv外,
對於整個python環境也有提出了一個解決方法 -- Requirements
事實上對於easy_install base安裝的工具最害怕的就是: 下一次裝不知道其相依的套件還會不會是相同版本.
因為安裝最新版未必是我們最希望的事情, 因為很有可能最新版將我們需要的功能給改變了,我會比較希望能夠有一個版本的控制
Requirements不僅可以由撰寫Requirements file來限制整個相依性的版本,
還可以用pyinstall.py --freeze=require.txt的方法,將整個開發環境的所有python版本套件版號都紀錄下來
方便你移到所以需要新安裝的機器上. 如果再結合上bundle, 幾乎就是非常完美的佈署方案.

pyinstall目前只有0.1.1版, 而且似乎還會有更多加強, 不過我認為這個工具的方便性跟應用的潛力非常大, 值得作個推荐.

更詳細的資訊請參考:

pyinstall:
http://www.openplans.org/projects/topp-engineering/blog/2008/09/24/pyinstall-a-new-hope/
http://pypi.python.org/pypi/pyinstall

distribute:
http://tarekziade.wordpress.com/2008/09/24/distribute-a-setuptools-fork/
http://mail.python.org/pipermail/distutils-sig/2008-September/010031.html
http://bazaar.launchpad.net/~tziade/distribute/trunk/files

七月 16, 2008
» [tips] 淺嘗 lift

lift是個由Scala語言所開發的web framework,由於想試玩一下據說連James Gosling都玩的Scala,索性就試著裝裝看lift,看看有沒有機會在上面開發Web APP。據說Scala在.NET及java平台下都可以執行,不過我試的平台是sun jdk 1.5。

首先先安裝好jdk跟maven2,再來打入這一大串,


mvn archetype:create -U \
-DarchetypeGroupId=net.liftweb \
-DarchetypeArtifactId=lift-archetype-basic \
-DarchetypeVersion=0.9 \
-DremoteRepositories=http://scala-tools.org/repo-releases \
-DgroupId=mytestorm.group -DartifactId=mytestorm.app


這會自動建立一個可連結derby database ,有models的ORM骨架的web application. 最厲害的是maven這個工具連scala,jetty這些你缺的dependency都能幫你裝到好。

接下來可以修改mytestorm.app/src/main/scala/bootstrap/liftweb/Boot.scala將db的connection string改成:"jdbc:derby:mytest;create=true" 這等一下會在我的專案目錄mytestorm.app裡建立一個名為mytest的derby db,再打入mvn jetty:run 就可以啟動webserver了(這裡打入mvn tomcat:run的話會幫你裝好tomcat). 因為lift已經事先幫你建好了model,所以現在連到server的8080 port或http://127.0.0.1:8080,就可以看到一個可以登入的歡迎畫面:





到這邊其實就已經有一點django admin模組的味道了,可以註冊帳號跟login什麼的,
如果用django的術語來講,整個lift的架構也不難解釋,django的urlconf跟settings被放在bootstrap/Boot.scala,
model.py被放到scala/your-proj's-group/model這個目錄裡,template是在webapp裡,template tag在scala/your-proj's-group/snippet裡,view在scala/your-proj's-group/view/,
說來說去,實在也是換湯不換藥,大底上目前的web開發就是如此。

如果你還有興趣的話,可以到抓下lift 0.9的release tarball, 然後解開之後到lift-0.9/sites/example裡,用mvn jetty:run 將example都跑起來玩一玩。裏面有幾個sample還滿有趣的,還包括一個comet的聊天室實作。

稍微玩了一下其實沒啥大感覺,主要覺得lift用的maven工具太複雜了,讓整個開發像在變魔術一樣,老是要找東西被裝到那,反而覺得scala沒什麼玩到,主要都在搞設定,另外mavan在裝dependency的時候整個download的過程都要連到國外總站,要裝的package又不少,導致安裝速度變得有點慢,如果能有台灣mirror應該會好一點。

七月 14, 2008
» [tips] rewrite debian/ubuntu 's lighttpd conf script from perl to python

Today I want to port lighttpd on another platform which basically a debian sarge system but without perl and dpkg package system on it. Since it's a debian based platform so I start from porting debian's binary lighttpd package, however I've found there're some perl script lays in /usr/share/lighttpd which are used when lighttpd startup.

While I can easily dump the result of perl script into a textfile,
and then startup my lighttpd correctly, I thought "maybe port it to python is not a bad idea." (since my target platform has python!), so here is the effort:
create-mime.assign.py

#!/usr/bin/python
#
# This script directly translate from debian's lighttpd perl script:
# create-mime.assign.pl
#
# Author: timchen119.at.nospam.gmail.com
# License: Public Domain
#
import sys

try:
f = open("/etc/mime.types",'r')
extensions = {}
print "mimetype.assign = ("
for line in f:
line = line.strip()
if line.startswith('#'): continue
if line != "":
splitlist = line.split()
if len(splitlist) < 2: continue
mime = splitlist[0]
for ext in splitlist[1:]:
if ext in extensions.keys(): continue
extensions[ext] = 1
print '".%s" => "%s",' % (ext,mime)
f.close()
print ")"
except Exception,e:
print e
sys.exit(1)


include-conf-enabled.py
#!/usr/bin/python
#
# This script directly translate from debian's lighttpd perl script:
# include-conf-enabled.pl
#
# Author: timchen119.at.nospam.gmail.com
# License: Public Domain
#

import os,glob

confdir = "/etc/lighttpd/"
enabled = "conf-enabled/*.conf"

os.chdir(confdir)

for file in sorted(glob.glob(enabled)):
print 'include "%s"' % file

use-ipv6.py
#!/usr/bin/python
#
# This script directly translate from ubuntu's lighttpd perl script:
# use-ipv6.pl
#
# Author: timchen119.at.nospam.gmail.com
# License: Public Domain
#

import socket

##this sometimes not accurate. (like in vserver mode)
#if socket.has_ipv6:
#

try:
if socket.socket(socket.AF_INET6,socket.SOCK_STREAM,0):
print 'server.use-ipv6 = "enable"'
except:
pass

All of these files can be found in http://kalug.linux.org.tw/~tim/lighttpd-debian-python-script/
Well something quite interesting happened when I port the debian's create-mime.assign.pl into python, It's that my python script's final result is not equivalent to perl one and has more mime types than its :
--- perlmime.txt    2008-07-14 15:29:23.000000000 +0800
+++ pymime.txt 2008-07-14 15:29:33.000000000 +0800
@@ -114,6 +114,11 @@
".dvi" => "application/x-dvi",
".rhtml" => "application/x-httpd-eruby",
".flac" => "application/x-flac",
+".pfa" => "application/x-font",
+".pfb" => "application/x-font",
+".gsf" => "application/x-font",
+".pcf" => "application/x-font",
+".pcf.Z" => "application/x-font",
".mm" => "application/x-freemind",
".gnumeric" => "application/x-gnumeric",
".sgf" => "application/x-go-sgf",
@@ -193,6 +198,11 @@
".pk" => "application/x-tex-pk",
".texinfo" => "application/x-texinfo",
".texi" => "application/x-texinfo",
+".~" => "application/x-trash",
+".%" => "application/x-trash",
+".bak" => "application/x-trash",
+".old" => "application/x-trash",
+".sik" => "application/x-trash",
".t" => "application/x-troff",
".tr" => "application/x-troff",
".roff" => "application/x-troff",
@@ -282,6 +292,7 @@
".tgf" => "chemical/x-mdl-tgf",
".mcif" => "chemical/x-mmcif",
".mol2" => "chemical/x-mol2",
+".b" => "chemical/x-molconn-Z",
".gpt" => "chemical/x-mopac-graph",
".mop" => "chemical/x-mopac-input",
".mopcrt" => "chemical/x-mopac-input",

So I start to dig why this happened, and I've found a strange perl regex filter all these mimetypes out, I believe it's a minor bug in original perl program. (or it does implicitly doing something meaningful? well I can't figure it out.)
--- create-mime.assign.pl    2008-07-14 15:35:58.000000000 +0800
+++ create-mime.assign.pl.new 2008-07-14 15:36:07.000000000 +0800
@@ -7,7 +7,7 @@
chomp;
s/\#.*//;
next if /^\w*$/;
- if(/^([a-z0-9\/+-.]+)\s+((?:[a-z0-9.+-]+[ ]?)+)$/) {
+ if(/^([A-Za-z0-9\/+-.~%]+)\s+((?:[A-Za-z0-9.+-~%]+[ ]?)+)$/) {
foreach(split / /, $2) {
# mime.types can have same extension for different
# mime types

replace this line and this will produce same results as mine.

usage:
just copy these py scripts to /usr/share/lighttpd
and change these lines if you're using debian based system
#### external configuration files
## mimetype mapping
#include_shell "/usr/share/lighttpd/create-mime.assign.pl"
include_shell "/usr/share/lighttpd/create-mime.assign.py"

## load enabled configuration files,
## read /etc/lighttpd/conf-available/README first
#include_shell "/usr/share/lighttpd/include-conf-enabled.pl"
include_shell "/usr/share/lighttpd/include-conf-enabled.py"

六月 10, 2008
» [tips] evaluate python dictionaries from file safely.

有時候程式設計師總是會有點奇怪的潔癖,
例如這個讀設定檔的module就是這一類的產物,
說真的python有內建csv,ini跟xml之類的parser,
3rd party的parser也到處都是, 特定情況下其實execfile,exec,eval也都沒什麼錯,如果設定檔可以用.py結尾, 直接import 就可以了,再加上其實python 2.6就要支援direct modify ast tree了...實在看不出有什麼必要硬要用python的parser來讀進設定檔,不過話說回來如果只是想要安全的從檔案裡取出一個dictionary,這個小巧的module倒也不失為一個好方法.

# -*- coding: utf-8 -*-
#!/usr/bin/env python
"""
safe_dict
~~~
The `safe_dict` module helps you read a dictionary from a file using python syntax.

The key and values in dictionary are string only.

File `dict.file` (file which we read dict from) should only contain an anonymous dictionary.

Support only Python 2.5+.

reference:
http://docs.python.org/dev/library/_ast
http://dev.pocoo.org/hg/sandbox/file/08541da989dd/ast/ast.py
http://pyside.blogspot.com/2008/03/ast-compilation-from-python.html
~~~
:Author: http://timchen119.blogspot.com
:license: Python License
"""

from __future__ import with_statement
import _ast
#need python 2.5+

def safe_eval_literal(node_or_string):
"""
Safe evaluate a literal.
"""
_safe_names = {'None': None, 'True': True, 'False': False}
if isinstance(node_or_string, basestring):
node_or_string = compile(node_or_string, "<unknown>", "eval" , _ast.PyCF_ONLY_AST)
if isinstance(node_or_string, _ast.Expression):
node_or_string = node_or_string.body
def _convert(node):
if isinstance(node, _ast.Str):
return node.s
elif isinstance(node, _ast.Dict):
return dict((_convert(k), _convert(v)) for k, v
in zip(node.keys, node.values))
elif isinstance(node, _ast.Name):
if node.id in _safe_names:
return _safe_names[node.id]
raise ValueError('malformed string')
return _convert(node_or_string)


def safe_read_dict_from(file):
"""
Safe evaluate a dictionary from a file.
"""
try:
with open(file,'r') as f:
source = f.read()
node = compile(source, "<unknown>", "eval", _ast.PyCF_ONLY_AST)

if isinstance(node.body, _ast.Dict):
return safe_eval_literal(node.body)
else:
raise
except:
raise

if __name__ == '__main__':
try:
dict_we_want = safe_read_dict_from('dict.file')
except Exception,e:
print e



用法: 只要在你的dict.file裡加上一個python dictionary即可, 就可以用這個module讀入dict.file,為了安全性考量,也只讀入字串.

六月 3, 2008
» [tips] add bzr sftp support when you have no compiler on target platform (pure python )

bzr depends on paramiko to provide sftp support. While paramiko itself is pure python, its dependency pycrypto is not. PyCrypto have lots of C-extenstion and you'll need a compiler to install it. However since we only use part of pycrypto (to have sftp support for bzr), we could just add some stub files to prevent the [deploy] problem.

I have made a modified pure python version pycrypto and packaged it with paramiko 1.7.3, so after you installed bzr (use standard python setup.py or easy_install), you just extract paramiko-1.7.3-bzr-sftp-purepy.tgz
at your python site-package directory (make sure you don't have paramiko and pycrypto already exists, if you do, you don't need to install this package anyway) and happy bzr...!

This also makes bzr only depend on python so you could easily deploy it on a machine which doesn't have c compiler and still have sftp support.

Warning: this pacakage only add bzr sftp support and provides nothing besides this, and these COULD break other python packages which also used paramiko and pycrypto, so don't use it if you don't really need it. And the only tests I've done is on my own (embedded linux) machine, Basically it's just for my own use, I have warned you.

五月 14, 2008
» [tips] 如何讓你的ext2/ext3在神出鬼沒的地雷戰場上存活.

喜歡用自由軟體的人其實應該都滿常遇到地雷,
通常也練就了一身人間即時掃雷機的本事,
但有些時候實在是地雷太小顆 (但是倒炸的很大力),
又發生在想都想不到的地方, 要讓人不嗚呼哀哉也難.
就像開車時你不超車會有別人超車,
你不想用新版會有別人用新版,
軟體相容性的問題往往是會不請自來的.

lloyd大大今天跟我說了一個最近踩到地雷的故事,
他拿了一顆用ext2格式化過的400g硬碟,
拿到他弟弟灌了ext2 driver的windows上執行,
之前好一陣子都能讀取寫入, 操作上都沒問題,
最近卻怎麼格式化都不能用.
(在windows上會問你要不要重新格式化)
換了小一點的硬碟也不行. 最後他深入追查才發現es2fprogs這個最近更新的套件更新了mkfs.ext2這個程式, 預設的inode改變成256 bytes. 所以要用
mkfs.ext2 -I 128 讓預設的inode設成原本的128 bytes.

ok問題解決了, 聽起來只是windows ext2 driver跟e2fsprogs相容性的問題對不對?
但仔細一想問題可能就很大了, 今天你在debian lenny格式化了一顆ext2硬碟, 要放到穩定的重要server上(恰巧是debian sarge),卻不能讀了.
今天如果你沒有"恰巧"讀到這段,


E2fsprogs 1.40.5 (January 27, 2008)

Fix a potential overflow big in e2image if the device name is too long.

Mke2fs will now create new filesystems with 256 byte inodes and the ext_attr feature flag by default.
This allows for much better future compatibity with ext4 and speeds up extended attributes even on ext3 filesystems.

並把他放在心上的話, 你很可能就炸掉了.
(不過事實上可能就算你讀到這段也還是會被炸掉...)

此外/boot通常有人會用ext2而非格式化成xfs或raiser3什麼的(甚至連ext3都不用, 因為穩定),也免不了會踩到這個雷,
這裡"恰巧"就有個血淋淋的例子. (GRUB vs. the Inodes: Who Needs a Bootable System, Anyway? ) 喔, 只是不能開機而已嘛...orz

備註:
e2fsprogs version:
Gentoo-stable: 1.40.8
lenny (next debian stable): 1.40.8
etch (debian stable): 1.39+1.40

重要指令:

mkfs.ext2 -I 128 /dev/???
mkfs.ext3 -I 128 /dev/???

如果你還要向前相容性的話, 從現在開始別忘了mkfs.ext3時加上-I 128 , 否則... 就歡樂的炸吧... XD

感謝lloyd大大更正: 在debian etch (kernel 2.6.18) 上應該還是可以讀取256 bytes inode的格式, sarge是2.4 kernel可能就不行了. (根據mkfs.ext2的man page說法是2.4 kernel會沒辦法mount)

update: fix link.

四月 22, 2008