Wednesday, May 7, 2008

Converting other encodings into Unicode with Python

Recently I need to convert some files encoded with gb2312 into utf-8, here's the way to achieve this in Python:
# Suppose gb2312_line is a string encoded in gb2312,utf8_line is the converting result. utf8_line = gb2312_line.decode('gb2312').encode('utf8')

No comments: