Saturday, February 22, 2014

UnicodeEncodeError exception in python

Error message:
UnicodeEncodeError: codec can't encode character  in position : ordinal not in range(128)

Python represents the string in utf-8 though the string has charachters from different encoding, hence the error message.

Scenario: Have charachters in string which are represented as string(utf-8) but has to be unicode

  • Check the type, that should tell what's the encoding:
           type(str_var)

  • If the type is 'str': you should be decoding from utf-8 to unicode:
           str_var.decode('utf-8')

  • If the type is unicode, then encode it to utf-8
           univar.encode('utf-8')

Note: if the encoding is wrong you might end up with no value if you do to avoid the exception:
          str_val.encode('ascii', 'ignore')

Refer unicode link to understand more about unicode.


No comments:

Post a Comment