Skip to content

Commit

Permalink
Ensure UTF8 encoding
Browse files Browse the repository at this point in the history
Avoid future problems with database fields encoded with latin1 and other codifications. This fixes a livestatus error (shinken-monitoring/mod-livestatus#33) creating a response with fields imported with this module encoded as latin1.
  • Loading branch information
dgilm committed Sep 25, 2014
1 parent 64217c0 commit 50bafd6
Showing 1 changed file with 15 additions and 1 deletion.
16 changes: 15 additions & 1 deletion module/module.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@
except ImportError:
MySQLdb = None

import chardet

from shinken.basemodule import BaseModule
from shinken.log import logger

Expand Down Expand Up @@ -96,6 +98,17 @@ def init(self):
raise
logger.info("[MySQLImport]: Connection opened")

def ensure_encoding(self, value):
try:
value.decode('utf-8')
except UnicodeDecodeError, e:
encoding = chardet.detect(value)['encoding'] or 'latin1'
logger.warning("[MySQLImport]: Error decoding string " + \
"(value='%s', encoding='%s')" % (value, encoding) + \
". Failing back to utf-8")
value = value.decode(encoding).encode('utf-8')
return value

This comment has been minimized.

Copy link
@gst

gst Jan 23, 2015

in the normal case this function return a unicode string but if the first decode happens to fail then it will return a byte string !! :|

we should either return always unicode string or return always byte string !


# Main function that is called in the CONFIGURATION phase
def get_objects(self):
if not hasattr(self, 'conn'):
Expand Down Expand Up @@ -126,7 +139,8 @@ def get_objects(self):
h = {}
for column in row:
if row[column]:
h[column] = str(row[column])
value = str(row[column])
h[column] = self.ensure_encoding(value)
r[k].append(h)

cursor.close()
Expand Down

0 comments on commit 50bafd6

Please sign in to comment.