delavnog delavnog - 2 months ago 10
Python Question

Automatically shorten long strings when dumping with pretty print

I have the following test program:

from random import choice
d = { }
def data(length):
alphabet = 'abcdefghijklmnopqrstuvwxyz'
res = ''
for _ in xrange(length):
res += choice(alphabet)
return res
# Create the test data
for cnt in xrange(10):
key = 'key-%d' % (cnt)
d[key] = data(30)
def pprint_shorted(d, max_length):
import pprint
pp = pprint.PrettyPrinter(indent=4)
pprint_shorted(d, 10)

Currently the output is something like:

{ 'key-0': 'brnneqgetvanmggyayppxevwcnxvue',
'key-1': 'qjzrklrdkykililenwcyhaexuylgub',
'key-2': 'ayddiaxhvgxpszutnjdwlgojqaluhr',
'key-3': 'rmjpzxrmbogezorigkycqhpsctinzq',
'key-4': 'botfczymszkzwuiecyarknnrvwavnr',
'key-5': 'norifblhtvfnwblcyeipjmteznylfy',
'key-6': 'tiiubgdwxnogdmbafvnujbwpfdopjl',
'key-7': 'badgwbrrqunivylutbxqkaeuctrykt',
'key-8': 'wulrfkqfqqecxmscayzdbatyispwtu',
'key-9': 'gzlwfvjrevlyvbmrvuisnyhhbbwtdd'}

In my production data, sometimes the strings are really long (several thousand chars, coming from
encoded attachments for example), and I do not want that filling up my logs. I would like something like:

{ 'key-0': 'brnneqgetv...',
'key-1': 'qjzrklrdky...',
'key-2': 'ayddiaxhvg...',
'key-3': 'rmjpzxrmbo...',
'key-4': 'botfczymsz...',
'key-5': 'norifblhtv...',
'key-6': 'tiiubgdwxn...',
'key-7': 'badgwbrrqu...',
'key-8': 'wulrfkqfqq...',
'key-9': 'gzlwfvjrev...'}

That is, the string values in the dict with length >
must be replaced by ellipsis. Is there any build-in support in
pretty print
for this, or must I create a copy of the dict by manually walking it and shortening the strings myself?


You can subclass the PrettyPrinter and override the method _format:

import pprint

class P(pprint.PrettyPrinter):
  def _format(self, object, *args, **kwargs):
    if isinstance(object, basestring):
      if len(object) > 20:
        object = object[:20] + '...'
    return pprint.PrettyPrinter._format(self, object, *args, **kwargs)

P().pprint('x' * 1000)

This prints:

[0, 1, 2]