Python2.x and 3.x maketrans and translate functions are used differently

2020-05-09 18:50:25
OfStack

The maketrans and translate functions are common methods for encoding string characters. The focus of this article is to demonstrate its basic usage and the differences between operations under different versions. The 2.X version mentioned in this article refers to the version 2.6 and above, and the 3.X version refers to the version 3.1 and above.
2.X basically divides strings into two types: unicode string and str 8-bit string. The 3.X version re-divides the string into the byte string bytes and the text string str, both of which are immutable, so a variable byte string type bytearray is added.
The 2.X version of the string type and str unicode type of a large number of methods are repeated, so 3.X version of the string module and str duplicate method is not recommended. There are also many useful constants and methods in the string module, such as string.digits, which can be easily used in string encoding.

2. Signatures of maketrans and translate functions in X:


  string.maketrans(from, to)

  string.translate(s, table[, deletechars])
  str.translate(table[, deletechars])
  unicode.translate(table)

3. Signatures of maketrans and translate functions in X:


  static str.maketrans(x[, y[, z]])
  static bytes.maketrans(from, to)
  static bytearray.maketrans(from, to)

  str.translate(map)
  bytes.translate(table[, delete])
  bytearray.translate(table[, delete])

It can be seen from that compared to the maketrans method of the string module of 2.X, three static methods are provided in 3.X to create the mapping table.
let's look at a simple example to illustrate the string conversion process:
2. Demo process under X:


  >>> import string                     # The import string The module 
  >>> map = string.maketrans('123', 'abc') # Create a mapping table that will contain the string '1','2','3' Replace with 'a','b','c'
  >>> s = '54321123789'                # The string before conversion 
  >>> s.translate(map)                  # Use the mapping table created map Converted string 
  '54cbaabc789'                        # The converted string

Demonstration process under 3.X


  >>> map = str.maketrans('123','abc')
  >>> s = '54321123789'
  >>> s.translate(map)
  '54cbaabc789'

2.X USES string's maketrans function, while 3.X USES str's maketrans function. Except for this point, the usage is basically the same. When you specify a character to delete from a string, the use is slightly different, as follows:
2. Demo process under X:


  >>> import string
  >>> map = string.maketrans('123', 'abc')
  >>> s = '54321123789'
  >>> s.translate(map, '78')        # In addition to the conversion, the characters in the string are also deleted '7','8'
  '54cbaabc9'               # The converted string has no characters '7','8'

3. Demo process under X:


  >>> map = str.maketrans('123','abc', '78')# The character to delete needs to be specified here 
  >>> s = '54321123789'
  >>> s.translate(map)
  '54cbaabc9'

While reading Python Cookbook, I came across an example based on the 2.X version, as follows


  import string
  def translator(frm='', to='', delete='', keep=None):
    if len(to) == 1:
      to = to * len(frm)
    trans = string.maketrans(frm, to)
    if keep is not None:
      allchars = string.maketrans('', '')
      delete = allchars.translate(allchars, keep.translate(allchars,delete))
    def translate(s):
      return s.translate(trans, delete)
    return translate

allchars should be a returned mapping table, why can you still call translate method, so it should be an str type, the test is as follows:


  >>> import string
  >>> map = string.maketrans('123', 'abc')
  >>> type(map)
  <type 'str'>

version 3.X version 3.X version 3.


  >>> map = str.maketrans('123','abc')
  >>> type(map)
  <class 'dict'>

Now that knows the type of the mapping table, we can "post-process" it, like example 1 in Python Cookbook above, to meet our coding requirements.

The string used in the example discussed above is composed of ASCII characters. If it is a byte type, the operation in X version is the same as that in X version. If it is of type unicode, 2.X requires the translate method of unicode, note the following code


  >>> print u"hallo".translate({97:u'e'})
  hello
  >>> print u"hallo".translate({'a':u'e'})
  hallo
  >>> print u"hallo".translate({u'a':u'e'})
  hallo

The reason why results are not the same is that according to the manual, the mapping table of unicode translate method is that the key of unicode must be the bit ordinal number of unicode, and the value can be the bit ordinal number of unicode, unicode string or None.