Dynamically add properties and generate objects in python's classes

  • 2020-05-10 18:25:47
  • OfStack

In this paper, the following aspects of 1 to 11 to solve

          1, main functions of the program

          2

          3. Definition of class          

          4, update each object dynamically with generator generator and return the object

          5. Use strip to remove unnecessary characters

          6, rematch match string

          7. Use timestrptime to extract strings and convert them into time objects

          8, complete code

The main function of the program

You now have a tab-1 like document for storing user information: the first line is the attributes, separated by a comma (,), and starting at the second line is the value of each attribute, with each row representing one user. How do you read in this document and output 1 user object per line?
There are four other small requirements:

Each document is large, and memory will crash if you store as many objects as a list of all the rows once. Only one row at a time can be stored in the program.

For each string separated by a comma, there may be double quotation marks (") or single quotation marks ('), such as "3". If it's a number like +000000001.24, get rid of both the + and the 0 and extract 1.24

The document has a time in it, either 2013-10-29 or 2013/10/29 2:23:56, and you want to convert that string to a time type

There are several such documents, each with different properties. For example, this is the user's information, and that is the call record. So what specific properties in the class are dynamically generated based on line 1 of the document

The implementation process

1. Class definition

Since the properties are added dynamically, the property-value pairs are also added dynamically, and are included in the class updateAttributes() and updatePairs() Two member functions will do, plus a list attributes Store properties, dictionaries attrilist Store the mapping. Among them init() The function is the constructor. __attributes An underscore indicates a private variable that cannot be called directly from the outside. You just need to instantiate it a=UserInfo() Ok, no parameters required.


class UserInfo(object):
 'Class to restore UserInformation'
 def __init__ (self):
  self.attrilist={}
  self.__attributes=[]
 def updateAttributes(self,attributes):
  self.__attributes=attributes
 def updatePairs(self,values):
  for i in range(len(values)):
   self.attrilist[self.__attributes[i]]=values[i]

2. Update each object dynamically with the generator (generator) and return the object

The generator is equivalent to a function that can be automatically run multiple times with only one initialization, returning one result per loop. But the function USES return Returns the result that the generator USES yield Return the result. It's running every time yield Return, the next run from yield And then we start. For example, we implement the fipolacci sequence using functions and generators, respectively:


def fib(max):
 n, a, b = 0, 0, 1
 while n < max:
  print(b)
  a, b = b, a + b
  n = n + 1
 return 'done'

We calculate the first six Numbers of the sequence:


>>> fib(6)
1
1
2
3
5
8
'done'

If you're using a generator, just use the print to yield That's it. As follows:


def fib(max):
 n, a, b = 0, 0, 1
 while n < max:
  yield b
  a, b = b, a + b
  n = n + 1

Usage:


>>> f = fib(6)
>>> f
<generator object fib at 0x104feaaa0>
>>> for i in f:
...  print(i)
... 
1
1
2
3
5
8
>>> 

As you can see, generator fib itself is an object, and each time it executes to yield, it interrupts to return 1 result, and it continues from yield the next time yield The next line of code continues execution. Generators can also be used generator.next() The execution.

In my program, the generator part of the code is as follows:


def ObjectGenerator(maxlinenum):
 filename='/home/thinkit/Documents/usr_info/USER.csv'
 attributes=[]
 linenum=1
 a=UserInfo()
 file=open(filename)
 while linenum < maxlinenum:
  values=[]
  line=str.decode(file.readline(),'gb2312')#linecache.getline(filename, linenum,'gb2312')
  if line=='':
   print'reading fail! Please check filename!'
   break
  str_list=line.split(',')
  for item in str_list:
   item=item.strip()
   item=item.strip('\"')
   item=item.strip('\'')
   item=item.strip('+0*')
   item=catchTime(item)
   if linenum==1:
    attributes.append(item)
   else:
    values.append(item)
  if linenum==1:
   a.updateAttributes(attributes)
  else:
   a.updatePairs(values)
   yield a.attrilist #change to ' a ' to use
  linenum = linenum +1

Among them, a=UserInfo() For the class UserInfo Because the document is gb2312 encoded, the corresponding decoding method is used above. Because line 1 is a property, a function stores the property list UserInfo - that is, updateAttributes(); The next line reads the property-value pair into a dictionary for storage. p.s.python Is equivalent to mapping (map).

3. Use strip to remove unnecessary characters

From the code above, you can see the use str.strip(somechar) Before and after str can be removed somechar Characters. somechar This can be either a symbol or a regular expression, as shown below:


item=item.strip()# Remove all escape characters before and after the string, such as \t,\n Etc. 
item=item.strip('\"')# Before and after removal "
item=item.strip('\'')
item=item.strip('+0*')# Before and after removal +00...00 . * said 0 You can have as many as you want, or you can have none 

4.re.match matches the string

Functional grammar:


re.match(pattern, string, flags=0)

Function parameter description:

Parameter                     description

pattern          

string                 the string to match.

flags                 flag bit, which controls how regular expressions are matched, such as case sensitive, multi-line matching, etc.

The re.match method returns one matched object if the match is successful, otherwise returns None. `

> > > s='2015-09-18'
> > > matchObj=re.match(r'\d{4}-\d{2}-\d{2}',s, flags= 0)
> > > print matchObj
< _sre.SRE_Match object at 0x7f3525480f38 >
1
2
3
4
5

5. Extract the string and convert it into a time object using time.strptime

in time In the module, time.strptime(str,format) Can put the str In accordance with the format Format into a time object, format The commonly used formats are:

A two-digit year is expressed as (00-99).

        %Y 4-digit years (000-9999)

        %m month (01-12)

1 day within a month (0-31)

        %H 24-hour system (0-23)

        %I 12-hour system (01-12)

        %M minutes (00=59)

        %S seconds (00-59)

In addition, it needs to be used re Module, with a regular expression, matches the string to see if it is in 1-like time format, such as YYYY/MM/DD H:M:S, YYYY-MM-DD Etc.

In the code above, the function catchTime determines whether item is a time object, and if so, converts it to a time object.

The code is as follows:


import time
import re

def catchTime(item):
 # check if it's time
 matchObj=re.match(r'\d{4}-\d{2}-\d{2}',item, flags= 0)
 if matchObj!= None :
  item =time.strptime(item,'%Y-%m-%d')
  #print "returned time: %s " %item
  return item
 else:
  matchObj=re.match(r'\d{4}/\d{2}/\d{2}\s\d+:\d+:\d+',item,flags=0 )
  if matchObj!= None :
   item =time.strptime(item,'%Y/%m/%d %H:%M:%S')
   #print "returned time: %s " %item
  return item

Complete code:


import collections
import time
import re

class UserInfo(object):
 'Class to restore UserInformation'
 def __init__ (self):
  self.attrilist=collections.OrderedDict()# ordered
  self.__attributes=[]
 def updateAttributes(self,attributes):
  self.__attributes=attributes
 def updatePairs(self,values):
  for i in range(len(values)):
   self.attrilist[self.__attributes[i]]=values[i]

def catchTime(item):
 # check if it's time
 matchObj=re.match(r'\d{4}-\d{2}-\d{2}',item, flags= 0)
 if matchObj!= None :
  item =time.strptime(item,'%Y-%m-%d')
  #print "returned time: %s " %item
  return item
 else:
  matchObj=re.match(r'\d{4}/\d{2}/\d{2}\s\d+:\d+:\d+',item,flags=0 )
  if matchObj!= None :
   item =time.strptime(item,'%Y/%m/%d %H:%M:%S')
   #print "returned time: %s " %item
  return item


def ObjectGenerator(maxlinenum):
 filename='/home/thinkit/Documents/usr_info/USER.csv'
 attributes=[]
 linenum=1
 a=UserInfo()
 file=open(filename)
 while linenum < maxlinenum:
  values=[]
  line=str.decode(file.readline(),'gb2312')#linecache.getline(filename, linenum,'gb2312')
  if line=='':
   print'reading fail! Please check filename!'
   break
  str_list=line.split(',')
  for item in str_list:
   item=item.strip()
   item=item.strip('\"')
   item=item.strip('\'')
   item=item.strip('+0*')
   item=catchTime(item)
   if linenum==1:
    attributes.append(item)
   else:
    values.append(item)
  if linenum==1:
   a.updateAttributes(attributes)
  else:
   a.updatePairs(values)
   yield a.attrilist #change to ' a ' to use
  linenum = linenum +1

if __name__ == '__main__':
 for n in ObjectGenerator(10):
  print n  # Output the dictionary to see if it is correct 

conclusion


Related articles: