Dynamically add properties and generate objects in python's classes
- 2020-05-10 18:25:47
- OfStack
In this paper, the following aspects of 1 to 11 to solve
1, main functions of the program
2
3. Definition of class
4, update each object dynamically with generator generator and return the object
5. Use strip to remove unnecessary characters
6, rematch match string
7. Use timestrptime to extract strings and convert them into time objects
8, complete code
The main function of the program
You now have a tab-1 like document for storing user information: the first line is the attributes, separated by a comma (,), and starting at the second line is the value of each attribute, with each row representing one user. How do you read in this document and output 1 user object per line?
There are four other small requirements:
Each document is large, and memory will crash if you store as many objects as a list of all the rows once. Only one row at a time can be stored in the program.
For each string separated by a comma, there may be double quotation marks (") or single quotation marks ('), such as "3". If it's a number like +000000001.24, get rid of both the + and the 0 and extract 1.24
The document has a time in it, either 2013-10-29 or 2013/10/29 2:23:56, and you want to convert that string to a time type
There are several such documents, each with different properties. For example, this is the user's information, and that is the call record. So what specific properties in the class are dynamically generated based on line 1 of the document
The implementation process
1. Class definition
Since the properties are added dynamically, the property-value pairs are also added dynamically, and are included in the class
updateAttributes()
and
updatePairs()
Two member functions will do, plus a list
attributes
Store properties, dictionaries
attrilist
Store the mapping. Among them
init()
The function is the constructor.
__attributes
An underscore indicates a private variable that cannot be called directly from the outside. You just need to instantiate it
a=UserInfo()
Ok, no parameters required.
class UserInfo(object):
'Class to restore UserInformation'
def __init__ (self):
self.attrilist={}
self.__attributes=[]
def updateAttributes(self,attributes):
self.__attributes=attributes
def updatePairs(self,values):
for i in range(len(values)):
self.attrilist[self.__attributes[i]]=values[i]
2. Update each object dynamically with the generator (generator) and return the object
The generator is equivalent to a function that can be automatically run multiple times with only one initialization, returning one result per loop. But the function USES
return
Returns the result that the generator USES
yield
Return the result. It's running every time
yield
Return, the next run from
yield
And then we start. For example, we implement the fipolacci sequence using functions and generators, respectively:
def fib(max):
n, a, b = 0, 0, 1
while n < max:
print(b)
a, b = b, a + b
n = n + 1
return 'done'
We calculate the first six Numbers of the sequence:
>>> fib(6)
1
1
2
3
5
8
'done'
If you're using a generator, just use the
print
to
yield
That's it. As follows:
def fib(max):
n, a, b = 0, 0, 1
while n < max:
yield b
a, b = b, a + b
n = n + 1
Usage:
>>> f = fib(6)
>>> f
<generator object fib at 0x104feaaa0>
>>> for i in f:
... print(i)
...
1
1
2
3
5
8
>>>
As you can see, generator fib itself is an object, and each time it executes to yield, it interrupts to return 1 result, and it continues from yield the next time
yield
The next line of code continues execution. Generators can also be used
generator.next()
The execution.
In my program, the generator part of the code is as follows:
def ObjectGenerator(maxlinenum):
filename='/home/thinkit/Documents/usr_info/USER.csv'
attributes=[]
linenum=1
a=UserInfo()
file=open(filename)
while linenum < maxlinenum:
values=[]
line=str.decode(file.readline(),'gb2312')#linecache.getline(filename, linenum,'gb2312')
if line=='':
print'reading fail! Please check filename!'
break
str_list=line.split(',')
for item in str_list:
item=item.strip()
item=item.strip('\"')
item=item.strip('\'')
item=item.strip('+0*')
item=catchTime(item)
if linenum==1:
attributes.append(item)
else:
values.append(item)
if linenum==1:
a.updateAttributes(attributes)
else:
a.updatePairs(values)
yield a.attrilist #change to ' a ' to use
linenum = linenum +1
Among them,
a=UserInfo()
For the class
UserInfo
Because the document is gb2312 encoded, the corresponding decoding method is used above. Because line 1 is a property, a function stores the property list
UserInfo
- that is,
updateAttributes();
The next line reads the property-value pair into a dictionary for storage.
p.s.python
Is equivalent to mapping (map).
3. Use strip to remove unnecessary characters
From the code above, you can see the use
str.strip(somechar)
Before and after str can be removed
somechar
Characters.
somechar
This can be either a symbol or a regular expression, as shown below:
item=item.strip()# Remove all escape characters before and after the string, such as \t,\n Etc.
item=item.strip('\"')# Before and after removal "
item=item.strip('\'')
item=item.strip('+0*')# Before and after removal +00...00 . * said 0 You can have as many as you want, or you can have none
4.re.match matches the string
Functional grammar:
re.match(pattern, string, flags=0)
Function parameter description:
Parameter description
pattern
string the string to match.
flags flag bit, which controls how regular expressions are matched, such as case sensitive, multi-line matching, etc.
The re.match method returns one matched object if the match is successful, otherwise returns None. `
>
>
>
s='2015-09-18'
>
>
>
matchObj=re.match(r'\d{4}-\d{2}-\d{2}',s, flags= 0)
>
>
>
print matchObj
<
_sre.SRE_Match object at 0x7f3525480f38
>
1
2
3
4
5
5. Extract the string and convert it into a time object using time.strptime
in
time
In the module,
time.strptime(str,format)
Can put the
str
In accordance with the
format
Format into a time object,
format
The commonly used formats are:
A two-digit year is expressed as (00-99).
%Y 4-digit years (000-9999)
%m month (01-12)
1 day within a month (0-31)
%H 24-hour system (0-23)
%I 12-hour system (01-12)
%M minutes (00=59)
%S seconds (00-59)
In addition, it needs to be used
re
Module, with a regular expression, matches the string to see if it is in 1-like time format, such as
YYYY/MM/DD H:M:S, YYYY-MM-DD
Etc.
In the code above, the function catchTime determines whether item is a time object, and if so, converts it to a time object.
The code is as follows:
import time
import re
def catchTime(item):
# check if it's time
matchObj=re.match(r'\d{4}-\d{2}-\d{2}',item, flags= 0)
if matchObj!= None :
item =time.strptime(item,'%Y-%m-%d')
#print "returned time: %s " %item
return item
else:
matchObj=re.match(r'\d{4}/\d{2}/\d{2}\s\d+:\d+:\d+',item,flags=0 )
if matchObj!= None :
item =time.strptime(item,'%Y/%m/%d %H:%M:%S')
#print "returned time: %s " %item
return item
Complete code:
import collections
import time
import re
class UserInfo(object):
'Class to restore UserInformation'
def __init__ (self):
self.attrilist=collections.OrderedDict()# ordered
self.__attributes=[]
def updateAttributes(self,attributes):
self.__attributes=attributes
def updatePairs(self,values):
for i in range(len(values)):
self.attrilist[self.__attributes[i]]=values[i]
def catchTime(item):
# check if it's time
matchObj=re.match(r'\d{4}-\d{2}-\d{2}',item, flags= 0)
if matchObj!= None :
item =time.strptime(item,'%Y-%m-%d')
#print "returned time: %s " %item
return item
else:
matchObj=re.match(r'\d{4}/\d{2}/\d{2}\s\d+:\d+:\d+',item,flags=0 )
if matchObj!= None :
item =time.strptime(item,'%Y/%m/%d %H:%M:%S')
#print "returned time: %s " %item
return item
def ObjectGenerator(maxlinenum):
filename='/home/thinkit/Documents/usr_info/USER.csv'
attributes=[]
linenum=1
a=UserInfo()
file=open(filename)
while linenum < maxlinenum:
values=[]
line=str.decode(file.readline(),'gb2312')#linecache.getline(filename, linenum,'gb2312')
if line=='':
print'reading fail! Please check filename!'
break
str_list=line.split(',')
for item in str_list:
item=item.strip()
item=item.strip('\"')
item=item.strip('\'')
item=item.strip('+0*')
item=catchTime(item)
if linenum==1:
attributes.append(item)
else:
values.append(item)
if linenum==1:
a.updateAttributes(attributes)
else:
a.updatePairs(values)
yield a.attrilist #change to ' a ' to use
linenum = linenum +1
if __name__ == '__main__':
for n in ObjectGenerator(10):
print n # Output the dictionary to see if it is correct
conclusion