The concept of a state machine and a tutorial on using a state machine under Python
- 2020-05-09 18:46:55
- OfStack
What is a state machine?
An extremely accurate description of a state machine is that it is a directed graph consisting of a set of nodes and a corresponding set of transition functions. The state machine "runs" in response to a series of 1 events. Each event is within the control of the transition function that belongs to the "current" node, where the scope of the function is a subset of the node. The function returns the "next" (or perhaps the same) node. At least one of these nodes must be an end state. When the final state is reached, the state machine stops.
But an abstract mathematical description (like the one I just gave you) doesn't really tell you when using a state machine can solve a real programming problem. Another strategy is to define the state machine as a mandatory programming language, where the nodes are also source lines. From a practical point of view, this definition, though precise, is as abstract and impractical as the first description. (for declarative, functional, or constraint-based languages, such as Haskell, Scheme, or Prolog, this is bound to happen.)
Let's try to use examples that are more appropriate for the actual tasks around us. Logically, each regular expression is equivalent to a state machine, and the parser for each regular expression implements this state machine. In fact, most programmers don't really think about this point when they write state machines.
In the following example, we will examine the true exploratory definition of the state machine. In general, we have a number of different ways to respond to a finite number of events. In some cases, the response depends only on the event itself. In other cases, however, the appropriate action depends on previous events.
The state machines discussed in this article are high-level machines whose purpose is to demonstrate a programmatic solution to a class 1 problem. If it is necessary to discuss programming problems by category of response event behavior, your solution is likely to be an explicit state machine.
Text processing state machine
One of the programming problems most likely to invoke an explicit state machine involves processing text files. Processing text files usually involves reading units of information (usually called characters or lines) and then performing the appropriate action on the unit that was just read. In some cases, the processing is "stateless" (that is, each such unit contains enough information to determine exactly what to do). In other cases, even if the text file is not completely stateless, the data has only a limited context (for example, the operation depends on no more information than the line number). However, in other common text processing problems, the input file is extremely "stateful." The meaning of each block of data depends on the string that precedes it (and perhaps the string that follows it). Reports, mainframe data entry, readable text, programmed source files, and other types of text files are stateful. A simple example is the 1 line of code that might appear in the Python source file:
myObject = SomeClass(this, that, other)
This line indicates that some of the content is different if there happen to be the following lines surrounding this line:
"""How to use SomeClass:
myObject = SomeClass(this, that, other)
"""
We should know that we are in the "block reference" state to make sure that this line of code is a part 1 comment and not an Python operation.
When not to use a state machine
When you start the task of writing a processor for any stateful text file, ask yourself what type of input you want to find in the file. Each type of input item is a candidate for one state. There are several of these types. If the number is large or uncertain, the state machine may not be the right solution. (in this case, some database solutions might be more appropriate.)
Also consider whether you need to use a state machine. In many cases, it's best to start with a simpler approach. You may find that even if the text file is stateful, there is a simple way to read it in chunks (where each chunk is a type of input value). In fact, in a single 1 state block, it is only necessary to implement a state machine if the transfer between text types requires content-based computation.
The following simple example illustrates the need to use a state machine. Consider two rules for dividing a column of Numbers into chunks. In rule 1, a zero in the list represents a break between blocks. In rule 2, a break between blocks occurs when the sum of the elements in a block exceeds 100. Because it USES an accumulator variable to determine whether a threshold has been reached, you cannot see the boundaries of the sublist "immediately." Therefore, the second rule may be more suitable for a mechanism similar to a state machine.
An example of a slightly stateful text file that is not well suited for processing with a state machine is the Windows-style.ini file. This file includes section headers, comments, and many assignments. Such as:
; set the colorscheme and userlevel
[colorscheme]
background=red
foreground=blue
title=green
[userlevel]
login=2
title=1
Our example has no real meaning, but it shows some interesting features of.ini format 1.
in a sense, the type of each line is determined by its first character (which may be a semicolon, an open curly brace, or a letter).
From another perspective, this format is "stateful" because the keyword "title" presumably means that if it appears in each section, there is a separate content.
You can write a text processor program with an COLORSCHEME state and an USERLEVEL state, which still handles the assignment of each state. But that doesn't seem to be the right way to deal with it. For example, you can use the Python code to create only natural blocks in this text file, such as:
Handles the chunking Python code for the.INI file
import
string
txt = open(
'hypothetical.ini').read()
sects = string.split(txt,
'[')
for
sect
in
sects:
# do something with sect, like get its name
# (the stuff up to ']') and read its assignments
Or, if desired, you can use a single current_section variable to determine the location:
Process the calculation Python code for the.INI file
for
line
in
open(
'hypothetical.ini').readlines():
if
line[0] ==
'[':
current_section = line(1:-2)
elif
line[0] ==
';':
pass
# ignore comments
else
:
apply_value(current_section, line)
When to use the state machine
Now that we've decided not to use a state machine if the text file is "too simple," let's look at situations where we need to use a state machine. The last article in this column discussed the utility Txt2Html, which converts "smart ASCII" (including this article) into HTML. Let's recap.
Smart ASCII is a text format that USES some spacing conventions to distinguish between text block types, such as headers, regular text, quotes, and code samples. While the reader or author can easily analyze the transitions between these text block types by looking at them, there is no easy way for a computer to split the "smart ASCII" file into its constituent text blocks. Unlike the.ini file example, text block types can appear in any order. In no case is there a single 1 delimiter to separate blocks (empty lines usually separate text blocks, but empty lines in a code sample do not necessarily end the code sample, and text blocks do not need to be separated by empty lines). Since each text block needs to be reformatted in a different way to produce the correct HTML output, a state machine seems like the natural solution.
The functions of Txt2Html reader 1 are as follows:
is started in the initial state. reads 1 line of input. Depending on the input and the current state, either moves to a new state or processes the line as appropriate to the current state.
This example is about the simplest case you will encounter, but it illustrates the following pattern we described:
A simple state machine input loop in Python
global
state, blocks, bl_num, newblock
#-- Initialize the globals
state = "HEADER"
blocks = [""]
bl_num = 0
newblock = 1
for
line
in
fhin.readlines():
if
state ==
"HEADER":
# blank line means new block of header
if
blankln.match(line): newblock = 1
elif
textln.match(line): startText(line)
elif
codeln.match(line): startCode(line)
else
:
if
newblock: startHead(line)
else
: blocks[bl_num] = blocks[bl_num] + line
elif
state ==
"TEXT":
# blank line means new block of text
if
blankln.match(line): newblock = 1
elif
headln.match(line): startHead(line)
elif
codeln.match(line): startCode(line)
else
:
if
newblock: startText(line)
else
: blocks[bl_num] = blocks[bl_num] + line
elif
state ==
"CODE":
# blank line does not change state
if
blankln.match(line): blocks[bl_num] = blocks[bl_num] + line
elif
headln.match(line): startHead(line)
elif
textln.match(line): startText(line)
else
: blocks[bl_num] = blocks[bl_num] + line
else
:
raise
ValueError,
"unexpected input block state: "+state
You can download the source file from which this code is extracted using Txt2Html (see resources). Note that the variable state is declared as global, and change its value in a function such as startText(). Transition conditions, such as textln.match (), are regular expression patterns, but they may also be custom functions. In fact, the formatting is done later in the program. The state machine only parses the text file into labeled chunks in the blocks list.
Abstract state machine class
It is easy to implement an abstract state machine using Python in forms and functions. This makes the state machine model of the program stand out more than the simple condition block in the previous example (at first glance, the conditions are no different than the others). Furthermore, the following classes and their associated handlers do a good job of operating in isolated state. In many cases, this improves encapsulation and readability.
File: statemachine py
from
string
import
upper
class
StateMachine
:
def
__init__
(self):
self.handlers = {}
self.startState = None
self.endStates = []
def
add_state
(self, name, handler, end_state=0):
name = upper(name)
self.handlers[name] = handler
if
end_state:
self.endStates.append(name)
def
set_start
(self, name):
self.startState = upper(name)
def
run
(self, cargo):
try
:
handler = self.handlers[self.startState]
except
:
raise
"InitializationError",
"must call .set_start() before .run()"
if
not
self.endStates:
raise
"InitializationError",
"at least one state must be an end_state"
while
1:
(newState, cargo) = handler(cargo)
if
upper(newState)
in
self.endStates:
break
else
:
handler = self.handlers[upper(newState)]
The StateMachine class is actually just what the abstract state machine needs. Because passing function objects using Python is so easy, this class requires very few lines compared to similar classes in other languages.
To actually use the StateMachine class, you need to create one handler for each state you want to use. The handler must conform to the pattern. It loops through the events until it wants to move to another state, at which point the handler should pass back a byte group (which includes the name of the new state and any cargo needed by the new state handler).
Using cargo as a variable in the StateMachine class encapsulates the data needed by the state handler (which does not have to call its cargo variable). The state handler USES cargo to pass the content required by the next handler, so the new handler can take over the work left by the previous handler. cargo typically includes a file handle that allows the next handler to read more data after the previous handler has stopped. cargo may also be a database connection, a complex class instance, or a list with several items.
Now, let's look at the test sample. In this case (outlined in the following code example), cargo is just a number that keeps feeding feedback to the iteration function. As long as val is in a range, the next value of val is always just math_func(val). Once a function returns a value out of range, the value is passed to another handler, or the state machine exits after calling a dead-state handler that does nothing. The example illustrates one thing: an event need not be an input event. It can also be a computed event (which is rare). The only difference between state handlers is that they use different tags when outputting the events they handle. This function is simple and does not require a state machine. But it's a good illustration of the concept. The code may be easier to understand than the explanation!
File: statemachine_test py
from
statemachine
import
StateMachine
def
ones_counter
(val):
print
"ONES State: ",
while
1:
if
val <= 0
or
val >= 30:
newState =
"Out_of_Range" ;
break
elif
20 <= val < 30:
newState =
"TWENTIES";
break
elif
10 <= val < 20:
newState =
"TENS";
break
else
:
print
" @ %2.1f+" % val,
val = math_func(val)
print
" >>"
return
(newState, val)
def
tens_counter
(val):
print
"TENS State: ",
while
1:
if
val <= 0
or
val >= 30:
newState =
"Out_of_Range";
break
elif
1 <= val < 10:
newState =
"ONES";
break
elif
20 <= val < 30:
newState =
"TWENTIES";
break
else
:
print
" #%2.1f+" % val,
val = math_func(val)
print
" >>"
return
(newState, val)
def
twenties_counter
(val):
print
"TWENTIES State:",
while
1:
if
val <= 0
or
val >= 30:
newState =
"Out_of_Range";
break
elif
1 <= val < 10:
newState =
"ONES";
break
elif
10 <= val < 20:
newState =
"TENS";
break
else
:
print
" *%2.1f+" % val,
val = math_func(val)
print
" >>"
return
(newState, val)
def
math_func
(n):
from
math
import
sin
return
abs(sin(n))*31
if
__name__==
"__main__":
m = StateMachine()
m.add_state(
"ONES", ones_counter)
m.add_state(
"TENS", tens_counter)
m.add_state(
"TWENTIES", twenties_counter)
m.add_state(
"OUT_OF_RANGE", None, end_state=1)
m.set_start(
"ONES")
m.run(1)