The concept of a state machine and a tutorial on using a state machine under Python

  • 2020-05-09 18:46:55
  • OfStack

What is a state machine?

An extremely accurate description of a state machine is that it is a directed graph consisting of a set of nodes and a corresponding set of transition functions. The state machine "runs" in response to a series of 1 events. Each event is within the control of the transition function that belongs to the "current" node, where the scope of the function is a subset of the node. The function returns the "next" (or perhaps the same) node. At least one of these nodes must be an end state. When the final state is reached, the state machine stops.

But an abstract mathematical description (like the one I just gave you) doesn't really tell you when using a state machine can solve a real programming problem. Another strategy is to define the state machine as a mandatory programming language, where the nodes are also source lines. From a practical point of view, this definition, though precise, is as abstract and impractical as the first description. (for declarative, functional, or constraint-based languages, such as Haskell, Scheme, or Prolog, this is bound to happen.)

Let's try to use examples that are more appropriate for the actual tasks around us. Logically, each regular expression is equivalent to a state machine, and the parser for each regular expression implements this state machine. In fact, most programmers don't really think about this point when they write state machines.

In the following example, we will examine the true exploratory definition of the state machine. In general, we have a number of different ways to respond to a finite number of events. In some cases, the response depends only on the event itself. In other cases, however, the appropriate action depends on previous events.

The state machines discussed in this article are high-level machines whose purpose is to demonstrate a programmatic solution to a class 1 problem. If it is necessary to discuss programming problems by category of response event behavior, your solution is likely to be an explicit state machine.

Text processing state machine

One of the programming problems most likely to invoke an explicit state machine involves processing text files. Processing text files usually involves reading units of information (usually called characters or lines) and then performing the appropriate action on the unit that was just read. In some cases, the processing is "stateless" (that is, each such unit contains enough information to determine exactly what to do). In other cases, even if the text file is not completely stateless, the data has only a limited context (for example, the operation depends on no more information than the line number). However, in other common text processing problems, the input file is extremely "stateful." The meaning of each block of data depends on the string that precedes it (and perhaps the string that follows it). Reports, mainframe data entry, readable text, programmed source files, and other types of text files are stateful. A simple example is the 1 line of code that might appear in the Python source file:


myObject = SomeClass(this, that, other)

This line indicates that some of the content is different if there happen to be the following lines surrounding this line:


"""How to use SomeClass:
myObject = SomeClass(this, that, other)
"""

We should know that we are in the "block reference" state to make sure that this line of code is a part 1 comment and not an Python operation.

When not to use a state machine

When you start the task of writing a processor for any stateful text file, ask yourself what type of input you want to find in the file. Each type of input item is a candidate for one state. There are several of these types. If the number is large or uncertain, the state machine may not be the right solution. (in this case, some database solutions might be more appropriate.)

Also consider whether you need to use a state machine. In many cases, it's best to start with a simpler approach. You may find that even if the text file is stateful, there is a simple way to read it in chunks (where each chunk is a type of input value). In fact, in a single 1 state block, it is only necessary to implement a state machine if the transfer between text types requires content-based computation.

The following simple example illustrates the need to use a state machine. Consider two rules for dividing a column of Numbers into chunks. In rule 1, a zero in the list represents a break between blocks. In rule 2, a break between blocks occurs when the sum of the elements in a block exceeds 100. Because it USES an accumulator variable to determine whether a threshold has been reached, you cannot see the boundaries of the sublist "immediately." Therefore, the second rule may be more suitable for a mechanism similar to a state machine.

An example of a slightly stateful text file that is not well suited for processing with a state machine is the Windows-style.ini file. This file includes section headers, comments, and many assignments. Such as:


; set the colorscheme and userlevel
[colorscheme]
background=red
foreground=blue
title=green
[userlevel]
login=2
title=1

Our example has no real meaning, but it shows some interesting features of.ini format 1.

      in a sense, the type of each line is determined by its first character (which may be a semicolon, an open curly brace, or a letter).
From another perspective, this format is "stateful" because the keyword "title" presumably means that if it appears in each section, there is a separate content.

You can write a text processor program with an COLORSCHEME state and an USERLEVEL state, which still handles the assignment of each state. But that doesn't seem to be the right way to deal with it. For example, you can use the Python code to create only natural blocks in this text file, such as:
Handles the chunking Python code for the.INI file


import
 
   string
txt = open(
  'hypothetical.ini').read()
sects = string.split(txt, 
  '[')
  for
 
   sect 
  in
 
   sects:
 
  # do something with sect, like get its name
 # (the stuff up to ']') and read its assignments

Or, if desired, you can use a single current_section variable to determine the location:
Process the calculation Python code for the.INI file


for
 
   line 
  in
 
   open(
  'hypothetical.ini').readlines():
 
  if
 
   line[0] == 
  '[':
 current_section = line(1:-2)
 
  elif
 
   line[0] == 
  ';':
 
  pass
 
  # ignore comments

 
  
 else
 
  :
 apply_value(current_section, line)

When to use the state machine

Now that we've decided not to use a state machine if the text file is "too simple," let's look at situations where we need to use a state machine. The last article in this column discussed the utility Txt2Html, which converts "smart ASCII" (including this article) into HTML. Let's recap.

Smart ASCII is a text format that USES some spacing conventions to distinguish between text block types, such as headers, regular text, quotes, and code samples. While the reader or author can easily analyze the transitions between these text block types by looking at them, there is no easy way for a computer to split the "smart ASCII" file into its constituent text blocks. Unlike the.ini file example, text block types can appear in any order. In no case is there a single 1 delimiter to separate blocks (empty lines usually separate text blocks, but empty lines in a code sample do not necessarily end the code sample, and text blocks do not need to be separated by empty lines). Since each text block needs to be reformatted in a different way to produce the correct HTML output, a state machine seems like the natural solution.

The functions of Txt2Html reader 1 are as follows:

      is started in the initial state.       reads 1 line of input. Depending on the input and the current state,       either moves to a new state or processes the line as appropriate to the current state.

This example is about the simplest case you will encounter, but it illustrates the following pattern we described:
A simple state machine input loop in Python


global
 
   state, blocks, bl_num, newblock
#-- Initialize the globals
state = "HEADER"
blocks = [""]
bl_num = 0
newblock = 1
  for
 
   line 
  in
 
   fhin.readlines():
 
  if
 
   state == 
  "HEADER": 
  # blank line means new block of header
 
   if
  
   
 
   blankln.match(line): newblock = 1
 
  elif
 
   textln.match(line): startText(line)
 
  elif
 
   codeln.match(line): startCode(line)
 
  else
 
  :
 
  if
 
   newblock: startHead(line)
 
  else
 
  : blocks[bl_num] = blocks[bl_num] + line
 
  elif
 
   state == 
  "TEXT": 
  # blank line means new block of text
 
   if
  
   
 
   blankln.match(line): newblock = 1
 
  elif
 
   headln.match(line): startHead(line)
 
  elif
 
   codeln.match(line): startCode(line)
 
  else
 
  :
 
  if
 
   newblock: startText(line)
 
  else
 
  : blocks[bl_num] = blocks[bl_num] + line
 
  elif
 
   state == 
  "CODE": 
  # blank line does not change state
 
   if
  
   
 
   blankln.match(line): blocks[bl_num] = blocks[bl_num] + line
 
  elif
 
   headln.match(line): startHead(line)
 
  elif
 
   textln.match(line): startText(line)
 
  else
 
  : blocks[bl_num] = blocks[bl_num] + line
 
  else
 
  :
 
  raise
 
   ValueError, 
  "unexpected input block state: "+state

You can download the source file from which this code is extracted using Txt2Html (see resources). Note that the variable state is declared as global, and change its value in a function such as startText(). Transition conditions, such as textln.match (), are regular expression patterns, but they may also be custom functions. In fact, the formatting is done later in the program. The state machine only parses the text file into labeled chunks in the blocks list.

Abstract state machine class

It is easy to implement an abstract state machine using Python in forms and functions. This makes the state machine model of the program stand out more than the simple condition block in the previous example (at first glance, the conditions are no different than the others). Furthermore, the following classes and their associated handlers do a good job of operating in isolated state. In many cases, this improves encapsulation and readability.
File: statemachine py


from
 
   string 
  import
 
   upper
  class 
   StateMachine
 
  :
 
  def 
   __init__
 
  (self):
 self.handlers = {}
 self.startState = None
 self.endStates = []
 
  def 
   add_state
 
  (self, name, handler, end_state=0):
 name = upper(name)
 self.handlers[name] = handler
 
  if
 
   end_state:
 self.endStates.append(name)
 
  def 
   set_start
 
  (self, name):
 self.startState = upper(name)
 
  def 
   run
 
  (self, cargo):
 
  try
 
  :
 handler = self.handlers[self.startState]
 
  except
 
  :
 
  raise
 
  "InitializationError", 
  "must call .set_start() before .run()"


 
   if 
 
  not
 
   self.endStates:
 
  raise
 
  "InitializationError", 
  "at least one state must be an end_state"

 
   while
 
   1:
 (newState, cargo) = handler(cargo)
 
  if
 
   upper(newState) 
  in
 
   self.endStates:
 
  break

 
   else
 
  :
 handler = self.handlers[upper(newState)]

The StateMachine class is actually just what the abstract state machine needs. Because passing function objects using Python is so easy, this class requires very few lines compared to similar classes in other languages.

To actually use the StateMachine class, you need to create one handler for each state you want to use. The handler must conform to the pattern. It loops through the events until it wants to move to another state, at which point the handler should pass back a byte group (which includes the name of the new state and any cargo needed by the new state handler).

Using cargo as a variable in the StateMachine class encapsulates the data needed by the state handler (which does not have to call its cargo variable). The state handler USES cargo to pass the content required by the next handler, so the new handler can take over the work left by the previous handler. cargo typically includes a file handle that allows the next handler to read more data after the previous handler has stopped. cargo may also be a database connection, a complex class instance, or a list with several items.

Now, let's look at the test sample. In this case (outlined in the following code example), cargo is just a number that keeps feeding feedback to the iteration function. As long as val is in a range, the next value of val is always just math_func(val). Once a function returns a value out of range, the value is passed to another handler, or the state machine exits after calling a dead-state handler that does nothing. The example illustrates one thing: an event need not be an input event. It can also be a computed event (which is rare). The only difference between state handlers is that they use different tags when outputting the events they handle. This function is simple and does not require a state machine. But it's a good illustration of the concept. The code may be easier to understand than the explanation!
File: statemachine_test py


from
 
   statemachine 
  import
 
   StateMachine
  def
   ones_counter
 
  (val):
 
  print
 
  "ONES State: ",
 
  while
 
   1:
 
  if
 
   val <= 0 
  or
 
   val >= 30:
 newState = 
  "Out_of_Range" ; 
  break
 elif
 
   20 <= val < 30:
 newState = 
  "TWENTIES"; 
  break
 elif
 
   10 <= val < 20:
 newState = 
  "TENS"; 
  break
 else
 
  :
 
  print
 
  " @ %2.1f+" % val,
 val = math_func(val)
 
  print
 
  " >>"

 
   return
 
   (newState, val)
  def 
   tens_counter
 
  (val):
 
  print
 
  "TENS State: ",
 
  while
 
   1:
 
  if
 
   val <= 0 
  or
 
   val >= 30:
 newState = 
  "Out_of_Range"; 
  break
 elif
 
   1 <= val < 10:
 newState = 
  "ONES"; 
  break
 elif
 
   20 <= val < 30:
 newState = 
  "TWENTIES"; 
  break
 else
 
  :
 
  print
 
  " #%2.1f+" % val,
 val = math_func(val)
 
  print
 
  " >>"

 
   return
 
   (newState, val)
  def 
   twenties_counter
 
  (val):
 
  print
 
  "TWENTIES State:",
 
  while
 
   1:
 
  if
 
   val <= 0 
  or
 
   val >= 30:
 newState = 
  "Out_of_Range"; 
  break
 elif
 
   1 <= val < 10:
 newState = 
  "ONES"; 
  break
 elif
 
   10 <= val < 20:
 newState = 
  "TENS"; 
  break
 else
 
  :
 
  print
 
  " *%2.1f+" % val,
 val = math_func(val)
 
  print
 
  " >>"

 
   return
 
   (newState, val)
  def 
   math_func
 
  (n):
 
  from
 
   math 
  import
 
   sin
 
  return
 
   abs(sin(n))*31
  if
 
   __name__== 
  "__main__":
 m = StateMachine()
 m.add_state(
  "ONES", ones_counter)
 m.add_state(
  "TENS", tens_counter)
 m.add_state(
  "TWENTIES", twenties_counter)
 m.add_state(
  "OUT_OF_RANGE", None, end_state=1)
 m.set_start(
  "ONES")
 m.run(1)


Related articles: