Python's dynamic repackaging tutorial

  • 2020-05-09 18:46:42
  • OfStack

Let's plot 1: suppose you want to run a process on a local machine, and some of the program logic is somewhere else. In particular, let's assume that the program logic is updated from time to time, and that you want to use the latest program logic when you run the process. There are many ways to meet the requirements just mentioned; This article will show you several of these approaches.

As the "lovely Python" column continues, the ongoing enhancements to my public domain utility, Txt2Html, have been discussed. This utility converts the "smart ASCII" text file to HTML. Previous articles discussed the Web proxy version of the utility and the curses interface of the utility. Also, I have occasionally noticed that some ASCII tags can be converted in a more efficient way, or that an error in processing a particular tag structure can be resolved.

In fact, the articles in this column were written in ASCII and then translated during the editing process into the HTML format that you can read. Before publishing the draft, I ran a program similar to the following:
The command line of the article is HTML


txt2html charming_python_7.txt > charming_python_7.html

If I want, I can specify some flags to modify the operation; But anyway, the fact is that the latest version of the converter is in my local drive and path. If on the other 1 machine work, or for readers to use the utility, the process is more troublesome, please visit my website, pay attention to compare version number and date of files (sometimes change is too small, I wouldn't change the version number), download the current release, the current version is copied to the correct directory, then run the command line switches. (see resources later in this article.)

The above process involves several manual and time-consuming steps. It should be easier, and it can be done.
Command line Web access

Most people think of Web as a way to interactively browse a page in an GUI environment. That's all well and good, but there's a lot of functionality on the command line, too. A system with Web browser lynx in text mode can view the entire Web as just another set of files used by the command line tool. For example, I've found some commands useful:
Use lynx for command line Web browsing


lynx -dump http://gnosis.cx/publish/.
lynx -dump http://ibm.com/developerworks/. > ibm_developer.txt
lynx -dump http://gnosis.cx/publish | wc | sed "s/( *[0-9]* *\)\([0-9]*\)\(.*\)/\2/g"

Line 1 says: "display the David Mertz home page (in ASCII text) to the console." Line 2 says: "save the ASCII version of IBM's current developerWorks home page to a file." The example in line 3 says, "displays the number of words on the David home page." (don't worry about the details, it just shows the command line tools that work with the pipeline.)

One thing to note about lynx (when using the -dump option) is that it performs almost the exact opposite of Txt2Html: the first tool converts HTML to text; The second tool converts to other formats. But there is no reason not to use Txt2Html, which is as popular as lynx 1. You can do this with a very short Python script:
'fetch_txt2html.py' command line converter


 import sys
 from urllib import urlopen, urlencode
 if len(sys.argv) == 2:
  cgi = 'http://gnosis.cx/cgi/txt2html.cgi'
  opts = urlencode({'source':sys.argv[1], 'proxy':'NONE'})
  print urlopen(cgi, opts).read()
       else:
  print "Please specify URL for Txt2Html conversion"

To run the script, do the following:


python fetch_txt2html.py http://gnosis.cx/publish/programming/charming_python_7.txt

This does not give you all the switches for local Txt2Html processing, but it is easy to add them if necessary. You can transport and redirect the output as you would with any command-line tool 1. However, in the above version, you can only process data files that URL can reach, not local files.

In fact, fetch_txt2html.py can do what lynx cannot (nor can Txt2Html itself) : it not only gets the data source from URL, but also gets the program logic remotely. If fetch_txt2html.py is used, it is not necessary to install Txt2Html on the local machine; The remote call will be handled (using the latest version) and the results will be sent back as if the local process were running. Great, isn't it? The local version of Txt2Html can access remote URL just as it can access local file 1, but it is not yet guaranteed to be up to date itself... !

Dynamic initialization

Using fetch_txt2html.py ensures that the latest program logic is always used in the transformation. However, one other thing this approach can do is transfer the processor (and memory) requirements to the gnosis.cx Web server. The load on this particular process is not particularly high, but it is likely that other types of processes handled on the client would be considered more efficient and satisfactory.

The way Txt2Html is organized -- the way most programs are organized -- is with core flow control functions provided by various utility functions. In particular, these utility functions are frequently updated functions; The core functions (main() and some others) only change when you make a major rewrite. In short, utility functions are effectively updated as each program runs. In fact, most of the time, most of the functions in the master Txt2Html module dmTxt2Html will suffice.
'd2h_textfuncs.py' dynamic Txt2Html update


"""Hot-pluggable replacement functions for Txt2Html""" 
     #-- Functions to massage blocks by type 
#def 
     Titleify(block):
    #def Authorify(block):
    # ... [more block massaging functions] ... 
#-- Utility functions for text transformation 
#def AdjustCaps(txt):
    #def capwords(txt):
    #def URLify(txt):
    def Typographify 
    (txt):
  
    # [module] names 
  r = re.compile(r
    ""'([\(\s'/">]|^)\[(.*?)\]([<\s\.\),:;'"?!/-])""" , re.M | re.S)
  txt = r.sub(
    '\\1<em><code>\\2</code></em>\\3' ,txt)
  
    # *strongly emphasize* words 
  r = re.compile(r
    ""'([\(\s'/"]|^)\*(.*?)\*([\s\.\),:;'"?!/-])""" , re.M | re.S)
  txt = r.sub(
    '\\1<strong>\\2</strong>\\3' , txt)
  
    # ... [more text massaging] ... 
     
     return 
     
     txt
    # ... [more text transformation functions] .....

To use the latest and most specific support modules, there are a few preparatory steps. First, download the main Txt2Html module to the local system (this is a first step). Second, create an Python script on the local system similar to the following example:
'dyn_txt2html.py' command line converter


from 
     dmTxt2Html 
    import 
     *   
    # Import the body of 'Txt2Html' code 
    
from 
     urllib 
    import 
     urlopen
    import 
     sys
    # Check for updated functions (fail gracefully if not fetchable) 
    
try 
    :
  updates = urlopen(
    'http://gnosis.cx/download/t2h_textfuncs.py' ).read()
  fh = open(
    't2h_textfuncs.py' , 
    'w' )
  fh.write(updates)
  fh.close()
    except 
    :
  sys.stderr.write(
    'Cannot currently download Txt2Html updates' )
    # Import the updated functions (if available) 
    
try 
    :
  
    from 
     t2h_textfuncs 
    import 
     *
    except 
    :
  sys.stderr.write(
    'Cannot import the updated Txt2Html functions' )
    # Set options based on runmode (shell vs. CGI) 
    
if 
     len(sys.argv) >= 2:
  cfg_dict = ParseArgs(sys.argv[1:])
  main(cfg_dict)
    else 
    :
  
    print"Please specify URL (and options) for Txt2Html conversion"

In the dyn_txt2html.py script, note that when the from t2h_textfuncs import * statement is executed, all functions previously defined in dmTxt2Html (such as Typographify()) are replaced by the same name function of the t2h_textfuncs version. Of course, if the function t2h_textfuncs is commented out, it will not be replaced.

One small problem to note is that different systems handle writes to STDERR in different ways. In the UNIX system, you can redirect STDERR when you run the script. But in the current OS/2 shell and Windows/DOS, STDERR messages are attached to the console output. You might want to write the above errors/warnings to a log file, or just get used to directing STDOUT to a file (which might be more useful). Such as:
Command line session for 'dyn_txt2html'


G:\txt2html> python dyn_txt2html.py test.txt > test.html
Cannot currently download Txt2Html updates

Error to the console; The converted output goes to the file.

A more interesting thing is why dyn_txt2html.py does not download the entire dmTxt2Html module, but only the support module. There are reasons for this, of course. The t2h_textfuncs support module is much smaller than the main dmTxt2Html module, especially since most of the functions have been truncated/commented out. On a modem connection, it is significantly faster. But download size is not the main reason.

For Txt2Html, it doesn't matter if the user automatically downloads the entire latest module. But what happens when program logic is distributed (especially when maintenance responsibilities are distributed)? You might want Alice, Bob, and Charlie to be responsible for the modules Funcs_A, Funcs_B, and Funcs_C, respectively. Each of them responsible for their function (and independent) change regularly, and upload the latest and best version to their own websites (such as http: / / alice com/Funcs_A py). In this case, it is not feasible for all three programmers to change the same main module. However, you can directly extend scripts like dyn_txt2html.py to try to import Funcs_A, Funcs_B, and Funcs_C at startup (if these resources are not available, MainProg will be retreated to the MainProg version).

A long running dynamic process

So far, the tools we've looked at have captured dynamic program logic by downloading an update resource at initialization time. This makes sense for command-line or batch processing, but what about long-running applications. Such long-running applications are most likely to be server processes that are constantly responding to client requests. But in this case, we'll use curses_txt2html.py, developed for the previous article, to illustrate Python's reload() function. The program curses_txt2html is the wrapper for the local copy of dmTxt2Html. This is not the second time curses programming has been mentioned, but curses_txt2html provides a set of interactive menus to configure and run multiple consecutive Txt2Html transformations.

curses_txt2html can run directly in the background, and we want it to be able to use the latest program logic when switching to its session and running the transformation. For this particular simple example, shutting down and restarting the application is not difficult or particularly damaging. But it is easy to think of other processes that are running directly (perhaps processes that account for the state of the operations performed in the session).

In this article, a new File/Update submenu has been added. It only calls the new function update_txt2html() when activated. In addition to the curses calls associated with providing the validation that occurs, we have seen these steps in other examples in this article:
'curses_txt2html.py' dynamic update function


def update_txt2html 
    ():
  
    # Check for updated functions (fail gracefully if not fetchable) 
  s = curses.newwin(6, 60, 4, 5)
 s.box()
  s.addstr(1, 2, 
    "* PRESS ANY KEY TO CONTINUE *" , curses.A_BOLD)
 s.addstr(3,2, 
    "...downloading..." )
 s.refresh()
  
    try 
    :
    
    from 
     urllib 
    import 
     urlopen
    updates = urlopen(
    'http://gnosis.cx/download/dmTxt2Html.py' ).read()
    fh = open(
    'dmTxt2Html.py' , 
    'w' )
    fh.write(updates)
    fh.close()
 s.addstr(3,2, 
    "Module [dmTxt2Html] downloaded to current directory" )
  
    except 
    :
 s.addstr(3,2, 
    "Download of updated [dmTxt2Html] module failed!" )
  reload(dmTxt2Html)
  s.addstr(4, 2, 
    "Module [dmTxt2Html] reloaded from current directory " )
 s.refresh()
 c = s.getch()
   s.erase()

There are two important differences between the dyn_txthtml.py and update_txt2html() functions. One difference is to go ahead and import the main dmTxt2Html module instead of just importing the supporting functions. This is mainly to simplify the import. The problem here is that we use import dmTxt2Html to access the module, not from dmTxt2Html import *. In many ways, this is a safer process, but the result is that it makes it more difficult (whether unintentionally or intentionally) to override functions in dmTxt2Html. If we want to attach the d2h_textfuncs function from d2h_textfuncs, we must perform dir() on the imported support module and attach the member as an attribute to the "dmTxt2Html" namespace. Performing this style of coverage is an exercise left to the reader.

The main difference brought about by the update_txt2html() function is the use of Python's built-in reload() function. Executing only the new import dmTxt2Html will not overwrite previously imported functions. Please pay close attention to this 1 point! Many beginners believe that reimporting a module will update the in-memory version. That's wrong. In fact, the way to update the memory image of a function in a module is the reload() module.

Another tip was performed in the example above. The download location for the updated dmTxt2Html module is the local working directory, which may or may not be the directory where dmTxt2Html was originally loaded. In fact, if it is in the Python library directory, you may not be using it (and may not have user permission for it). But the reload() call attempts to load from the current directory and then try the rest of the Python path. So, whether or not the download is successful, reload() should be a secure operation (although it may or may not load the new module).


Related articles: