After the python directory operation traverses the folder the result is stored as XML

  • 2020-04-02 13:25:20
  • OfStack

Linux servers such as CentOS, Fedora, etc., all have preinstalled Python, with versions ranging from 2.4 to 2.5, while most windows-type servers have installed Python, so as long as you write a script on the machine, upload it to the corresponding machine, and modify the parameters at run time.

Python USES OS libraries to manipulate files and folders, and the following code mainly USES several functions:

Os.listdir: lists files and folders in a directory
Os.path.join: spliced to get a full path to a file/folder
Os.path.isfile: determines if it is a file
Os.path.splitext: takes a subpart from the name

The following is the directory operation code


def search(folder, filter, allfile):
    folders = os.listdir(folder)
    for name in folders:
        curname = os.path.join(folder, name)
        isfile = os.path.isfile(curname)
        if isfile:
            ext = os.path.splitext(curname)[1]
            count = filter.count(ext)
            if count>0:
                cur = myfile()
                cur.name = curname
                allfile.append(cur)
        else:
            search(curname, filter, allfile)
    return allfile

When returning various information of the file, use the custom class allfile to save the information of the file, only use the full path of the file in the program, if you need to record the size of the file, time, type and other information, you can copy the code to expand.


class myfile:
    def __init__(self):
        self.name = ""

  After you get the array to store the file information, you can also save it in XML format. The following is the code. When you use it, you need to import xml.dom.minidom from the Document

Here is the code saved as XML


def generate(allfile, xml):
    doc = Document()
    root = doc.createElement("root")
    doc.appendChild(root)
    for myfile in allfile:
        file = doc.createElement("file")
        root.appendChild(file)
        name = doc.createElement("name")
        file.appendChild(name)
        namevalue = doc.createTextNode(myfile.name)
        name.appendChild(namevalue)
    print doc.toprettyxml(indent="")
    f = open(xml, 'a+')
    f.write(doc.toprettyxml(indent=""))
    f.close()

The code to execute is as follows


if __name__ == '__main__':
    folder = "/usr/local/apache/htdocs"
    filter = [".html",".htm",".php"]
    allfile = []
    allfile = search(folder, filter, allfile)
    len = len(allfile)
    print "found: " + str(len) + " files"
    xml = "folder.xml"
    generate(allfile, xml)

On the Linux command line, execute Python filesearch.py to generate a file named folder. XML.

If you want to run the program on Windows, you need to change the folder variable to the format under Windows, such as c:\\apache2\htdocs, and then execute c:\python25\python.exe filesearch.py (assuming the python installation directory is c:\python25)


Related articles: