Extract the specific file path in the folder based on linux command

  • 2021-08-12 04:14:40
  • OfStack

Recently, you need to automate the search for specific files in specific folders, and you need to save the file path and file name separately. However, walk using python can be realized, but it feels more complicated. So I wanted to see if linux's own commands could do the job.

Environment

The directory structure to find is as follows

. |____test | |____test2.txt | |____test.py | |____test.txt | |____regex.py |____MongoDB | |____.gitignore | |____cnt_fail.py | |____db

Goal 1: Get all py file names

If only find.-name '*. py' is used for the lookup, the result will be a

./test/test.py
./test/regex.py
./MongoDB/cnt_fail.py

If we only need the file name, we can use the command basename provided by linux

Use basename to process all the search results of find. We need to use the parameters of find-exec

The final order is:

find . -name '*.py' -exec basename {} \;

Results:

test.py
regex.py
cnt_fail.py

Where {} is used in conjunction with the-exec option to match all results and then fetch their filenames.

Goal 2: Get all py file paths, remove duplicates, and delete the "./" character at the beginning

linux also has a command to get the file path dirname

Slightly modify the previous command to display all file paths

find . -name '*.py' -exec dirname {} \;
Search results:

./test
./test
./MongoDB

It can be seen that there are duplicates in the path. linux can be used to remove duplicates by adding-u parameter with sort, and-u parameter is used to remove duplicates in the sorting result
We need to pass the output of the last command to sort as input, so we naturally think of pipes

The pipeline command operator is: It can only process the correct output information from the previous 1 instruction, that is, the information of standard output, and for stdandard
error information has no direct processing capability. Then, pass the next command as standard input standard input.

The command after adding sort is

find . -name '*.py' -exec dirname {} \; | sort -u

The result of running is:

./MongoDB
./test

Finally, we use cut to delete the./character before every 1 drive, and the parameter-c3-means to extract the third character of the string (starting position 1) to the last substring
The final order is:

find . -name '*.py' -exec dirname {} \; | sort -u | cut -c3-

Run results:

MongoDB
test


Related articles: