Efficiency comparison of different methods of deleting files in Linux

  • 2021-08-28 21:39:24
  • OfStack

Test the efficiency of deleting a large number of files under Linux under 1.

First, establish 500,000 documents

$ test for i in $(seq 1 500000);do echo text >>$i.txt;done

1. rm delete

$ time rm -f *
zsh: sure you want to delete all the files in /home/hungerr/test [yn]? y
zsh: argument list too long: rm
rm -f * 3.63s user 0.29s system 98% cpu 3.985 total

rm doesn't work because of the excessive number of files.

2. find delete

$ time find ./ -type f -exec rm {} \;
find ./ -type f -exec rm {} \; 49.86s user 1032.13s system 41% cpu 43:19.17 total

About 43 minutes, my computer. . . . . . I deleted it while watching the video.

3. find with delete

$ time find ./ -type f -delete
find ./ -type f -delete 0.43s user 11.21s system 2% cpu 9:13.38 total

It takes 9 minutes.

4. rsync delete

# Create an empty folder blanktest first
$ time rsync -a --delete blanktest/ test/
rsync -a --delete blanktest/ test/ 0.59s user 7.86s system 51% cpu 16.418 total16s

It's good and powerful.

5. Python delete


import os
import timeit
 
def main():  
  for pathname,dirnames,filenames in os.walk('/home/username/test'):    
    for filename in filenames:      
      file=os.path.join(pathname,filename)      
      os.remove(file)     
if __name__=='__main__':
t=timeit.Timer('main()','from __main__ import main')
print t.timeit(1)    
1
2
$ python test.py 529.309022903

It takes about nine minutes.

6. Perl delete

$ time perl -e 'for( < * > ){((stat)[9] < (unlink))}'
perl -e 'for( < * > ){((stat)[9] < (unlink))}' 1.28s user 7.23s system 50% cpu 16.784 total16s

This should be the fastest.

7. Results:

rm: Too many files to be available find with-exec 500,000 files take 43 minutes find with-delete 9 mins Perl 16sPython 9 min rsync with -delete 16s

Conclusion: rsync is the fastest and most convenient to delete a large number of small files.


Related articles: