Skip to content Skip to sidebar Skip to footer

Copying Files With Python

I'm relatively new to python, and am part of the way through 'Learning python the Hard Way,' but have a question. So, from what I've read, if you want to make a copy of a text file

Solution 1:

You should use shutil.copyfile() or shutil.copyfileobj() instead, which does this efficiently and correctly using a buffer.

Not that it is particularly hard, shutil.copyfileobj() is implemented as:

defcopyfileobj(fsrc, fdst, length=16*1024):
    """copy data from file-like object fsrc to file-like object fdst"""while1:
        buf = fsrc.read(length)
        ifnot buf:
            break
        fdst.write(buf)

This makes sure that your memory isn't filled up by a big file, by reading the file in chunks instead. Also, .read() is not guaranteed to return all of the data of the file, you could end up not copying all of the data if you don't loop until .read() returns an empty string.

Solution 2:

One caveat is that .read() isn't necessarily guaranteed to read the entire file at once, so you must make sure to repeat the read/write cycle until all of the data has been copied. Another is that there may not be enough memory to read all of the data at once, in which case you'll need to perform multiple partial reads and writes in order to complete the copy.

Solution 3:

So, from what I've read, if you want to make a copy of a text file, you can just open and read it contents to a variable, then write this variable to a different file. I've tested it out with images, and it actually seems to work. Are there downsides to this method of copying that I'll run into later, and are there any file types it specifically won't work for?

If you open both files in binary mode, you read with .read(), and write with write(), then you get an exact copy. If you use other mechanisms, you could strip out line endings or encounter troubles, in particular when you work cross-platform.

From the docs

On Windows, 'b' appended to the mode opens the file in binary mode, so there are also modes like 'rb', 'wb', and 'r+b'. Python on Windows makes a distinction between text and binary files; the end-of-line characters in text files are automatically altered slightly when data is read or written. This behind-the-scenes modification to file data is fine for ASCII text files, but it’ll corrupt binary data like that in JPEG or EXE files. Be very careful to use binary mode when reading and writing such files. On Unix, it doesn’t hurt to append a 'b' to the mode, so you can use it platform-independently for all binary files.

In any case, use other approaches to file copy, like the ones suggested by others.

Post a Comment for "Copying Files With Python"