You are here: Home > Dive Into Python > Exceptions and File Handling > Working with Directories | << >> | ||||
Dive Into PythonPython from novice to pro |
The os.path module has several functions for manipulating files and directories. Here, we're looking at handling pathnames and listing the contents of a directory.
Example 6.16. Constructing Pathnames
>>> import os >>> os.path.join("c:\\music\\ap\\", "mahadeva.mp3") 'c:\\music\\ap\\mahadeva.mp3' >>> os.path.join("c:\\music\\ap", "mahadeva.mp3") 'c:\\music\\ap\\mahadeva.mp3' >>> os.path.expanduser("~") 'c:\\Documents and Settings\\mpilgrim\\My Documents' >>> os.path.join(os.path.expanduser("~"), "Python") 'c:\\Documents and Settings\\mpilgrim\\My Documents\\Python'
os.path is a reference to a module -- which module depends on your platform. Just as getpass encapsulates differences between platforms by setting getpass to a platform-specific function, os encapsulates differences between platforms by setting path to a platform-specific module. | |
The join function of os.path constructs a pathname out of one or more partial pathnames. In this case, it simply concatenates strings. (Note that dealing with pathnames on Windows is annoying because the backslash character must be escaped.) | |
In this slightly less trivial case, join will add an extra backslash to the pathname before joining it to the filename. I was overjoyed when I discovered this, since addSlashIfNecessary is one of the stupid little functions I always need to write when building up my toolbox in a new language. Do not write this stupid little function in Python; smart people have already taken care of it for you. | |
expanduser will expand a pathname that uses ~ to represent the current user's home directory. This works on any platform where users have a home directory, like Windows, UNIX, and Mac OS X; it has no effect on Mac OS. | |
Combining these techniques, you can easily construct pathnames for directories and files under the user's home directory. |
Example 6.17. Splitting Pathnames
>>> os.path.split("c:\\music\\ap\\mahadeva.mp3") ('c:\\music\\ap', 'mahadeva.mp3') >>> (filepath, filename) = os.path.split("c:\\music\\ap\\mahadeva.mp3") >>> filepath 'c:\\music\\ap' >>> filename 'mahadeva.mp3' >>> (shortname, extension) = os.path.splitext(filename) >>> shortname 'mahadeva' >>> extension '.mp3'
The split function splits a full pathname and returns a tuple containing the path and filename. Remember when I said you could use multi-variable assignment to return multiple values from a function? Well, split is such a function. | |
You assign the return value of the split function into a tuple of two variables. Each variable receives the value of the corresponding element of the returned tuple. | |
The first variable, filepath, receives the value of the first element of the tuple returned from split, the file path. | |
The second variable, filename, receives the value of the second element of the tuple returned from split, the filename. | |
os.path also contains a function splitext, which splits a filename and returns a tuple containing the filename and the file extension. You use the same technique to assign each of them to separate variables. |
Example 6.18. Listing Directories
>>> os.listdir("c:\\music\\_singles\\") ['a_time_long_forgotten_con.mp3', 'hellraiser.mp3', 'kairo.mp3', 'long_way_home1.mp3', 'sidewinder.mp3', 'spinning.mp3'] >>> dirname = "c:\\" >>> os.listdir(dirname) ['AUTOEXEC.BAT', 'boot.ini', 'CONFIG.SYS', 'cygwin', 'docbook', 'Documents and Settings', 'Incoming', 'Inetpub', 'IO.SYS', 'MSDOS.SYS', 'Music', 'NTDETECT.COM', 'ntldr', 'pagefile.sys', 'Program Files', 'Python20', 'RECYCLER', 'System Volume Information', 'TEMP', 'WINNT'] >>> [f for f in os.listdir(dirname) ... if os.path.isfile(os.path.join(dirname, f))] ['AUTOEXEC.BAT', 'boot.ini', 'CONFIG.SYS', 'IO.SYS', 'MSDOS.SYS', 'NTDETECT.COM', 'ntldr', 'pagefile.sys'] >>> [f for f in os.listdir(dirname) ... if os.path.isdir(os.path.join(dirname, f))] ['cygwin', 'docbook', 'Documents and Settings', 'Incoming', 'Inetpub', 'Music', 'Program Files', 'Python20', 'RECYCLER', 'System Volume Information', 'TEMP', 'WINNT']
The listdir function takes a pathname and returns a list of the contents of the directory. | |
listdir returns both files and folders, with no indication of which is which. | |
You can use list filtering and the isfile function of the os.path module to separate the files from the folders. isfile takes a pathname and returns 1 if the path represents a file, and 0 otherwise. Here you're using os.path.join to ensure a full pathname, but isfile also works with a partial path, relative to the current working directory. You can use os.getcwd() to get the current working directory. | |
os.path also has a isdir function which returns 1 if the path represents a directory, and 0 otherwise. You can use this to get a list of the subdirectories within a directory. |
Example 6.19. Listing Directories in fileinfo.py
def listDirectory(directory, fileExtList): "get list of file info objects for files of particular extensions" fileList = [os.path.normcase(f) for f in os.listdir(directory)] fileList = [os.path.join(directory, f) for f in fileList if os.path.splitext(f)[1] in fileExtList]
Whenever possible, you should use the functions in os and os.path for file, directory, and path manipulations. These modules are wrappers for platform-specific modules, so functions like os.path.split work on UNIX, Windows, Mac OS, and any other platform supported by Python. |
There is one other way to get the contents of a directory. It's very powerful, and it uses the sort of wildcards that you may already be familiar with from working on the command line.
Example 6.20. Listing Directories with glob
>>> os.listdir("c:\\music\\_singles\\") ['a_time_long_forgotten_con.mp3', 'hellraiser.mp3', 'kairo.mp3', 'long_way_home1.mp3', 'sidewinder.mp3', 'spinning.mp3'] >>> import glob >>> glob.glob('c:\\music\\_singles\\*.mp3') ['c:\\music\\_singles\\a_time_long_forgotten_con.mp3', 'c:\\music\\_singles\\hellraiser.mp3', 'c:\\music\\_singles\\kairo.mp3', 'c:\\music\\_singles\\long_way_home1.mp3', 'c:\\music\\_singles\\sidewinder.mp3', 'c:\\music\\_singles\\spinning.mp3'] >>> glob.glob('c:\\music\\_singles\\s*.mp3') ['c:\\music\\_singles\\sidewinder.mp3', 'c:\\music\\_singles\\spinning.mp3'] >>> glob.glob('c:\\music\\*\\*.mp3')
Further Reading on the os Module
- Python Knowledge Base answers questions about the os module.
- Python Library Reference documents the os module and the os.path module.
<< Using sys.modules |
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | |
Putting It All Together >> |