Xtralien Scientific Python: Files

 

File storage is central to many applications as some application state or configuration must often be preserved to be read, written or logged.

open files

In Python the concept of files is heavily used, borrowed from the UNIX family of operating systems. This means that a file must be opened to be read or written too.

The method to do this is by using the open function. This function takes one required parameter filename, and one optional parameter mode. This filename can either be relative or explicit:

  • A relative file will be opened from the location of the file that is running.
  • A full path is the full path from the root of the device. e.g. C:\\ on Windows.

Note: When using windows, the \ character will need to be escaped (\\) because \ is the escape character.

Modes

When opening files a mode can be supplied to allow writing or appending to the file.

Mode modifier Description
r Read mode, open file for reading
a Append mode, add data to a file without removing any
w Write mode, overwrite the file and open for writing
+ Open the file for both reading and writing
b Binary mode, used to write and read bytes rather than strings

Both + and b are used with another mode. If no mode is supplied then r is used as the default value.

Path handling

To handle multi-platform issues, and enable sharing of code it is recommended that file paths use the os.path module.

Note: The os module also contains utilities for checking if files exist and are readable, however for this simple tutorial we will just concern ourselves with os.path

from os import path

some_file_path = path.join('C:', 'Users', 'YOUR_NAME', 'some_file.txt')

The above example would store the string 'C:\\Users\\YOUR_NAME\\some_file.txt' in the variable some_file_path on Windows.

The os.path.join function joins the arguments to the function similar to string formatting. This makes sure that the paths are correct in your strings.

Error handling

OSError is the most common exception that file access may cause. This is usually caused when a file cannot be opened. It is common to see file-handling enclosed within a try/except block to ensure incorrect file access will not cause any data loss.

try:
    f = open('filename.txt')
    # Use the file
except OSError:
    print("An error occurred opening the file")

The example above highlights this usage. The main body of the code would be contained in the try block. Alternatively it is possible to use the with keyword to handle this automatically.

with open('filename.txt') as f:
    # Use the file

This has the same effect as the previous example, with the main content of the code\ located inside the with block. This typically looks cleaner and is more understandable.

CSV files

CSV (Comma Separated Value) files are one of the many open standards that have become useful to many data processing tasks.

The CSV filetype is powerful due to two main features.

  1. The standard is simple and human-readable.
  2. The standard is open and can be easily read by many programs (including Microsoft Excel).

Whilst there is the csv python module included in Python already, we have added two functions to our Xtralien Scientific Python distribution that allow easy handling of numpy arrays.

array_to_csv is one of these provided functions, which takes a numpy array and a filename. This is called like below.

array_to_csv(numpy_array, 'filename.csv')

This would save the array stored in numpy_array into the file filename.csv.

The is a counterpart to this which is load_csv function, which will load a csv from the provided filename argument.

data = load_csv(`filename.csv')

The example above highlights this usage. data will be allocated a numpy array containing the data found inside 'filename.csv'.