Filed Under (Python) by Marcin Kuźmiński on January-2-2010

Reading a huge file (ie. 70gb) in python just using open() could get you in troubles :).
There is a clean and nice solution for reading a complex huge files or list.
Using the new with open() as statement and python generator function you could write an easy functions
which will read such files without taking whole computer resources. Each iteration on such a function will perform a read of
given size, and the with statement will make sure that file is closed when iteration is finished…

An example of such a function:

def read_large_file(filename, mode = 'rs', size = '1024'):
    '''
    A lazy generator functions that reads a file with a given chunk of data
    USAGE:
    for data_chunk in read_large_file('/tmp/huge.file','rb',10240):
        print data_chunk
    @param filename: a filename
    @param mode: read mode
    @param size: size to read at one iteration
    '''
    with open(filename, mode) as f:
        while 1:
            data = f.read(size)
            if not data:
                break
            yield data

Remember that with statement is supported out of the box from python 2.6 in python
2.5 you have to do

from __future__ import with_statement

before using it. If you have troubles using with statement i recommend reading
this link



Comments
Marcin on January 6th, 2010 at 23:31 #

nice :)

Bart on January 8th, 2010 at 11:33 #

Great :)

Post a comment
Name:  * required
Email:  * required
URL: 
Comments: