How to Read Specific Lines From a File in Python
-
fileobject.readlines()
to Read Specific Lines for Small-Size File -
the
for
Loop infileobject
to Read Specific Lines in Python -
the
linecache
Module to Read the Specific Lines in Python -
enumerate
During Reading Specific Lines From a Large File in Python
A common way to read a file in Python is to read it entirely and then process the specific line. Reading a file in Python is fast; for example, it takes roughly 0.67 seconds to write a 100MiB file. But if the file size exceeds 100 MB, it would cause memory issues when it is read into memory.
Python has 3 built-in methods to read the specific lines from a file, as introduced in the next sections.
fileobject.readlines()
to Read Specific Lines for Small-Size File
fileobject.readlines()
reads all the file content to the memory. It could use list slicing to read the specific lines.
If we only need to read line 10,
with open("file.txt") as f:
data = f.readlines()[10]
print(data)
If we need to read lines from 10 to 100,
with open("file.txt") as f:
data = f.readlines()[10:100]
print(data)
the for
Loop in fileobject
to Read Specific Lines in Python
for line in fileobject
is also a quick solution for small files.
lines = [10, 100]
data = []
i = 0
with open("file.txt", "r+") as f:
for line in f:
if i in lines:
data.append(line.strip)
i = i + 1
print(data)
the linecache
Module to Read the Specific Lines in Python
The linecache
module could be used for reading many files, possible repeatedly or extracting many lines:
import linecache
data = linecache.getline("file.txt", 10).strip()
The string method strip()
returns a string that strips white spaces from both ends.
The linecache
module allows you to get any line from a python source file while using the cache to optimize internally, which is a common practice of reading many lines from a single file. The traceback module uses it to retrieve the source lines contained in the formatted traceback.
enumerate
During Reading Specific Lines From a Large File in Python
When reading files, a large one may cause problems such as won’t fit into memory. In this case, we might use enumerate()
:
with open("file.txt") as f:
for i, line in enumerate(f):
pass # process line i
Note that for the n
-th line, i = n-1
.
The enumerate()
function is used to combine an iterable data object (such as a list, tuple, or string) into an index sequence. It lists data and data subscripts simultaneously, which will be used in the for
loop as in the above example.