A binary file is a file that contains data in the form of bytes , which can represent any type of data, such as images, audio, video, executable code, etc. A binary file cannot be read or edited using a text editor , but requires a specific program or application that can understand its format.
Opening a binary file
To open binary files in Python, we need to use the “b” character in the mode argument of the open() function.
The file open modes for binary files are similar to the text file open modes, except that they use the “b” character to indicate binary mode. The meaning of each mode is as follows:
Mode |
Description |
“rb” |
Open a binary file for reading only. The file pointer is placed at the beginning of the file. |
“rb+” |
Open a binary file for both reading and writing. The file pointer is placed at the beginning of the file. |
“wb” |
Open a binary file for writing only. Overwrites the file if it exists. Creates a new file if it does not exist. |
“wb+” |
Open a binary file for both writing and reading. Overwrites the existing file if it exists. Creates a new file if it does not exist. |
“ab” |
Open a binary file for appending. The file pointer is at the end of the file if it exists. That is, the file is in the append mode. If the file does not exist, it creates a new file for writing. |
“ab+” |
Open a binary file for both appending and reading. The file pointer is at the end of the file if it exists. The file opens in the append mode. If the file does not exist, it creates a new file for reading and writing. |
Closing a binary file
This is done using the close() method of the file object , just like in text files. The close() method releases the resources associated with the file and ensures that any changes made to the file are saved. It is important to close a file after using it to avoid errors or data loss.
For example, to close the file object f, we can write:
f.close()
Pickle Module
The pickle module is a built-in module that provides functions for serializing and deserializing Python objects.
Serialization is the process of converting an object into a stream of bytes that can be stored in a file or transmitted over a network.
Deserialization is the reverse process of converting a stream of bytes back into an object .
The pickle module can handle most Python objects, such as lists, dictionaries, classes, functions, etc., but not all.
To use the pickle module, we need to import it first:
import pickle
The pickle module provides two methods - dump() and load() to work with binary files for pickling and unpickling, respectively.
The dump( ) method
The dump() method takes an object and a file object as arguments and writes the serialized bytes of the object to the file.
The file in which data are to be dumped, needs to be opened in binary write mode (wb) .
Syntax of dump() is as follows:
dump(data_object, file_object)
where data_object is the object that has to be dumped to the file with the file handle named file_object.
For example, the program given below writes the record of a student (roll_no, name, gender and marks) in the binary file named mybinary.dat using the dump(). We need to close the file after pickling:
import pickle
listvalues=[1,"Geetika",'F', 26]
fileobject=open("mybinary.dat", "wb")
pickle.dump(listvalues,fileobject)
fileobject.close()
The load( ) method
The load() method takes a file object as an argument and returns the deserialized object from the bytes read from the file.
The file to be loaded is opened in binary read (rb) mode.
Syntax of load() is as follows:
Store_object = load(file_object)
Here, the pickled Python object is loaded from the file having a file handle named file_object and is stored in a new file handle called store_object.
For example, the program given below demonstrates how to read data from the file mybinary.dat using the load() method:
import pickle print("The data that were stored in file are: ")
fileobject=open("mybinary.dat","rb")
objectvar=pickle.load(fileobject)
fileobject.close()
print(objectvar)
Read, write/create, search, append and update operations in a binary file
These are some common operations that can be performed on a binary file using different methods and functions . Some examples are:
1. read
To read data from a binary file, we can use methods like read(), readline(), or readlines() , just like in text files. However, these methods will return bytes objects instead of strings. We can also use struct.unpack() to convert bytes into other data types, such as integers, floats, etc.
For example, to read an integer from a binary file, we can write:
import struct
# open a binary file in read mode
f = open("number.bin", "rb")
# read 4 bytes from the file
data = f.read(4)
# unpack the bytes into an integer
number = struct.unpack("i", data)[0]
# close the file
f.close()
# print the number
print(number)
2. write/create
To write or create data in a binary file, we can use methods like write() or writelines() , just like in text files. However, these methods will take bytes objects instead of strings. We can also use struct.pack() to convert other data types into bytes, such as integers, floats, etc.
For example, to write an integer to a binary file, we can write:
import struct
# open a binary file in write mode
f = open("number.bin", "wb")
# pack an integer into 4 bytes
data = struct.pack("i", 42)
# write the bytes to the file
f.write(data)
# close the file
f.close()
3. search
To search for a specific data in a binary file, we can use a loop to iterate over the bytes or records in the file and compare them with the target data. We can also use methods like tell() and seek() to get or set the position of the file pointer .
For example, to search for an integer in a binary file, we can write:
import struct
# open a binary file in read mode
f = open("numbers.bin", "rb")
# define the target integer to search for
target = 42
# define a flag to indicate whether the target is found or not
found = False
# loop until the end of the file is reached
while True:
# read 4 bytes from the file
data = f.read(4)
# if no more data is available, break the loop
if not data:
break
# unpack the bytes into an integer
number = struct.unpack("i", data)[0]
# compare the number with the target
if number == target:
# get the current position of the file pointer
pos = f.tell()
# print a message with the position of the target
print(f"Found {target} at position {pos - 4}")
# set the flag to True
found = True
# break the loop (optional)
break
# close the file
f.close()
# if the flag is still False, print a message that the target is not found
if not found:
print(f"{target} is not found in the file")
4. append
To append data to a binary file, we can use methods like write() or writelines() , just like in text files. However, we need to open the file in append mode (“ab” or “ab+”) instead of write mode (“wb” or “wb+”).
For example, to append an integer to a binary file, we can write:
import struct
# open a binary file in append mode
f = open("numbers.bin", "ab")
# pack an integer into 4 bytes
data = struct.pack("i", 42)
# write the bytes to the end of the file
f.write(data)
# close the file
f.close()
5. update
To update data in a binary file, we can use methods like write() or writelines(), just like in text files. However, we need to open the file in read and write mode (“rb+” or “wb+”) instead of read mode (“rb”) or write mode (“wb”). We also need to use methods like tell() and seek() to get or set the position of the file pointer .
For example, to update an integer in a binary file, we can write:
import struct
# open a binary file in read and write mode
f = open("numbers.bin", "rb+")
# define the target integer to update and its new value
target = 42
new_value = 99
# loop until the end of the file is reached
while True:
# read 4 bytes from the file
data = f.read(4)
# if no more data is available, break the loop
if not data:
break
# unpack the bytes into an integer
number = struct.unpack("i", data)[0]
# compare the number with the target
if number == target:
# get the current position of the file pointer
pos = f.tell()
# move the file pointer back by 4 bytes
f.seek(pos - 4)
# pack the new value into 4 bytes
data = struct.pack("i", new_value)
# write the bytes to the file, overwriting the old value
f.write(data)
# print a message that the target is updated
print(f"Updated {target} to {new_value} at position {pos - 4}")
# break the loop (optional)
break
# close the file
f.close()