After some google researches I concluded that doesn’t exist a python library that’s able me to manipulate some data in a JPEG Exif. I need this, cause I’m involved in a project called Syncropated [1], and this software wants to embed a thumbnail in JPEGs files. As you can see at Exif_2-1_V1.PDF, section 2.5.5 it’s possible.
But if I will code something to write this thumbnail, why do not put some code to parse data and something else?
Below you can read how the Exif works and in another post I will talk more about the pyhton Exif library that I’m coding.
A little bit about Exif format.
First we want to know if the file is a JPEG or not, so we can simple check the two first bytes of the file. If the byte[0] equal then ‘FF’ and the second one is ‘D8′ (Figure 1,
) the file can be considered a JPEG candidate. To be a JPEG, it must follow others rules shown below.
JPEG files thats have Exif must have the word `Exif` in the header. To be more specific these words are supposed to start at the 6th byte of the file, forming a sequence like the green (
) one that you can see at Figure 1
The 49 49 (Figure 1,
) represent the ordering of the data sequences, if its is Big-endian or Little-endian. Little endian is represented by `II` (Intel format) and Big endian is `MM` (Motorola format)
![]() Figure 1. |
Other important thing to know about Exit is the organization of the data, Exif is divided in:
- JPEG HEADER
- 0th IFD
- 0th IFD Values
- 1st IFD
- 1st IFD Values
- 1st Thumbnail – Image Data
- 0th (Primary) – Image Data
An IFD (Image File Directory) is used to store tags, with values and data types. IFDs works like a chained lists. The IF0 points to ID1 and so so…
All IFDs have the same structure, the first two bytes represent the number of tags in the directories (Figure 2 – []).
and all tags in IFD is stored in the same way:
- Tag (Bytes 0 – 1)
- Type (Bytes 2 – 3)
- Count (Bytes 4 – 7)
- Value Offset (bytes 7 – 11)
If the value can be represented in 4 bytes or smaller, it will be saved in the offset space else the offset will point the data and the “x” IFD value will be used.
The pointer to the next IFD is represented at the final of the IFD block, before the IFD values block. If you get the numbers of tags (2 first bytes) and times 12 (where 12 is the size of an IFD tag) you will get the offset to the next IFD postion (represented by 4 bytes).
I have a particular interest in the thumbnail, so as you can see in the explanation above (Exif organization), I need to have an IFD0 and IFD1 to have a thumbnail. IFD1 have two special tags, that points me to the thumbnail. The first one is the JPEGInterchangeFormat (0×0201) and the other one is the JPEGInterchangeFormatLength (0×0202).
The JPEGInterchangeFormat value is an offset to the beginning of the thumbnail and the JPEGInterchangeFormatLength contains the thumbnail size. With this two informations we are able to get any thumbnail embeded in JPEG Exif files ;P
[1] I need to talk more about Syncropated, to read more about visit: http://syncropated.garage.maemo.org

3 Responses to Exchangeable image file format for Digital Still Cameras: Exif
Cleiton
January 10th, 2007 at 4:56 am
Hi fyer,
csm@mordor:~$ which exif.py
/usr/bin/exif.py
csm@mordor:~$ head -10 `which exif.py`
#! /usr/bin/env python
# Library to extract EXIF information in digital camera image files
#
# Contains code from "exifdump.py" originally written by Thierry Bousch
# and released into the public domain.
#
# Updated and turned into general-purpose library by Gene Cash
#
#
# NOTE: This version has been modified by Leif Jensen
Hope it helps
felipe
January 10th, 2007 at 6:49 am
y0 Creyssú,
I know this one but, it’s not allow me to change the content of the exif
and I will need to change in some place in the future