Re-Adding EXIF Metadata to JPGs Exported from Google Photos

Abstract

Basically, I love Google Photos. It is just great! Unlimited storage for all my photos (in a really high resolution), they are automatically grouped using machine learning, and I can put them in albums and share them easily. However, one thing is a major disadvantage... If you add additional information (such as geo location data) or change the date/time a photo was created (because you digitalized old photos and want them to show up at the right point in time) such information are stored in Google Photos but are not changed in the EXIF data of the photos.

Even though, I love Google Photos and use it as my main photo storage service, it is not smart to solely rely on such a service. What happens if you lose your password or you get hacked? What if Google has a major data breach and all photos are deleted (highly unlikely)? What if you accidentally delete photos?

It is better to backup data and store it on your physical devices (NAS or hard drives) or upload it to another cloud provider. Because I'm an Amazon Prime user, I have also unlimited photo storage space on Amazon photos - that's why I wanted to download all photos with all metadata and upload them to Amazon Photos. However, the metadata that has changed since the initial upload is stored in a JSON file that you have to access through the Google export service (Google Takeout). This post will describe how to export all pictures and show how to use a small and fast Python script to add the metadata from each JSON file to each respective photo.

First step: use the Google Takeout service

First, you have to navigate to the Google Takeout page. Next you only select Google Photos as the service that you want to export and select one or several albums. In the next step, you can select the file type (I recommend ZIP) and the size when the ZIP file is being splitted. My Google Photos were about 600 Gigabyte in size, therefore make sure you have some local storage that you can use to download everything. After starting the export, it takes some time (in my case some hours) until you get a notification Email and can start the download process.

Next, you have to unpack the ZIP file/files to a folder on your PC. You will already notice that several folders are created and that each image/video file has its own JSON file that is named exactly as the image/video file. It is of course possible to modify my Python script to iterate through each folder, but for me, it was sufficient for me to apply the script to one folder at a time.

Second-step: write the Python script

The main features I wanted to use target the metadata that can be changed in Google Photos. Namely, the date/time when the photos was taken and the geo location (GPS data). I uploaded a lot of old photos that I had scanned/digitalized and obviously, they had no metadata. That's why I changed such data after the upload to Google Photos.

Let's talk about the libraries used:

import json
import os
import glob
import re
from fractions import Fraction
import piexif
from datetime import datetime

json is required because we load the json file of each respective image file to use its metadata. os is used to for the path that can be entered as a user input. glob helps us to search for all images that are related to the metadata file. re is used for some simple string replacements. fractions is used to convert the GPS data to an format that can be written to EXIF. piexif is the main library that handles to read and write operations of the EXIF data. datetime is used to transform the date and time from the JSON file to a format that is suitable for EXIF.

In a first step, we want the user to give us the location of the folder with the files as a user input. I decided to use a copy-and-paste process from the Windows Explorer. However, this required to change the backslashes of the code and set the input folder as a new working directory:

input = input('Please enter the path of the Google Photo Takeout directory:')
path = input
path = re.sub('\\\\', '/', path)
os.chdir(path)

Next, we want that all JSON files in that specific folder are added to a list (so that we can use this list to iterate through). This is done through the following line, that uses the new working directory and searches for files that have the file ending .jpg.json:

json_files = [pos_json for pos_json in os.listdir(path) if pos_json.endswith('.jpg.json')]

Afterwards, we want to iterate over all those JSON files and check for each file if one or several pictures (e.g., there are two versions if the photo was graphically modified through Google Photos) exist. Next, we use the information from each JSON file to update the metadata in the JPG's EXIF. After the information is updated, the EXIF data is written to the JPGs. (Don't worry, I will describe the functions used in more detail below)

for js in json_files:
    with open(os.path.join(path, js)) as json_file:
        json_data = json.load(json_file) #load each json file data
    picture_names = get_picture_names(json_data['title']) #get the names of the jpgs from the json file
    try:
        for i in picture_names: #iterate over each picture
            exif_dict = piexif.load(i) #load the EXIF metadata
            exif_dict = modify_exif(exif_dict , i, json_data) #update the EXIF metadata
            exif_bytes = piexif.dump(exif_dict)
            piexif.insert(exif_bytes, i) #write the new EXIF metadata to the jpg file
    except:
        print('error')

We have to get the relevant file names for each JSON file, which we get through the following function. It gets the title of the JSON file and stems it (using the file stem, without any file endings). To do so, we replace .jpg and .json. Next, we search for all files that match the word stem in the specific folder and keep only JPG files. These files are saved in a list and are then returned:

def get_picture_names(json_filename):
    jpg_filename_stemmed = json_filename.replace('.json','')
    jpg_filename_stemmed = json_filename.replace('.jpg','')
    all_files = glob.glob(jpg_filename_stemmed+'*')
    only_pictures = [x for x in all_files if ".json" not in x]
    return(only_pictures)

Next, we have to functions to get the geo data from the JSON file and convert these information to a format the we can insert into the EXIF of the JPGs. The second function (geo_degrees_conv) has one input variable (decimal coordinates) and returns a new format (degrees minutes seconds) - in line with the EXIF standard. It is important to note that the format has to match the following syntax {(X X), (X X), (X X)}. To get these individual fractions, we use the first function (get_tuples) that iterates through the fractions for degrees, minutes, and seconds and returns only the round brackets, e.g., (X X):

def get_tuples(combined):
    return [(item.numerator,item.denominator) for item in combined]

def geo_degrees_conv(degrees):
   is_positive = degrees >= 0
   degrees = abs(degrees)
   minutes,seconds = divmod(degrees*3600,60)
   seconds = round(seconds,2)
   degrees,minutes = divmod(minutes,60)
   degrees = degrees if is_positive else -degrees
   degrees2 = Fraction(degrees).limit_denominator(1000)
   minutes = Fraction(minutes).limit_denominator(1000)
   seconds =  Fraction(seconds).limit_denominator(1000)
   combined = [degrees2,minutes,seconds]
   combined = get_tuples(combined)
   return(combined)

Last, we have a look at the function that changes the EXIF data of the JPGs. It takes the dictionary from the JSON file, calls the respective functions, and writes the new date/time created and geo location to the EXIF dictionary. Then it is returned to main.

def modify_exif(exif_dict, filename, json_data): #input1: exif dictionary, input2: the filename, input3: the json data
    #datatime taken
    photo_date = float(json_data['photoTakenTime']['timestamp'])
    photo_date = datetime.fromtimestamp(photo_date)
    photo_date = photo_date.strftime("%Y:%m:%d %H:%M:%S")
    try:
        original_datetime = exif_dict['Exif'][36867] #if the JPG had a value, get it
    except:
        original_datetime = '0:0:0 0:0:0' #no EXIF information in the JPG
    exif_dict['Exif'][36867] = photo_date #write the json data to the EXIF dictionary
    print('json_file:', json_data['title'], 'picture:' , filename, 'date stored in exif was:', original_datetime, 'and is changed to:' , photo_date)
    #geo data
    exif_dict['GPS'][piexif.GPSIFD.GPSLatitude] = geo_degrees_conv(json_data['geoData']['latitude']) #write the latitude information from the JSON to the EXIF dictionary
    exif_dict['GPS'][piexif.GPSIFD.GPSLongitude] = geo_degrees_conv(json_data['geoData']['longitude']) #write the longitude information from the JSON to the EXIF dictionary
    print('json_file:', json_data['title'], 'picture:' , filename, 'latitude set to:',exif_dict['GPS'][piexif.GPSIFD.GPSLatitude],
          'longitude set to:',exif_dict['GPS'][piexif.GPSIFD.GPSLongitude])
    return(exif_dict) #return the exif dictionary to main

As we can see, it was quite easy to implement a Python script that extracts the JSON data from Google Photos and updates the EXIF data for each JPG.

Complete Code

# -*- coding: utf-8 -*-
"""
Created on Thu Jul 30 16:52:15 2020


"""
import json
import os
import glob
import re
from fractions import Fraction
import piexif
from datetime import datetime


def get_tuples(combined):
    return [(item.numerator,item.denominator) for item in combined]

def geo_degrees_conv(degrees):
   is_positive = degrees >= 0
   degrees = abs(degrees)
   minutes,seconds = divmod(degrees*3600,60)
   seconds = round(seconds,2)
   degrees,minutes = divmod(minutes,60)
   degrees = degrees if is_positive else -degrees
   degrees2 = Fraction(degrees).limit_denominator(1000)
   minutes = Fraction(minutes).limit_denominator(1000)
   seconds =  Fraction(seconds).limit_denominator(1000)
   combined = [degrees2,minutes,seconds]
   combined = get_tuples(combined)
   return(combined)

def get_picture_names(json_filename):
    jpg_filename_stemmed = json_filename.replace('.json','')
    jpg_filename_stemmed = json_filename.replace('.jpg','')
    all_files = glob.glob(jpg_filename_stemmed+'*')
    only_pictures = [x for x in all_files if ".json" not in x]
    return(only_pictures)

def modify_exif(exif_dict, filename, json_data): #input1: exif dictionary, input2: the filename, input3: the json data
    #datatime taken
    photo_date = float(json_data['photoTakenTime']['timestamp'])
    photo_date = datetime.fromtimestamp(photo_date)
    photo_date = photo_date.strftime("%Y:%m:%d %H:%M:%S")
    try:
        original_datetime = exif_dict['Exif'][36867] #if the JPG had a value, get it
    except:
        original_datetime = '0:0:0 0:0:0' #no EXIF information in the JPG
    exif_dict['Exif'][36867] = photo_date #write the json data to the EXIF dictionary
    print('json_file:', json_data['title'], 'picture:' , filename, 'date stored in exif was:', original_datetime, 'and is changed to:' , photo_date)
    #geo data
    exif_dict['GPS'][piexif.GPSIFD.GPSLatitude] = geo_degrees_conv(json_data['geoData']['latitude']) #write the latitude information from the JSON to the EXIF dictionary
    exif_dict['GPS'][piexif.GPSIFD.GPSLongitude] = geo_degrees_conv(json_data['geoData']['longitude']) #write the longitude information from the JSON to the EXIF dictionary
    print('json_file:', json_data['title'], 'picture:' , filename, 'latitude set to:',exif_dict['GPS'][piexif.GPSIFD.GPSLatitude],
          'longitude set to:',exif_dict['GPS'][piexif.GPSIFD.GPSLongitude])
    return(exif_dict) #return the exif dictionary to main

    
input = input('Please enter the path of the Google Photo Takeout directory:')
path = input
path = re.sub('\\\\', '/', path)
os.chdir(path)

json_files = [pos_json for pos_json in os.listdir(path) if pos_json.endswith('.jpg.json')]
for js in json_files:
    with open(os.path.join(path, js)) as json_file:
        json_data = json.load(json_file) #load each json file data
    picture_names = get_picture_names(json_data['title']) #get the names of the jpgs from the json file
    try:
        for i in picture_names: #iterate over each picture
            exif_dict = piexif.load(i) #load the EXIF metadata
            exif_dict = modify_exif(exif_dict , i, json_data) #update the EXIF metadata
            exif_bytes = piexif.dump(exif_dict)
            piexif.insert(exif_bytes, i) #write the new EXIF metadata to the jpg file
    except:
        print('error')