Exporting Google Doc Revision History Content

Today I was working with what I’d call a common task in Google Drive revolving around revision history. Revision history is a great way to always have a way to go back in time to a different version of a document. These points in time also are really handy for if you want to see the development of the writing process. However, one challenge is finding a way to export these changes. What happens if you want to be able to view everything in a document? You have to go step-by-step reverting and saving the document.

Similar to the idea of AppSmashing (where the output of one application on iOS becomes the input to another) I was looking for a way to ToolSmash. I’m going to call ToolSmashing using the outputs of one technical tool as the input to another. Just like when coding, this turned into a few iterations to get the entire process down each time pulling in a different tool. In the end, this used:

Google APIs

BBEdit
(Text Wrangler)

Excel
(Google Sheets)

Script Editor

What did all these tools allow me to do?

  1. The Google APIs allowed me to determine the revision IDs of a document
  2. BBEdit allowed me to do some find and replaces to convert the list into a CSV. In addition, find and replace was used for formatting into Script Editor
  3. Excel was used to import in the Revision IDs and create a formula to fill in and generate URLs of the specific revision.
  4. Script Editor was used to paste in the final commands and automate the execution / download of the revisions.

What did all this not allow me to do? A big disadvantage to exporting the revision history is you lose the ability to see which individual made a specific change. Keep that in mind.

How do we do all this magic with revision history? Well, lets get started! I’ve created and shared a Sample Google Doc where anyone with the link can Edit. That means this is interactive. The link to the Demo document is:

https://docs.google.com/document/d/1myNvURQOizb1NOju-GKzLOJouuAE_mFBO5CjXpgiHAA/

Go ahead, feel free to make a change and add some content to the document. What you will need from that URL is the unique document ID. In the case of this document that is:

1myNvURQOizb1NOju-GKzLOJouuAE_mFBO5CjXpgiHAA
Next, we are going to go Google’s Drive API on the Revisions list command. Using the Document ID above, we will retrieve the revisions/id data on the file by authorizing the API access to Drive, providing the fileID and clicking execute.
After clicking execute you should see a list of the Revision IDs existing in the sample document. You will want to copy and paste these to use in the next step. Note we will begin the copy the line below “revisions”

Not so bad, right? That is just step one of our ToolSmash. Next we will be working to manipulate the revision IDs. I used BBEdit but TextWrangler or another tool allowing you to easily find and replace content.

Once you have copied your returned data of revision IDs we are going to begin eliminating unneeded characters by finding them and replacing them with a null value (nothing as the replace with). You will want to do a find and replace for the following characters:

  1. {
  2. },
  3. }
  4. ]
  5. ” (Note: This is a single quote)
  6. id:
  7. ”  ” (Note: This is two spaces without surrounding quotes)

 

 

Our next find is going to be a find and replace. We will want to find all occurrences of nn and replace them with n

This will find all occurrences of multiple new-line characters and replace them with a single new line. You may need to perform this search more than once. You will know this is done when you are only left with a single column of revision ID numbers as displayed below.


We are now half way home! As you recall Step One of this ToolSmash used the Google Drive APIs to retrieve the revision ID numbers of a specific document. Step Two was to manipulate the returned data and leave us with only the revision ID numbers themselves. Our next step will be to use a tool like Excel or Google Sheets to create the URLs of each document ID we would like to export.

To start, I’ve created a shared Google Sheet with all the necessary URL formatting and equations to generate the URLs for you. That is linked below and opening the document will prompt you to make a copy.

https://docs.google.com/spreadsheets/d/10ODeJmj1TeT6b9Lq4fCTyPwF79FLURI0KDIV8iOMagI/copy

Using this sheet you will only need to provide the Document ID to obtain the revisions of and the revision ID numbers themselves. Paste those into the respective places in the Data to Provide tab and view the generated URLs on the Built URLs tab. For reference, the URL components are provided as well. In addition, only the first 200 fields of the sheet are populated with equations.

Once all the data is provided, you should see a result similar to the screenshot below. If you use the sample revision history document references earlier in this guide, you’ll see what your peers reading this guide have provided. In addition, these links will download as a TXT file.


Time for the last part of this guide. Our final step will be to take these links and automate the download. While 4 revision histories aren’t too bad to work with manually, if you have 50 revision histories, the process is quite tedious. Since I’m around Apple devices, this will use the Script Editor application on OS X. But, before we jump into this code, we want to make another visit to our text editor friend.

Copy all the URLs with data (using the screenshot above, this was the first four URLs) and paste them into your text editor. You will want to perform another find and replace. This time, however we are going to use the tool to add in a delay so we wait 2 seconds between each URL opening in our web browser. You may increase this number if you would like. To do this, we will do a find and replace with the following:

Lastly, open Script Editor and create a new script. I’m using Google Chrome but you could easily use Safari or FireFox. You will need to take the freshly formatted text and paste it into the window where indicated in the image below.


Congratulations! You made it to the end! That should mean as soon as you run the Script Editor Document above you will be able to download each revision history of a file, by Google Drive File ID. Before running the command, there are a few items to keep in mind

  • The raw text of the revision history is being exported. This will not include information about WHO made the change. At the moment this is really only viewable in the Google Document itself.
  • Your browser may have a limit as to the number of items it will auto-name as duplicates. For example File (1), File (2), etc. Using Chrome this limit was a total of 100 items, so it is advisable to break the downloads into smaller chunks, move and rename them, and then begin again.
  • Please test the code with a small number of files before using it for 100 files. It will save you a lot of tabs, system sounds and frustration.
  • Make sure you are signed into the browser with the account in Google Drive that can view revision history.
  • If you find a better way to do parts, please let me know. I’d be happy to update this and make it easier for everyone else.