User Data Export Approaches of SaaS / Cloud Service / Social Media Providers — An Overview

Continuing with the backup theme of this week, I decided to create a GitHub repository documenting the user backup export options of common SaaS and cloud providers.

These are all the services I have taken my own backups of this week — and they vary quite significantly in methodology and ease.

Here’s an alphabetical runthrough:


Cpanels

Backing up web hosting Cpanels is relatively straightforward. Although there a variety of proprietary tools on the market — like JetBackup — Cpanel includes a default option for initiating full account backups including files and MySQL databases.

These can be taken cloud to cloud or delivered to email and moved elsewhere:


Facebook

I give Facebook top marks for the granularity and ease of its data export options.

Like G Suite Takeouts, Facebook users are able to control exactly what they want to include in their export archive, including:

  • The date range
  • The format (HTML/JSON)
  • The media quality

You can use simple checkboxes to indicate which services you want Facebook to compress into your archive on the backend:

And you will receive a Facebook notification and email when the compression is over and the archive is ready to be downloaded:

The archive is neatly organized — although trying to decipher statuses from the HTML clutter can be a little tricky!


GitHub

Exporting your repositories from GitHub is pretty straightforward.

Users can clone their repositories on local or cloud storage by using the Git CLI. Alternatively, repositories can be manually downloaded as archives and uploaded to another cloud storage repository.

G Suite

Thanks to its Google Takeout backup engine Google makes it very easy for users to export a copy of their data.

Google users can select precisely which services they wish to include in the export — although be warned, when YouTube videos are added to the mix these files can easily end up constituting tens of gigabytes.

Users can retrieve their generated Takeouts for some time after they have been created:


LastPass

Users can backup LastPass by navigating to the ‘Export’ functionality in advanced options.

Then by clicking on ‘Export’. And finally by saving the .php output and saving it to another cloud storage repository.


LinkedIn

LinkedIn has a decent user data export functionality nested under the privacy tab:

After clicking ‘getting a copy of your data’ users can choose exactly what they would like to bundle in the archive:

The final archive will be delivered via email:

Medium

Dear, Medium Staff,

I love this platform and your clean-cut UI. I liked how easy it was to find the ‘Download your Information’ button:

However, in the spirit of this article, I had to dig a little deeper and take a look at what you presented in the Zip archive:

The articles are .HTML files. That’s fine. But where are the images?

Answer: in the Medium CDN!

Your CDN and progressive image loading technology is amazing. But would you mind giving us access to our images please?


Quora

Better late than never, right?

Quora’s lack of a built-in backup functionality has been something of a sticking point among the user community for many years.

Thankfully, Quora seems to have seen the light. Although it doesn’t yet have a native backup export functionality, you can ask its support people to generate a download archive for you.

Proceed as follows:

Follow the link for data privacy:

You should eventually find this help center resource:

https://help.quora.com/hc/en-us/articles/360000839503-Can-I-get-a-copy-of-my-data-

Follow the instructions and within 72 hours, as stated, you will get your data by email.

My archive came in a little over 24 hours.

Quora will give you two archives: data and content. Content contains your answers and images:

The HTML file which the process outputs looks like this in a web browser:

Of course, if keep the images in the same folder you will get the images as well where you have them in your answers:


Reddit

Reddit also has never developed a self service backup functionality. Like Quora, you need to ask their support people in order to get an archive of your data from their storage.

Do a little bit of digging in the help center:

And ask them for an archive. You can choose to download all your Reddit activity or just your activity over a specific date range:

A short while after you ask for your data, /u/RedditDataRequests will get back to you with your archive:

All your account activity should be in the archive:

This includes:

  • Your posts
  • Your comments
  • Your IP log
  • Your chat history
  • Your hidden posts
  • The subreddits

Todoist

I give my favorite task manager top marks for how easy it makes it to download backups of your own data.

True, todo lists are basically just text files but Todoist creates a bunch of snapshots and allows you to download whatever date points you like.

The data is presented as a .CSV file for each project. However, I couldn’t find task attachments — like images or audio clips — anywhere in the archive. This could be improved upon.


Twitter

Finally, the Twitter machine.

Twitter ranks towards the top of the table. It doesn’t take much digging to locate the data export functionality.

Once you’ve done that it’s a couple more clicks to download the archive:

Deciphering your own tweets isn’t too hard:

Twitter actually does a really good job at packing these. You have all media that you tweet in there too at /data/tweet_media:

Concluding Thoughts

SaaS services and cloud providers vary enormously in how well they allow users to download a copy of their own data.

I would suggest various criteria to judge a provider by:

  • Is the data export self service or do you need to contact support? Self service is obviously optimal
  • How is the data export packaged? Take a look inside an archive to find out
  • Are your images included or are these trapped in the provider’s CDN or simply not given out at all?

Hopefully, as awareness about the importance of data protection, governance, and portability continues to grow we can look forward to an era in which every internet user has on-demand access to a well-structured and complete archive of all the data they have created in every cloud or cloud service that they use.