In the wake of the Cambridge Analytica scandal, I downloaded my Facebook personal data file, in an attempt to better understand the information which similar companies could have on me.

I found the experience to be a troubling one – albeit not exactly surprising. Suffice to say, the file contained more or less everything about me. From the photos I had uploaded to the organisation to reams of messaging history, it was a reminder that the internet – or at least Facebook – apparently never forgets.

How can I download my personal data?

If you want to take the plunge and download your personal data file, you can do so by logging onto Facebook, clicking ‘settings’, and then ‘download a copy of your Facebook data’.

general account settings

Depending on the size of the data file, you’ll receive a copy to your email in hopefully less than 30 minutes. It should resemble something similar to this:

data file

Files of note

There are several folders and files in this download. In my opinion, the ones which are of most interest are:

  • Index – a general snapshot of the information on your profile
  • Messages – a repository of the conversations, images, videos, files etc shared through Facebook
  • HTML – there are several interesting files located here – namely ads, contact_info, and messages. The latter being a thorough history of conversations.

Reviewing these, I learned several interesting things about my personal data.

There are false records

The index file contains more information than I would expect to find. For example, I wouldn’t expect previous relationships to be an important factor worth storing. Reviewing this field, I was surprised to see a name of someone I had never heard of. For the purposes of privacy, I will refer to this woman as Megan.


Looking into her own presence on Facebook, I was surprised to see Megan went to the same university I did.

Then it hit me.

Back in university, I was ‘Facebook hacked’. Instead of being some elaborate cybercrime, it was actually my stupid fault for staying logged on through a public computer. These students changed a lot of personal information including my relationship status – adding a name of someone I had never met.

Back at home, I corrected the information and changed my password – thinking nothing of it. Years later, Facebook still thinks Megan and I were in a relationship.

‘Facebook hacking’ was – and probably still is – common at my university. As a result, we can only guess at the number of profiles containing false information. It is also possible that this influences what display adverts appear.

Facebook has an extensive store of my conversation history

I live approximately 200 miles from my parents so I use Facebook messenger to communicate with my family and friends. Although I shouldn’t be shocked that Facebook has stored all these conversations, it is possibly too thorough.

Located under the HTML section, the ‘messages’ file contains all the information you’ve typed into Facebook Messenger. For example, I was able to locate this conversation I had back in 2012:

Also in this folder contains call histories throughout messenger – each one detailing how long the conversation lasted:


Although this is trivial information, it stands to reason that Cambridge Analytica could have obtained full transcripts of the conversations people were having through Facebook – personal details which the participants thought would remain private.

At the very least, it doesn’t seem necessary for Facebook to keep in-depth records of these conversations. The purpose of hoarding all this information is unclear as well but it seems likely that these transcripts influenced the construction of buyer and user personas.

Slightly worryingly, Facebook keeps a record of all the telephone numbers of my contacts – many of these details I no longer possess. Located under contact_info in the HTML folder, it would be troubling if advertisers received this information.

There’s a list of keywords to help advertisers target me

Located in the HTML section, the ads file contains several interesting insights. The first section though, ‘Ads Topics’, is a list of keywords which appear to influence the adverts displaying on my feed:

ad topics


I’d expect to see some of those phrases as they are accounts which I’ve subscribed to. For example, the Onion and Sarcasm Society. However, as anyone who knows me well enough will tell you – I hate gardening.

There are other disparities in this list as well. Although the organisation seems to think I like Guitar Hero, I haven’t played the game since my early twenties. It certainly wouldn’t be something I’d be interested in now.

Therefore, at least in my case, the information available to advertisers is out of date.

Which advertisers have my contact information?

Available in the same file is a list of advertisers which have my contact information. Most notable are the different branches of Sony Playstation:


Why profiling needs context

For marketing professionals, the data collected by Facebook is a goldmine. However, it is only useful if we understand context. For example, if the ads topics are just assigned to me because of articles I clicked on during the commute out of boredom, that research isn’t as useful as it could be.

Furthermore, the incorrect information identified earlier gives the advertiser false details about me – skewing his or her customer analysis. Understanding why a person performed those actions is infinitely more useful than seeing the actions themselves.     

The data collected by Facebook is extensive but without analysis and auditing, is largely irrelevant to marketers. Instead, many users are now only realising how much information is held on them. Understandably, the reaction has been largely negative.

Although the scandal is still developing and will probably lead to additional regulation for social media companies, this exploration into my personal data has highlighted just how much of my data is available online.

Would I delete my Facebook account?

Following the Cambridge Analytica revelations, a #deletefacebook campaign has started to build momentum. As for myself, the answer is no. I use it to keep in touch with people and – as demonstrated – my data is already out there.

Most worrying of all though is this – I still barely use Facebook. I dread to think how large the data files are of more regular users.