I can remember the first time that I built a machine learning model - I used it to solve a real problem.

It was a binary classifier (that is, it gave a Yes / No) answer to a question and it worked brilliantly.

The Why: Why I did it

When I was working on my PhD we wrote a article and submitted it to a journal.

We got feedback suggesting that we should reference a couple of papers, mainly by one particular author. We didn’t necessarily agree with the additional references, but it didn’t change the nature of the work, so we make the changes, resubmitted and the paper was published.

However, I wanted the process to be smoother. I was researching in a new field, apart from the team at the CSSM there were only a few other researchers worldwide and we knew them all. It was a fairly safe bet that my reviewer was someone I knew.

The How: How I did it

So I built a model to try to identify which of the researchers in my suspect pool were most likely the author of the review.

It’s a similar solution to how researchers have used machine learning to identify how much of a Shakespeare play was written by someone else. I didn’t use any of the more modern techniques, I based my approach on email spam detection.

I wrote (in C) a spam detector (which I subsequently used for years successfully on my incoming mail) and then trained it on a corpus of email. For each suspected reviewer I (effectively) flagged their email as spam and everything else as ham (i.e. not-spam). There was a wrinkle that emails from physicists usually use different terminology than emails from the rest of your correspondents (how many people use renormalisation, or Gedanken experiment in their daily messages?) - so I trained on mostly physics-type messages, rather than everything.

So now I had X detection models - one for each potential reviewer - and so I simply applied them to the anonymous feedback from the reviewer. These models looked at the common phrases and word usage of the various authors.

The result was compelling. One potential author was flagged, the others weren’t.

The Action: How I used the result

Building a machine learning model can be fun, but’s there’s really not much point unless you put it to use. How did I do that?

I used it to build relationships. I knew who was likely to be a reviewer of my work (and perhaps my PhD when it was finished). So the next conference I was at, I made sure to buy this person some drinks, get to know them, talk about my work and theirs.

I think this made me a better physicist as I understood more of other people’s work and was able to extend my own. I think it also helped me in my post-academic life - building models is fine, but you need to get them into production.