How important is peer-to-peer loan diversification?

Andrey Kamenov, Ph.D. Probability and Statistics

In recent posts, we have seen that both the Prosper and Lending Club loan grading systems do their job quite well, allowing you to assess your risks with each particular borrower.

What it doesn’t show us, however, is if there is any correlation in the loan charge-offs. As you may know, one of the reasons for the 2007 subprime mortgage crisis was the underestimation of the probability of a widespread wave of default events.

Now, we should probably provide an important disclaimer: there is no way to estimate the probability of so-called “black swan” events given the data we have. We can only manage systematic risk, and any smart investor should keep in mind that there will always be risks not present in his or her model.

So, with that in mind, let us assume that the individual default events are all independent. We’ll now see how well this assumption fits the data we can download from the Lending Club website.


Until the middle of 2012, our average estimate is a bit off, most likely due to some minor changes in the loan grading system.

After that, however, the observed default rate fits the confidence range quite well, rarely showing significantly better- or worse-than-expected default numbers.

Another thing that we should address, though, is the presence of any geographic correlations. One can argue that you would be safer if you tried to select borrowers from different states.

As the map below shows, statewide default rates above or below three standard deviations are quite rare, the largest numbers seen being around six (Texas ’14 and Florida ’13).

What does this mean for an investor? According to our calculations, in terms of variance, funding 10 different loans in the same state is only 2.5 percent riskier than choosing borrowers randomly. So one can argue that (again, considering only systematic risk) there's no value in paying too much attention to geographical loan diversification.

Source:  Lending Club downloadable data

Discuss this article on our forum with over 1,900,000 registered members.

About Andrey Kamenov

Andrey Kamenov, Ph.D. Probability and Statistics

Andrey Kamenov is a data scientist working for Advameg Inc. His background includes teaching statistics, stochastic processes and financial mathematics in Moscow State University and working for a hedge fund. His academic interests range from statistical data analysis to optimal stopping theory. Andrey also enjoys his hobbies of photography, reading and powerlifting.

Other posts by Andrey Kamenov:

One thought on “How important is peer-to-peer loan diversification?”

  1. In your Rome, Georgia offering, you have misspelled “Mussollini” in the photo title of the statue of Romulus and Remus nursing from the wolf. This statue was given to Rome, Ga. by the WWII leader of Italy as a sister city. This needs to be corrected in your database. Thanks

Leave a Reply

Your email address will not be published. Required fields are marked *