Markov Decision Process in Account Management of Consumer Lending
Recently, one of the main questions in my mind is how to adjust the user’s credit limit given a predicted risk. I consider myself a practitioner in the industry with limited experience. Currently, the account management process relies more on expert judgement. We may decrease the limit or freeze user accounts if they are not profitable, and increase the limit for low default-risk users. But the process is 50% qualitative which relies on expert judgement, and 50% quantitative which relies on a behaviour scorecard, I would say. The decision-making process is sometimes like an art, and there will be tension between the risk team who wants to be as conservative as possible and the business team who wants to be as aggressive as possible. Is there a single truth or a set of better options that can be proved or shown? That motivates my attempt to use portfolio management techniques to determine the optimal account management policy.
It is so true that the more you know, the more you know you don’t know. When I am researching online, I discovered the topic is a very old one which can be dated back to the 1960s, and 70s. I learned several methods to solve the problem. I am still reading on this topic, and hopefully, I can provide a literature summary on the development. But, in this article, I would like to focus on one particular method called Markov Decision Process.
The optimization goal is profitability. Simple as it sounds, it is sometimes difficult to promote if a team is just focusing on one factor of the equation, namely risk or revenue. Also, the definition of profitability requires a time frame and unit of measurement. For example, do you mean the immediate profitability, the profitability within a year or the lifetime profitability? Do you mean the profitability of a single product or the profitability which takes into account the possibility of selling other products to the customers? I think aligning the definition of profitability can be an important step to help all different groups within the organization make coherent decisions.
Let’s consider a simple case where profitability is a lifetime one and involves only one product. We ask ourselves, what affects profitability? It is default risk, i.e., the probability of default. Assume that the probability of default is stationary, meaning the prediction we made is 100% accurate and will not be changed by our actions or the external environment. Then, the policy that maximizes the profitability of a portfolio is to assign as many limits to the users as they need. Since their default risk is unchanged and profit is guaranteed. However, it is definitely not true that the default risk of a user is unchanged. For example, if we grant the user a 1 million USD limit, he may want to default immediately and run away. That implies that the probability of default is not stationary. It is a dynamic process.
The Markov Decision Process takes into account the transition probability in order to optimize the profitability over a user’s lifetime. The probability of default is called a state in the model. There can be other states, like account balance and utilization. A state contains information about the user. The word Markov means that all information is contained in the current state. The past states will not provide extra information. For example, whether a user was in jail before does not contain any extra information because he is not in jail right now. Then we create a state transition matrix which denotes the probability of a user transiting from one state to another. With transition matrix (es), we are able to write a policy function like the following:
The profit (pi) is a function of the account balance (b), the credit score (s), and the limit I am going to assign (l). B and S are at t-1 because those are the information I can get before assigning a limit at t. The second part of the equation denotes the loss if a user defaults (probability of default times balance), and the profit if a user does not default (one minus the probability of default times interest rate times balance). The first term denotes the probability of transiting to a new balance, and new credit score given the current balance, current score, and the limit assigned. We repeat the process throughout the lifetime of the user (Here, we assume to be infinite), and we try to optimize the profit.
The result of solving the equation is a very intuitive one. To maximize the Markov Process, a policy maximizes the sum of the immediate profit given an action plus the profit of the next step because all information is contained in the current state.
I hope the short introduction can inspire the reader to explore more on the topic. Markov Decision Process application is wide and not limited to account management in consumer lending.
Comments ()