Abstract
Peer-to-peer (P2P) lending platforms have revolutionized the personal finance landscape, but they face the inherent challenge of credit risk management. This project presents a comprehensive data analytics study of the Lending Club dataset to identify the key drivers of loan defaults and develop a robust framework for predictive risk assessment. The study follows a structured methodology encompassing data preparation, in-depth Exploratory Data Analysis (EDA), and the implementation of multiple machine learning models.
Key findings from the EDA reveal that loan grade, debt-to-income (DTI) ratio, interest rate, and loan purpose are the most significant predictors of default. A synthesized profile of a high-risk borrower typically includes a low loan grade (D or worse), a high DTI, and a loan purpose of 'small business' or 'debt consolidation'.