Team
Qianjin Zhou
Data Science Student
Brandon Dioneda
Data Science Student
Mert Ozer
Data Science Student
Introduction
Traditional credit scoring models exclude individuals without credit history, limiting financial access. Our project develops a Cash Score, an alternative credit measure using financial behavior.
Creditworthiness assessment has been a longstanding challenge. While modern credit scores became widely adopted in 1989, these traditional models often fail to account for the financial profiles of individuals who lack conventional credit histories. As a result, millions are excluded from fair access to credit.
Moreover, early credit evaluations were frequently marred by discriminatory practices, factoring in age, race, and marital status. Even today, reliance on conventional credit history can reinforce socioeconomic biases.
We analyze bank transactions, account activity, and income patterns for better credit assessment. With advancements in data infrastructure and open banking, we now have the technology to efficiently leverage financial data, making this the ideal moment to redefine credit assessment.
This allows us to extend loans to more newcomers, including immigrants and students, while also generating greater profits for our partners.
Why Cash Score?
- More inclusive than traditional credit scores
- Uses real-time transaction data
- Provides access to credit for underserved populations
- Reduces bias in credit assessment
- Captures day-to-day financial behavior
- Complements traditional credit scores
Research Question
How can machine learning be applied to develop a "Cash Score" that accurately reflects financial behavior and equal access to credit?
The Problem
We faced two main challenges in this project:
- Transaction Categorization: Accurately categorizing bank transaction memos to understand spending patterns
- Default Prediction: Using categorized transactions and account data to predict loan default probability
Recent studies have shown that alternative data sources, including transaction data, can significantly improve credit scoring accuracy, especially for individuals with limited or no traditional credit history, thus addressing the issue of financial inclusion.
Data Overview
We used bank transaction data from 2017-2023 provided by PrismData to develop our models. The data includes detailed information about consumers' financial activities, account balances, and transaction histories.
Sample of Consumer Data
prism_consumer_id | evaluation_date | credit_score | DQ_TARGET |
---|---|---|---|
0 | 2021-09-01 | 726.0 | 0.0 |
1 | 2021-07-01 | 626.0 | 0.0 |
... | ... | ... | ... |
Sample of Account Data
prism_consumer_id | prism_account_id | account_type | balance_date | balance |
---|---|---|---|---|
3023 | 0 | SAVINGS | 2021-08-31 | 90.57 |
3023 | 1 | CHECKING | 2021-08-31 | 225.95 |
... | ... | ... | ... | ... |
Sample Transaction Data
prism_consumer_id | amount | credit_or_debit | posted_date | category |
---|---|---|---|---|
3023 | 0.05 | CREDIT | 2021-04-16 | MISCELLANEOUS |
10533 | 4.96 | DEBIT | 2021-03-11 | BILLS_UTILITIES |
... | ... | ... | ... | ... |
Transaction Categorization Data
prism_consumer_id | amount | memo | posted_date | category |
---|---|---|---|---|
3023 | 0.05 | INTEREST PAYMENT | 2021-04-16 | MISCELLANEOUS |
10533 | 4.96 | ELECTRIC BILL PAYMENT | 2021-03-11 | BILLS_UTILITIES |
... | ... | ... | ... | ... |
Our dataset includes 47 different transaction categories. For the memo categorization task, we focused on 9 key categories:
- FOOD_AND_BEVERAGES
- GENERAL_MERCHANDISE
- GROCERIES
- PETS
- TRAVEL
- MORTGAGE
- OVERDRAFT
- EDUCATION
- RENT
We filtered rows where the memo field differs from the category field to allow our models to look for more meaningful features.
- Consumer Data: States if a consumer credit defaulted
- Account Data: Record of consumers' bank accounts
- Transaction Data: Tracks consumers' bank transactions
- Transaction Categorization Data: Contains transaction memos used for categorization
Data Preparation
We filtered out null or inconsistent records, removed outlier transactions with extreme values, and merged the datasets on consumer IDs to create a comprehensive view of each individual's financial behavior. For transaction categorization, we cleaned memo text by converting to lowercase, removing special characters and numbers, and trimming extra spaces.
Transaction Categorization
Memo Cleaning & Processing
To prepare our dataset for analysis, we cleaned the transaction memo text by:
- Converting all bank memos to lowercase
- Removing special characters and numbers
- Removing placeholder sequences (e.g., "xxx")
- Trimming extra spaces
We also extracted additional features from transaction data:
- TF-IDF: Identifying distinctive words in transaction memos
- Date Features: Month, day of week, weekend indicator
- Amount Features: Whether the amount is a whole number
Categorization Models
We tested several models to categorize transactions based on their memos:
Model | Accuracy | Speed |
---|---|---|
FastText | 96.88% | Fastest |
Logistic Regression | 96.45% | Fast |
XGBoost | 91.98% | Medium |
DistilBERT | 89.56% | Slow |
LLMs (Nemotron/Llama) | ~75-78% | Very Slow |
FastText emerged as the best model, offering an excellent balance of accuracy, training time, and inference speed.
Feature Engineering
We created 266 features based on attributes in our datasets. Our features fall under 3 types:
- Bank Balance Features:
- Current balance: Money in a person's account up to the most recent transaction
- Balance over time: Account balance in 3, 6, 9, and 12-month periods
- Average account balance: Mean transaction amount per consumer
- Income Features:
- Time-based average transaction amounts: Monthly, weekly, yearly averages
- Net monthly cash flow: Difference between inflows and outflows
- Category statistics: Number, sum, mean, median, and variance of transactions by category
- Spending Features:
- Outflow statistics: Spending habits over different time periods
- Outflow over time: Mean and variance of spending over 3, 6, 9, and 12 months
Ethical Considerations
During feature creation, we ensured compliance with fair lending regulations such as the Equal Credit Opportunity Act (ECOA). We removed variables that might inadvertently cause disparate impact to maintain fairness.

Top 15 Features by Mutual Information
Feature Selection
To reduce dimensionality from our initial 266 features, we applied:
- Mutual Information (MI): Calculated MI between each feature and the delinquency target
- Top 50 Features: We retained the top 50 features based on MI for final model training
This helped mitigate overfitting and improved interpretability while retaining most of the predictive signal.
Feature Analysis
Our feature analysis revealed important patterns that differentiate high-risk and low-risk consumers. Here are two key visualizations that demonstrate these patterns:
Monthly Account Balance Trends

Average Account Balance Distribution

Key Insights from Feature Analysis
Our analysis revealed that balance-related features are among the strongest predictors of default risk. Consumers with consistently low balances, frequent negative balances, or highly volatile account activity show significantly higher default rates. These patterns provide valuable signals that traditional credit scores might miss, especially for consumers with limited credit history.
Results
We explored and compared five models (Sequential Neural Network not listed in the table) to estimate the probability of loan default. After hyperparameter tuning, XGBoost emerged as the best performer with an ROC-AUC of 0.81.
Model Performance Comparison

This table compares the balanced accuracy and ROC-AUC scores of our four models. XGBoost and Gradient Boosting achieved the highest ROC-AUC (0.81), with XGBoost providing better balanced accuracy (0.71 vs 0.54).

The ROC curve illustrates the trade-off between sensitivity (true positive rate) and specificity (1 - false positive rate). Our XGBoost model achieved an AUC of 0.81, indicating strong discriminative power between defaulters and non-defaulters.
Model Evaluation

The confusion matrix shows our model's prediction accuracy. While we achieve good overall performance, there's a trade-off between identifying true defaulters and minimizing false positives. This balance can be adjusted based on business requirements.

This SHAP (SHapley Additive exPlanations) plot shows the most influential features in our model. Balance-related features dominate the top predictors, confirming our feature analysis findings that account balance patterns are key indicators of default risk.
Key Factors Driving Default Risk
For consumers predicted to default, we identified the most common factors that contributed to their high-risk assessment:
- Balance Count: Frequency of balance changes
- Balance Minimum: Extremely low minimum balance
- Average Account Balance: Overall low average balance
- Balance Median: Consistently low median balance
- Refund Amount Mean: Irregular or insufficient financial inflows
Key Insight: How consistently a consumer maintains adequate funds is a key default indicator. Frequent balance fluctuations, especially toward low or negative values, strongly signal potential default risk.

This chart shows the five most common "reason codes" that appeared among the top three SHAP contributors for each consumer predicted to default. For every predicted-default consumer, we extracted their three highest-impact features according to SHAP values, then calculated the proportion of times each feature was flagged as a leading cause of risk.
Cash Score vs. Traditional Credit Score
We compared our Cash Score against traditional credit scores to evaluate its complementary value:

Key Observations:
- Red points (defaults) concentrate in the lower-left region (low scores on both measures)
- Consumers with mid-range credit scores but high Cash Scores (middle-right area) show lower default rates
- The upper-right quadrant (high scores on both measures) shows nearly zero defaults

Key Insights:
- Default rates exceed 30-40% in the lower-left corner (low scores on both measures)
- Higher Cash Scores correlate with lower default rates even within the same credit score tier
- The gradient pattern shows how Cash Score provides additional risk differentiation beyond traditional credit scores
Value of Cash Score
- Captures day-to-day financial behavior not reflected in credit history
- Identifies financially stable individuals with limited credit history
- Reveals liquidity challenges that may not be apparent in conventional scores
- Provides a more holistic view of consumer financial health
Business Impact
- Expands the pool of creditworthy applicants
- Reduces default rates through more accurate risk assessment
- Enables more precise risk-based pricing
- Creates opportunities for financial inclusion while maintaining profitability
Conclusion
Our results demonstrate that bank transaction data, combined with carefully engineered features, can significantly improve credit risk assessment. Key takeaways include:
- Memo Categorization: FastText achieved excellent accuracy (96.88%) and scalability in labeling transaction memos.
- Predictive Modeling: XGBoost emerged as a strong performer for loan default risk prediction, yielding an ROC-AUC of 0.81.
- Fairness Considerations: We rigorously filtered sensitive or proxy variables to comply with the Equal Credit Opportunity Act, underscoring the ethical dimension of credit scoring.
- FICO Comparisons: The new "Cash Score" can complement traditional credit scores, especially for thin-file or borderline applicants.
Future work includes real-time model updating, extended interpretability analyses, and expanded data sources for even richer behavioral insights.
Impact
The Cash Score has the potential to:
- Expand credit access to millions of underserved individuals
- Reduce bias in lending decisions
- Improve risk assessment accuracy for financial institutions
- Create a more inclusive financial ecosystem