74.6 kB

Title: On-Chain Credit Risk Score in Decentralized Finance

URL Source: https://arxiv.org/html/2412.00710

Markdown Content: Back to arXiv

This is experimental HTML to improve accessibility. We invite you to report rendering errors. Use Alt+Y to toggle on accessible reporting links and Alt+Shift+Y to toggle off. Learn more about this project and help improve conversions.

Why HTML? Report Issue Back to Abstract Download PDF Abstract 1Introduction 2Framework 3On-Chain Credit Risk (OCCR) Score 4Simulation 5Dynamic LTV Adjustment 6Conclusion 7Appendix 8Asymptotic Normality of OCCR Score 9Consistency of the OCCR Score References

HTML conversions sometimes display errors due to content that did not convert correctly from the source. This paper uses the following packages that are not yet supported by the HTML conversion tool. Feedback on these issues are not necessary; they are known and are being worked on.

failed: tabularray

Authors: achieve the best HTML results from your LaTeX submissions by following these best practices.

License: CC BY-NC-SA 4.0 arXiv:2412.00710v2 [q-fin.RM] 25 Mar 2025 On-Chain Credit Risk Score in Decentralized Finance by Rik Ghosh, Arka Datta, Sudipan Sinha, Vidhi Aggarwal, Rajdeep Sengupta Chainrisk, UAE \ANDCorrespondence to: arka@chainrisk.cloud

Abstract

Decentralized Finance (DeFi), a financial ecosystem without centralized controlling organization, has introduced a new paradigm for lending and borrowing. However, its capital efficiency remains constrained by the inability to effectively assess the risk associated with each user/wallet. This paper introduces the ’On-Chain Credit Risk Score (OCCR Score) in DeFi’, a probabilistic measure designed to quantify the credit risk associated with a wallet. By analyzing historical real-time on-chain activity as well as predictive scenarios, the OCCR Score may enable DeFi lending protocols to dynamically adjust Loan-to-Value (LTV) ratios and Liquidation Thresholds (LT) based on the risk profile of a wallet. Unlike existing wallet risk scoring models, which rely on heuristic-based evaluations, the OCCR Score offers a more objective and probabilistic approach, aligning closer to traditional credit risk assessment methodologies. This framework can further enhance DeFi’s capital efficiency by incentivizing responsible borrowing behavior and optimizing risk-adjusted returns for lenders.

1Introduction

The ongoing digital revolution is driving a profound transformation in the financial sector. A key development in this evolution is decentralized finance (DeFi), which uses blockchain technology to facilitate open and permissionless financial services (Sahu and Kumar,, 2024). DeFi signifies a paradigm shift in financial systems that reshapes the future of global finance. DeFi provides financial services without the intervention of any centralized intermediaries. They operate mainly as automated protocols on a blockchain (Doerr et al.,, 2021). Although much more new in concept, DeFi is a fast-growing market for quick and safe interaction between lenders and borrowers. It is one of the best ways to seamlessly perform verifiable cross-border transactions in a much faster way compared to Traditional Finance. Numerous avenues can be explored to make DeFi more robust and capital-efficient. However, for such explorations, we need to understand the creditworthiness of each user. It should be noted that ‘user’ means the specific wallet interacting with the DeFi ecosystem, and in this article, user and wallet are used interchangeably.

Unlike conventional credit scoring, which depends on centralized credit bureaus and financial history of an user, DeFi credit scoring leverages on-chain data and smart contract interactions to evaluate an user’s borrowing and repayment behavior. Credit risk scoring is a systematic process used to assess the risk involved in a user’s creditworthiness by analyzing their on-chain transaction history, repayment behavior, outstanding liabilities, and other relevant economic indicators. This analytical approach enables financial institutions to quantify the probability of default, facilitating data-driven risk assessment and informed lending decisions (Moghe and Johri, n.d.,). An ‘On-Chain Credit Risk Score (OCCR Score)’ for a wallet might be an answer to quantifying the credit risk of the particular wallet in DeFi ecosystem. Through the OCCR Score of a wallet, we have estimated the probability that the particular wallet may face liquidation when any borrow position is opened. Throughout the paper, we have used borrow positions and loans interchangeably.

There are different ways to give credit scores to users in the current TradFi landscape, but this is not very prevalent in DeFi. However, there have been previous attempts to measure the creditworthiness of a wallet, including those by Cred (CRED Protocol, n.d.,), Credit Data Alliance (CreDA) (CreDA,, 2022), Credit scoring of Aave accounts (Wolf et al.,, 2022) and Levon (Block Analitica, n.d.,). The main limitation of these existing credit scoring models is the absence of a comprehensive framework that integrates historical, current, and future predictive scenarios. The Cred Protocol scoring model combines financial metrics such as loan history, account composition, account health, and new credit with qualitative factors such as ecosystem participation and attributes related to trust and transparency, which are described in a very vague way (Packin and Lev-Aretz,, 2024). The CreDA credit scoring model relies heavily on social activity data, which could obscure more concrete financial indicators such as asset holdings, lending and borrowing behavior, and off-chain data (CreDA,, 2022). This emphasis on decentralized social engagement raises concerns about the model’s ability to provide a reliable and objective assessment of creditworthiness (Packin and Lev-Aretz,, 2024). In addition, most of the existing scores take natural numbers on a variable scale. This scaling cannot be directly associated with a probability, which might make them subjective in nature. So, we have tried to develop a score that can bridge the three scenarios of historical, current, and predictive future and is equivalent to the probability of default for that particular wallet.

Section 3 explains the overall formulation of the OCCR Score and of each subscore of the OCCR Score. A simulation study has been conducted using synthetically generated data in Section 4. Section 5 discusses the usage of the OCCR Score to make the LTV/LF dynamic for each of the wallets. In addition, in the Appendix Section 7, the different statistical properties of an estimate (Fisher,, 1925) are derived in detail. By estimate, we mean the subscores of the OCCR Score.

2Framework

In this section, we will discuss the overall framework of the OCCR Score and the notational explanations that will help us to develop the OCCR Score for the different wallets. The OCCR score, which is a probability of default of a particular wallet, has a range of ( 0 , 1 ) and is a quantitative measure of the risk of default of the wallet’s credit. The risk associated with a wallet is its unreliability of repayment when a borrow position is opened. Thus, to understand the OCCR Score of the wallet, we need to look at both the historical behavior of the wallet, the present dynamics of the wallet such as the current risk subscore, the credit utilization, and other factors, which are explained in detail in Section 3. Here, we go through the terms that have been used extensively in later Sections. 𝐿 𝑖 , 𝑗 denote the loan amount taken by the 𝑖 𝑡 ⁢ ℎ wallet corresponding to the 𝑗 𝑡 ⁢ ℎ loan/position. This random variable 𝐿 𝑖 , 𝑗 can take any positive real value. Since there might be single and multiple assets that can be provided as collaterals in borrowing, we need to associate different scores to each of the assets depending on the asset’s riskiness. 𝑟 𝑖 , 𝑗 is used to denote the risk associated with each asset. Since this measure is of relative nature, it will have a range of [ 0 , 1 ] . Similarly to the loan amount, we also define 𝐶 𝑖 , 𝑗 , 𝑘 as the amount provided by the 𝑖 𝑡 ⁢ ℎ wallet to maintain the position 𝑗 𝑡 ⁢ ℎ when paid on the 𝑘 𝑡 ⁢ ℎ asset. We denote the total current holding of a particular 𝑖 𝑡 ⁢ ℎ wallet by 𝐻 𝑖 . 𝑇 𝑖 , 𝑗 denotes the amount transacted by the 𝑖 𝑡 ⁢ ℎ wallet in the 𝑗 𝑡 ⁢ ℎ transaction. Similarly to the loan amount, all random variables 𝐶 𝑖 , 𝑗 , 𝑘 , 𝐻 𝑖 , and 𝑇 𝑖 , 𝑗 can take any real positive value.

3On-Chain Credit Risk (OCCR) Score 3.1Historical Credit Risk subscore ( 𝑠 ^ ℎ 𝑖 )

In this section, we will dive into analyzing the historical data for each wallet 𝑖 . 𝑋 𝑖 , 𝑗 is a dichotomous random variable, defined as

𝑋 𝑖 , 𝑗

{ 1

if loan/position is liquidated

if loan/position is repayed.

It should be noted that 𝑃 ⁢ ( 𝑋 𝑖 , 𝑗 )

𝑠 ℎ 𝑖 . We need to estimate the parameter 𝑠 ℎ 𝑖 . Thus, we observe the ratio estimate,

𝑠 ^ ℎ 𝑖

∑ 𝑗 𝑤 𝑖 , 𝑗 ⁢ 𝑋 𝑖 , 𝑗 ∑ 𝑗 𝑤 𝑖 , 𝑗 ,

(1)

where 𝑤 𝑖 , 𝑗

𝐿 𝑖 , 𝑗 × ( 1 − 𝑟 𝑖 , 𝑗 ) × 𝑝 𝑖 , 𝑗 × 𝑡 𝑖 , 𝑗 .

The combined riskiness of all collateral assets is calculated by 𝑟 𝑖 , 𝑗

∑ 𝑘 𝐶 𝑖 , 𝑗 , 𝑘 ⁢ ( 𝜎 𝐶 𝑖 , 𝑗 , 𝑘 𝜎 𝑀 ⁢ 𝑎 ⁢ 𝑥 ) ∑ 𝑘 𝐶 𝑖 , 𝑗 , 𝑘 , for 𝐶 𝑖 , 𝑗 , 𝑘 are the collaterals provided for the 𝑗 𝑡 ⁢ ℎ loan. Here, by ‘Maximum Volatility ( 𝜎 𝑀 ⁢ 𝑎 ⁢ 𝑥 ) ’ we mean the maximum observed volatility among the collateral assets under consideration, while ‘Asset Volatility ( 𝜎 𝐴 ⁢ 𝑠 ⁢ 𝑠 ⁢ 𝑒 ⁢ 𝑡 ) ’ is the volatility of the corresponding collateral asset ( 𝑘 ) provided for the 𝑗 𝑡 ⁢ ℎ loan. 𝑝 𝑖 , 𝑗 denotes the proportion of the liquidated collateral asset (loan amount) as this proportion might vary depending on the particular DeFi protocol. The recency of the loan is represented by 𝑡 𝑖 , 𝑗 . 𝑡 𝑖 , 𝑗 will be of the form of,

𝑡 𝑖 , 𝑗

1 1 + 𝑒 − ( 𝑑 ⁢ 𝑡 𝑖 , 𝑗 − 𝑘 ) ,

(2)

where 𝑑 ⁢ 𝑡 𝑖 , 𝑗 is the corresponding month of particular date when the loan position was commenced, and 𝑘 is such that the 𝑡 𝑖 , 𝑗 takes the value of 0.5 for the month which falls in the middle of the whole period.

3.2Current Credit Risk subscore ( 𝑠 ^ 𝑐 𝑖 )

This section deals with the current (open) positions associated with the 𝑖 𝑡 ⁢ ℎ wallet. To understand the risk associated with the wallet, we need to understand the possibility of the particular wallet not being able to repay all the outstanding (open) loans if any unprecedented situation arises. Thus, to understand this, we observe the Liquidation at Risk (LaR) value for that particular wallet (Perez et al.,, 2021). It is to be noted that if the total LaR observed across all loans/positions does not exceed the current holding of the wallet, then the wallet is safe and credible to get further loans. Now, to observe LaR for a particular position, we need to simulate the price path of different assets. Thus, we simulate price paths in sets of a certain number, say 2000, and continue to do so until the difference of the variance of LaR values are convergent. The convergence check is performed using the condition | 𝜎 𝐿 ⁢ 𝑎 ⁢ 𝑅 𝑡 + 1 − 𝜎 𝐿 ⁢ 𝑎 ⁢ 𝑅 𝑡 + 1 ≤ 𝜖 | . Let us define the random variable 𝑍 𝑖 , 𝑗 , which is a dichotomous variable. 𝑍 𝑖 , 𝑗 takes the value 1 , if the total liquidations at risk ( 𝐿 ⁢ 𝑎 ⁢ 𝑅 𝑡 ⁢ 𝑜 ⁢ 𝑡 ⁢ 𝑎 ⁢ 𝑙 ) exceeds or equals to the current holding of the 𝑖 𝑡 ⁢ ℎ wallet, otherwise it is 0. Thus,

𝑍 𝑖 , 𝑗

{ 1

if ⁢ 𝐿 ⁢ 𝑎 ⁢ 𝑅 𝑡 ⁢ 𝑜 ⁢ 𝑡 ⁢ 𝑎 ⁢ 𝑙 ≥ 𝐻 𝑖

if ⁢ 𝐿 ⁢ 𝑎 ⁢ 𝑅 𝑡 ⁢ 𝑜 ⁢ 𝑡 ⁢ 𝑎 ⁢ 𝑙 < 𝐻 𝑖 .

where 𝐻 𝑖 is the current holding for the 𝑖 𝑡 ⁢ ℎ wallet, and 𝐸 ⁢ ( 𝑍 𝑖 , 𝑗 )

𝑠 𝑐 𝑖 .

The estimate of 𝑠 𝑐 𝑖 is given by,

𝑠 ^ 𝑐 𝑖

∑ 𝑗

1 𝑚 𝑍 𝑖 , 𝑗 𝑚 ,

(3)

, 𝑚 is the total number of times the price path has been simulated.

3.3Credit Utilization ( 𝑠 ^ 𝑐 ⁢ 𝑢 𝑖 )

In this section, we will explain the dependence of the subscore on the utilization of the available credit limit. We define the subscore for credit utilization, taking three different components into account, namely 𝐶 𝑖 , 𝑗 is the collateral asset (in $ ) given during the opening of the position 𝑗 , 𝐿 ⁢ 𝑇 ⁢ 𝑉 𝑖 , 𝑗 is the Loan-to-Value ratio prevalent at that exact time and 𝐿 𝑖 , 𝑗 as the loan amount taken at the 𝑗 𝑡 ⁢ ℎ position.

The estimate of 𝑠 𝑐 ⁢ 𝑢 𝑖 is given by,

𝑠 ^ 𝑐 ⁢ 𝑢 𝑖

∑ 𝑗 ( 1 − ( 𝐿 𝑖 , 𝑗 𝐶 𝑖 , 𝑗 × 𝐿 ⁢ 𝑇 ⁢ 𝑉 𝑖 , 𝑗 ) ) × 𝐿 𝑖 , 𝑗 ∑ 𝑗 𝐿 𝑖 , 𝑗 .

(4) 3.4On-Chain Transaction ( 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 )

Here, we will try to understand the on-chain transaction of a particular wallet. We will mainly focus on the number and the size of the transaction, with a larger weight given to recent transactions. The on-chain transaction subscore associated with the OCCR Score is given by

𝑠 ^ 𝑐 ⁢ 𝑡 𝑖

∑ 𝑙 𝑇 𝑖 , 𝑙 ⁢ 𝑆 𝑖 , 𝑙 ⁢ 𝑡 𝑖 , 𝑙 ∑ 𝑙 𝑇 𝑖 , 𝑙 ,

(5)

where 𝑇 𝑖 , 𝑙 is the 𝑙 𝑡 ⁢ ℎ transaction amount of the 𝑖 𝑡 ⁢ ℎ wallet. If the transaction is credited to the wallet 𝑖 , 𝑆 𝑖 , 𝑙 is + 1 , otherwise, the variable takes a value of − 1 .

3.5New Credit ( 𝑠 ^ 𝑛 ⁢ 𝑐 𝑖 )

In this section, we will look at the risk associated with the wallet for taking recent loans in bulk (multiple times within a particular time span). We will be using cluster analysis to study the pattern of the wallet in taking multiple loans in past, and compare that with recent loans opened. Depending on the multiple loan positions opened compared to the earlier cases, a negative point will be assigned to the wallet, given by 𝑠 ^ 𝑛 ⁢ 𝑐 𝑖 .

We denote the loan amount for the 𝑗 𝑡 ⁢ ℎ loan of the 𝑖 𝑡 ⁢ ℎ wallet as 𝐿 𝑖 , 𝑗 , and the corresponding date when the loan was taken as 𝐷 𝑖 , 𝑗 . Now, the shortest interval of the 𝑗 𝑡 ⁢ ℎ loan compared to its two consecutive loans is denoted by Δ ⁢ 𝐷 𝑖 , 𝑗 , where Δ ⁢ 𝐷 𝑖 , 𝑗

𝑚 ⁢ 𝑖 ⁢ 𝑛 ⁢ ( ( 𝐷 𝑖 , 𝑗 − 𝐷 𝑖 , 𝑗 − 1 ) , ( 𝐷 𝑖 , 𝑗 + 1 − 𝐷 𝑖 , 𝑗 ) ) . Let the mean value of all the loan amounts taken by the wallet in the last month (tentative for now) be denoted by 𝜇 𝐿 𝑖 , while the mean of the intervals be denoted by 𝜇 Δ ⁢ 𝐷 𝑖 . The total number of loans taken by the 𝑖 𝑡 ⁢ ℎ wallet in the last month is indicated by 𝑛 . Let 𝑌 𝑖 , 𝑗 be a dichotomous random variable such that it is given by

𝑌 𝑖 , 𝑗

{ 1

if ⁢ 𝐿 𝑖 , 𝑗 ≥ 𝜇 𝐿 𝑖 ⁢ and ⁢ Δ ⁢ 𝐷 𝑖 , 𝑗 ≤ 𝜇 Δ ⁢ 𝐷 𝑖

if otherwise,

with 𝑃 ⁢ ( 𝐿 𝑖 , 𝑗 ≥ 𝜇 𝐿 𝑖 ⁢ and ⁢ Δ ⁢ 𝐷 𝑖 , 𝑗 ≤ 𝜇 Δ ⁢ 𝐷 𝑖 )

𝑃 ⁢ ( 𝐿 𝑖 , 𝑗 ≥ 𝜇 𝐿 𝑖 ) × 𝑃 ⁢ ( Δ ⁢ 𝐷 𝑖 , 𝑗 ≤ 𝜇 Δ ⁢ 𝐷 𝑖 )

𝑠 𝑛 ⁢ 𝑐 𝑖 .

Now, the new credit risk subscore is given by,

𝑠 ^ 𝑛 ⁢ 𝑐 𝑖

∑ 𝑗

1 𝑛 𝑌 𝑖 , 𝑗 𝑛

(6) 3.6OCCR Score

The OCCR Score for the particular wallet is found using a weighted average of all the above credit risk subscores obtained. The weight associated to each of the subscores is 0.35 , 0.25 , 0.15 , 0.15 , 0.10 to 𝑠 ^ ℎ 𝑖 , 𝑠 ^ 𝑐 𝑖 , 𝑠 ^ 𝑐 ⁢ 𝑢 𝑖 , 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 and 𝑠 ^ 𝑛 ⁢ 𝑐 𝑖 respectively. Thus, the OCCR Score is obtained as

OCCR Score

0.35 × 𝑠 ^ ℎ 𝑖 + 0.25 × 𝑠 ^ 𝑐 𝑖 + 0.15 × ( 1 − 𝑠 ^ 𝑐 ⁢ 𝑢 𝑖 ) − 0.15 × 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 + 0.10 × 𝑠 ^ 𝑛 ⁢ 𝑐 𝑖

(7) 4Simulation

Any theoretical construe needs to be backed by a corresponding simulation study. In this section, we have tried to synthetically generate data for different wallets and understand the theoretical results claimed in Section 3. For the simulation study, we have taken different parameter values to generate transactions and borrowing positions for a wallet. The parameter values include the loan amount, the collateral amount, the timestamp, the asset provided as collateral, and others.

In our simulation study for the on-chain transaction subscore ( 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 ) , we generated synthetic data for different wallets by varying key parameters such as the probability of credit transactions 𝑝 . This probability is a measure of the probability that the wallet will have an amount credited to the wallet. Usually, transaction amounts are heavy-tailed in distribution. Thus, we assume that the transaction amount follows a Pareto distribution with shape parameter 𝛼 and scale 𝑥 𝑚 ⁢ 𝑖 ⁢ 𝑛 . For each transaction, we randomly generated the amount of the transaction using a Pareto distribution (with the specified 𝛼 and 𝑥 𝑚 ⁢ 𝑖 ⁢ 𝑛 ) (Arnold,, 2014), assigned a sign based on whether it was a credit or debit (which is also generated randomly from the Bernoulli distribution with a success probability of 𝑝 ). We weighted it by a recency score drawn uniformly from [ 0 , 1 ] . We then calculated the on-chain transaction subscore by taking the weighted sum of these transactions and normalizing it by the total transaction amount. For each wallet, we assumed a total of 60 , 000 transactions. In addition, the process was repeated for a total of 5000 times and the results obtained are tabulated in Table 1.

Table 1:Estimated On-Chain Transaction subscore ( 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 ) based on 5000 simulations along with corresponding Sample Standard Error (SSE), Asymptotic Standard Error (ASE), Coverage Probability (CP). No. ( 𝑝 , 𝛼 , 𝑥 𝑚 ⁢ 𝑖 ⁢ 𝑛 )

𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 ⁢ ( 𝑠 𝑐 ⁢ 𝑡 𝑖 ) ASE SSE 𝐶 ⁢ 𝑃 ^

1 (0.60, 2.10, 300) 0.0997 (0.10) 0.000031 0.000019 0.985 2 (0.35, 2.25, 300) -0.1496 (-0.15) 0.000014 0.000013 0.957 3 (0.80, 2.10, 320) 0.2991 (0.30) 0.000023 0.000014 0.984 4 (0.68, 2.60, 110) 0.1796 (0.18) 0.000008 0.000009 0.946 5 (0.42, 2.06, 108) -0.0798 (-0.08) 0.000049 0.000021 0.992

In Table 1, we have presented five different scenarios assuming different parameter values, which distinguish the specific wallet. In reality, we can observe that the borrowing or transaction patterns change from wallet to wallet. To mimic the different wallets, we have taken different values of the parameters such that all scenarios encompass both negative and positive values for on-chain transaction subscore. In all rows of Table 1, we can see that the theoretical values are nearly equal to the estimated ones. Also, in all rows, the coverage probability is near 0.95 or higher, which means that the confidence interval constructed using the estimated values includes the theoretical mean in all the scenarios. Thus, it can be said that the estimate is both unbiased and reliable (Voinov and Nikulin,, 2012). Also, since the ASE and SSE values are very close to each other and also very low, it implies that the estimator is also more consistent.

Table 2:Estimated Credit Utilization subscore ( 𝑠 ^ 𝑐 ⁢ 𝑢 𝑖 ) based on 5000 simulations along with corresponding Sample Standard Error (SSE), Asymptotic Standard Error (ASE), Coverage Probability (CP). No. ( 𝛼 , 𝑥 𝑚 ⁢ 𝑖 ⁢ 𝑛 , 𝑙 𝑚 ⁢ 𝑖 ⁢ 𝑛 , 𝑙 𝑚 ⁢ 𝑎 ⁢ 𝑥 )

𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 ⁢ ( 𝑠 𝑐 ⁢ 𝑡 𝑖 ) ASE SSE 𝐶 ⁢ 𝑃 ^

1 (2.10, 300, 0.50, 0.90) 0.333333 (0.333344) 0.0000059 0.0000035 0.986 2 (2.30, 210, 0.64, 0.92) 0.333359 (0.333338) 0.0000025 0.0000022 0.965 3 (2.80, 50, 0.62, 0.84) 0.333341 (0.333336) 0.0000014 0.0000014 0.952 4 (2.45, 680, 0.46, 0.74) 0.333368 (0.333337) 0.0000019 0.0000019 0.957 5 (2.45, 680, 0.46, 0.94) 0.333351 (0.333337) 0.0000020 0.0000018 0.958

Table 2 elaborates on the simulated results and compares those results with the theoretically obtained results for the credit utilization subscore ( 𝑠 ^ 𝑐 ⁢ 𝑢 𝑖 ) . Here, we also observe that the theoretical and the simulated estimator values are nearly the same, with pretty low ASE and SSE values. Thus, we can surely say that the credit utilization subscore estimator is also unbiased, reliable, and consistent. Simulation studies for other subscores can be done along similar lines as above.

5Dynamic LTV Adjustment

Dynamic LTV adjustment means that the LTV ratio offered to a borrower changes based on their OCCR Score, which quantifies their creditworthiness. A wallet with a lower OCCR Score (indicating a good repayment history and low default risk) could be rewarded with a higher LTV ratio (more borrowing power against their collateral), while a wallet with a high OCCR Score would be offered the LTV ratio prevalent in the market at that time.

5.1Stochastic Modeling of Dynamic LTV

Dynamic adjustment of LTV can be modeled as a stochastic process, which means that the LTV ratio is not fixed but changes over time, influenced by the borrower’s OCCR score. A stochastic process accounts for randomness and uncertainty in how factors such as market conditions or borrower behavior evolve, making the model adaptable to real-world scenarios.

5.1.1Basic Model Structure

Let’s define a time-dependent LTV ratio, 𝐿 ⁢ 𝑇 ⁢ 𝑉 ⁢ ( 𝑡 ) , which is adjusted according to the borrower’s OCCR score at time 𝑡 . We can express this as follows.

𝐿 ⁢ 𝑇 ⁢ 𝑉 ⁢ ( 𝑡 )

𝐿 ⁢ 𝑇 ⁢ 𝑉 fixed − 𝑚 ⁢ 𝑖 ⁢ 𝑛 ⁢ ( 𝑓 ⁢ ( 𝑂 ⁢ 𝐶 ⁢ 𝐶 Score_t ) , 0 )

(8)

Where:

•

𝐿 ⁢ 𝑇 ⁢ 𝑉 ⁢ ( 𝑡 ) is the LTV ratio at time 𝑡 ,

•

𝐿 ⁢ 𝑇 ⁢ 𝑉 fixed is the base or market-determined LTV ratio that would be applied in the absence of any OCCR score,

•

𝑓 ⁢ ( 𝑂 ⁢ 𝐶 ⁢ 𝐶 ⁢ _ ⁢ 𝑆 ⁢ 𝑐 ⁢ 𝑜 ⁢ 𝑟 ⁢ 𝑒 𝑡 ) is an adjustment function that modifies the LTV based on the OCCR Score at time 𝑡 .

5.1.2Adjustment Function 𝑓 ⁢ ( 𝑂 ⁢ 𝐶 ⁢ 𝐶 ⁢ 𝑅 ⁢ _ ⁢ 𝑆 ⁢ 𝑐 ⁢ 𝑜 ⁢ 𝑟 ⁢ 𝑒 𝑡 )

The adjustment function, 𝑓 ⁢ ( 𝑂 ⁢ 𝐶 ⁢ 𝐶 ⁢ 𝑅 ⁢ _ ⁢ 𝑆 ⁢ 𝑐 ⁢ 𝑜 ⁢ 𝑟 ⁢ 𝑒 𝑡 ) , increases or decreases the LTV ratio depending on the borrower’s risk profile at any given time. It could take several forms, such as a linear or non-linear relationship between the OCCR score and the LTV ratio. For example:

𝑓 ⁢ ( 𝑂 ⁢ 𝐶 ⁢ 𝐶 ⁢ 𝑅 ⁢ _ ⁢ 𝑆 ⁢ 𝑐 ⁢ 𝑜 ⁢ 𝑟 ⁢ 𝑒 𝑡 )

𝛼 ⋅ ( 𝑂 ⁢ 𝐶 ⁢ 𝐶 ⁢ 𝑅 ⁢ _ ⁢ 𝑆 ⁢ 𝑐 ⁢ 𝑜 ⁢ 𝑟 ⁢ 𝑒 𝑡 − 𝑂 ⁢ 𝐶 ⁢ 𝐶 ⁢ 𝑅 avg )

Where:

•

𝛼 is a scaling parameter that controls the sensitivity of LTV adjustments based on the OCCR Score, Use past data to backtest how different 𝛼 values would have impacted loan performance and liquidation risks.

•

𝑂 ⁢ 𝐶 ⁢ 𝐶 ⁢ 𝑅 ⁢ _ ⁢ 𝑆 ⁢ 𝑐 ⁢ 𝑜 ⁢ 𝑟 ⁢ 𝑒 𝑡 is the OCCR score at time 𝑡 ,

•

𝑂 ⁢ 𝐶 ⁢ 𝐶 ⁢ 𝑅 avg is the average OCCR score in the market.

6Conclusion

In this paper, we have performed a detailed analysis of historical credit risk, current credit risk, new credit, credit utilization, and on-chain transaction subscores, providing valuable insight into their expectations, variances, and consistency. Under the assumptions outlined, each estimator is consistent in estimating the respective credit risk score for the wallet. In Section 4 and Appendix 7, we have tried to establish the unbiased (Voinov and Nikulin,, 2012) nature of all subscores, using simulation and theoretical studies, respectively. Collectively, these estimators offer a comprehensive approach to assessing wallet risk across different time frames and transaction types, facilitating more accurate and reliable credit risk assessments. In Section 4, we have synthetically generated the transaction data for different wallets to emulate the real-life scenario and compared the simulated results with the theoretical results. For two of the subscores, it was observed that both results matched a high coverage probability score. Thus, it can be claimed that the estimators are reliable for practical applications.

The ‘On-Chain Credit Risk Score (OCCR Score)’ of wallets will help lending borrowing protocols and other DeFi institutes to understand the risk involved in allowing a wallet to open borrow position and thus may change the Loan-to-Value (LTV) ratio and subsequently the Liquidation Threshold (LT) if required. Through the OCCR score, we are trying to tailor the LT/LTV for particular wallets, hence enabling ’walletized finance’. If a lower ‘OCCR Scoring’ is associated with a wallet, DeFi institutions may be incentivized to offer them borrow positions at a higher LT/LTV ratio than observed in the market, while for wallets maintaining a higher ‘OCCR Scoring’ value, they may decide on keeping the LT/LTV ratio the same as the one prevalent in the market. This will encourage wallets to maintain a lower credit risk score to get loans at a much better (higher) LT/LTV ratio value than what is prevalent in the market. This might help the DeFi market to be more capital-efficient while maintaining a less risky approach. It should be noted that a wallet that enters the DeFi ecosystem for the very first time will receive a mean OCCR Score since that wallet has yet to make its first on-chain transaction.

7Appendix 7.1Expectation, Variance and Consistency of Historical Sub Score

In this section, we derive the approximate expectation and variance of the historical subscore:

𝑠 ^ ℎ 𝑖

𝑁 𝑖 𝐷 𝑖

∑ 𝑗

1 𝑛 𝑤 𝑖 , 𝑗 ⁢ 𝑋 𝑖 , 𝑗 ∑ 𝑗

1 𝑛 𝑤 𝑖 , 𝑗 ,

where for each loan (or position) 𝑗 :

•

𝑋 𝑖 , 𝑗 is a Bernoulli variable with

𝑃 ⁢ ( 𝑋 𝑖 , 𝑗

1 )

𝑠 ℎ 𝑖 , 𝑃 ⁢ ( 𝑋 𝑖 , 𝑗

0 )

1 − 𝑠 ℎ 𝑖 ⟹ 𝔼 ⁢ [ 𝑋 𝑖 , 𝑗 ]

𝑠 ℎ 𝑖 .

•

The weight is given by

𝑤 𝑖 , 𝑗

𝐿 𝑖 , 𝑗 ⁢ ( 1 − 𝑟 𝑖 , 𝑗 ) ⁢ 𝑝 ⁢ 𝑡 𝑖 , 𝑗 ,

with:

–

Loan Amount:

𝐿 𝑖 , 𝑗 ∣ ( ltv , collateral ) ∼ Uniform ⁡ ( 0 , ltv × collateral ) ,

then

𝔼 ⁢ [ 𝐿 𝑖 , 𝑗 ∣ ltv , collateral ]

ltv × collateral 2 .

ltv ∼ Uniform ⁡ ( 𝑙 min , 𝑙 max ) and 𝔼 ⁢ [ ltv ]

𝑙 min + 𝑙 max 2 ,

and if the collateral is Pareto distributed with parameters 𝛼 and scale 𝑚 , so that

𝔼 ⁢ [ collateral ]

𝛼 ⁢ 𝑚 𝛼 − 1 ,

then, by independence,

𝔼 [ 𝐿 𝑖 , 𝑗 ]

( 𝑙 min + 𝑙 max ) ⁢ 𝛼 ⁢ 𝑚 4 ⁢ ( 𝛼 − 1 ) .

Similarly, using 𝔼 ⁢ [ 𝑋 2 ]

𝑎 2 3 for a Uniform ⁡ ( 0 , 𝑎 ) variable,

𝔼 [ 𝐿 𝑖 , 𝑗 2 ]

( 𝑙 min 2 + 𝑙 min ⁢ 𝑙 max + 𝑙 max 2 ) ⁢ 𝛼 ⁢ 𝑚 2 9 ⁢ ( 𝛼 − 2 ) .

–

Risk Factor: If 𝑟 𝑖 , 𝑗 ∼ Uniform ⁡ ( 0 , 1 ) , then

𝔼 ⁢ [ 1 − 𝑟 𝑖 , 𝑗 ]

1 2 , 𝔼 ⁢ [ ( 1 − 𝑟 𝑖 , 𝑗 ) 2 ]

1 3 .

–

Recency: Assuming 𝑡 𝑖 , 𝑗 ∼ Uniform ⁡ ( 0 , 1 ) ,

𝔼 ⁢ [ 𝑡 𝑖 , 𝑗 ]

1 2 , 𝔼 ⁢ [ 𝑡 𝑖 , 𝑗 2 ]

1 3 .

–

Liquidation Probability: Here, 𝑝 is a constant.

Under the assumption of independence of the components, the first and second moments of the weight are:

𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 ]

𝔼 ⁢ [ 𝐿 𝑖 , 𝑗 ] ⋅ 1 2 ⋅ 𝑝 ⋅ 1 2

𝑝 ⁢ 𝔼 ⁢ [ 𝐿 𝑖 , 𝑗 ] 4 ,

and

𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 2 ]

𝔼 ⁢ [ 𝐿 𝑖 , 𝑗 2 ] ⋅ 1 3 ⋅ 𝑝 2 ⋅ 1 3

𝑝 2 ⁢ 𝔼 ⁢ [ 𝐿 𝑖 , 𝑗 2 ] 9 .

7.1.1Expectation of the Historical Credit Risk Score

Define

𝑁 𝑖

∑ 𝑗

1 𝑛 𝑤 𝑖 , 𝑗 ⁢ 𝑋 𝑖 , 𝑗 and 𝐷 𝑖

∑ 𝑗

1 𝑛 𝑤 𝑖 , 𝑗 .

Since 𝑤 𝑖 , 𝑗 and 𝑋 𝑖 , 𝑗 are independent,

𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 ⁢ 𝑋 𝑖 , 𝑗 ]

𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 ] ⁢ 𝔼 ⁢ [ 𝑋 𝑖 , 𝑗 ]

𝑠 ℎ 𝑖 ⁢ 𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 ] ,

so that

𝔼 ⁢ [ 𝑁 𝑖 ]

𝑠 ℎ 𝑖 ⁢ ∑ 𝑗

1 𝑛 𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 ] , 𝔼 ⁢ [ 𝐷 𝑖 ]

∑ 𝑗

1 𝑛 𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 ] .

Using the approximation

𝔼 ⁢ [ 1 𝐷 𝑖 ] ≈ 1 𝔼 ⁢ [ 𝐷 𝑖 ] + Var ⁡ ( 𝐷 𝑖 ) ( 𝔼 ⁢ [ 𝐷 𝑖 ] ) 3 ,

we have

𝔼 ⁢ [ 𝑠 ^ ℎ 𝑖 ]
≈ 𝔼 ⁢ [ 𝑁 𝑖 ] ⋅ 𝔼 ⁢ [ 1 𝐷 𝑖 ]

≈ ( 𝑠 ℎ 𝑖 ⁢ ∑ 𝑗

1 𝑛 𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 ] ) ⁢ ( 1 ∑ 𝑗

1 𝑛 𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 ] + Var ⁡ ( 𝐷 𝑖 ) ( ∑ 𝑗

1 𝑛 𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 ] ) 3 )

𝑠 ℎ 𝑖 ⁢ ( 1 + Var ⁡ ( 𝐷 𝑖 ) ( ∑ 𝑗

1 𝑛 𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 ] ) 2 ) .

In the special case where all loans are identically distributed (denoting 𝜇 𝑤

𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 ] and 𝜇 𝑤 2

𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 2 ] ), we have

𝔼 ⁢ [ 𝐷 𝑖 ]

𝑛 ⁢ 𝜇 𝑤 , Var ⁡ ( 𝐷 𝑖 )

𝑛 ⁢ ( 𝜇 𝑤 2 − 𝜇 𝑤 2 ) ,

so that

𝔼 [ 𝑠 ^ ℎ 𝑖 ] ≈ 𝑠 ℎ 𝑖 ( 1 + 𝜇 𝑤 2 − 𝜇 𝑤 2 𝑛 ⁢ 𝜇 𝑤 2 ) .

If the variation in 𝐷 𝑖 is negligible, then 𝔼 ⁢ [ 𝑠 ^ ℎ 𝑖 ] ≈ 𝑠 ℎ 𝑖 .

7.1.2Variance of the Historical Credit Risk Score

We use the first–order Delta method to approximate the variance of the ratio

𝑠 ^ ℎ 𝑖

𝑁 𝑖 𝐷 𝑖 .

Step 1. Partial Derivatives: Define

𝑓 ⁢ ( 𝑁 𝑖 , 𝐷 𝑖 )

𝑁 𝑖 𝐷 𝑖 .

Then

∂ 𝑓 ∂ 𝑁 𝑖

1 𝐷 𝑖 , ∂ 𝑓 ∂ 𝐷 𝑖

− 𝑁 𝑖 𝐷 𝑖 2 .

Evaluating at the mean values 𝔼 ⁢ [ 𝑁 𝑖 ]

𝑠 ℎ 𝑖 ⁢ 𝔼 ⁢ [ 𝐷 𝑖 ] and 𝔼 ⁢ [ 𝐷 𝑖 ] , we have:

∂ 𝑓 ∂ 𝑁 𝑖 | 𝔼

1 𝔼 ⁢ [ 𝐷 𝑖 ] , ∂ 𝑓 ∂ 𝐷 𝑖 | 𝔼

− 𝑠 ℎ 𝑖 𝔼 ⁢ [ 𝐷 𝑖 ] .

Step 2. Delta Method Formula: The variance is approximated by

Var ⁡ ( 𝑠 ^ ℎ 𝑖 ) ≈ ( 1 𝔼 ⁢ [ 𝐷 𝑖 ] ) 2 ⁢ Var ⁡ ( 𝑁 𝑖 ) + ( 𝑠 ℎ 𝑖 𝔼 ⁢ [ 𝐷 𝑖 ] ) 2 ⁢ Var ⁡ ( 𝐷 𝑖 ) − 2 ⁢ 𝑠 ℎ 𝑖 ( 𝔼 ⁢ [ 𝐷 𝑖 ] ) 2 ⁢ Cov ⁡ ( 𝑁 𝑖 , 𝐷 𝑖 ) .

Step 3. Variance of 𝑁 𝑖 : Since

𝑁 𝑖

∑ 𝑗

1 𝑛 𝑤 𝑖 , 𝑗 ⁢ 𝑋 𝑖 , 𝑗 ,

and using the independence of 𝑤 𝑖 , 𝑗 and 𝑋 𝑖 , 𝑗 , we have for each 𝑗 :

Var ⁡ ( 𝑤 𝑖 , 𝑗 ⁢ 𝑋 𝑖 , 𝑗 )

𝔼 ⁢ [ ( 𝑤 𝑖 , 𝑗 ⁢ 𝑋 𝑖 , 𝑗 ) 2 ] − ( 𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 ⁢ 𝑋 𝑖 , 𝑗 ] ) 2

𝑠 ℎ 𝑖 ⁢ 𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 2 ] − 𝑠 ℎ 𝑖 2 ⁢ ( 𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 ] ) 2 .

Summing over 𝑗 :

Var ( 𝑁 𝑖 )

∑ 𝑗

1 𝑛 [ 𝑠 ℎ 𝑖 𝔼 [ 𝑤 𝑖 , 𝑗 2 ] − 𝑠 ℎ 𝑖 2 ( 𝔼 [ 𝑤 𝑖 , 𝑗 ] ) 2 ] .

Step 4. Variance of 𝐷 𝑖 : Since

𝐷 𝑖

∑ 𝑗

1 𝑛 𝑤 𝑖 , 𝑗 ,

we have

Var ( 𝐷 𝑖 )

∑ 𝑗

1 𝑛 [ 𝔼 [ 𝑤 𝑖 , 𝑗 2 ] − ( 𝔼 [ 𝑤 𝑖 , 𝑗 ] ) 2 ] .

Step 5. Covariance between 𝑁 𝑖 and 𝐷 𝑖 : Since only the same index 𝑗 contributes,

Cov ⁡ ( 𝑁 𝑖 , 𝐷 𝑖 )

∑ 𝑗

1 𝑛 Cov ⁡ ( 𝑤 𝑖 , 𝑗 ⁢ 𝑋 𝑖 , 𝑗 , 𝑤 𝑖 , 𝑗 ) .

For each 𝑗 :

Cov ⁡ ( 𝑤 𝑖 , 𝑗 ⁢ 𝑋 𝑖 , 𝑗 , 𝑤 𝑖 , 𝑗 )

𝑠 ℎ 𝑖 ⁢ ( 𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 2 ] − ( 𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 ] ) 2 ) .

Thus,

Cov ( 𝑁 𝑖 , 𝐷 𝑖 )

𝑠 ℎ 𝑖 Var ( 𝐷 𝑖 ) .

Step 6. Combine the Pieces: Substitute into the Delta formula:

Var ⁡ ( 𝑠 ^ ℎ 𝑖 )
≈ Var ⁡ ( 𝑁 𝑖 ) ( 𝔼 ⁢ [ 𝐷 𝑖 ] ) 2 + 𝑠 ℎ 𝑖 2 ⁢ Var ⁡ ( 𝐷 𝑖 ) ( 𝔼 ⁢ [ 𝐷 𝑖 ] ) 2 − 2 ⁢ 𝑠 ℎ 𝑖 ( 𝔼 ⁢ [ 𝐷 𝑖 ] ) 2 ⁢ 𝑠 ℎ 𝑖 ⁢ Var ⁡ ( 𝐷 𝑖 )

Var ⁡ ( 𝑁 𝑖 ) ( 𝔼 ⁢ [ 𝐷 𝑖 ] ) 2 − 𝑠 ℎ 𝑖 2 ⁢ Var ⁡ ( 𝐷 𝑖 ) ( 𝔼 ⁢ [ 𝐷 𝑖 ] ) 2 .

In the case of 𝑛 identical loans, with 𝜇 𝑤

𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 ] and 𝜇 𝑤 2

𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 2 ] , we have:

𝔼 ⁢ [ 𝐷 𝑖 ]

𝑛 ⁢ 𝜇 𝑤 , Var ⁡ ( 𝑁 𝑖 )

𝑛 ⁢ [ 𝑠 ℎ 𝑖 ⁢ 𝜇 𝑤 2 − 𝑠 ℎ 𝑖 2 ⁢ 𝜇 𝑤 2 ] , Var ⁡ ( 𝐷 𝑖 )

𝑛 ⁢ [ 𝜇 𝑤 2 − 𝜇 𝑤 2 ] .

Then,

Var ⁡ ( 𝑠 ^ ℎ 𝑖 )
≈ 𝑛 ⁢ [ 𝑠 ℎ 𝑖 ⁢ 𝜇 𝑤 2 − 𝑠 ℎ 𝑖 2 ⁢ 𝜇 𝑤 2 ] 𝑛 2 ⁢ 𝜇 𝑤 2 − 𝑠 ℎ 𝑖 2 ⁢ 𝑛 ⁢ [ 𝜇 𝑤 2 − 𝜇 𝑤 2 ] 𝑛 2 ⁢ 𝜇 𝑤 2

𝑠 ℎ 𝑖 ⁢ 𝜇 𝑤 2 − 𝑠 ℎ 𝑖 2 ⁢ 𝜇 𝑤 2 − 𝑠 ℎ 𝑖 2 ⁢ 𝜇 𝑤 2 + 𝑠 ℎ 𝑖 2 ⁢ 𝜇 𝑤 2 𝑛 ⁢ 𝜇 𝑤 2

𝑠 ℎ 𝑖 ⁢ ( 1 − 𝑠 ℎ 𝑖 ) ⁢ 𝜇 𝑤 2 𝑛 ⁢ 𝜇 𝑤 2 .

Recalling that

𝜇 𝑤

𝑝 ⁢ 𝔼 ⁢ [ 𝐿 𝑖 , 𝑗 ] 4 , 𝜇 𝑤 2

𝑝 2 ⁢ 𝔼 ⁢ [ 𝐿 𝑖 , 𝑗 2 ] 9 ,

we have:

𝜇 𝑤 2 𝜇 𝑤 2

𝑝 2 ⁢ 𝔼 ⁢ [ 𝐿 𝑖 , 𝑗 2 ] 9 ( 𝑝 ⁢ 𝔼 ⁢ [ 𝐿 𝑖 , 𝑗 ] 4 ) 2

16 ⁢ 𝔼 ⁢ [ 𝐿 𝑖 , 𝑗 2 ] 9 ⁢ 𝔼 ⁢ [ 𝐿 𝑖 , 𝑗 ] 2 .

Thus, the variance becomes:

Var ( 𝑠 ^ ℎ 𝑖 ) ≈ 16 ⁢ 𝑠 ℎ 𝑖 ⁢ ( 1 − 𝑠 ℎ 𝑖 ) ⁢ 𝔼 ⁢ [ 𝐿 𝑖 , 𝑗 2 ] 9 ⁢ 𝑛 ⁢ ( 𝔼 ⁢ [ 𝐿 𝑖 , 𝑗 ] ) 2 .

This completes the full derivation of the expectation and variance of the historical sub score.

7.1.3Consistency of the Historical Credit Risk Score

For the estimator

𝑠 ^ ℎ 𝑖

𝑁 𝑖 𝐷 𝑖

∑ 𝑗

1 𝑛 𝑤 𝑖 , 𝑗 ⁢ 𝑋 𝑖 , 𝑗 ∑ 𝑗

1 𝑛 𝑤 𝑖 , 𝑗 ,

we have shown that

𝔼 ⁢ [ 𝑠 ^ ℎ 𝑖 ] ≈ 𝑠 ℎ 𝑖 and Var ⁡ ( 𝑠 ^ ℎ 𝑖 ) ≈ 16 ⁢ 𝑠 ℎ 𝑖 ⁢ ( 1 − 𝑠 ℎ 𝑖 ) ⁢ 𝔼 ⁢ [ 𝐿 𝑖 , 𝑗 2 ] 9 ⁢ 𝑛 ⁢ ( 𝔼 ⁢ [ 𝐿 𝑖 , 𝑗 ] ) 2 .

Since the variance of 𝑠 ^ ℎ 𝑖 is proportional to 1 / 𝑛 , it tends to zero as 𝑛 → ∞ . Thus, the estimator 𝑠 ^ ℎ 𝑖 converges in probability to 𝑠 ℎ 𝑖 , i.e.,

𝑠 ^ ℎ 𝑖 → 𝑝 𝑠 ℎ 𝑖 as 𝑛 → ∞ .

This demonstrates the consistency of the historical sub score.

7.2Expectation, Variance, and Consistency of Current Credit Risk Score ( 𝑠 ^ 𝑐 𝑖 )

We define the Current Wallet Risk Score estimator for the 𝑖 th wallet as:

𝑠 ^ 𝑐 𝑖

∑ 𝑗

1 𝑘 𝑍 𝑖 , 𝑗 𝑘

where 𝑍 𝑖 , 𝑗 is a Bernoulli random variable with mean 𝑠 𝑐 𝑖

𝔼 ⁢ [ 𝑍 𝑖 , 𝑗 ] . Therefore, 𝑍 𝑖 , 𝑗 ∼ Bernoulli ⁢ ( 𝑠 𝑐 𝑖 ) , and ∑ 𝑗

1 𝑘 𝑍 𝑖 , 𝑗 ∼ Binomial ⁢ ( 𝑘 , 𝑠 𝑐 𝑖 ) .

Assuming both Liquidation at Risk (LaR) and Holding ( 𝐻 ) follow Pareto distributions:

𝐿 ⁢ 𝑎 ⁢ 𝑅

∼ Pareto ⁢ ( 𝛼 𝐿 , 𝑥 𝑚 𝐿 ) ,

(9)

𝐻

∼ Pareto ⁢ ( 𝛼 𝐻 , 𝑥 𝑚 𝐻 ) ,

(10)

where 𝑥 𝑚 𝐿 , 𝑥 𝑚 𝐻 are the scale parameters, and 𝛼 𝐿 , 𝛼 𝐻 are the shape parameters. The probability of liquidation risk exceeding holdings is given by:

𝑃 ⁢ ( 𝐿 ⁢ 𝑎 ⁢ 𝑅 > 𝐻 )

∫ 𝑥 𝑚 𝐻 ∞ 𝑃 ⁢ ( 𝐿 ⁢ 𝑎 ⁢ 𝑅

ℎ ) ⁢ 𝑓 𝐻 ⁢ ( ℎ ) ⁢ 𝑑 ℎ .

(11)

Using the cumulative distribution function (CDF) of LaR,

𝑃 ⁢ ( 𝐿 ⁢ 𝑎 ⁢ 𝑅 > ℎ )

( 𝑥 𝑚 𝐿 ℎ ) 𝛼 𝐿 , ℎ ≥ 𝑥 𝑚 𝐿 .

(12)

Substituting this into the integral,

𝑃 ⁢ ( 𝐿 ⁢ 𝑎 ⁢ 𝑅 > 𝐻 )

∫ 𝑥 𝑚 𝐻 ∞ ( 𝑥 𝑚 𝐿 ℎ ) 𝛼 𝐿 ⁢ 𝛼 𝐻 ⁢ ( 𝑥 𝑚 𝐻 ) 𝛼 𝐻 ℎ 𝛼 𝐻 + 1 ⁢ 𝑑 ℎ .

(13)

This simplifies to:

𝑃 ⁢ ( 𝐿 ⁢ 𝑎 ⁢ 𝑅 > 𝐻 )

𝛼 𝐻 ⁢ ( 𝑥 𝑚 𝐻 ) 𝛼 𝐻 ⁢ ( 𝑥 𝑚 𝐿 ) 𝛼 𝐿 ⁢ ∫ 𝑥 𝑚 𝐻 ∞ ℎ − 𝛼 𝐿 − 𝛼 𝐻 − 1 ⁢ 𝑑 ℎ .

(14)

Evaluating the integral:

∫ 𝑥 𝑚 𝐻 ∞ ℎ − ( 𝛼 𝐿 + 𝛼 𝐻 + 1 ) ⁢ 𝑑 ℎ

𝑥 𝑚 𝐻 ( 𝛼 𝐿 + 𝛼 𝐻 ) ⁢ ( 𝑥 𝑚 𝐻 ) − ( 𝛼 𝐿 + 𝛼 𝐻 ) .

(15)

Thus, we obtain:

𝑃 ⁢ ( 𝐿 ⁢ 𝑎 ⁢ 𝑅 > 𝐻 )

𝛼 𝐻 𝛼 𝐿 + 𝛼 𝐻 ⁢ ( 𝑥 𝑚 𝐿 𝑥 𝑚 𝐻 ) 𝛼 𝐿 .

(16)

Therefore, the expected value of 𝑍 𝑖 , 𝑗 , which determines the credit risk subscore, is given by:

𝐸 ⁢ ( 𝑍 𝑖 , 𝑗 )

𝑠 ^ 𝑐 𝑖

𝑃 ⁢ ( 𝐿 ⁢ 𝑎 ⁢ 𝑅

𝐻 ) .

(17) 7.2.1Expectation of 𝑠 ^ 𝑐 𝑖

The expectation of 𝑠 ^ 𝑐 𝑖 is given by:

𝔼 ⁢ [ 𝑠 ^ 𝑐 𝑖 ]

𝔼 ⁢ [ ∑ 𝑗

1 𝑘 𝑍 𝑖 , 𝑗 𝑘 ]

Using the linearity of expectation:

𝔼 ⁢ [ 𝑠 ^ 𝑐 𝑖 ]

𝔼 ⁢ [ ∑ 𝑗

1 𝑘 𝑍 𝑖 , 𝑗 𝑘 ]

𝔼 ⁢ [ ∑ 𝑗

1 𝑘 𝑍 𝑖 , 𝑗 ] 𝑘

Since 𝔼 ⁢ [ 𝑍 𝑖 , 𝑗 ]

𝑠 𝑐 𝑖 :

𝔼 ⁢ [ ∑ 𝑗

1 𝑘 𝑍 𝑖 , 𝑗 ]

𝑘 ⋅ 𝑠 𝑐 𝑖

Thus:

𝔼 ⁢ [ 𝑠 ^ 𝑐 𝑖 ]

𝑘 ⋅ 𝑠 𝑐 𝑖 𝑘

𝑠 𝑐 𝑖

This shows that 𝑠 ^ 𝑐 𝑖 is an unbiased estimator for 𝑠 𝑐 𝑖 .

7.2.2Variance of 𝑠 ^ 𝑐 𝑖

The variance of 𝑠 ^ 𝑐 𝑖 is given by:

Var ⁢ ( 𝑠 ^ 𝑐 𝑖 )

Var ⁢ ( ∑ 𝑗

1 𝑘 𝑍 𝑖 , 𝑗 𝑘 )

Since the variance of a constant (1) is zero:

Var ⁢ ( 𝑠 ^ 𝑐 𝑖 )

Var ⁢ ( ∑ 𝑗

1 𝑘 𝑍 𝑖 , 𝑗 𝑘 )

For a Binomial random variable ∑ 𝑗

1 𝑘 𝑍 𝑖 , 𝑗 ∼ Binomial ⁢ ( 𝑘 , 𝑠 𝑐 𝑖 ) , we know that:

Var ⁢ ( ∑ 𝑗

1 𝑘 𝑍 𝑖 , 𝑗 )

𝑘 ⋅ 𝑠 𝑐 𝑖 ⋅ ( 1 − 𝑠 𝑐 𝑖 )

Thus:

Var ⁢ ( ∑ 𝑗

1 𝑘 𝑍 𝑖 , 𝑗 𝑘 )

𝑘 ⋅ 𝑠 𝑐 𝑖 ⋅ ( 1 − 𝑠 𝑐 𝑖 ) 𝑘 2

𝑠 𝑐 𝑖 ⋅ ( 1 − 𝑠 𝑐 𝑖 ) 𝑘

Therefore, the variance of 𝑠 ^ 𝑐 𝑖 is:

Var ⁢ ( 𝑠 ^ 𝑐 𝑖 )

𝑠 𝑐 𝑖 ⁢ ( 1 − 𝑠 𝑐 𝑖 ) 𝑘

7.2.3Consistency of 𝑠 ^ 𝑐 𝑖

As 𝑘 → ∞ , the variance Var ⁢ ( 𝑠 ^ 𝑐 𝑖 )

𝑠 𝑐 𝑖 ⁢ ( 1 − 𝑠 𝑐 𝑖 ) 𝑘 → 0 . Therefore, 𝑠 ^ 𝑐 𝑖 is a consistent estimator of 1 − 𝑠 𝑐 𝑖 .

7.3Expectation, Variance, and Consistency of New Credit ( 𝑠 ^ 𝑛 ⁢ 𝑐 𝑖 )

For the new credit risk score, 𝑠 ^ 𝑛 ⁢ 𝑐 𝑖 , we compute the expectation, variance, and consistency.

Given:

𝑠 ^ 𝑛 ⁢ 𝑐 𝑖

∑ 𝑗

1 𝑛 𝑌 𝑖 , 𝑗 𝑛 ,

where 𝑌 𝑖 , 𝑗 is a Bernoulli random variable such that

𝑌 𝑖 , 𝑗

{ 1 ,

if ⁢ 𝐿 𝑖 , 𝑗 ≥ 𝜇 𝐿 𝑖 ⁢ and ⁢ Δ ⁢ 𝐷 𝑖 , 𝑗 ≤ 𝜇 Δ ⁢ 𝐷 𝑖 ,

0 ,

otherwise.

In our model, we assume the following:

Loan Amount Distribution ( 𝐿 𝑖 , 𝑗 ): The loan amounts are modeled as a Pareto random variable with minimum value 𝑥 𝑚 and shape parameter 𝛼 . That is,

𝑃 ⁢ ( 𝐿 𝑖 , 𝑗 ≥ 𝑢 )

( 𝑥 𝑚 𝑢 ) 𝛼 , for ⁢ 𝑢 ≥ 𝑥 𝑚 .

Setting 𝑢

𝜇 𝐿 𝑖 (the mean loan amount threshold) gives

𝑃 ⁢ ( 𝐿 𝑖 , 𝑗 ≥ 𝜇 𝐿 𝑖 )

( 𝑥 𝑚 𝜇 𝐿 𝑖 ) 𝛼 .

The loan dates are assumed to be independently drawn from Uniform ⁡ ( 0 , 1 ) . When sorted, consider three consecutive order statistics

𝑈 ( 𝑗 − 1 ) , 𝑈 ( 𝑗 ) , 𝑈 ( 𝑗 + 1 ) ,

with spacings

𝑋

𝑈 ( 𝑗 ) − 𝑈 ( 𝑗 − 1 ) and 𝑌

𝑈 ( 𝑗 + 1 ) − 𝑈 ( 𝑗 ) .

The joint density of ( 𝑋 , 𝑌 ) is

𝑓 𝑋 , 𝑌 ⁢ ( 𝑥 , 𝑦 )

𝑛 ⁢ ( 𝑛 − 1 ) ⁢ ( 1 − 𝑥 − 𝑦 ) 𝑛 − 2 , 𝑥

0 , 𝑦

0 , 𝑥 + 𝑦 < 1 .

Defining

Δ ⁢ 𝐷 𝑖 , 𝑗

min ⁡ { 𝑋 , 𝑌 } ,

we obtain

𝑃 ⁢ ( Δ ⁢ 𝐷 𝑖 , 𝑗 ≤ 𝑧 )

1 − ( 1 − 2 ⁢ 𝑧 ) 𝑛 , 0 ≤ 𝑧 ≤ 1 2 .

Thus, for a threshold 𝜇 Δ ⁢ 𝐷 𝑖 (with 0 ≤ 𝜇 Δ ⁢ 𝐷 𝑖 ≤ 1 2 ),

𝑃 ⁢ ( Δ ⁢ 𝐷 𝑖 , 𝑗 ≤ 𝜇 Δ ⁢ 𝐷 𝑖 )

1 − ( 1 − 2 ⁢ 𝜇 Δ ⁢ 𝐷 𝑖 ) 𝑛 .

Assuming independence between 𝐿 𝑖 , 𝑗 and Δ ⁢ 𝐷 𝑖 , 𝑗 , the probability of a “risky” event for each loan is

𝑠 𝑛 ⁢ 𝑐 𝑖

𝑃 ⁢ ( 𝐿 𝑖 , 𝑗 ≥ 𝜇 𝐿 𝑖 ) × 𝑃 ⁢ ( Δ ⁢ 𝐷 𝑖 , 𝑗 ≤ 𝜇 Δ ⁢ 𝐷 𝑖 )

( 𝑥 𝑚 𝜇 𝐿 𝑖 ) 𝛼 ⁢ [ 1 − ( 1 − 2 ⁢ 𝜇 Δ ⁢ 𝐷 𝑖 ) 𝑛 ] .

7.3.1Expectation of 𝑠 ^ 𝑛 ⁢ 𝑐 𝑖

By linearity of expectation, the expectation of 𝑠 ^ 𝑛 ⁢ 𝑐 𝑖 is

𝐸 ⁢ ( 𝑠 ^ 𝑛 ⁢ 𝑐 𝑖 )

𝐸 ⁢ ( ∑ 𝑗

1 𝑛 𝑌 𝑖 , 𝑗 𝑛 )

1 𝑛 ⁢ ∑ 𝑗

1 𝑛 𝐸 ⁢ ( 𝑌 𝑖 , 𝑗 )

𝑠 𝑛 ⁢ 𝑐 𝑖 .

Therefore, 𝑠 ^ 𝑛 ⁢ 𝑐 𝑖 is an unbiased estimator of 𝑠 𝑛 ⁢ 𝑐 𝑖 .

7.3.2Variance of 𝑠 ^ 𝑛 ⁢ 𝑐 𝑖

The variance of 𝑠 ^ 𝑛 ⁢ 𝑐 𝑖 is given by

Var ⁢ ( 𝑠 ^ 𝑛 ⁢ 𝑐 𝑖 )

Var ⁢ ( ∑ 𝑗

1 𝑛 𝑌 𝑖 , 𝑗 𝑛 )

1 𝑛 2 ⁢ ∑ 𝑗

1 𝑛 Var ⁢ ( 𝑌 𝑖 , 𝑗 ) .

Since 𝑌 𝑖 , 𝑗 ∼ Bernoulli ⁢ ( 𝑠 𝑛 ⁢ 𝑐 𝑖 ) , it follows that Var ⁢ ( 𝑌 𝑖 , 𝑗 )

𝑠 𝑛 ⁢ 𝑐 𝑖 ⁢ ( 1 − 𝑠 𝑛 ⁢ 𝑐 𝑖 ) , hence

Var ⁢ ( 𝑠 ^ 𝑛 ⁢ 𝑐 𝑖 )

𝑠 𝑛 ⁢ 𝑐 𝑖 ⁢ ( 1 − 𝑠 𝑛 ⁢ 𝑐 𝑖 ) 𝑛 .

7.3.3Consistency of 𝑠 ^ 𝑛 ⁢ 𝑐 𝑖

For consistency, we need to check if 𝑠 ^ 𝑛 ⁢ 𝑐 𝑖 converges in probability to 𝑠 𝑛 ⁢ 𝑐 𝑖 as 𝑛 → ∞ . Since

𝐸 ⁢ ( 𝑠 ^ 𝑛 ⁢ 𝑐 𝑖 )

𝑠 𝑛 ⁢ 𝑐 𝑖 ,

and

Var ⁢ ( 𝑠 ^ 𝑛 ⁢ 𝑐 𝑖 )

𝑠 𝑛 ⁢ 𝑐 𝑖 ⁢ ( 1 − 𝑠 𝑛 ⁢ 𝑐 𝑖 ) 𝑛 → 0 as ⁢ 𝑛 → ∞ ,

by the law of large numbers, 𝑠 ^ 𝑛 ⁢ 𝑐 𝑖 is a consistent estimator of 𝑠 𝑛 ⁢ 𝑐 𝑖 .

7.4Expectation, Variance and Consistency of On-Chain Transaction Score

In this section, we analyze the on-chain transaction score 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 , which evaluates the transactional behavior of a wallet based on recent on-chain transactions. We assume that the transaction amounts 𝑇 𝑖 , 𝑙 follow a Pareto distribution, which is commonly used for modeling heavy-tailed data in financial contexts, while the transaction weights 𝑡 𝑖 , 𝑙 are fixed non-stochastic values that weight recent transactions more heavily.

The Pareto distribution is a suitable model for transaction data, as it captures the presence of infrequent, high-value transactions within a large number of smaller transactions. We model the transaction amount 𝑇 𝑖 , 𝑙 for the 𝑖 -th wallet’s 𝑙 -th transaction as:

𝑇 𝑖 , 𝑙 ∼ Pareto ⁢ ( 𝛼 , 𝑥 min )

where:

•

𝛼

1 is the shape parameter, controlling the "heaviness" of the distribution tail,

•

𝑥 min is the minimum transaction amount, such that 𝑇 𝑖 , 𝑙 ≥ 𝑥 min .

This distribution choice allows for tractable calculations of expectation and variance while realistically modeling the likelihood of large transaction values, which are essential for assessing risk.

7.4.1Expectation of 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖

The on-chain transaction score 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 is defined as:

𝑠 ^ 𝑐 ⁢ 𝑡 𝑖

∑ 𝑙 𝑇 𝑖 , 𝑙 ⁢ 𝑡 𝑖 , 𝑙 ∑ 𝑙 | 𝑇 𝑖 , 𝑙 |

where 𝑇 𝑖 , 𝑙 represents the transaction amount, and 𝑡 𝑖 , 𝑙 is a weight associated with the 𝑙 -th transaction. Assuming 𝑇 𝑖 , 𝑙 ∼ Pareto ⁢ ( 𝛼 , 𝑥 min ) , we calculate 𝐸 ⁢ [ 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 ] as follows.

Let 𝑇 𝑖 , 𝑙 denote the 𝑙 -th transaction amount (credited or debited) for the 𝑖 -th wallet. Define:

•

𝑇 as the transaction amount, with 𝑇 𝑖 , 𝑙 being positive for credits and negative for debits.

•

𝑡 as the recency score is assumed to be uniformly distributed over [ 0 , 1 ] , i.e.,

𝑡 ∼ Uniform ⁡ ( 0 , 1 )

•

𝑆 as the sign variable:

𝑆

{ + 1 ,

if credited

− 1 ,

if debited

with probability 𝑃 ⁢ ( 𝑆

1 )

𝑝 and 𝑃 ⁢ ( 𝑆

− 1 )
1 − 𝑝 .

A transaction is given by:

𝑇

𝑆 ⁢ 𝐴

where 𝐴 is the absolute transaction amount.

Define:

𝑋

𝑆 ⁢ 𝐴 ⁢ 𝑡 ,

𝑌

𝐴 .

We approximate the expectation of the ratio 𝐸 ⁢ [ 𝑋 / 𝑌 ] using a second-order Taylor expansion:

𝑔 ⁢ ( 𝑋 , 𝑌 )

𝑋 𝑌 .

Expanding around ( 𝜇 𝑋 , 𝜇 𝑌 ) :

𝑔 ⁢ ( 𝑋 , 𝑌 ) ≈ 𝑔 ⁢ ( 𝜇 𝑋 , 𝜇 𝑌 ) + ( 𝑋 − 𝜇 𝑋 ) ⁢ 𝑔 𝑋 ⁢ ( 𝜇 𝑋 , 𝜇 𝑌 ) + ( 𝑌 − 𝜇 𝑌 ) ⁢ 𝑔 𝑌 ⁢ ( 𝜇 𝑋 , 𝜇 𝑌 )

1 2 ⁢ [ 𝑔 𝑥 ⁢ 𝑥 ⁢ ( 𝜇 𝑋 , 𝜇 𝑌 ) ⁢ ( 𝑋 − 𝜇 𝑋 ) 2
2 ⁢ 𝑔 𝑥 ⁢ 𝑦 ⁢ ( 𝜇 𝑋 , 𝜇 𝑌 ) ⁢ ( 𝑋 − 𝜇 𝑋 ) ⁢ ( 𝑌 − 𝜇 𝑌 )
𝑔 𝑦 ⁢ 𝑦 ⁢ ( 𝜇 𝑋 , 𝜇 𝑌 ) ⁢ ( 𝑌 − 𝜇 𝑌 ) 2 ] .

Computing derivatives:

𝑔 𝑋

1 𝑌 ,
𝑔 𝑌

− 𝑋 𝑌 2 ,

𝑔 𝑋 ⁢ 𝑋

0 ,
𝑔 𝑋 ⁢ 𝑌

− 1 𝑌 2 ,
𝑔 𝑌 ⁢ 𝑌

2 ⁢ 𝑋 𝑌 3 .

Evaluating at ( 𝜇 𝑋 , 𝜇 𝑌 ) :

𝑔 ⁢ ( 𝜇 𝑋 , 𝜇 𝑌 )

𝜇 𝑋 𝜇 𝑌

𝜇 𝑆 ⁢ 𝜇 𝑡 ,

𝑔 𝑋 ⁢ ( 𝜇 𝑋 , 𝜇 𝑌 )

1 𝜇 𝑌

1 𝜇 𝐴 ,

𝑔 𝑌 ⁢ ( 𝜇 𝑋 , 𝜇 𝑌 )

− 𝜇 𝑋 𝜇 𝑌 2

− 𝜇 𝑆 ⁢ 𝜇 𝑡 𝜇 𝐴 ,

𝑔 𝑋 ⁢ 𝑌 ⁢ ( 𝜇 𝑋 , 𝜇 𝑌 )

− 1 𝜇 𝐴 2 ,

𝑔 𝑌 ⁢ 𝑌 ⁢ ( 𝜇 𝑋 , 𝜇 𝑌 )

2 ⁢ 𝜇 𝑋 𝜇 𝐴 3

2 ⁢ 𝜇 𝑆 ⁢ 𝜇 𝑡 𝜇 𝐴 2 .

Since 𝐸 ⁢ [ 𝑋 − 𝜇 𝑋 ]

0 and 𝐸 ⁢ [ 𝑌 − 𝜇 𝑌 ]

0 , the first-order terms vanish. The second order correction is:

Δ

1 2 ⁢ [ 2 ⁢ ( − 1 𝜇 𝐴 2 ) ⁢ Cov ⁡ ( 𝑋 , 𝑌 ) + 2 ⁢ 𝜇 𝑆 ⁢ 𝜇 𝑡 𝜇 𝐴 2 ⁢ Var ⁡ ( 𝑌 ) ] .

Simplifying:

Δ

− Cov ⁡ ( 𝑋 , 𝑌 ) 𝜇 𝐴 2 + 𝜇 𝑆 ⁢ 𝜇 𝑡 ⁢ Var ⁡ ( 𝑌 ) 𝜇 𝐴 2 .

Since:

Var ⁡ ( 𝑌 )

𝜎 𝐴 2 ,

Cov ⁡ ( 𝑋 , 𝑌 )

𝜇 𝑆 ⁢ 𝜇 𝑡 ⁢ 𝜎 𝐴 2 ,

we get:

Δ

− 𝜇 𝑆 ⁢ 𝜇 𝑡 ⁢ 𝜎 𝐴 2 𝜇 𝐴 2 + 𝜇 𝑆 ⁢ 𝜇 𝑡 ⁢ 𝜎 𝐴 2 𝜇 𝐴 2

0 .

Thus, the expectation simplifies to:

𝐸 ⁢ [ 𝑠 ^ 𝑐 ⁢ 𝑡 ]

𝜇 𝑆 ⁢ 𝜇 𝑡 .

where

𝜇 𝑆

2 ⁢ 𝑝 − 1 , with ⁢ 𝑝

𝑃 ⁢ ( 𝑇

0 )

and

𝜇 𝑡

0.5

7.4.2Variance of On-Chain Transaction Score Var ⁡ ( 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 )

To compute the variance of the on-chain transaction score 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 , we use the second order approximation.

Var ⁡ ( 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 ) ≈ 𝑔 𝑋 2 ⁢ Var ⁡ ( 𝑋 ) + 𝑔 𝑌 2 ⁢ Var ⁡ ( 𝑌 ) + 2 ⁢ 𝑔 𝑋 ⁢ 𝑔 𝑌 ⁢ Cov ⁡ ( 𝑋 , 𝑌 ) .

(18)

From our earlier derivative calculations:

𝑔 𝑋

1 𝜇 𝐴 ,
𝑔 𝑌

− 𝜇 𝑆 ⁢ 𝜇 𝑡 𝜇 𝐴 .

Var ⁡ ( 𝑋 )

𝐸 ⁢ [ 𝑋 2 ] − ( 𝐸 ⁢ [ 𝑋 ] ) 2 .

(19)

Since 𝑋

𝑆 ⁢ 𝐴 ⁢ 𝑡 , we expand:

𝐸 ⁢ [ 𝑋 ]

𝐸 ⁢ [ 𝑆 ⁢ 𝐴 ⁢ 𝑡 ]

𝐸 ⁢ [ 𝑆 ] ⁢ 𝐸 ⁢ [ 𝐴 ] ⁢ 𝐸 ⁢ [ 𝑡 ]

𝜇 𝑆 ⁢ 𝜇 𝐴 ⁢ 𝜇 𝑡 ,

𝐸 ⁢ [ 𝑋 2 ]

𝐸 ⁢ [ 𝑆 2 ⁢ 𝐴 2 ⁢ 𝑡 2 ]

𝐸 ⁢ [ 𝑆 2 ] ⁢ 𝐸 ⁢ [ 𝐴 2 ] ⁢ 𝐸 ⁢ [ 𝑡 2 ] .

Since 𝑆 is a binary variable, we have:

𝐸 ⁢ [ 𝑆 2 ]

1 .

(20)

For 𝐴 (which follows a Pareto distribution),

𝐸 ⁢ [ 𝐴 2 ]

𝛼 ⁢ 𝑥 min 2 ( 𝛼 − 2 ) , for ⁢ 𝛼

2 .

(21)

For the recency score 𝑡 :

𝐸 ⁢ [ 𝑡 2 ]

Var ⁡ ( 𝑡 ) + ( 𝐸 ⁢ [ 𝑡 ] ) 2

𝜎 𝑡 2 + 𝜇 𝑡 2 .

(22)

Thus,

𝐸 ⁢ [ 𝑋 2 ]

𝐸 ⁢ [ 𝐴 2 ] ⁢ 𝐸 ⁢ [ 𝑡 2 ] .

(23)

Finally, we compute the variance:

Var ⁡ ( 𝑋 )

𝐸 ⁢ [ 𝑋 2 ] − ( 𝐸 ⁢ [ 𝑋 ] ) 2

( 𝛼 ⁢ 𝑥 min 2 ( 𝛼 − 2 ) ⁢ ( 𝜎 𝑡 2 + 𝜇 𝑡 2 ) ) − 𝜇 𝑆 2 ⁢ 𝜇 𝐴 2 ⁢ 𝜇 𝑡 2 .

Similarly, the variance of 𝑌 is given by:

Var ⁡ ( 𝑌 )

𝜎 𝐴 2

𝑥 min 2 ⁢ 𝛼 ( 𝛼 − 1 ) 2 ⁢ ( 𝛼 − 2 ) , for ⁢ 𝛼

2 .

(24)

The covariance term is given by:

Cov ⁡ ( 𝑋 , 𝑌 )

𝜇 𝑆 ⁢ 𝜇 𝑡 ⁢ 𝜎 𝐴 2 .

(25)

Since 𝜇 𝑆

2 ⁢ 𝑝 − 1 , we substitute:

𝜇 𝑆 2

( 2 ⁢ 𝑝 − 1 ) 2 .

(26)

Now, the full variance expression becomes:

Var ⁡ ( 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 ) ≈ 1 ( 𝛼 ⁢ 𝑥 min 𝛼 − 1 ) 2 ⁢ [ 𝛼 ⁢ 𝑥 min 2 𝛼 − 2 ⁢ ( 𝜎 𝑡 2 + 𝜇 𝑡 2 ) − ( ( 2 ⁢ 𝑝 − 1 ) ⁢ 𝜇 𝑡 ⁢ 𝛼 ⁢ 𝑥 min 𝛼 − 1 ) 2 ] + [ ( 2 ⁢ 𝑝 − 1 ) 2 ⁢ 𝜇 𝑡 2 ( 𝛼 ⁢ 𝑥 min 𝛼 − 1 ) 2 − 2 ⁢ ( 2 ⁢ 𝑝 − 1 ) 2 ⁢ 𝜇 𝑡 2 ( 𝛼 ⁢ 𝑥 min 𝛼 − 1 ) 2 ] ⁢ 𝛼 ⁢ 𝑥 min 2 ( 𝛼 − 2 ) ⁢ ( 𝛼 − 1 ) 2 .

Then the variance becomes

Var ⁡ ( 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 )

≈ 1 𝜇 𝐴 2 ⁢ [ 𝛼 ⁢ 𝑥 min 2 𝛼 − 2 ⁢ ( 𝜎 𝑡 2 + 𝜇 𝑡 2 ) − ( ( 2 ⁢ 𝑝 − 1 ) ⁢ 𝜇 𝑡 ⁢ 𝜇 𝐴 ) 2 ] − ( 2 ⁢ 𝑝 − 1 ) 2 ⁢ 𝜇 𝑡 2 𝜇 𝐴 2 ⁢ 𝛼 ⁢ 𝑥 min 2 ( 𝛼 − 2 ) ⁢ ( 𝛼 − 1 ) 2 .

(27)

Let us simplify the first term:

1 𝜇 𝐴 2 ⁢ 𝛼 ⁢ 𝑥 min 2 𝛼 − 2

𝛼 ⁢ 𝑥 min 2 𝜇 𝐴 2 ⁢ ( 𝛼 − 2 )

𝛼 ⁢ 𝑥 min 2 𝛼 2 ⁢ 𝑥 min 2 ( 𝛼 − 1 ) 2 ⁢ ( 𝛼 − 2 )

( 𝛼 − 1 ) 2 𝛼 ⁢ ( 𝛼 − 2 ) ,

and note that

1 𝜇 𝐴 2 ⁢ ( ( 2 ⁢ 𝑝 − 1 ) ⁢ 𝜇 𝑡 ⁢ 𝜇 𝐴 ) 2

( 2 ⁢ 𝑝 − 1 ) 2 ⁢ 𝜇 𝑡 2 .

Thus, the first term simplifies to

( 𝛼 − 1 ) 2 𝛼 ⁢ ( 𝛼 − 2 ) ⁢ ( 𝜎 𝑡 2 + 𝜇 𝑡 2 ) − ( 2 ⁢ 𝑝 − 1 ) 2 ⁢ 𝜇 𝑡 2 .

Next, simplify the second term:

1 𝜇 𝐴 2 ⁢ 𝛼 ⁢ 𝑥 min 2 ( 𝛼 − 2 ) ⁢ ( 𝛼 − 1 ) 2

1 𝛼 ⁢ ( 𝛼 − 2 ) ,

so that the second term becomes

− ( 2 ⁢ 𝑝 − 1 ) 2 ⁢ 𝜇 𝑡 2 𝛼 ⁢ ( 𝛼 − 2 ) .

Combining both parts, we obtain:

Var ⁡ ( 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 ) ≈ ( 𝛼 − 1 ) 2 𝛼 ⁢ ( 𝛼 − 2 ) ⁢ ( 𝜎 𝑡 2 + 𝜇 𝑡 2 ) − ( 2 ⁢ 𝑝 − 1 ) 2 ⁢ 𝜇 𝑡 2 ⁢ ( 1 + 1 𝛼 ⁢ ( 𝛼 − 2 ) ) .

Thus, combining both parts, we obtain the variance for a single transaction:

Var ⁡ ( 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 ) ≈ ( 𝛼 − 1 ) 2 𝛼 ⁢ ( 𝛼 − 2 ) ⁢ [ ( 𝜎 𝑡 2 + 𝜇 𝑡 2 ) − ( 2 ⁢ 𝑝 − 1 ) 2 ⁢ 𝜇 𝑡 2 ] .

Since 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 is computed from 𝑛 independent transactions, by the properties of i.i.d. random variables, the variance of the estimator decreases as 1 / 𝑛 . Therefore, the final corrected variance is

Var ( 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 ) ≈ 1 𝑛 ( 𝛼 − 1 ) 2 𝛼 ⁢ ( 𝛼 − 2 ) [ ( 𝜎 𝑡 2 + 𝜇 𝑡 2 ) − ( 2 𝑝 − 1 ) 2 𝜇 𝑡 2 ] .

7.4.3Consistency of 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖

Recall that the variance of the on-chain transaction score is given by

Var ⁡ ( 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 ) ≈ 1 𝑛 ⁢ ( 𝛼 − 1 ) 2 𝛼 ⁢ ( 𝛼 − 2 ) ⁢ [ ( 𝜎 𝑡 2 + 𝜇 𝑡 2 ) − ( 2 ⁢ 𝑝 − 1 ) 2 ⁢ 𝜇 𝑡 2 ] .

As 𝑛 → ∞ , the factor 1 𝑛 drives the variance to zero:

lim 𝑛 → ∞ Var ⁡ ( 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 )

0 .

This implies that the estimator 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 converges in probability to its expected value 𝐸 ⁢ [ 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 ]

( 2 ⁢ 𝑝 − 1 ) ⁢ 𝜇 𝑡 . Hence, 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 is a consistent estimator of the on-chain transaction score.

7.5Expectation,Variance and Consistency of the Credit Utilization Score: 7.5.1Derivation of the Expectation of the Credit Utilization Score

Recall that the credit utilization score is defined as

𝑠 ^ 𝑐 ⁢ 𝑢 𝑖

𝑁 𝑖 𝐷 𝑖

∑ 𝑗

1 𝑛 ( 𝐿 𝑖 ⁢ 𝑗 − 𝐿 𝑖 ⁢ 𝑗 2 𝑌 𝑖 ⁢ 𝑗 ) ∑ 𝑗

1 𝑛 𝐿 𝑖 ⁢ 𝑗 ,

with

𝑌 𝑖 ⁢ 𝑗

𝐶 𝑖 ⁢ 𝑗 × 𝐿 ⁢ 𝑇 ⁢ 𝑉 𝑖 ⁢ 𝑗 ,

and we assume that conditionally

𝐿 𝑖 ⁢ 𝑗 ∣ 𝑌 𝑖 ⁢ 𝑗 ∼ Uniform ⁢ ( 0 , 𝑌 𝑖 ⁢ 𝑗 ) .

For a single loan, the conditional moments are

𝔼 ⁢ [ 𝐿 𝑖 ⁢ 𝑗 ∣ 𝑌 𝑖 ⁢ 𝑗 ]

𝑌 𝑖 ⁢ 𝑗 2 , 𝔼 ⁢ [ 𝐿 𝑖 ⁢ 𝑗 2 ∣ 𝑌 𝑖 ⁢ 𝑗 ]

𝑌 𝑖 ⁢ 𝑗 2 3 .

Defining

𝑁 𝑖 ⁢ 𝑗

𝐿 𝑖 ⁢ 𝑗 − 𝐿 𝑖 ⁢ 𝑗 2 𝑌 𝑖 ⁢ 𝑗 ,

we have

𝔼 ⁢ [ 𝑁 𝑖 ⁢ 𝑗 ∣ 𝑌 𝑖 ⁢ 𝑗 ]

𝑌 𝑖 ⁢ 𝑗 2 − 1 𝑌 𝑖 ⁢ 𝑗 ⋅ 𝑌 𝑖 ⁢ 𝑗 2 3

𝑌 𝑖 ⁢ 𝑗 6 .

Taking the unconditional expectation (via the law of iterated expectation) gives

𝔼 ⁢ [ 𝑁 𝑖 ⁢ 𝑗 ]

𝔼 ⁢ [ 𝑌 𝑖 ⁢ 𝑗 ] 6 , 𝔼 ⁢ [ 𝐿 𝑖 ⁢ 𝑗 ]

𝔼 ⁢ [ 𝑌 𝑖 ⁢ 𝑗 ] 2 .

For a borrower with 𝑛 independent loans, define

𝑁 𝑖

∑ 𝑗

1 𝑛 𝑁 𝑖 ⁢ 𝑗 and 𝐷 𝑖

∑ 𝑗

1 𝑛 𝐿 𝑖 ⁢ 𝑗 .

Then,

𝔼 ⁢ [ 𝑁 𝑖 ]

𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 6 , 𝔼 ⁢ [ 𝐷 𝑖 ]

𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 2 .

Thus, the first order (plug-in) estimator for credit utilization is

𝔼 ⁢ [ 𝑁 𝑖 ] 𝔼 ⁢ [ 𝐷 𝑖 ]

𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 6 𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 2

1 3 .

Although conditionally

Cov ⁡ ( 𝑁 𝑖 ⁢ 𝑗 , 𝐿 𝑖 ⁢ 𝑗 ∣ 𝑌 𝑖 ⁢ 𝑗 )

0 ,

the law of total covariance yields

Cov ⁡ ( 𝑁 𝑖 ⁢ 𝑗 , 𝐿 𝑖 ⁢ 𝑗 )

Cov ⁡ ( 𝑌 𝑖 ⁢ 𝑗 6 , 𝑌 𝑖 ⁢ 𝑗 2 )

1 12 ⁢ Var ⁡ ( 𝑌 𝑖 ⁢ 𝑗 ) .

Assuming all 𝑌 𝑖 ⁢ 𝑗 share the same variance Var ⁡ ( 𝑌 ) , we have for 𝑛 loans:

Cov ⁡ ( 𝑁 𝑖 , 𝐷 𝑖 )

𝑛 ⁢ Var ⁡ ( 𝑌 ) 12 .

Also, for a single loan,

Var ⁡ ( 𝐿 𝑖 ⁢ 𝑗 )

𝔼 ⁢ [ 𝑌 2 ] 3 − 𝔼 ⁢ [ 𝑌 ] 2 4 ,

so that

Var ⁡ ( 𝐷 𝑖 )

𝑛 ⁢ ( 𝔼 ⁢ [ 𝑌 2 ] 3 − 𝔼 ⁢ [ 𝑌 ] 2 4 ) .

For a function 𝑔 ⁢ ( 𝑁 𝑖 , 𝐷 𝑖 )

𝑁 𝑖 / 𝐷 𝑖 , a second-order Taylor expansion about

( 𝜇 𝑋 , 𝜇 𝑌 )

( 𝔼 ⁢ [ 𝑁 𝑖 ] , 𝔼 ⁢ [ 𝐷 𝑖 ] )

gives

𝔼 ⁢ [ 𝑁 𝑖 𝐷 𝑖 ] ≈ 𝔼 ⁢ [ 𝑁 𝑖 ] 𝔼 ⁢ [ 𝐷 𝑖 ] − Cov ⁡ ( 𝑁 𝑖 , 𝐷 𝑖 ) 𝔼 ⁢ [ 𝐷 𝑖 ] 2 + 𝔼 ⁢ [ 𝑁 𝑖 ] ⁢ Var ⁡ ( 𝐷 𝑖 ) 𝔼 ⁢ [ 𝐷 𝑖 ] 3 .

Substitute the expressions obtained above:

𝔼 ⁢ [ 𝑁 𝑖 ] 𝔼 ⁢ [ 𝐷 𝑖 ]

1 3 .

− Cov ⁡ ( 𝑁 𝑖 , 𝐷 𝑖 ) 𝔼 ⁢ [ 𝐷 𝑖 ] 2

− 𝑛 ⁢ Var ⁡ ( 𝑌 ) 12 ( 𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 2 ) 2

− Var ⁡ ( 𝑌 ) 12 ⋅ 4 𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 2

− Var ⁡ ( 𝑌 ) 3 ⁢ 𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 2 .

𝔼 ⁢ [ 𝑁 𝑖 ] ⁢ Var ⁡ ( 𝐷 𝑖 ) 𝔼 ⁢ [ 𝐷 𝑖 ] 3

( 𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 6 ) ⁢ 𝑛 ⁢ ( 𝔼 ⁢ [ 𝑌 2 ] 3 − 𝔼 ⁢ [ 𝑌 ] 2 4 ) ( 𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 2 ) 3 .

Simplify as follows:

Numerator

𝑛 2 ⁢ 𝔼 ⁢ [ 𝑌 ] 6 ⁢ ( 𝔼 ⁢ [ 𝑌 2 ] 3 − 𝔼 ⁢ [ 𝑌 ] 2 4 ) ,

Denominator

𝑛 3 ⁢ 𝔼 ⁢ [ 𝑌 ] 3 8 ,

so that

𝔼 ⁢ [ 𝑁 𝑖 ] ⁢ Var ⁡ ( 𝐷 𝑖 ) 𝔼 ⁢ [ 𝐷 𝑖 ] 3

8 6 ⁢ 𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 2 ⁢ ( 𝔼 ⁢ [ 𝑌 2 ] 3 − 𝔼 ⁢ [ 𝑌 ] 2 4 )

4 3 ⁢ 𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 2 ⁢ ( 𝔼 ⁢ [ 𝑌 2 ] 3 − 𝔼 ⁢ [ 𝑌 ] 2 4 ) .

Combining the three terms, we obtain

𝔼 ⁢ [ 𝑠 ^ 𝑐 ⁢ 𝑢 𝑖 ] ≈ 1 3 − Var ⁡ ( 𝑌 ) 3 ⁢ 𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 2 + 4 3 ⁢ 𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 2 ⁢ ( 𝔼 ⁢ [ 𝑌 2 ] 3 − 𝔼 ⁢ [ 𝑌 ] 2 4 ) .

Noting that

Var ⁡ ( 𝑌 )

𝔼 ⁢ [ 𝑌 2 ] − 𝔼 ⁢ [ 𝑌 ] 2 ,

and combining the correction terms over a common denominator, one obtains

𝔼 [ 𝑠 ^ 𝑐 ⁢ 𝑢 𝑖 ] ≈ 1 3 + 𝔼 ⁢ [ 𝑌 2 ] 9 ⁢ 𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 2 .

This completes the derivation of the expectation of the credit utilization score using the delta method.

7.5.2Derivation of the Variance of the Credit Utilization subscore

Using the delta method for the function

𝑔 ⁢ ( 𝑁 𝑖 , 𝐷 𝑖 )

𝑁 𝑖 𝐷 𝑖 ,

its first-order Taylor expansion yields the approximate variance

Var ⁡ ( 𝑁 𝑖 𝐷 𝑖 )

≈ ( ∂ 𝑔 ∂ 𝑁 𝑖 | ( 𝜇 𝑋 , 𝜇 𝑌 ) ) 2 ⁢ Var ⁡ ( 𝑁 𝑖 ) + ( ∂ 𝑔 ∂ 𝐷 𝑖 | ( 𝜇 𝑋 , 𝜇 𝑌 ) ) 2 ⁢ Var ⁡ ( 𝐷 𝑖 )

(28)

2 ⁢ ∂ 𝑔 ∂ 𝑁 𝑖 | ( 𝜇 𝑋 , 𝜇 𝑌 ) ⁢ ∂ 𝑔 ∂ 𝐷 𝑖 | ( 𝜇 𝑋 , 𝜇 𝑌 ) ⁢ Cov ⁡ ( 𝑁 𝑖 , 𝐷 𝑖 )

(29)

where

∂ 𝑔 ∂ 𝑁 𝑖

1 𝐷 𝑖 , ∂ 𝑔 ∂ 𝐷 𝑖

− 𝑁 𝑖 𝐷 𝑖 2 .

Evaluated at ( 𝜇 𝑋 , 𝜇 𝑌 ) , this becomes

Var ⁡ ( 𝑠 ^ 𝑐 ⁢ 𝑢 𝑖 ) ≈ Var ⁡ ( 𝑁 𝑖 ) 𝜇 𝑌 2 + 𝜇 𝑋 2 ⁢ Var ⁡ ( 𝐷 𝑖 ) 𝜇 𝑌 4 − 2 ⁢ 𝜇 𝑋 ⁢ Cov ⁡ ( 𝑁 𝑖 , 𝐷 𝑖 ) 𝜇 𝑌 3 .

For a single loan, recall the following conditional calculations given 𝐿 ∣ 𝑌 ∼ Uniform ⁢ ( 0 , 𝑌 ) :

𝔼 ⁢ [ 𝐿 ∣ 𝑌 ]

𝑌 2 , 𝔼 ⁢ [ 𝐿 2 ∣ 𝑌 ]

𝑌 2 3 , 𝔼 ⁢ [ 𝐿 3 ∣ 𝑌 ]

𝑌 3 4 , 𝔼 ⁢ [ 𝐿 4 ∣ 𝑌 ]

𝑌 4 5 ,

𝔼 ⁢ [ 𝑁 ∣ 𝑌 ]

𝑌 6 , with 𝑁

𝐿 − 𝐿 2 𝑌

A direct calculation shows:

𝔼 ⁢ [ 𝑁 2 ∣ 𝑌 ]

𝑌 2 30 ,

so that

Var ⁡ ( 𝑁 ∣ 𝑌 )

𝑌 2 30 − ( 𝑌 6 ) 2

𝑌 2 180 .

Unconditionally, using the law of total variance,

Var ⁡ ( 𝑁 )

𝔼 ⁢ [ Var ⁡ ( 𝑁 ∣ 𝑌 ) ] + Var ⁡ ( 𝔼 ⁢ [ 𝑁 ∣ 𝑌 ] )

𝔼 ⁢ [ 𝑌 2 ] 180 + Var ⁡ ( 𝑌 ) 36 .

Similarly, for 𝐿 we have

Var ⁡ ( 𝐿 )

𝔼 ⁢ [ 𝑌 2 ] 3 − ( 𝔼 ⁢ [ 𝑌 ] 2 ) 2 .

For a borrower with 𝑛 independent loans:

𝜇 𝑋

𝔼 ⁢ [ 𝑁 𝑖 ]

𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 6 , 𝜇 𝑌

𝔼 ⁢ [ 𝐷 𝑖 ]

𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 2 ,

Var ⁡ ( 𝐷 𝑖 )

𝑛 ⁢ ( 𝔼 ⁢ [ 𝑌 2 ] 3 − 𝔼 ⁢ [ 𝑌 ] 2 4 ) ,

Var ⁡ ( 𝑁 𝑖 )

𝑛 ⁢ ( 𝔼 ⁢ [ 𝑌 2 ] 180 + Var ⁡ ( 𝑌 ) 36 ) ,

Cov ⁡ ( 𝑁 𝑖 , 𝐷 𝑖 )

𝑛 ⁢ Var ⁡ ( 𝑌 ) 12 .

Thus, the delta method approximation for the variance of the credit utilization subscore becomes

Var ( 𝑠 ^ 𝑐 ⁢ 𝑢 𝑖 ) ≈ 𝑛 ⁢ ( 𝔼 ⁢ [ 𝑌 2 ] 180 + Var ⁡ ( 𝑌 ) 36 ) ( 𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 2 ) 2 + ( 𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 6 ) 2 ⁢ 𝑛 ⁢ ( 𝔼 ⁢ [ 𝑌 2 ] 3 − 𝔼 ⁢ [ 𝑌 ] 2 4 ) ( 𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 2 ) 4 − 2 ⁢ ( 𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 6 ) ⁢ 𝑛 ⁢ Var ⁡ ( 𝑌 ) 12 ( 𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 2 ) 3 .

This expression can be further simplified if desired.

In summary, we have derived the following approximate formulas using the delta method:

𝔼 ⁢ [ 𝑠 ^ 𝑐 ⁢ 𝑢 𝑖 ] ≈ 𝔼 ⁢ [ 𝑁 𝑖 ] 𝔼 ⁢ [ 𝐷 𝑖 ] − Cov ⁡ ( 𝑁 𝑖 , 𝐷 𝑖 ) 𝔼 ⁢ [ 𝐷 𝑖 ] 2 + 𝔼 ⁢ [ 𝑁 𝑖 ] ⁢ Var ⁡ ( 𝐷 𝑖 ) 𝔼 ⁢ [ 𝐷 𝑖 ] 3 ,

and

Var ⁡ ( 𝑠 ^ 𝑐 ⁢ 𝑢 𝑖 ) ≈ Var ⁡ ( 𝑁 𝑖 ) 𝔼 ⁢ [ 𝐷 𝑖 ] 2 + 𝔼 ⁢ [ 𝑁 𝑖 ] 2 ⁢ Var ⁡ ( 𝐷 𝑖 ) 𝔼 ⁢ [ 𝐷 𝑖 ] 4 − 2 ⁢ 𝔼 ⁢ [ 𝑁 𝑖 ] ⁢ Cov ⁡ ( 𝑁 𝑖 , 𝐷 𝑖 ) 𝔼 ⁢ [ 𝐷 𝑖 ] 3 .

These derivations assume that the loans are i.i.d. and that the conditional distribution 𝐿 𝑖 ⁢ 𝑗 ∣ 𝑌 𝑖 ⁢ 𝑗 is uniform on [ 0 , 𝑌 𝑖 ⁢ 𝑗 ] .

7.5.3Consistency of credit utilization subscore

To show that the estimator 𝑠 ^ 𝑐 ⁢ 𝑢 𝑖 is consistent, we need to verify that its variance vanishes as 𝑛 → ∞ and that it converges in probability to the true parameter.

From the variance expression:

Var ⁡ ( 𝑠 ^ 𝑐 ⁢ 𝑢 𝑖 ) ≈ 𝑛 ⁢ ( 𝔼 ⁢ [ 𝑌 2 ] 180 + Var ⁡ ( 𝑌 ) 36 ) ( 𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 2 ) 2 + ( 𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 6 ) 2 ⁢ 𝑛 ⁢ ( 𝔼 ⁢ [ 𝑌 2 ] 3 − 𝔼 ⁢ [ 𝑌 ] 2 4 ) ( 𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 2 ) 4 − 2 ⁢ ( 𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 6 ) ⁢ 𝑛 ⁢ Var ⁡ ( 𝑌 ) 12 ( 𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 2 ) 3 ,

we analyze the asymptotic behavior as 𝑛 → ∞ . Each term in the variance expression contains factors of 1 𝑛 or higher negative powers of 𝑛 . Specifically, the leading term behaves as:

Var ⁡ ( 𝑠 ^ 𝑐 ⁢ 𝑢 𝑖 )

𝒪 ⁢ ( 1 𝑛 ) .

Since Var ⁡ ( 𝑠 ^ 𝑐 ⁢ 𝑢 𝑖 ) → 0 as 𝑛 → ∞ , the estimator is asymptotically unbiased and its variance vanishes in the limit. By Chebyshev’s inequality,

𝑃 ⁢ ( | 𝑠 ^ 𝑐 ⁢ 𝑢 𝑖 − 𝑠 𝑐 ⁢ 𝑢 𝑖 | ≥ 𝜖 ) ≤ Var ⁡ ( 𝑠 ^ 𝑐 ⁢ 𝑢 𝑖 ) 𝜖 2 → 0 as ⁢ 𝑛 → ∞ .

This implies 𝑠 ^ 𝑐 ⁢ 𝑢 𝑖 → 𝑝 𝑠 𝑐 ⁢ 𝑢 𝑖 , meaning that 𝑠 ^ 𝑐 ⁢ 𝑢 𝑖 is a consistent estimator of 𝑠 𝑐 ⁢ 𝑢 𝑖 .

8Asymptotic Normality of OCCR Score

Our objective is to prove that the OCCR score is asymptotically normal, i.e.,

𝑁 ⁢ ( OCCR Score − 𝜇 ) → 𝑑 𝒩 ⁢ ( 0 , 𝜎 2 ) ,

(30)

where 𝜇 is the expected value and 𝜎 2 is the variance of the OCCR score.

We analyze the asymptotic properties of each component individually.

8.1Historical Credit Risk subscore ( 𝑠 ^ ℎ 𝑖 )

The historical credit risk subscore is defined as

𝑠 ^ ℎ 𝑖

∑ 𝑗

1 𝑛 𝑤 𝑖 , 𝑗 ⁢ 𝑋 𝑖 , 𝑗 ∑ 𝑗

1 𝑛 𝑤 𝑖 , 𝑗 .

(31)

Under the assumption of i.i.d. loans, by the Central Limit Theorem (CLT) we have:

𝑛 ⁢ ( 𝑠 ^ ℎ 𝑖 − 𝑠 ℎ 𝑖 ) → 𝑑 𝒩 ⁢ ( 0 , 𝑠 ℎ 𝑖 ⁢ ( 1 − 𝑠 ℎ 𝑖 ) ⁢ 𝜇 𝑤 2 𝜇 𝑤 2 ) .

In the desired format, the expectation and variance can be expressed as:

𝔼 [ 𝑠 ^ ℎ 𝑖 ] ≈ 𝑠 ℎ 𝑖 ( 1 + 𝜇 𝑤 2 − 𝜇 𝑤 2 𝑛 ⁢ 𝜇 𝑤 2 ) .

Var ( 𝑠 ^ ℎ 𝑖 ) ≈ 16 ⁢ 𝑠 ℎ 𝑖 ⁢ ( 1 − 𝑠 ℎ 𝑖 ) ⁢ 𝔼 ⁢ [ 𝐿 𝑖 , 𝑗 2 ] 9 ⁢ 𝑛 ⁢ ( 𝔼 ⁢ [ 𝐿 𝑖 , 𝑗 ] ) 2 .

8.2Current Credit Risk subscore ( 𝑠 ^ 𝑐 𝑖 )

The current credit risk subscore is defined as

𝑠 ^ 𝑐 𝑖

1 𝑘 ⁢ ∑ 𝑗

1 𝑘 𝑍 𝑖 , 𝑗 , with ⁢ 𝑍 𝑖 , 𝑗 ∼ Bernoulli ⁡ ( 𝑠 𝑐 𝑖 ) .

(32)

Then, by the CLT:

𝔼 ⁢ [ 𝑠 ^ 𝑐 𝑖 ]

𝑠 𝑐 𝑖 ,

Var ( 𝑠 ^ 𝑐 𝑖 )

𝑠 𝑐 𝑖 ⁢ ( 1 − 𝑠 𝑐 𝑖 ) 𝑘 .

8.3New Credit subscore ( 𝑠 ^ 𝑛 ⁢ 𝑐 𝑖 )

The new credit subscore is given by

𝑠 ^ 𝑛 ⁢ 𝑐 𝑖

1 𝑛 ⁢ ∑ 𝑗

1 𝑛 𝑌 𝑖 , 𝑗 , with ⁢ 𝑌 𝑖 , 𝑗 ∼ Bernoulli ⁡ ( 𝑠 𝑛 ⁢ 𝑐 𝑖 ) .

(33)

Thus, we have

𝐸 ( 𝑠 ^ 𝑛 ⁢ 𝑐 𝑖 )

𝑠 𝑛 ⁢ 𝑐 𝑖 ,

and since Var ⁡ ( 𝑌 𝑖 , 𝑗 )

𝑠 𝑛 ⁢ 𝑐 𝑖 ⁢ ( 1 − 𝑠 𝑛 ⁢ 𝑐 𝑖 ) , it follows that

Var ( 𝑠 ^ 𝑛 ⁢ 𝑐 𝑖 )

𝑠 𝑛 ⁢ 𝑐 𝑖 ⁢ ( 1 − 𝑠 𝑛 ⁢ 𝑐 𝑖 ) 𝑛 .

8.4On-Chain Transaction subscore ( 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 )

The on-chain transaction subscore is defined as

𝑠 ^ 𝑐 ⁢ 𝑡 𝑖

∑ 𝑙

1 𝑛 𝑇 𝑖 , 𝑙 ⁢ 𝑡 𝑖 , 𝑙 ∑ 𝑙

1 𝑛 | 𝑇 𝑖 , 𝑙 | ,

(34)

where 𝑇 𝑖 , 𝑙 ∼ Pareto ⁢ ( 𝛼 , 𝑥 min ) . Assuming appropriate moment conditions and applying the CLT for weighted sums, we get an expectation of

𝐸 ⁢ [ 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 ]

𝜇 𝑆 ⁢ 𝜇 𝑡 ,

and the variance is given by

Var ( 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 ) ≈ 1 𝑛 ( 𝛼 − 1 ) 2 𝛼 ⁢ ( 𝛼 − 2 ) [ ( 𝜎 𝑡 2 + 𝜇 𝑡 2 ) − ( 2 𝑝 − 1 ) 2 𝜇 𝑡 2 ] .

8.5Credit Utilization subscore ( 𝑠 ^ 𝑐 ⁢ 𝑢 𝑖 )

The credit utilization subscore is defined as

𝑠 ^ 𝑐 ⁢ 𝑢 𝑖

∑ 𝑗

1 𝑛 ( 1 − 𝐿 𝑖 , 𝑗 𝐶 𝑖 , 𝑗 ⋅ 𝐿 ⁢ 𝑇 ⁢ 𝑉 𝑖 , 𝑗 ) ⁢ 𝐿 𝑖 , 𝑗 ∑ 𝑗

1 𝑛 𝐿 𝑖 , 𝑗 .

(35)

In the target format, the expectation and variance are approximated by

𝔼 [ 𝑠 ^ 𝑐 ⁢ 𝑢 𝑖 ] ≈ 1 3 + 𝔼 ⁢ [ 𝑌 2 ] 9 ⁢ 𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 2 ,

The overall OCCR Score is constructed as a weighted sum of the independent component subscores:

OCCR Score

∑ 𝑘

1 5 𝑤 𝑘 ⁢ 𝑠 ^ 𝑘 ,

(36)

with weights 𝑤 𝑘 and component subscores 𝑠 ^ 𝑘 . By the continuous mapping theorem, since each 𝑠 ^ 𝑘 is asymptotically normal, the OCCR Score is also asymptotically normal:

OCCR Score ∼ 𝒩 ⁢ ( ∑ 𝑘

1 5 𝑤 𝑘 ⁢ 𝜇 𝑘 , ∑ 𝑘

1 5 𝑤 𝑘 2 ⁢ 𝜎 𝑘 2 ) ,

(37)

where for each component 𝑘 we define:

𝜇 𝑘

𝔼 ⁢ [ 𝑠 ^ 𝑘 ] ,

(38)

𝜎 𝑘 2

Var ⁡ ( 𝑠 ^ 𝑘 ) .

(39)

In our case, the component subscores and weights are:

𝑠 ^ 1

𝑠 ^ ℎ 𝑖 ,
𝑤 1

0.35 ,

𝑠 ^ 2

𝑠 ^ 𝑐 𝑖 ,
𝑤 2

0.25 ,

𝑠 ^ 3

1 − 𝑠 ^ 𝑐 ⁢ 𝑢 𝑖 ,
𝑤 3

0.15 ,

𝑠 ^ 4

𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 ,
𝑤 4

− 0.15 ,

𝑠 ^ 5

𝑠 ^ 𝑛 ⁢ 𝑐 𝑖 ,
𝑤 5

0.10 .

Thus, the overall expectation and variance of the OCCR Score are given by

𝔼 ⁢ [ OCCR Score ]

∑ 𝑘

1 5 𝑤 𝑘 ⁢ 𝜇 𝑘 and Var ⁡ ( OCCR Score )

∑ 𝑘

1 5 𝑤 𝑘 2 ⁢ 𝜎 𝑘 2 .

(40)

Expectation of the OCCR Score

Since the expectation operator is linear, we have:

𝔼 ⁢ [ OCCR Score ]

0.35 ⁢ 𝔼 ⁢ [ 𝑠 ^ ℎ 𝑖 ] + 0.25 ⁢ 𝔼 ⁢ [ 𝑠 ^ 𝑐 𝑖 ] + 0.15 ⁢ 𝔼 ⁢ [ 1 − 𝑠 ^ 𝑐 ⁢ 𝑢 𝑖 ]

− 0.15 ⁢ 𝔼 ⁢ [ 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 ] + 0.10 ⁢ 𝔼 ⁢ [ 𝑠 ^ 𝑛 ⁢ 𝑐 𝑖 ]

≈ 0.35 ⁢ 𝑠 ℎ 𝑖 ⁢ ( 1 + 𝜇 𝑤 2 − 𝜇 𝑤 2 𝑛 ⁢ 𝜇 𝑤 2 ) + 0.25 ⁢ 𝑠 𝑐 𝑖

0.15 ⁢ ( 1 − 1 3 − 𝔼 ⁢ [ 𝑌 2 ] 9 ⁢ 𝑛 ⁢ 𝔼 ⁢ [ 𝑌 ] 2 ) − 0.15 ⁢ ( 𝜇 𝑆 ⋅ 𝜇 𝑡 )
0.10 ⁢ 𝑠 𝑛 ⁢ 𝑐 𝑖 .

(41)

Here, the overall OCCR expectation, 𝜇 OCCR , is given by:

𝜇 OCCR

∑ 𝑘

1 5 𝑤 𝑘 ⁢ 𝜇 𝑘 .

(42)

Variance of the OCCR Score

Assuming that the subscores are estimated independently, the variance of the OCCR Score is the sum of the variances of the weighted components:

Var ⁡ ( OCCR Score )

( 0.35 ) 2 ⁢ Var ⁡ ( 𝑠 ^ ℎ 𝑖 ) + ( 0.25 ) 2 ⁢ Var ⁡ ( 𝑠 ^ 𝑐 𝑖 ) + ( 0.15 ) 2 ⁢ Var ⁡ ( 1 − 𝑠 ^ 𝑐 ⁢ 𝑢 𝑖 )

( 0.15 ) 2 ⁢ Var ⁡ ( 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 )
( 0.10 ) 2 ⁢ Var ⁡ ( 𝑠 ^ 𝑛 ⁢ 𝑐 𝑖 )

≈ ( 0.35 ) 2 ⋅ 16 ⁢ 𝑠 ℎ 𝑖 ⁢ ( 1 − 𝑠 ℎ 𝑖 ) ⁢ 𝔼 ⁢ [ 𝐿 𝑖 , 𝑗 2 ] 9 ⁢ 𝑛 ⁢ ( 𝔼 ⁢ [ 𝐿 𝑖 , 𝑗 ] ) 2

( 0.25 ) 2 ⋅ 𝑠 𝑐 𝑖 ⁢ ( 1 − 𝑠 𝑐 𝑖 ) 𝑘
( 0.15 ) 2 ⋅ Var ⁡ ( 𝑠 ^ 𝑐 ⁢ 𝑢 𝑖 )
( 0.15 ) 2 ⋅ 1 𝑛 ⁢ ( 𝛼 − 1 ) 2 𝛼 ⁢ ( 𝛼 − 2 ) ⁢ [ ( 𝜎 𝑡 2
𝜇 𝑡 2 ) − ( 2 ⁢ 𝑝 − 1 ) 2 ⁢ 𝜇 𝑡 2 ]
( 0.10 ) 2 ⋅ 𝑠 𝑛 ⁢ 𝑐 𝑖 ⁢ ( 1 − 𝑠 𝑛 ⁢ 𝑐 𝑖 ) 𝑛 .

(43)

Thus, the overall variance of the OCCR Score, 𝜎 OCCR 2 , is given by:

𝜎 OCCR 2

∑ 𝑘

1 5 𝑤 𝑘 2 ⁢ 𝜎 𝑘 2 .

(44) 9Consistency of the OCCR Score

Each individual estimator 𝑠 ^ ℎ 𝑖 , 𝑠 ^ 𝑐 𝑖 , 𝑠 ^ 𝑐 ⁢ 𝑢 𝑖 , 𝑠 ^ 𝑐 ⁢ 𝑡 𝑖 , and 𝑠 ^ 𝑛 ⁢ 𝑐 𝑖 is assumed to be consistent for its corresponding true parameter as 𝑛 → ∞ (or 𝑘 → ∞ where applicable). Therefore, the OCCR Score, being a weighted linear combination of these consistent estimators, is itself a consistent estimator for the weighted combination of the true parameters:

0.35 ⁢ 16 ⁢ 𝑠 ℎ 𝑖 ⁢ ( 1 − 𝑠 ℎ 𝑖 ) ⁢ 𝔼 ⁢ [ 𝐿 𝑖 , 𝑗 2 ] 9 ⁢ 𝑛 ⁢ ( 𝔼 ⁢ [ 𝐿 𝑖 , 𝑗 ] ) 2

0.25 ⁢ 𝑠 𝑐 𝑖
0.15 ⁢ [ 1 − 2 − ( 𝑝 min
𝑝 max ) 2 − ( 2 − ( 𝑝 min
𝑝 max ) ) ⁢ ∑ 𝑗 Var ⁡ ( 𝐿 𝑖 , 𝑗 ) 2 ⁢ ( ∑ 𝑗 𝔼 ⁢ [ 𝐿 𝑖 , 𝑗 ] ) 2 ]

− 0.15 ⁢ ( 𝜇 𝑆 ⁢ 𝜇 𝑡 ) + 0.10 ⁢ 𝑠 𝑛 ⁢ 𝑐 𝑖 .

(45)

Thus, as the sample size increases, the OCCR Score converges in probability to the true weighted combination of the subscores.

References Sahu and Kumar, (2024) ↑ Sahu, K. and Kumar, R. (2024). “A secure decentralised finance framework.” Computer Fraud & Security, vol. 2024, no. 3. (2) ↑ Moghe, M. S. and Johri, S. (n.d.). “The Role of Credit Scoring in Modern Banking–An Overview of Methodology & Implementation.” Voinov and Nikulin, (2012) ↑ Voinov, V. G. and Nikulin, M. S. (2012). Unbiased estimators and their applications: volume 1: univariate case. Springer Science & Business Media, vol. 263. CreDA, (2022) ↑ CreDA (2022). “What Is a Crypto Credit Score?” Medium, Apr. 4. [Online]. Available: https://creda-app.medium.com/what-is-a-crypto-credit-score-a34228685a5f. Arnold, (2014) ↑ Arnold, B. C. (2014). Pareto distribution. Wiley StatsRef: Statistics Reference Online, pp. 1–10. Wiley Online Library. (6) ↑ CRED Protocol (n.d.). “CRED Protocol.” [Online]. Available: https://www.credprotocol.com/. Fisher, (1925) ↑ Fisher, R. A. (1925). “Theory of statistical estimation.” Mathematical Proceedings of the Cambridge Philosophical Society, vol. 22, no. 5, pp. 700–725. Cambridge University Press. Doerr et al., (2021) ↑ Doerr, J. F., Kosse, A., Khan, A., Lewrick, U., Mojon, B., Nolens, B., and Rice, T. (2021). “DeFi risks and the decentralisation illusion.” BIS Quarterly Review, vol. 21. Wolf et al., (2022) ↑ Wolf, W., Henry, A., Fadel, H. A., Quintuna, X., and Gay, J. (2022). “Scoring Aave accounts for creditworthiness.” arXiv preprint arXiv:2207.07008. Packin and Lev-Aretz, (2024) ↑ Packin, N. G. and Lev-Aretz, Y. (2024). “Decentralized credit scoring: Black box 3.0.” American Business Law Journal, vol. 61, no. 2, pp. 91–111. Perez et al., (2021) ↑ Perez, D., Werner, S. M., Xu, J., and Livshits, B. (2021). “Liquidations: DeFi on a Knife-edge.” In Financial Cryptography and Data Security: 25th International Conference, FC 2021, Virtual Event, March 1–5, 2021, Revised Selected Papers, Part II, Springer, pp. 457–476. (12) ↑ Block Analitica (n.d.). “Introducing Project Levon.” Medium, [Online]. Available: https://medium.com/block-analitica/introducing-project-levon-e1444bd888d1. Report Issue Report Issue for Selection Generated by L A T E xml Instructions for reporting errors

We are continuing to improve HTML versions of papers, and your feedback helps enhance accessibility and mobile support. To report errors in the HTML that will help us improve conversion and rendering, choose any of the methods listed below:

Click the "Report Issue" button. Open a report feedback form via keyboard, use "Ctrl + ?". Make a text selection and click the "Report Issue for Selection" button near your cursor. You can use Alt+Y to toggle on and Alt+Shift+Y to toggle off accessible reporting links at each section.

Our team has already identified the following issues. We appreciate your time reviewing and reporting rendering errors we may not have found yet. Your efforts will help us improve the HTML versions for all readers, because disability should not be a barrier to accessing research. Thank you for your continued support in championing open access for all.

Have a free development cycle? Help support accessibility at arXiv! Our collaborators at LaTeXML maintain a list of packages that need conversion, and welcome developer contributions.

Xet Storage Details

Size:: 74.6 kB
Xet hash:: 23431ea6258cc76183b8105d021a609fccda6eb9f591c59b9c4576dc269fe288

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.

𝑋 𝑖 , 𝑗

It should be noted that 𝑃 ⁢ ( 𝑋 𝑖 , 𝑗 )

𝑠 ^ ℎ 𝑖

where 𝑤 𝑖 , 𝑗

The combined riskiness of all collateral assets is calculated by 𝑟 𝑖 , 𝑗

𝑡 𝑖 , 𝑗

𝑍 𝑖 , 𝑗

where 𝐻 𝑖 is the current holding for the 𝑖 𝑡 ⁢ ℎ wallet, and 𝐸 ⁢ ( 𝑍 𝑖 , 𝑗 )

𝑠 ^ 𝑐 𝑖

∑ 𝑗

𝑠 ^ 𝑐 ⁢ 𝑢 𝑖

𝑠 ^ 𝑐 ⁢ 𝑡 𝑖

𝑌 𝑖 , 𝑗

with 𝑃 ⁢ ( 𝐿 𝑖 , 𝑗 ≥ 𝜇 𝐿 𝑖 ⁢ and ⁢ Δ ⁢ 𝐷 𝑖 , 𝑗 ≤ 𝜇 Δ ⁢ 𝐷 𝑖 )

𝑃 ⁢ ( 𝐿 𝑖 , 𝑗 ≥ 𝜇 𝐿 𝑖 ) × 𝑃 ⁢ ( Δ ⁢ 𝐷 𝑖 , 𝑗 ≤ 𝜇 Δ ⁢ 𝐷 𝑖 )

𝑠 ^ 𝑛 ⁢ 𝑐 𝑖

∑ 𝑗

OCCR Score

𝐿 ⁢ 𝑇 ⁢ 𝑉 ⁢ ( 𝑡 )

𝑓 ⁢ ( 𝑂 ⁢ 𝐶 ⁢ 𝐶 ⁢ 𝑅 ⁢ _ ⁢ 𝑆 ⁢ 𝑐 ⁢ 𝑜 ⁢ 𝑟 ⁢ 𝑒 𝑡 )

𝑠 ^ ℎ 𝑖

𝑁 𝑖 𝐷 𝑖

∑ 𝑗

1 𝑛 𝑤 𝑖 , 𝑗 ⁢ 𝑋 𝑖 , 𝑗 ∑ 𝑗

𝑃 ⁢ ( 𝑋 𝑖 , 𝑗

1 )

𝑠 ℎ 𝑖 , 𝑃 ⁢ ( 𝑋 𝑖 , 𝑗

0 )

1 − 𝑠 ℎ 𝑖 ⟹ 𝔼 ⁢ [ 𝑋 𝑖 , 𝑗 ]

𝑤 𝑖 , 𝑗

𝔼 ⁢ [ 𝐿 𝑖 , 𝑗 ∣ ltv , collateral ]

ltv ∼ Uniform ⁡ ( 𝑙 min , 𝑙 max ) and 𝔼 ⁢ [ ltv ]

𝔼 ⁢ [ collateral ]

𝔼 [ 𝐿 𝑖 , 𝑗 ]

Similarly, using 𝔼 ⁢ [ 𝑋 2 ]

𝔼 [ 𝐿 𝑖 , 𝑗 2 ]

𝔼 ⁢ [ 1 − 𝑟 𝑖 , 𝑗 ]

1 2 , 𝔼 ⁢ [ ( 1 − 𝑟 𝑖 , 𝑗 ) 2 ]

𝔼 ⁢ [ 𝑡 𝑖 , 𝑗 ]

1 2 , 𝔼 ⁢ [ 𝑡 𝑖 , 𝑗 2 ]

𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 ]

𝔼 ⁢ [ 𝐿 𝑖 , 𝑗 ] ⋅ 1 2 ⋅ 𝑝 ⋅ 1 2

𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 2 ]

𝔼 ⁢ [ 𝐿 𝑖 , 𝑗 2 ] ⋅ 1 3 ⋅ 𝑝 2 ⋅ 1 3

𝑁 𝑖

∑ 𝑗

1 𝑛 𝑤 𝑖 , 𝑗 ⁢ 𝑋 𝑖 , 𝑗 and 𝐷 𝑖

∑ 𝑗

𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 ⁢ 𝑋 𝑖 , 𝑗 ]

𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 ] ⁢ 𝔼 ⁢ [ 𝑋 𝑖 , 𝑗 ]

𝔼 ⁢ [ 𝑁 𝑖 ]

𝑠 ℎ 𝑖 ⁢ ∑ 𝑗

1 𝑛 𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 ] , 𝔼 ⁢ [ 𝐷 𝑖 ]

∑ 𝑗

𝔼 ⁢ [ 𝑠 ^ ℎ 𝑖 ] ≈ 𝔼 ⁢ [ 𝑁 𝑖 ] ⋅ 𝔼 ⁢ [ 1 𝐷 𝑖 ] ≈ ( 𝑠 ℎ 𝑖 ⁢ ∑ 𝑗

1 𝑛 𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 ] ) ⁢ ( 1 ∑ 𝑗

1 𝑛 𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 ] + Var ⁡ ( 𝐷 𝑖 ) ( ∑ 𝑗

1 𝑛 𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 ] ) 3 )

𝑠 ℎ 𝑖 ⁢ ( 1 + Var ⁡ ( 𝐷 𝑖 ) ( ∑ 𝑗

In the special case where all loans are identically distributed (denoting 𝜇 𝑤

𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 ] and 𝜇 𝑤 2

𝔼 ⁢ [ 𝐷 𝑖 ]

𝑛 ⁢ 𝜇 𝑤 , Var ⁡ ( 𝐷 𝑖 )

𝑠 ^ ℎ 𝑖

𝑓 ⁢ ( 𝑁 𝑖 , 𝐷 𝑖 )

∂ 𝑓 ∂ 𝑁 𝑖

1 𝐷 𝑖 , ∂ 𝑓 ∂ 𝐷 𝑖

Evaluating at the mean values 𝔼 ⁢ [ 𝑁 𝑖 ]

∂ 𝑓 ∂ 𝑁 𝑖 | 𝔼

1 𝔼 ⁢ [ 𝐷 𝑖 ] , ∂ 𝑓 ∂ 𝐷 𝑖 | 𝔼

𝑁 𝑖

∑ 𝑗

Var ⁡ ( 𝑤 𝑖 , 𝑗 ⁢ 𝑋 𝑖 , 𝑗 )

𝔼 ⁢ [ ( 𝑤 𝑖 , 𝑗 ⁢ 𝑋 𝑖 , 𝑗 ) 2 ] − ( 𝔼 ⁢ [ 𝑤 𝑖 , 𝑗 ⁢ 𝑋 𝑖 , 𝑗 ] ) 2

Var ( 𝑁 𝑖 )

∑ 𝑗

𝐷 𝑖

∑ 𝑗

Var ( 𝐷 𝑖 )

∑ 𝑗

𝔼 ⁢ [ 𝑠 ^ ℎ 𝑖 ]
≈ 𝔼 ⁢ [ 𝑁 𝑖 ] ⋅ 𝔼 ⁢ [ 1 𝐷 𝑖 ]

≈ ( 𝑠 ℎ 𝑖 ⁢ ∑ 𝑗

Var ⁡ ( 𝑠 ^ ℎ 𝑖 )
≈ Var ⁡ ( 𝑁 𝑖 ) ( 𝔼 ⁢ [ 𝐷 𝑖 ] ) 2 + 𝑠 ℎ 𝑖 2 ⁢ Var ⁡ ( 𝐷 𝑖 ) ( 𝔼 ⁢ [ 𝐷 𝑖 ] ) 2 − 2 ⁢ 𝑠 ℎ 𝑖 ( 𝔼 ⁢ [ 𝐷 𝑖 ] ) 2 ⁢ 𝑠 ℎ 𝑖 ⁢ Var ⁡ ( 𝐷 𝑖 )

Var ⁡ ( 𝑠 ^ ℎ 𝑖 )
≈ 𝑛 ⁢ [ 𝑠 ℎ 𝑖 ⁢ 𝜇 𝑤 2 − 𝑠 ℎ 𝑖 2 ⁢ 𝜇 𝑤 2 ] 𝑛 2 ⁢ 𝜇 𝑤 2 − 𝑠 ℎ 𝑖 2 ⁢ 𝑛 ⁢ [ 𝜇 𝑤 2 − 𝜇 𝑤 2 ] 𝑛 2 ⁢ 𝜇 𝑤 2