Count Scoring de la Fer or a study on credit scoring as part of broadening one's horizons. Part 3

Part three, in which Athos precipitated, and the Count de la Fer makes wise with algorithms.







UPD Part One is here

UPD part two here







AntipovSN and MihhaCF







Introduction from the authors:







Good afternoon! Today we continue the series of articles devoted to scoring and the use of graph theory in it. You can get acquainted with the first and second articles here and here, respectively. We strongly recommend, otherwise, this article may seem like a meaningless experiment with algorithms.







All comic allegories, inserts, etc., are designed to slightly relieve the narrative and not allow it to fall into a boring lecture. We apologize to everyone who doesn’t get into our humor







The purpose of this article: in no more than 30 minutes, to describe the graph construction algorithm and calculate the scoring score for NPO “One for All”.







Terms and Definitions:









In our first article ( here ), we determined the graph model with which we will experiment, and gave it a brief description.







In our second article ( here ) we described the basic data structures that can be used to store the graph and described in more detail the graph model and how to work with it.







It's time to pull the sword out of its scabbard and start stabbing everyone, more precisely, create an algorithm for calculating the scoring score for NPO “One for All”.







Let's go!







The algorithm will be executed in Python . As the saying goes, pull the snake on the sword!







The graph will be stored in the dictionary, as follows:







graph = { 'Odin_za_vseh' : {'goody': 10, 'rank': 4, 'fund':4, 'relations':{'Bonasye':3, 'Trevi': 1, 'Gusak':2,'Suchka':5, 'Roshfor':1, 'Cardi':1}}, 'Cardi' : {'goody': -8, 'rank': 9, 'fund':8, 'relations':{'Korol':2,'Odin_za_vseh':3,'Gvardia':5, 'Roshfor':5 }}, 'Roshfor' : {'goody': -7, 'rank': 7, 'fund':5,'relations':{'Cardi':1, 'Suchka': 3,'Odin_za_vseh':2, 'Bonasye':5 }}, 'Suchka' : {'goody': -10, 'rank': 4, 'fund':4, 'relations':{'Odin_za_vseh':5,'Roshfor':3 }}, 'Gvardia' : {'goody': -5, 'rank': 3, 'fund':4, 'relations':{'Cardi':1,'Gusak':1 }}, 'Gusak' : {'goody': -7, 'rank': 4, 'fund':4, 'relations':{'Gvardia':5,'Odin_za_vseh':2 }}, 'Trevi' : {'goody': 7, 'rank': 5, 'fund':5, 'relations':{'Odin_za_vseh':4,'Mushketery':5 }}, 'Bonasye': {'goody': -2, 'rank': 2, 'fund':3, 'relations':{'Odin_za_vseh':1,'Roshfor':1,'Konsta':2 }}, 'Konsta' : {'goody': 10, 'rank': 5, 'fund':3, 'relations':{'Koroleva':2,'Bonasye':1 }}, 'Mushketery' : {'goody': 7, 'rank': 5, 'fund':4, 'relations':{'Korol':1, 'Trevi':1}}, 'Koroleva' : {'goody': 6, 'rank': 8, 'fund':7, 'relations':{'Korol':2,'Konsta':5, 'EnglishMan':3 }}, 'Korol' : {'goody': 5, 'rank': 10, 'fund':10, 'relations':{'Cardi':3, 'Koroleva':5,'Mushketery':5 }}, 'EnglishMan' : {'goody': 5, 'rank': 8, 'fund':10, 'relations':{'Koroleva':2}} }
      
      





'goody' - node rating - good or bad and the degree of "priority" of the node. For example, the Cardinal is a negative hero of the film, so his rating is negative. The Cardinal is one of the key enemies of our heroes, but not the most important (we decided so)

'rank' - score of the node in the ranking table

'fund' - node viability

'relations' - with whom is connected and how much can influence the node connected with it







So far, everything is simple, but an attentive reader could already notice some of the features, yes, content questions arise already. Let's try to predict and answer them. The right of the first injection is ours!







  1. I think it’s not worth explaining what we used for the name of the key of the first level dictionary. You already perfectly guessed, although some names have undergone changes due to our perception of the work.







  2. Where did these parameters come from ('goody', 'rank', etc.), why are they and where did their rating come from?

    In order:







    • In real life, these parameters will characterize a particular node-company. For everyone who will conduct the assessment and depending on the type of this assessment, these parameters will be different. For example, such parameters for evaluating the scoring of a borrower may be - turnover, amount of payables / receivables, writ of execution, etc.
    • Why exactly they - the author sees it this way))) these parameters reflect the essence of what the Count describes, in our opinion
    • Assessment is, as is now customary to say, expert. In real life, this estimate will be obtained through mining. We will write more about this in the next article.


  3. Why do some nodes have different reciprocal links (for example, Odin_za_vseh - Bonasye = 3, and Bonasye - Odin_za_vseh = 1)? This is because Odin_za_vseh has a much greater effect on Bonasye than vice versa. And this is important to understand. When scoring Odin_za_vseh, we will need to consider the effect of Bonasye on Odin_za_vseh.







  4. What is a bond score and how is it calculated?







    • Communication assessment is a measure of the influence of one node on another
    • Each connection in our Graph is two-way, but at the same time, the assessment of communication in different directions may vary
    • Communication score is a value from 1 to 5 (from 1 to 5 just for example). Negative connection can not be, because a node is either connected to another, and we evaluate how much it affects this node, or is not connected. In this case, of course, the connection with an unreliable node will ultimately be negative for the node we are evaluating, because this unreliable node will have a negative internal rating.
    • An internal evaluation of a node is an aggregated value that consists of the internal parameters of a node. In our example, this is 'goody', 'rank', 'fund'. Internal assessment may be negative.


  5. How will a scoring ball be considered?







    • Each company that will use this approach can lay down its scoring score calculation based on its requirements and its business experience.
    • In our case, we chose a simple and intricate way:

      • Scoring score = Internal score of the node we are evaluating + Sum (Internal assessment of a level 1 child node evaluating the impact of this node on the node being evaluated) + (Sum (Internal rating of a level 2 child node assessing the effect of this node on a level 1 parent node) / 2)
      • If you get a negative scoring score, then we do not give a loan
    • The algorithm presented below is basic and can be easily modified to fit any scoring score calculation technique.
    • All the difficulty will be in the “right” mining, but it will be described in the next article




And here is the algorithm itself:







 searched=[] #   def search_outer(name,n): return graph[name]['goody']*graph[name]['rank']*graph[name]['fund']+search(name,n) # search_outer()    ,       #,       search() #name –   ,     #n –   –       def search(name, n, z=1): ball=0 #,      global searched #    if z <= n and name not in searched: #        searched.append(name) for x in graph[name]['relations']: #   ball += graph[x]['goody'] * graph[x]['rank'] * graph[x]['fund'] * graph[x]['relations'][name] / z + search(x,n,z+1) #      #   ,     return ball #, ,     def main(): ball = search_outer ('Odin_za_vseh', 2) print(ball) if __name__ == '__main__': main()
      
      





To summarize:









image







Thanks to sobolevn and his article







You can go to the most interesting and difficult - data mining!







PS Regarding links to other articles with theory (in the last article we made a remark):










All Articles