一种新的银行客户分类的大数据方法——结合猪、R和Hadoop的解决方案(IJITCS-V8-N9-10)

I.J. Information Technology and Computer Science, 2016, 9, 81-90

Published Online September 2016 in MECS (https://www.360docs.net/doc/a02022841.html,/) DOI: 10.5815/ijitcs.2016.09.10

A Novel Big Data Approach to Classify Bank Customers - Solution by Combining PIG, R and

Hadoop

Lija Mohan

Division of Computer Science, Cochin University of Science & Technology, Kochi, Kerala, India.

E-mail: lija@cusat.ac.in

Sudheep Elayidom M.

Division of Computer Science, Cochin University of Science & Technology, Kochi, Kerala, India.

E-mail: sudheepelayidom@cusat.ac.in

Abstract —Large amount of data that is characterized by

its volume, velocity, veracity, value and variety is termed Big Data. Extracting hidden patterns, customer preferences, market trends, unknown correlations, or any other useful business information from large collection of structured or unstructured data set is called Big Data analysis. This article explores the scope of analyzing bank transaction data to categorize customers which could help the bank in efficient marketing, improved customer service, better operational efficiency, increased profit and many other hidden benefits. Instead of relying on a single technology to process large scale data, we make use of a combination of strategies like Hadoop, PIG, R etc for efficient analysis. RHadoop is an upcoming research trend for Big Data analysis, as R is a very efficient and easy to code, data analysis and visualization tool compared to traditional MapReduce program. K-Means is chosen as the clustering algorithm for classification. Index Terms —BigData Analysis, Bank customer

classification, Hadoop, PIG, R. I. I NTRODUCTION Emergence of Internet and World Wide Web (WWW)

caused the flooding of large amount of data. This data

characterized by its volume, velocity, veracity, value and

variety is termed BigData [41]. Big Data contains terra

bytes or peta bytes of data and it can be structured or

unstructured. Big Data Analysis is very much crucial as it

leads to more accurate analysis. Accurate analysis is the

back bone of accurate decision making. Accurate

decision making leads to operational efficiency and cost

and risk reduction in industries.

Traditional client server architecture seems inefficient to tackle big data. Hadoop comes to rescue here. Hadoop

is a distributed processing framework from Apache which

can store and process data in a parallel processing manner.

A. Motivation

This article identifies an application of BigData analytics to classify bank customers. Every bank will maintain a secret, customer profile to cross check the risk perception of customer. This customer classification is necessary for many reasons. ? To identify if the customer furnishes wrong information about himself. ? If the customer does not re-pay the loans on time. ? To restrict loans and credit card issue to high risky customers ? To correctly predict apt customers for their products and policies, thus leading to excellent target based marketing. ? Productive marketing thus leads to the profit of the

banks etc. Classification done based on the current salary, assets he own etc may not be an actual reflection of the

customer behavior. People who possess valuable assets may show reluctance to pay back the loans on time.

Hence this article tries to classify the customers based on

their previous transactions. Here we include information

such as how many loans he already possesses, how many have re-payed etc for better classification. The challenge here is to mine valuable information from transactions which ranges from terra bytes or peta bytes of information. In this article, we consider bank transactions as Big Data and provide a Big Data approach to solve the problem of customer classification. As per our knowledge, it is the first time in literature to solve the problem of customer classification using a Big Data approach. We combine PIG and R on Hadoop platform to do the

classification. Advantages of solving the problem in a Big Data environment:

?All the hardware related tasks will be handled by the Hadoop framework itself. Eg: Checking

whether all nodes are up, transferring a job in case

of failure, selecting a node with less overhead etc. ?Programmer can concentrate more on programming.

?Task can be completed in minimum time.

?Scalability is ensured

?Accuracy of classification is increased as training data is large

?Efficient system development

?PIG, R etc can be programmed by a person who do not even know map reduce programming

principles

II.B ACKGROUND AND L ITERATURE R EVIEW There has been several classification models developed both in literature and academia to categorize banking customers based on their transactions performed. Several techniques like Support Vector machines [15,16], Decision trees [30, 31], Neural network based models [17, 18], Abnormal detection [26, 27], Optimizations [28], Bayesian models [24], Rule based models [22, 23], Nearest neighbor approaches [21], Pattern recognition algorithms [25], Hybrid approaches [29] etc were proposed, but all of these methods depends on the actual input data that it could receive to classify the customers accurately. As the amount of training data increases, the accuracy of classification also increases. But if the amount of data increases beyond a limit, training and optimizations becomes very difficult and time consuming. Apart from traditional client server architecture, a distributed processing system which is capable of processing large amount of data can come to rescue here. Hadoop [32] is a distributed processing frame work provided as open source by Apache. Data processing applications like Mahout, PIG, HIVE, R etc can be integrated with Hadoop to handle Big Data. Map Reduce [33] programming is followed in Hadoop.

Hadoop is widely used in applications where large scale data processing is necessary. Customer feedback analysis [36], social media analysis [38], ad-targeting [39], data warehouse modernization [34], operation analysis [35], fraud detection [37] etc are the wide spread applications of big data analysis. Apart from literature, hadoop is gaining so much popularity in industries for Market Basket Analysis and Pricing Optimization, Credit risk, Compliance and regulatory reporting, Merchandizing and market analysis, Behaviour-based targeting, Supply-chain management and analytics, Market and consumer segmentations, Fraud detection and security analytics, scoring and analysis, Customer Segmentation, Risk analysis and management, Medical insurance fraud, CRM, Trade surveillance and abnormal trading pattern analysis, Games Development, Clinical trials, Drug development analysis, Patient care quality analysis, Disease pattern analysis etc.

This article proposes a method to classify the bank customers using Hadoop framework. Classification is done based on the transactions performed by the customers. Transactions analysis is tedious as the size of input data is very huge. According to the RBI statistics [40], in a single month, a bank will have to handle approximately 8,00,00,000 transactions which is not possible using traditional client server processing. Thus we approach this problem as a BigData problem and propose a solution based on Hadoop, PIG and R. The next section gives an overview of Hadoop, Pig and R. Clustering method adopted is K-Means clustering. But based on the application requirement it can be changed to SVMs, Neural Networks or Decision Trees. But for simplicity here we adopt K Means clustering.

A. Hadoop – An Architecture to process BigData Hadoop [13, 14] is an open source distributed processing framework from Apache, for storing large amount of data on huge number of commodity clusters. Key features that make hadoop suitable for processing large scale data is its flexibility, cost effectiveness, fault tolerance, reliability, scalability, robustness and real time processing. Major components of Hadoop are Hadoop Distributed File System, Namenode and DataNodes. Hadoop Distributed File System (HDFS):HDFS is a distributed file system which allows data to be stored in different blocks of pre determined sizes. Since the data is split across multiple blocks, which spread across multiple machines, data access and operations can be performed in a parallel fashion thus enabling faster processing. Data replication is done automatically to allow fault tolerance.HDFS maintains a master/ slave architecture where the master is named as Name node and slave is named as Data node in Hadoop.

Name Node:Name node functions as a master node. Name node should be a high end system which could survive hardware faults. The daemon that performs the functions of a name node is called Job Tracker. Different functions of name node include:

?Maintaining the namespace of files stored. i.e.

keeping the metadata of file blocks and their

locations.

?Maintains an index of cluster configuration

?Directs data nodes to execute low level operations ?Records the changes that take place in a cluster

?Replication of data blocks is taken care by Name node

?Receives the heart beat of each data node to check whether it is alive, in case of a data node failure,

name node will assign the task to another data

node depending on the data block availability,

location, overhead etc.

Data Node: Data node serves as the slave node. Data node can be any commodity hardware which will not create any problem even if the node crashes. Replication will avoid any damage associated with data node failure.

The daemon that performs the operation of data node is called Task Tracker. The functions of data node include:

? Performs low level read/write operations on blocks ? Replication is implemented by Data node.

? Forward data to other data nodes, pipeline the data

and send heart beats to name node.

Fig.1. Nodes in Hadoop

B. PIG for BigData PIG [1, 2] was developed at Yahoo to process large

data sets stored in Hadoop distributed file system. It was

observed that some programmers felt difficulty in transforming to map reduce way of programming. Hence more time was spent on writing programs than analysis. PIG was developed to solve these issues. PIG is designed to handle any type of data.

PIG is built on two components.

PIGLatin

PIG Latin is a data flow language unlike declarative or procedural language. Program operates on files in HDFS. User code and binaries can be included anywhere. Hence frequently used operations like join, group, filter, sort etc can be directly re-used. A programmer who is not familiar with Java can also write PIG programs. Programmers can develop their own functions and they are named PIG UDFs (User Defined Functions).

PIG Run Time Environment/ PIG Engine

PIG Engine runs over Hadoop. Hence programmers are completely isolated from Hadoop ecosystem. Even the changes in Hadoop cluster are handled by PIG Engine. It is the responsibility of PIG Engine to parse, optimize,

generate execution plan and job monitoring.

Figure below illustrates the working of PIG with a

sample usecase.

Fig.2. Working of PIG

C. R for Data Analysis

R [3, 4] is an open -source data analys is software widely used by statisticians, data scientists, analysts etc for analysis, visualization and predictive modeling. R allows to write scripts and functions in a complete, interact ive, and object or iented way. Graphica l Visualization is the unique capability provided by R. Scatter plots, box-plots, lattice charts etc are all supported by R. A majority of the data mining algorithms like regression analysis, classification, clustering, association

rule mining, text mining, etc can make use of R for efficient and easy implementation.

R and Hadoop can go hand in hand to analyze and visualize large amount of data. Several projects have emerged by combining R and Hadoop for handling different applications. RHadoop, RHIPE, ORCH, Hadoop Streaming with R etc are different ways by which R and Hadoop are combined. In RHadoop [5] the hadoop distributed file system will store the large amount of data and analytic code is written using R and the data will be processed in a map reduce manner.

III. R ESEARCH D ESIGN AND M ETHODOLOGY

A. Problem Statement

Categorize the customers in a bank into low, medium and high risk based on their previous transactions. This could help the banks to do marketing by understanding the customers. Bank should analyze terra bytes or peta bytes of transaction data for accurate classification. Hence we make use of Hadoop distributed processing framework to store and process this large amount of data. Apart from relying on a single tool like R to do the analysis, we are making use of PIG also to ease the preprocessing step. First we propose a solution solely based on PIG. Then we will illustrate how we can improve the classification using K-Means and implement and visualize it using R.

B. Data Pre-Processing using PIG

Assume that the bank possess a large dataset containing the customer details, transactions, loan, deposits and credit card details. Clean the data by removing not applicable and null values and save the files in comma separated format to hdfs. Now we can use PIG to pre-process the data. Extract all useful information from different tables and group then based on the customer id. Join the tables to extract loan amount taken, duration to repay the loan, status of loan, type of loan, amount involved in a transaction by customer, date, type of owner, name of the district to which he belongs, average salary, rate of unemployment in that district, number of entrepreneurs in that district, type of card he holds and birth date.

C. PIG Algorithm to classify customers

PIG can be used to easily classify the data based on some conditions. Here the number of codes needed is only 3 to categorize into 3 groups. Hence saves a lot of coding time. We categorize the customers based on the loan they have already taken, how much they have re-payed, unemployment rate in the district to which he belongs etc. The criterions are actually chosen in a random manner. These criteria can be changed according to the Bank?s requirement.

The above obtained data can be used to classify the bank customers by following Algorithm 1.

Algorithm 1: Classification of Bank Customers using PIG

Input: loan amount taken, duration to repay the loan, status of loan, type of loan, amount involved in a transaction by customer, date, type of owner, name of the district to which he belongs, average salary, rate of unemployment in that district, number of entrepreneurs in that district, type of card he holds and birth date.

Output: Category of Each User (Low Risk / Medium Risk / High Risk)

1.Extract all transaction details within 1 year.

2.If( transaction_amount > 10 lakh) and (a vg_sal > 10k) and loan_status==?A? and (age between 25 and 65) then

categorize the customer as …low risk?. Card Status is …card can be upgraded?

3.If( transaction_amount > 10 lakh) and (avg_sal > 6k) and loan_status==?A? and loan_status==?C? and (age between 25

and 55) and unemployment_rate < 0.80 then categorize him to …low risk? but card status is …card can be upgraded only after loan repayment.?

4.If (avg_sal > 6k) and loan_status==?B? and loan_status==?D? and (age >35) and no_of_entrepreneur>100 then

catego rize him to …risky? and his card status can be …down graded?

5.Write the category of customers and card status to seperate files.

A lgorithm 1 shows how to classify bank customer?s based on several randomly chosen criteria like loans taken, re-payment status etc. Implementation of above algorithm using PIG will be discussed in section 4.

PIG allows us to implement all SQL like operations; little bit of programming capabilities can be brought by writing PIG UDFs. But PIG is not suitable for applications which need complex algorithms, optimizations, machine learning etc. Hence to classify the data more effectively without writing complicated map reduce codes, we can use another data analytic language R. Thus in the proposed method we use PIG for doing pre-processing (cleaning data and extracting useful information from original data) and R for classification.

IV. C LASSIFICATION BY R ON H ADOOP

A clustering algorithm [6] should be devised to obtain a more accurate classification. Here we use the simple and elegant K-means [7, 8] unsupervised clustering algorithm to classify the customers. R has inbuilt capabilities to implement K means clustering. Before feeding data to K-Means we check for outliers. Outliers [10] are eliminated from salary, unemployment rate etc but no change made to loan amount sanctioned as it is more sensitive one compared to other fields. We can make use of box plotting [11] to detect outliers. In R there exist built in functions to implement all these features. Since different data fields maintain different value range, we perform a scaling operation to uniformly scale the attributes. Then we calculate the variance of each attributes and feed this as input to K-Means clustering.

A. K-Means Clustering for Classification

Bas ic goal of k-mea ns clustering is to classify …n? objects to …k? clusters based on their similarity. Lloyd?s algorithm [9] is widely accepted for implementing k-means clustering. Lloyd?s algorithm is particularly suited

for Big Data applications.

In the context of banking customer classification, we set k=3 as we need to classify the customers into low/ medium/ high risk. After executing the K-Means algorithm in R we get a set of parameters like clusters (1to k), cluster centroids, difference in each cluster, difference between other clusters, number of iterations taken to converge, etc.

Algorithm 2: Lloyd’s Algorithm for implementing K -Means Clustering

Output: K clusters

1. Initialize the center of the clusters (μ) with 0; μi = 0 for i = 1 to k

2. Assign the closest cluster to each data point. c i = { j:d(x j ,μi ) ≤ d(x j ,μl ), l≠i, j=1,...,n } where d(x,μi )=∥x?μi ∥2.

3. Reset the position of each cluster to the mean of all data points currently belonging to that cluster. μi=1/|c i | ∑j ∈ci x j ,

4. Repeat steps 2-3 until the cluster center remains same for all successive iterations.

V. R ESULTS AND D ISCUSSION

This section gives an overview of the implementation constructs adapted to our approach. A. Dataset

We use Berka Data set [12] created by Petr Berka and Marta Sochorova as the source to obtain banking details. Berka Data set contains the datils of about 5300 bank customers with approximately 1,000,000 transactions, 700 loan details and 900 credit card details. Data is collected fro m Czech bank. Different tables in the data set are:

ACCOUNTS: containing details of each account CLIENTS: personal information of each customer ORDER: details of each payment order

TRANSACTIONS: details of each transaction made by the customer

LOANS: details of each loan sanctioned by bank.

CREDIT CA RDS: details of credit cards issued by the bank

DEMOGRAPHIC DATA: details about the districts

Entity relationships between the tables are as shown in

figure:

Fig.3. Entity Relationship in Berka Dataset

B. Setting Up RHadoop

After installing the Hadoop set environmental variables for integrating R. Load RHadoop packages and read input data from hadoop distributed file system (For details refer Appendix 1).

C. Data Preprocessing

After data cleaning, the structure of the fetched data will be as shown in figure 4.

Fig.4. Structure of data after cleaning stage.

Fig.5. Summary of fetched data

Fig.6. Box Plot of all attributes

Fig.7. Boxplot on average salary, unemployment rate and loan amount

Fig.8. Scatterplots of a) average salary and b) loan amount

The box plot of all attributes is as shown in figure 6. But from the boxplot it is clear that there exist outliers in avg_sal, loan_amount and unemp_rate. The box plot applied on each attribute separately is given in figure 7. Scatter plots of average salary and loan amount is as shown in figure 8. Scatter plots will give more insight to the data distribution of sensitive attributes like avg_sal and loan_amt.

From fig 8.a it is clear that a lot of pixels lie in the outlier range for avg_sal and loan_amt, hence it is better to maintain the values as such without any change. K-Means Clustering

After the data pre-processing stage, the cleaned data is fed as input to K Means clustering algorithm. Elbow graph obtained by plotting the variance obtained in 15 successive iterations are as shown in figure 9.From this figure it is clear that the slope changes in every 3 successive iterations, hence the optimized number of cluster can be deduced as 3.

Fig.9. Elbow Graph of 15 successive iterations

Fig.10. K - Means Summary

K-Means summary of the input data set is given in figure 10.

K Means plot of age vs. average salary is illustrated in fig 11.a. Three co lors denote 3 different cluster data points. Fig 11.b. plots data points with cluster names 1, 2, 3 so that the spread of data points is clearly reflected. Fig 12 illustrates by how much, each data points vary within each cluster or it projects the variation of each cluster point from its mean or centroid.

Fig.11.a. K means plot for age vs. avg_sal

Fig.11.b. Plotting data points with cluster numbers 1,2 and 3.

Fig.12. Variation of data points from cluster mean

Fig.13. Mean of each cluster.

The cluster mean obtained from the Berka data set for each cluster low, medium and high risk customers is given in fig 13. (The details of how to obtain the graphs and plots in R are given in appendix 2).

VI. C ONCLUSION

Analysis of Big Data helps organization to get valuable insights from the structured or unstructured data they possess. This article identifies an application of Big Data in banking domain to categorize the customers based on their transactions. We use Hadoop as the software platform for storing the data. Data analysis tools like R and PIG are used to extract useful information from the data. To classify the customers, K Means clustering is used. But this could be replaced with SVMs or Neural Networks for better accuracy. Since we have addressed the problem in a BigData perspective, the learning time or heavy processing requirement will not be a problem.

A CKNOWLEDGEMENT

I sincerely thank Inspire Fellowship provided by Department of Science & Technology, India for supporting my research in the form of fellowship.

R EFERENCES

[1]Alan Gates (2011), Programming PIG, O?Reilly Media,

New York.

[2]Christopher Olston, Benjamin Reed, Utkarsh Srivastava,

Ravi Kumar and Andrew Tomkins, “Pig Latin: A Not-So-

Foreign Language for Data Processing”, SIGMOD?08, June 9–12, 2008, Vancouver, BC, Canada.

[3]Emmanual Paradis, “R for Beginners”, http://cran.r-

https://www.360docs.net/doc/a02022841.html,/doc/contrib/Paradis-rdebuts_en.pdf

[4]W. N. Venables, D. M. Smith and the R Core Team, “An

Introduction to R”, http://cran.r-

https://www.360docs.net/doc/a02022841.html,/doc/manuals/R-intro.pdf

[5]Vignesh Prajapati (2013), “Big Data Analytics with R and

Hadoop”, Packt Publishing, UK.

[6]Jiawei Han, Micheline Kamber (2006), “Data Mining

Concepts & Techniques”, Morgan Kaufmann Publishers, Canada.

[7]J. B. MacQueen (1967): "Some Methods for classification

and Analysis of Multivariate Observations, Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability", Berkeley, University of California Press, 1:281-297.

[8]Brian T. Luke: “K-Means Clustering”,

https://www.360docs.net/doc/a02022841.html,/~lukeb/kmeans.html

[9]S. Lloyd, Least square quantization in PCM, IEEE Trans.

Infor. Theory, 28, 1982, pp. 129– 137.

[10]Charu C. Aggarwal (2013), “Outlier Analysis”, Kluwer

Academic Publishers, Boston.

[11]Purple Math: “Box Plot and 5 number summary”,

https://www.360docs.net/doc/a02022841.html,/modules/boxwhisk.htm [12]Berka, P. (2000). Guide to the financial data set. The

ECML/PKDD 2000 Discovery Challenge.

[13]Robert Chansler, Hairong Kuang, Sanjay Radia,

Konstantin Shvachko, and Suresh Srinivas,“Hadoop Distributed File System”, https://www.360docs.net/doc/a02022841.html,/en/hdfs.html

[14]Jeffrey Shafer, Scott Rixner, and Alan L. Cox, “The

Hadoop Distributed Filesystem: Balancing Portability and Performance”, ISPASS 2010, New York, USA, Pages 122-133.

[15]Y. Wang, S. Wang, and K.K. Lai, “A New Fuzzy Support

Vector Machine to Evalu ate Credit Risk,” IEEE Trans.

Fuzzy Systems, vol. 13, no. 6, pp. 820-831, Dec. 2005. [16]T. Joachims, “Making Large-Scale SVM Learning

Practical,” Advances in Kernel Methods: Support Vector Learning, B. ScholLkopf, C. Burges, and A. Smola, eds., MIT-Press, 1999.

[17]L. Yu, S. Wang, and K. Lai, “Credit Risk Assessment

with a Multistage Neural Network Ensemble Learning Approach,” Expert Systems with Applications, vol. 34, no.

2, pp. 1434-1444, 2008.

[18]H. Guo and S.B. Gelfand, “Classification Trees with

Neural Network F eature Extraction,” IEEE Trans. Neural Networks, vol. 3, pp. 923-933, 1992.

[19]R. Rymon, “An SE-Tree Based Characterization of the

Induction Problem,” Proc. Int?l Conf. Machine Learning, 1993.

[20]J.R. Quilan, “Induction of Decision Trees,” Machine

Learning, vol. 1, pp. 81-106, 1986.

[21]Nearest Neighbor (NN) Norms: NN Pattern Classification

Techniques, B.V. Dasarathy, ed. IEEE Computer Society Press, 1991.

[22] D. Martens, M. De Backer, R. Haesen, J. Vanthienen, M.

Snoeck, and B. Baesens, “Classification with Ant Colony O ptimization,” IEEE Trans. Evolutionary Computation, vol. 11, no. 5, pp. 651-665, Oct. 2007.

[23] D. Martens, B.B. Baesens, and T. Van Gestel,

“Decomposition Rule Extraction from Support Vector Machines by Active Learning,” IEEE Trans. Knowledge and Data Eng., vol. 21, no. 2, pp. 178-191,Dec. 2008. [24] D. Pedro and M. Pazzani, “On the Optimality of the

Simple Bayesian Classifier under Zero-One Loss,”

Machine Learning, vol. 29, pp. 103-137, 1997.

[25]G. Guo, “CR Dyer, Learning from Examples in the

Sample Case: Face Expres sion Recognition,” IEEE Trans.

Systems, Man, and Cybernetics, Part B, vol. 35, no. 3, pp.

477-488, June 2005.

[26]W.A. Chaovalitwon Se, Y.-J. Fan, and R.C. Sachdeo,

“Support Feature Machine for Classification of Abnormal Brain Activity,” Proc. ACM SIGMOD, pp. 113-122, 2007.

[27] B.P. Rachel, T. Shlomo, R. Alex, L. Anna, and K. Patrick,

“Multiplex Assessment of Serum Biomarker Concentrations in Well-Appearing Children with Inflicted Traumatic Brain Injury,” Pediatric Research, vol. 65, no. 1, pp. 97-102, 2009.

[28]S. Ola fsson, X. Li, and S. Wu, “Operations Research and

Data Mining,” European J. Operational Research, vol. 187, no. 3, pp. 1429-1448, 2008.

[29] A. Benos and G. Papanastasopoulos, “Extending the

Merton Model: A Hybrid Approach to Assessing Credit Quality,” Math and Computer Modelling, vol. 48, pp. 47-

68, 2007.

[30]R. Rymon, “An SE-Tree Based Characterization of the

Induction Problem,” Proc. Int?l Conf. Machine Learning, 1993.

[31]J.R. Quilan, “Induction of Decision Trees,” Machine

Learning, vol. 1, pp. 81-106, 1986.

[32]Apache Hadoop Architecture :

https://https://www.360docs.net/doc/a02022841.html,/docs/r1.2.1/hdfs_design.html [33]Google?s Technical Paper on Map Reduce,

https://www.360docs.net/doc/a02022841.html,/archive/mapreduce.html [34]Kumagai, J., “Mission impossible? [FBI computer

network]”, IEEE Spectrum, Volume:40, Issue: 4, April 2003.

[35] J. Cohen ,“Graph Twiddling in a MapReduce World.”,

Computing in Science & Engineering (Volume:11 , Issue: 4 ), June 2009.

[36] Shunmei Meng; Wanchun Dou ; Xuyun Zhang ; Jinjun

Chen, “KASR: A Keyword-Aware Service Recommendation Method on MapReduce for Big Data Applications ”, IEEE Transactions on Parallel and Distributed Systems, Volume:25 , Issue: 12, Dec. 2014 [37] Hormozi, H, Akbari, M.K., Hormozi, E.., Javan, M.S,

“Credit cards fraud detection by negative selection algorithm on hadoop (To reduce the training time)”, IEEE 5th Conference on Information and Knowledge Technology (IKT), May, 2013.

[38] Conejero, J.; Burnap, P.; Rana, O.; Morgan, J., “Scaling

Archived Social Media Data Analysis Using a Hadoop Cloud”, IEEE Sixth International Conference o n Cloud Computing (CLOUD), June, 2013.

[39] Xu, J.; Yu, Y.; Chen, Z.; Cao, B.; Dong, W.; Guo, Y.; Cao,

J.,”MobSafe: cloud computing based forensic analysis for massive mobile applications using data mining”, Tsinghua Science and Technology, Volume: 18, Issue: 4 , June 2013. [40] RBI Bank Transaction Statistics,

https://www.360docs.net/doc/a02022841.html,.in/scripts/NEFTUserView.aspx?Id=82. [41] What is BigData?, https://www.360docs.net/doc/a02022841.html,/en_us/insights/big-data/what-is-big-data.html.

Authors ’ Profiles

Lija Mohan (born April 4, 1988) is an Academician and Researcher who is very much interested in High Performance Computing Domain. She is more interested in Big Data Analytics and has done several projects based on that. She is a University rank holder for both Masters in Computer Science and Bachelors in Computer Science

Engineering. She has published several research works and handles sessions during Big Data workshops. She has got sufficient experience as a software engineer and teacher. She has won several fellowships and research grants including Windows Azure and AWS Research Grant.

Dr. Sudheep Elayidom M. is working as Associate Professor in Computer Science Division of School of Engineering for the past fifteen years. He secured 1st Rank in the University for both his Masters Degree and Bache lor?s Degree. He has several journal and conference publications to his credit and he is the author of the book titled …Data

Mining and Warehousing? published by Cengage publishers. He obtained Ph.D degree from CUSAT and he is a pioneer in Data Mining. He is actively involved in latest research trends including Cloud Computing, Big Data Analytics, Sentiment Analytics, etc.

How to cite this paper: Lija Mohan, Sudheep Elayidom M., "A Novel Big Data Approach to Classify Bank Customers - Solution by Combining PIG, R and Hadoop", International Journal of Information Technology and Computer Science (IJITCS), Vol.8, No.9, pp.81-90, 2016. DOI: 10.5815/ijitcs.2016.09.10

中国银行业分类

中国银行业分类(Jave) 一.中央银行 1.中国人民银行二.政策性银行 1.国家开发银行 2.中国进出口银行 3.中国农业发展银行三.国有商业银行 1.中国工商银行 2.中国农业银行 3.中国银行 4.中国建设银行 5.交通银行四.邮政银行 1.中国邮政储蓄银行五.全国性股份制银行 1.银行 2.中国光大银行 3.华夏银行 4.中国民生银行 5.广发银行 6.招商银行 7.兴业银行 8.浦东发展银行 9.平安银行 10.恒丰银行 11.浙商银行 12.渤海银行六.合资银行

1.中德住房储蓄银行 2.国际银行 3.华一银行 4.华商银行 5.嘉华银行（中国） 6.国际银行 7.浦发硅谷银行七.外资银行花旗银行（中国）、汇丰银行（中国）、渣打银行（中国）、瑞穗实业银行（中国）、三井住友银行（中国）、星展银行（中国）、三菱东京日联银行（中国）、格兰皇家银行（原荷兰银行）（中国）、华侨银行（中国）、摩根士丹利国际银行（中国）、摩根大通银行（中国）、国友利银行（中国）、大华银行（中国）、亚银行（中国）、国企业银行（中国）、德意志银行（中国）、汇理银行（中国）、华美银行（中国）、法国巴黎银行（中国）、汇理银行（中国）、新银行（中国）、国外换银行（中国）、泰国盘谷银行（中国）、菲律宾首都银行（中国）、正信银行、法国兴业银行（中国）、澳新银行（中国）、山口银行、横滨银行、名古屋银行、瑞士宝盛银行、南洋银行、阿联酋阿布扎比银行、KB国民银行（中国）、国釜山银行、瑞士银行（中国）、印度国家银行八.港资银行东亚银行（中国）、恒生银行（中国）、永亨银行（中国）、南洋商业银行（中国）、协和银行、大新银行（中国）九.台资银行永丰银行、土地银行、国泰世华银行、彰化商业银行、第一银行、合作金库银行、工业银行、台北富邦银行、玉山银行、中国信托商业银行、兆丰商业银行、中小企业银行十.城市商业银行银行天津天津银行银行银行市商业银行市商业银行银行银行银行

金融机构划分

关于金融统计中各类金融机构划分的最新通知境内银行业存款类金融机构：是指由中国银行业监督管理委员会监督管理的，吸收公众存款的金融机构。包括银行、城市信用社（含联社）、农村信用合作社（含联社）、农村资金互助社和财务公司以及三大国家政策性银行（国家开发银行、中国农业发展银行和中国进出口银行）。境内银行业非存款类金融机构：是指由中国银行业监督管理委员会监督管理的，不吸收公众存款的金融机构。包括信托公司、金融资产管理公司、金融租赁公司、汽车金融公司、贷款公司和货币经纪公司。境内证券业金融机构：是指由中国证券业监督管理委员会监督管理的，具备从事证券业合法资格的金融机构。包括证券公司、证券投资基金管理公司、期货公司和投资咨询公司。境内保险业金融机构：是指由中国保险业业监督管理委员会监督管理的，具备从事保险业合法资格的金融机构。包括财产保险公司、人身保险公司、再保险公司、保险资产管理公司、保险经纪公司、保险代理公司、保险公估公司和企业年金。境内交易及结算类金融机构：是指为金融交易提供交易场所、服务以及安全环境等活动，并具备一定监督管理职能的金融机构。包括交易所和登记结算公司。

境内金融控股公司：是指依据《中华人民共和国公司法》设立，拥有或控制一个或多个金融性公司，并且这些金融性公司净资产占全部控股公司合并净资产的50%以上，所属的受监管实体至少明显地在从事两种以上的银行、证券和保险业务独立企业法人。包括中央金融控股公司和其他金融控股公司。境内特殊目的载体（SPV）：是指为持有特定资产而对外发行新金融工具，并合法拥有该特定资产，具备完整独立账户的金融实体。包括证券投资基金、资金信托计划、代客理财项目、资产证券化项目。境内其他金融机构：除上述机构之外的其他金融机构。包括小额贷款公司等金融机构。

银行主要业务分类和简介

1负债业务存款业务、借款业务 2资产业务贷款业务、债券投资业务、现金资产业务 3中间业务交易业务、清算业务、支付结算业务、银行卡业务、代理业务、托管业务、担保业务、承诺业务理财业务、电子银行业务本章通过负债业务、资产业务和中间业务三大类对银行的主要业务进行介绍。负债业务是商业银行形成资金来源的业务，是商业银行资产业务和中间业务的重要基础。商业银行负债主要由存款和借款构成。存款包括人民币存款和外币存款两类，而借款包括短期借款和长期借款两大类。存款是商业银行最主要的资金来源，存款业务也是商业银行的传统业务。本章对银行负债业务的介绍主要针对存款业务。资产是银行过去的交易或事项形成的、由银行拥有或控制、预期会给银行带来经济利益的资源。商业银行的资产主要包括贷款、债券投资和现金资产三大类。贷款是商业银行最主要的资产，也是最主要的资金运用。本章对银行资产业务的介绍主要针对贷款业务。中间业务是指不构成商业银行表内资产、表内负债，形成银行非利息收入的业务，包括交易业务、清算业务、支付结算业务、银行卡业务、代理业务、托管业务、担保业务、承诺业务、理财业务和电子银行业务等。本章包括负债、资产、中间业务三节内容。 3.1负债业务商业银行的负债主要由存款和借款构成。存款包括人民币存款和外币存款两大类。人民币存款：又分为个人存款、单位存款和同业存款，外币存款又分为个人外汇存款和机构外汇存款。借款：包括短期借款和长期借款两大类。短期借款是指期限在一年或一年以下的借款。主要包括同业拆借、证券回购协议和向中央银行借款等。长期借款是指期限在一年以上的借款，一般采用发行金融债券

的银行性质分类

银行的分类我国银行分为中央银行、政策性银行、国有商业银行、股份制商业银行、地方性商业银行等几大类。 1、中央银行中央银行是国家中居主导地位的金融中心结构，负责制定并执行国家货币信用政策，干预和调控国家经济发展，中央银行独具货币发行权，实行金融监管，不对个人和企业办理银行业务。我国的中央银行仅有一家，即中国人民银行。 2、政策性银行政策性银行是由政府创立，专门为贯彻、配合政府社会经济政策或意图，在特定业务领域开展直接或间接的金融业务，非盈利目的专业性金融机构，充当政府发展经济、促进社会进步、进行宏观经济管理的工具。包括：国家开发银行、中国进出口银行、中国农业发展银行。国家开发银行是全球最大的开发性金融机构，中国最大的对外投融资合作银行、中国长期信贷银行和债券银行（2015年3月从政策性银行剥离，成为开发性金融机构） 3、国有商业银行国有商业银行，是由国家（财政部、中央汇金公司）直接管控的大型商业银行。包括：中国工商银行、中国农业银行、中国银行、中国建设银行（四大银行）、交通银行、中国邮政储蓄银行。 4、全国性质的股份制商业银行股份制商业银行是相对国有商业银行公有制性质的,股份制商业银行属于股份制,有许多投资资金在里面，换句话说，就是非国有资本参股银行。包括：招商银行、浦发银行、中信银行、中国光大银行、华夏银行、中国民生银行、广发银行、兴业银行、平安银行、浙商银行、恒丰银行、渤海银行（十二大股份制商业银行） 5、城市商业银行城市商业银行是中国银行业的特殊群体，前身是20世纪80年代设立的城市信用社，业务定位为中小企业提供金融支持，为地方经济搭桥铺路。严格来说，城市商业银行属于商业银行的一种，我国共有144 家城市商业银行。包括：北京银行、上海银行、江苏银行、南京银行、宁波银行等。 6、农村信用社农村信用合作社是由中国人民银行批准、社员入股组成的农村合作金融机构、主要任务是筹集农村闲散资金，为农业、农民和农村经济发展提供金融服务。我国共有2265家农村信用社。 7、农村商业银行农村商业银行，是由地方农民、农村工商户、企业法人和其他经济组织共同入股组成的股份制的地方性金融机构。我国共有212家农村商业银行和190家农村合作银行。 8、村镇银行村镇银行是由中国银行业监督管理委员会批准，由境内外金融机构、境内非金融机构企业法人、境内自然人出资，在农村地区设立的银行业金融机构，主要职责为当地农民、农业和农村经济发展提供金融服务我国共有635家村镇银行。

商业银行业务分类讲解

商业银行业务分类大全来看看银行都干了些什么，也就是他们的业务细分。最通用的分类是：负债业务（商业银行形成资金来源的业务），资产业务（商业银行运用资金的业务），中间业务（银行不需运用自己的资金，代客户承办支付和其他委托事项而收取手续费的业务）一、资产业务资产业务，是商业银行的主要收入来源。 1、贷款(放款)业务--商业银行最主要的资产业务 1）信用贷款：信用贷款，指单凭借款人的信誉，而不需提供任何抵押品的贷款，是一种资本贷款。（1）普通借款限额：企业与银行订立一种非正式协议，以确定一个贷款，在限额，企业可随时得到银行的贷款支持，限额的有效期一般不超过90天。普通贷款限额的贷款，利率是浮动的，与银行的优惠利率挂钩。（2）透支贷款：银行通过允许客户在其上透支的方式向客户提供贷款。提供这种便利被视为银行对客户所承担的合同之外的“附加义务”。（3）备用贷款承诺：备用贷款承诺，是一种比较正式和具有法律约束的协议。银行与企业签订正式合同，在合同中银行承诺在指定期限和限额向企业提供相应贷款，企业要为银行的承诺提供费用。（4）消费者贷款：消费者贷款是对消费个人发放的用于购买耐用消费品或支付其他费用的贷款，商业银行向客户提供这种贷款时，要进行多方面的审查。（5）票据贴现贷款：票据贴现贷款，是顾客将未到期的票据提交银行，由银行扣除自贴现日起至到期日止的利息而取得现款。 2）抵押贷款：抵押贷款有以下几种类型（1）存货贷款。存货贷款也称商品贷款，是一种以企业的存贷或商品作为抵押品的短期贷款。（2）客帐贷款。银行发放的以应收帐款作为抵押的短期贷款，称为“客帐贷款”。这种贷款一般都为一种持续性的信贷协定。（3）证券贷款。银行发放的企业借款，除以应收款和存货作为抵押外，也有不少是用各种证券特别是公司企业发行的股票和债券作押的。这类贷款称为“证券贷款”。（4）不动产抵押贷款。通常是指以房地产或企业设备抵押品的贷款。 3）保证书担保贷款：保证书担保贷款，是指由经第三者出具保证书担保的贷款。保证书是保证为借款人作贷款担保，与银行的契约性文件，其中规定了银行和保证人的权利和义务。银行只要取得经保证人签字的银行拟定的标准格式保证书，即可向借款人发放贷款。所以，保证书是银行可以接受的最简单的担保形式。 4）贷款证券化：贷款证券化是指商业银行通过一定程序将贷款转化为证券发行的总理资过程。具体做法是：商业银行将所持有的各种流动性较差的贷款，组合成若干个资产库（Assets Pool），出售给专业性的融资公司（Special Purpose Corporation），

中国银行板块分类

中国银行主要业务中国银行拥有一个独特的全方位金融服务平台，提供商业银行、投资银行、保险、资产管理、飞机租赁和其他金融服务，能够满足不同客户的复杂业务需求。商业银行业务商业银行业务是中国银行的传统主营业务，包括公司金融业务、个人金融业务及金融市场业务（主要指资金业务）。公司金融业务公司金融业务为中国银行业务利润的主要来源。2007年，公司金融继续以完善客户服务体系、促进业务整体联动、加强产品创新及实施管理转型为重点，组建公司金融板块，加强条线管理。中国银行实行服务重点大型优质公司客户的发展战略，关注于与大型优质客户的长期合

作关系，同时明确中小企业业务是公司金融业务的重要组成部分，致力成为中小企业高效、专业、能够满足全面需求的合作伙伴。 ?存款业务中国银行积极应对资本市场快速发展对人民币公司存款业务的冲击，大力发展人民币公司存款业务。 ?贷款业务中国银行继续强化贷款结构调整，加大对重点支持类行业的投入，实现信贷资源优化配置。 ?金融机构业务中国银行注重与金融机构的全面合作，通过互荐客户、资源共享和共同开发新产品，为客户提供更加全面的服务。中国银行亦通过纽约、法兰克福和东京分行进行美元、欧元和日元清算，上述分行和新加坡分行均为当地一级清算银行。 ?国际结算及贸易融资业务国际结算业务是中国银行优势业务。中国银行加强境内外机构联动，实现国际结算及贸易融资业务快速发展。 ?其他公司金融业务中国银行提供支付结算业务，主要包括银行汇票、本票、支票、汇兑、银行承兑汇票、委托收款、托收承付、集中支付、支票圈存及票据托管等。产品服务创新中国银行配合公司客户最新业务需求，组合和创新公司金融产品；加大与金融同业的产品合作，积极开展同业间公司信贷资产的转让业务；推出融易达（基于应收账款的融资服务）、通易达（应收账款质押开证）、融信达（基于投保出口信用险的应收账款的融资服务）和融货达（货物质押融资）等产品，进一步丰富了“达”系列贸易融资产品种类；推出隐蔽型出口保理、D/A银行保付票据项下福费廷等新产品；顺应全球贸易主流结算方式的变化，在中国内地同业中首批加入SWIFT组织服务设施平台（TSU），实现国内首笔TSU真实交易。中国银行汲取国内外同业成功做法，借助战略投资者的经验，改进中小企业业务模式；修订中小企业授信政策制度，简化中小企业信贷业务操作流程；根据中小企业融资需求的特点，推出中小企业融资产品“快富易”，为中小企业客户提供短期融资支持。中国银行整合清算资源，在中国内地首家推出融海外分行与代理行服务于一体的系列支付产品“全额到账”、“台湾汇款”、“优先汇款”、“特殊汇款服务”，实现海外行与代理行业务共同发展，最大限度扩展了产品的覆盖面，填补了市场空白。其中，“台湾汇款”产品改变了该项业务一直由代理行包揽的局面，拓展了业务市场。个人金融业务个人金融业务为中国银行战略发展重点之一。中国银行继续完善个人金融业务的经营管理体制和运营机制，组建个人金融板块，加强个人业务条线管理；重点推进网点经营方式转型、

银行客户分类方案

银行客户分类方案根据总行下发的《2015年公司业务指导意见》，银行将坚定不移的推进以客户为中心的营销战略，进一步优化客户结构，实现客户分层分类营销管理工作。一、客户分类方案（一）分类思路根据总行《2015年公司业务指导意见》、《公司客户管理指导意见》以及工信部下发的《统计上大中小微型企业划分办法》，综合考虑企业类型、行业状况、市场地位、信用状况、财务状况、发展潜力、综合贡献度等多种因素，将客户划分为战略客户、优质客户、持续贡献客户以及潜在合作客户，并逐步实现分行管战略客户、优质客户（即分行负责客户高级管理层营销，支行负责日常服务以及基层员工的对接工作），支行管持续贡献客户、潜在合作客户（即支行负责企业整体的营销工作，分行对支行进行辅助营销）的服务营销范围体制。（二）分类标准 1.战略客户是指垄断能力强、品牌价值高、市场覆盖面广、风险较小、与我行具有长期高端战略合作的伙伴关系、对我行业务发展和功能完善具有较强的支撑和拉动效应的客户。

主要分类条件为：（1）国务院国资委直属的中央大型、特大型企业集团，省国资委控股龙头企业；具有垄断性质的大型优质企事业单位；国内500强企业，行业100强企业；大型总部型、集团型、连锁型客户；优良上市公司。（2）与我行签订战略合作协议，并能开展全面排他性战略合作。（3）能利用客户自身的行业优势、区域垄断性、品牌效应、资源优势及产品销售网络等资源，加速我行多元化营销渠道建设。（4）资产规模较大，财务结构合理，现金流充足，主营业务突出，市场拓展能力强，利润率不低于同业平均水平。（5）信誉良好，无不良信用记录，企业或其法定代表人无涉诉、履约纠纷等情况，信用等级评定均为A级以上（含）。（6）原则上60%结算业务在我行办理，50%关联客户或上下游客户与我行开展业务合作。包括但不限于拉动我行存款业务、理财业务、代理业务、POS机收单业务、代发工资业务、国际业务、票据业务、投资业务等，在我行相比其他金融机构的匹配量和占比量较大。 2.优质客户是指具有一定品牌效应，经营持续成功，信用情况良好，与我行业务合作时间相对较长，合作关系相对

中国银行分类_各银行区别

安徽银行招聘网：https://www.360docs.net/doc/a02022841.html,/ (1)银行业金融机构包括3家政策性银行(国家开发银行、中国进出口银行、中国农业发展银行)，5家大型商业银行(中、农、工、建、交)，12家股份制商业银行(中信、华夏、招商、深发、光大、民生、浦发、渤海、广发、兴业、恒丰、浙商)，144家城市商业银行，212家农村商业银行，190家农村合作银行，2265家农村信用社，1家邮政储蓄银行，4家金融资产管理公司，40家外资法人金融机构，66家信托公司，127家企业集团财务公司，18家金融租赁公司。4家货币经纪公司，14家汽车金融公司，4家消费金融公司，635家村镇银行，10家贷款公司以及46家农村资金互助社。我国银行业金融机构共有法人机构3800家，从业人员319.8万人。 (2)中公金融人就几家规模比较大的银行详细讲述一下它们的不同点：国有四大银行是中国银行、中国建设银行、中国工商银行和中国农业银行。这四个银行总体功能上有很多的相似之处。例如个人金融服务、企业金融服务电子银行等。虽然有很多相似处，但还有很多不同的地方。它们各自负责的区域还是有所不同的。中国银行，其网络银行具有账户查询、网上支付、境外账户管理、银企对接、提供资金汇划、批量委托、授权模式定制、预约付款、定向支付、历史交易查询、代发工资、代理报销、网上支付通关税费、异地报关等功能。中国建设银行，其网络银行具有账户查询、网上缴费、网上支付、账户转账、网上速汇通、银证业务、证券业务、外汇买卖、批量业务等功能。中国工商银行，其网络银行具有账户信息查询、转账付款、在线支付、代理业务、网上支付等功能，提供了动态的演示,使人感到很方便，安全性能是最高的，另外还划分出了电话银行、企业银行、手机银行。中国农业银行，其网站功能相对较简单。其网络银行具有账户查询、账户密码修改、网上临时挂失、转账、网上缴费、网上支付等功能。网上支付分为B2B、B2C。工商银行职能比较键全,因为它不仅包含了三大银行的基本功能,而且还提供了动态的演示,使人感到很方便，安全性能是最高的，要安装补丁和了解安全迷钥，而且工商银行另外划分出了电话银行、企业银行、手机银行!中国银行的网上银行提供的功能也不错，其提供的功能主要是为企业服务的，像代发工资、银企对接等功能;中国建设银行的网上银行提供的功能在个人和企业方面比较平衡，不像中国银行那样偏重其中一方;而中国农业银行的网上银行的功能最少，只开设了一些基本功能，对其它一些功能，如银证转账、网上汇款、网上质押贷款、国债买卖等，尚未开设。

(完整word版)银行客户分类标准

××银行企业金融客户分层分类标准企业金融客户分层分类管理是企业金融客户基础建设的重要组成部分。随着客户和产品向多层次、多元化发展，为更客观有效地评价客户与本行合作的成效，增强客户评定标准的直观性，以客户的经济资本收益这一指标为核心，进一步完善本行企业金融客户评价体系，现将本行企业金融客户分层分类标准明确如下：一、企业金融客户分层 ××银行企业金融客户分为企业客户和机构客户（党政机关、事业单位及社会团体客户）。公立医院、学校和设计院纳入机构客户管理；个体工商户纳入零售业务条线管理。（一）企业客户分层标准企业客户区分信用客户和非信用客户，实行不同分层标准。 1、信用客户按照客户总资产指标分层：（1）特大型：总资产≥300亿；（2）大型：6亿≤总资产＜300亿；（3）中型：0.6亿≤总资产＜6亿；（4）小型：0.1亿≤总资产＜0.6亿；（5）小微型：总资产＜0.1亿。总资产以客户最近一期的年度财务报表数据为准，本年新成立的客户以最新一期报表数据为准。 2、非信用客户按照客户注册资本指标分层： —1—

（1）特大型：注册资本≥50亿；（2）大型：1亿≤注册资本＜50亿；（3）中型：0.1亿≤注册资本＜1亿；（4）小型：0.05亿≤注册资本＜0.1亿；（5）小微型：注册资本＜0.05亿。（二）机构客户分层标准按照行政隶属关系分为大型、中型、小型： 1、大型:省、部级、直辖市党政机关、行政事业、社会团体及所属单位和工会； 2、中型：地、市、区（县）级党政机关、行政事业、社会团体及所属单位和工会； 3、小型：乡、镇、村级组织所属单位和工会。二、企业金融客户分类（一）分类定义客户分类指在客户分层的基础上，根据客户对本行的贡献度和与本行合作的紧密度将企业金融客户由高到低分为：战略基础客户、优质基础客户、有效基础客户、培育型客户、调整型客户。各类客户均无包含关系。 1、战略基础客户是对本行的贡献度和与本行合作的紧密度最高的客户群体。 2、优质基础客户是对本行的贡献度较高的客户群体。 3、有效基础客户是对本行贡献度达到一定水平，达到本行—2—

中国银行业绿色信贷体系

中国银行业绿色信贷体系中国金融信息网2016年12月26日10:38分类：分析报告 ?打印 ?大中小 ?我要评论核心提示：目前中国已基本建立以《绿色信贷指引》为核心的绿色信贷制度框架，对银行业金融机构开展节能环保授信和绿色信贷的政策界限、管理方式、考核政策等做出明确规定，确保信贷资金投向低碳、循环、生态领域。本文将回顾中国绿色信贷体系的发展历程，探析绿色信贷体系构成，以及展望未来发展的趋势。钱立华兴业银行环境金融部 2016年8月中国人民银行、财政部、银监会等7部委联合印发的《关于构建绿色金融体系的指导意见》（银发【2016】228号）对绿色金融、绿色金融体系都做了明确的定义，并提出大力发展绿色信贷。目前全球只有中国、巴西和孟加拉国三个国家有正式的绿色信贷统计，同时中国的绿色信贷框架体系处于国际相对领先的位置。据2016年中国银监会绿色信贷新闻发布会公布，2007年以来，银监会陆续出台《节能减排授信工作指导意见》（2007年）、《绿色信贷指引》（2012年）、《能效信贷指引》（2015年），相关政策在国际上均是首创。目前中国已基本建立以《绿色信贷指引》为核心的绿色信贷制度框架，对银行业金融机构开展节能环保授信和绿色信贷的政策界限、管理方式、考核政策等做出明确规定，确保信贷资金投向低碳、循环、生态领域。本文将回顾中国绿色信贷体系的发展历程，探析绿色信贷体系构成，以及展望未来发展的趋势。一、中国绿色信贷体系发展历程近年来，生态文明建设、可持续发展的重大政策，连续出台，绿色发展已经成为国家战略。2012年党的“十八大报告”突出生态文明建设，将生态文明建设纳入建设中国特色社

中国银行分类

1民营资本逐利是资本的天性。民营资本渴望进入银行业的主要原因在于：一是，进入银行业后能为企业搭建一个平台，使得企业在更大的范围内利用金融资源；二是，银行业盈利水平虽然收窄，但前景依然可期；三是，民营银行将来可以上市，进而套现。毋庸赘言，民营资本进入银行业的意义十分重大。一方面，能缓解实体经济融资成本高、金融资源结构性短缺的局面。另一方面，可以改善中小企业融资难的现状，进而促进民营企业的发展，也有利于金融业形成更合理的竞争格局。民间资本进入银行业，成为金融改革的重要力量，将会使我国金融资源配置上存在的问题得到一定程度的缓解，更将产生鲶鱼效应，为现有银行业引入竞争机制。会使金融业竞争格局更合理我国经济的发展和城镇化逐步深入，需要更多金融机构深入基层搞服务。所以，探索设立民间资本发起的自担风险的民营银行和金融租赁公司、消费金融公司等，对解决小微企业融资难、农民贷款难问题、促进农村经济发展很重要。某种意义上，小银行必须选择众多小客户，才能生存下去。未来互联网金融与传统金融业将逐渐融合从去年开始，互联网金融发展迅猛，各种“宝”让老百姓的理财生活变得更加丰富。互联网金融的发展，激发传统金融业加速了互联网化进程，也促使互联网企业不断在金融领域寻找突破口。熊津成认为，未来互联网金融与传统金融业将逐渐趋于融合，金融业的分工体系也会进一步完善。由此看来，民营银行已经蓄势待发，他们将给现有的金融格局带来哪些变化，谁也无法准确预测。不过可以预见的是，民营银行的开设必将为百姓金融基本诉求带来更好的服务，为广大中小微企业带来更多的融资途径。虽然刚刚起步的民营银行规模较小，但我们相信民营银行准入有利于以小促大，刺激更大商业银行从内至外进行深刻的变革。一系列的变革最终将是百姓受益。期待第一批民营银行获准后，第二批中能够看到我们本地民营企业发起成立的民营银行，让本地市民切实体验到银行业新格局带来的实惠。民营资本进入银行业的意义 1.1 弥补我国银行资本充足率不足商业银行应具有适当数量的自有资本金。但是，我国商业银行的资本金的充实和积累显得跟不上经营规模的快速增长。民营资本进人银行业是提高银行资本充足率积极、有效的手段。有利于银行自身资本金的充足和规模的扩展。由于中小银行在业务扩展方面往往受到资本充足率限制，而银行负担较高税率和相对较低的利润率，使得其自身积累无法满足发展的需要，

中国银行分类

第一类：国有商业银行中国工商银行" 中国农业银行" 中国银行" 中国建设银行" 第二类：政策性银行国家开发银行中国进出口银行" 中国农业发展银行" 第三类：股份制商业银行交通银行" 中信实业银行" 中国光大银行华夏银行" 深圳发展银行" 广东发展银行" 招商银行" 上海浦东发展银行" 中国民生银行福建兴业银行" 恒丰银行" 第四类：合资银行福建亚洲银行" 华商银行" 华一银行" "青岛国际银行上海巴黎国际银行" 厦门国际银行" 浙江商业银行" 第五类：城市商业银行（1）直辖市和省会城市： "上海银行北京银行（原为北京市商业银行）" 天津市商业银行"

石家庄市商业银行" "太原市商业银行呼和浩特市商业银行"沈阳市商业银行" 长春市商业银行" 哈尔滨市商业银行"南京市商业银行" "杭州市商业银行合肥市商业银行" 福州市商业银行" 南昌市商业银行" 济南市商业银行" 郑州市商业银行" "武汉市商业银行长沙市商业银行" 广州市商业银行" 南宁市商业银行" 成都市商业银行" 贵阳市商业银行" "昆明市商业银行西安市商业银行" 兰州市商业银行" 西宁市商业银行" 银川市商业银行" "乌鲁木齐市商业银行（2）地级市：淄博市商业银行" 深圳市商业银行" 无锡市商业银行" 大连市商业银行" "青岛市商业银行厦门市商业银行" 东莞市商业银行" 锦州市商业银行" 开封市商业银行" 镇江市商业银行" "烟台市商业银行台州市商业银行" 德阳市商业银行" 衡阳市商业银行" 吉林市商业银行" 张家口市商业银行"

温州市商业银行" 南阳市商业银行" 珠海市商业银行" 廊坊市商业银行" 万州市商业银行" "泉州市商业银行洛阳市商业银行" 湛江市商业银行" 桂林市商业银行" 包头市商业银行" 宁波市商业银行" "汕头市商业银行新乡市商业银行" 营口市商业银行" 泰安市商业银行" 绍兴市商业银行" 大庆市商业银行" "齐齐哈尔市商业银行鞍山市商业银行" 丹东市商业银行" 抚顺市商业银行" 阜新市商业银行" 葫芦岛市商业银行" "辽阳市商业银行沧州市商业银行" 秦皇岛市商业银行"唐山市商业银行" 盘锦市商业银行" 大同市商业银行" "安庆市商业银行蚌埠市商业银行" 淮北市商业银行" 马鞍山市商业银行"芜湖市商业银行" 常州市商业银行" "淮安市商业银行连云港市商业银行"苏州市商业银行" 徐州市商业银行" 盐城市商业银行" 扬州市商业银行" "赣州市商业银行九江市商业银行"

银行客户分类问题

评阅编号（由校组委会评阅前进行编号）：

编号专用页评阅编号(由校组委会评阅前进行编号): 评阅记录: 评奖结果:

银行信贷业务问题摘要随着经济的快速发展，银行越来越重视客户的分类，对于银行来说，一个新客户的到来，银行应该针对该客户的信息，判断客户可能的类别，然后采用针对性较强的销售策略，以获得最高的效益。本文就是一个典型的银行客户分类问题，第一问我们运用支持向量机模型把银行客户分成有贷款和无贷款的，把附件bank1中的数据作为训练集，将其中的客户资料进行量化，构造出分类函数) x g f y+ = = x =，把数据 )) wx ( sgn( sgn( (b ) 带进去当1 y时此客户是无贷款的，运用支持向 = = - y时此客户为有贷款的，当1 量机计算出参数w和b，再从附件bank-full中随机抽取10%的数据作为检测集进行检验得到准确率为97.1688%。第二问我们构造决策树模型对有贷款和无贷款的客户进行细分，我们把附件bank1中数据分为有贷款和无贷款的,分别建立决策树。我们只选取年龄、工作、婚姻状况、教育程度、信贷违约、年平均余额这六个属性，把是否信贷违约看做分类标识，先对数据进行量化分类，再分别算出它们的信息增益，根据算出的信息增益值的大小，对属性进行排序确定叶节点画出决策树，把决策树的每一个从根到叶节点的路径作为一个分类，由此我们把有贷款的无贷款的都细分为六类。第三问分为两小问来解答：（1）判断此客户是否可能购买贷款产品，我们任意给出一个客户资料，把客户资料量化后代入第一问中的模型得出1 y，因此 = 我们判断此客户有可能购买贷款产品。（2）建议其购买哪种贷款产品，我们再把客户资料代入第二问中的模型判断出此客户属于有贷款中的第二类，由每类客户的购买建议，我们推荐他购买短期的担保贷款。关键词：分类问题支持向量机决策树信息增益

客户分级分类体系

客户分级分类体系在个人理财版块，汇丰银行以客户的“全面理财总值”为依据，也就是说，我们依据客户和银行之间现金流的总数为标准进行分割。并通过对CRM系统中庞大的客户信息和销售数据进行分析，引入客户的活跃度、客户资产和客户生命周期等维度对他们进行分类，最终得到精准的客户分级分类。汇丰银行通过对客户的持续研究和跟踪调查，把她的客户主要分为以下七类：顶级客户这个类别中的客户的全面理财总值超过10亿港元。他们是汇丰银行卓越理财客户，也是占汇丰银行个人理财部门客户总数5%的那部分价值最高的客户。他们一般有汇丰银行有许多活跃的帐户（使用频率高，近期使用频繁）；使用了汇丰银行一系列的产品和服务；愿意支付贴水费, 把产品推荐给其他人；正处在职业生涯的上升期，他们为银行创造的收入要大于银行为此付出的成本。大型客户这个类别中的客户的全面理财总值超过100万港元。他们也是汇丰银行的卓越理财客户，客户价值位于顶级客户之后，占总人数的15%。他们在汇丰银行有一些活跃的帐户（使用频率低，近期使用不频繁），使用了汇丰银行一些的产品和服务，他们愿意支付的价格极富弹性，正处在职业生涯的上升期，他们为银行创造的收入要大于银行为此付出的成本。中型客户这个类别中的客户的全面理财总值超过2万港元。他们是汇丰银行的运筹理财客户，占总人数的60%，是个人理财部门比例最大的一个客户群。他们在汇丰银行有一些活跃的帐户（使用频率低，近期使用不频繁），他们使用了汇丰银行一系列的产品和服务，他们创造的边际利润不尽如人意，他们不会和汇丰银行发生大笔交易。小型客户这个类别中的客户的全面理财总值在2万港元以下。他们是个人理财部门的常规客户，占总人数的20%。他们在汇丰银行有一些活跃的帐户（使用频率低，近期使用不频繁），他们使用了汇丰银行一系列的产品和服务，他们仅和汇丰银行做小笔交易，他们未来很难创造更高的价值。非活跃客户这些客户的帐户处于“休眠”或者“结清”状态。休眠帐户指那些比如说2年，甚至更长时间都没有使用过的帐户。结清帐户是指客户申请结清的帐户。潜在客户这些客户使用汇丰银行别的部门的产品，比如公司理财。银行内有他们的一些数据，并且和他们保持着联络。怀疑对象这是其他银行的客户。汇丰银行搜集到他们有关数据，但还没有和他们进行联系过。

中国银行业绿色信贷体系

核心提示：目前中国已基本建立以《绿色信贷指引》为核心的绿色信贷制度框架，对银行业金融机构开展节能环保授信和绿色信贷的政策界限、管理方式、考核政策等做出明确规定，确保信贷资金投向低碳、循环、生态领域。本文将回顾中国绿色信贷体系的发展历程，探析绿色信贷体系构成，以及展望未来发展的趋势。钱立华兴业银行环境金融部 2016年8月中国人民银行、财政部、银监会等7部委联合印发的《关于构建绿色金融体系的指导意见》（银发【2016】228号）对绿色金融、绿色金融体系都做了明确的定义，并提出大力发展绿色信贷。目前全球只有中国、巴西和孟加拉国三个国家有正式的绿色信贷统计，同时中国的绿色信贷框架体系处于国际相对领先的位置。据2016年中国银监会绿色信贷新闻发布会公布，2007年以来，银监会陆续出台《节能减排授信工作指导意见》（2007年）、《绿色信贷指引》（2012年）、《能效信贷指引》（2015年），相关政策在国际上均是首创。目前中国已基本建立以《绿色信贷指引》为核心的绿色信贷制度框架，对银行业金融机构开展节能环保授信和绿色信贷的政策界限、管理方式、考核政策等做出明确规定，确保信贷资金投向低碳、循环、生态领域。本文将回顾中国绿色信贷体系的发展历程，探析绿色信贷体系构成，以及展望未来发展的趋势。一、中国绿色信贷体系发展历程近年来，生态文明建设、可持续发展的重大政策，连续出台，绿色发展已经成为国家战略。2012年党的“十八大报告”突出生态文明建设，将生态文明建设纳入建设中国特色社会主义“五位一体”的总体布局；2014年4月，新《环境保护法》颁布，环境立法修法进程加快；2015年5月，中共中央政治局审议通过了《关于加快推进生态文明建设的意见》，将“绿色化”与新型工业化、城镇化、信息化、农业现代化并列，生态文明建设的力度空前；2015年9月，召开的中共中央政治局会议，审议通过了《生态文明体制改革总体方案》，成为生态文明领域改革的顶层设计和部署；2015年10月，十八届五中全会提出“绿色发展”，与创新发展、协调发展、开放发展、共享发展一道，成为指导我国“十三五”时期发展甚至是更为长远发展的科学发展理念和发展方式；2016年3月，两会审查和批准《国民经济和社会发展第十三个五年规划纲要(草案)》，增加了环境质量的考核指标，并首次将PM2.5(细颗粒物)写入指标。这些年，国家还连续出台了《水污染防治行动计划》（俗称“水十条”）、《大气污染防治行动计划》（俗称“气十条”）、《中华人民国大气污染防治法》、《土壤污染防治行动计划》（俗称“土十条”）等众多环境领域的政策、法规、制度、规划等。

中国银行分类-各银行区别

中国银行分类-各银行区别 (1)银行业金融机构包括3家政策性银行(国家开发银行、中国进出口银行、中国农业发展银行)，5家大型商业银行(中、农、工、建、交)，12家股份制商业银行(中信、华夏、招商、深发、光大、民生、浦发、渤海、广发、兴业、恒丰、浙商)，144家城市商业银行，212家农村商业银行，190家农村合作银行，2265家农村信用社，1家邮政储蓄银行，4家金融资产管理公司，40家外资法人金融机构，66家信托公司，127家企业集团财务公司，18家金融租赁公司。4家货币经纪公司，14家汽车金融公司，4家消费金融公司，635家村镇银行，10家贷款公司以及46家农村资金互助社。我国银行业金融机构共有法人机构3800家，从业人员319.8万人。 (2)甘肃银行招聘网就几家规模比较大的银行详细讲述一下它们的不同点：国有四大银行是中国银行、中国建设银行、中国工商银行和中国农业银行。这四个银行总体功能上有很多的相似之处。例如个人金融服务、企业金融服务电子银行等。虽然有很多相似处，但还有很多不同的地方。它们各自负责的区域还是有所不同的。中国银行，其网络银行具有账户查询、网上支付、境外账户管理、银企对接、提供资金汇划、批量委托、授权模式定制、预约付款、定向支付、历史交易查询、代发工资、代理报销、网上支付通关税费、异地报关等功能。中国建设银行，其网络银行具有账户查询、网上缴费、网上支付、账户转账、网上速汇通、银证业务、证券业务、外汇买卖、批量业务等功能。中国工商银行，其网络银行具有账户信息查询、转账付款、在线支付、代理业务、网上支付等功能，提供了动态的演示,使人感到很方便，安全性能是最高的，另外还划分出了电话银行、企业银行、手机银行。中国农业银行，其网站功能相对较简单。其网络银行具有账户查询、账户密码修改、网上临时挂失、转账、网上缴费、网上支付等功能。网上支付分为B2B、B2C。工商银行职能比较键全,因为它不仅包含了三大银行的基本功能,而且还提供了动态的演示,使人感到很方便，安全性能是最高的，要安装补丁和了解安全迷钥，而且工商银行另外划分出了电话银行、企业银行、手机银行!中国银行的网上银行提供的功能也不错，其提供的功能主要是为企业服务的，像代发工资、银企对接等功能;中国建设银行的网上银行提供的功能在个人和企业方面比较平衡，不像中国银行那样偏重其中一方;而中国农业银行的网上银行的功能最少，只开设了一些基本功能，对其它一些功能，如银证转账、网上汇款、网上质押贷款、国债买卖等，尚未开设。点此查看>>

商业银行业务分类大全

商业银行业务分类大全标准化管理部编码-[99968T-6889628-J68568-1689N]

商业银行业务分类大全最通用的分类是：负债业务（商业银行形成资金来源的业务），资产业务（商业银行运用资金的业务），中间业务（银行不需运用自己的资金，代客户承办支付和其他委托事项而收取手续费的业务）一、资产业务资产业务，是商业银行的主要收入来源。 1、贷款(放款)业务--商业银行最主要的资产业务 1）信用贷款：信用贷款，指单凭借款人的信誉，而不需提供任何抵押品的贷款，是一种资本贷款。（1）普通借款限额：企业与银行订立一种非正式协议，以确定一个贷款，在限额内，企业可随时得到银行的贷款支持，限额的有效期一般不超过90天。普通贷款限额内的贷款，利率是浮动的，与银行的优惠利率挂钩。（2）透支贷款：银行通过允许客户在其帐户上透支的方式向客户提供贷款。提供这种便利被视为银行对客户所承担的合同之外的“附加义务”。（3）备用贷款承诺：备用贷款承诺，是一种比较正式和具有法律约束的协议。银行与企业签订正式合同，在合同中银行承诺在指定期限和限额内向企业提供相应贷款，企业要为银行的承诺提供费用。（4）消费者贷款：消费者贷款是对消费个人发放的用于购买耐用消费品或支付其他费用的贷款，商业银行向客户提供这种贷款时，要进行多方面的审查。（5）票据贴现贷款：票据贴现贷款，是顾客将未到期的票据提交银行，由银行扣除自贴现日起至到期日止的利息而取得现款。 2）抵押贷款：抵押贷款有以下几种类型（1）存货贷款。存货贷款也称商品贷款，是一种以企业的存贷或商品作为抵押品的短期贷款。（2）客帐贷款。银行发放的以应收帐款作为抵押的短期贷款，称为“客帐贷款”。这种贷款一般都为一种持续性的信贷协定。（3）证券贷款。银行发放的企业借款，除以应收款和存货作为抵押外，也有不少是用各种证券特别是公司企业发行的股票和债券作押的。这类贷款称为“证券贷款”。（4）不动产抵押贷款。通常是指以房地产或企业设备抵押品的贷款。 3）保证书担保贷款：保证书担保贷款，是指由经第三者出具保证书担保的贷款。保证书是保证为借款人作贷款担保，与银行的契约性文件，其中规定了银行和保证人的权利和义务。

商业银行业务分类大全

银行客户分类方案-银行客户活动方案

银行客户分类方案根据总行下发的《ⅩⅩ年公司业务指导意见》，银行将坚定不移的推进以客户为中心的营销战略，进一步优化客户结构，实现客户分层分类营销管理工作。一、客户分类方案（一）分类思路根据总行《ⅩⅩ年公司业务指导意见》、《公司客户管理指导意见》以及工信部下发的《统计上大中小微型企业划分办法》，综合考虑企业类型、行业状况、市场地位、信用状况、财务状况、发展潜力、综合贡献度等多种因素，将客户划分为战略客户、优质客户、持续贡献客户以及潜在合作客户，并逐步实现分行管战略客户、优质客户（即分行负责客户高级管理层营销，支行负责日常服务以及基层员工的对接工作），支行管持续贡献客户、潜在合作客户（即支行负责企业整体的营销工作，分行对支行进行辅助营销）的服务营销范围体制。（二）分类标准 1.战略客户是指垄断能力强、品牌价值高、市场覆盖面广、风险较小、与我行具有长期高端战略合作的伙伴关系、对我行业务发展和功能完善具有较强的支撑和拉动效应的客户。