Basic Principles of Learning and Data Mining (UCI CS273a)
Professor Dr. Hans Hofmann Institut f ur Statistik und Ökonometrie University in Hamburg FB Wirtschaftswissenschaften Von-Melle-Park in May 2000 Hamburg 13.
Two sets of data are provided all the original data, as provided by Professor Hofmann, contains symbolic categorical attributes and is in the file.
For algorithms that require numeric attributes, Strathclyde University produced the This file has been modified and several indicator variables added to make it suitable for algorithms that can not cope with variables Several attributes that are classified categorical such as the 17 attributes were coded as Integer was the form used by Statlog.
This set of data requires the use of a cost matrix see below.
The lines represent the actual classification and columns of the expected classification.
It is worse to class customers as well when they are bad 5 that class as bad a customer when they are good one.
Attribute 1 Quality Assessment A11 200 wage assignments DM existing checking account for at least 1 year A14 no checking account.
Attribute 3 A30 qualitative credit history no credit taken all loans repaid duly A31 all credits to the bank duly repaid existing loans A32 duly repaid far A33 delay refund in the last A34 other critical accounts not existing credits this bank.
Attribute 4 Purpose qualitative car A40 new A41 used car A42 A43 A44 radio equipment television appliances repair furniture A45 A46 A47 education vacation - does not exist A48 A49 recycling other business A410.
Attibute 6 qualitative savings bonds represent the A61 A65 unknown DM 1000 no savings account.
Attribute 7 qualitative employment Present for unemployed A71 A72 7 years.
Attribute 8 percentage rate digital deposit of disposable income.
Attribute qualitative Staff Regulation 9 and the male A91 A92 divorced separate separated divorced woman married only male A93 A94 A95 married man widowed single woman.
Attribute Other receivables 10 qualitative guarantors A101 A102 not co-applicant guarantor A103.
Attribute 11 Digital current residence since.
Attribute property 12 qualitative real property A122 A121 A121 otherwise savings contract construction company life insurance A123 A121 A122 otherwise drive or another, and not in the attribute 6 A124 unknown no property.
Other qualitative attribute 14 stores payment plans bank A142 A141 A143 no.
Attribute 15 A151 Housing rent own qualitative A152 A153 free.
Attribute 16 digital Number of existing loans to the bank.
Attribute 17 qualitative A171 of unemployed unskilled labor - inexperienced nonresident A172 - A173 resident employed qualified management official independent officer A174 highly skilled employees.
Attribute 18 digital Number of people being able to ensure maintenance.
Attribute 19 qualitative Phone A191 A192 no yes, registered under the customer's name.
Attribute 20 qualitative foreign worker yes A201 A202 not.
Avelino Gonzalez and J Lawrence B Holder and Diane J Cook, graphic-based Learning 2001 Conference FLAIRS View context.
Oya Ekin and Peter L Hammer and Alexander Kogan and Pawel Winter Rating Distance based methods e p o r t RUTCOR ffl Rutgers Center for Operations Research Rutgers University ffl 1996 See context.
Please refer to the quote from political machine learning repository.
UCI Machine Learning Repository Statlog (German credit data) Dataset, the machine learning, deposit.