Preview

Data Mining for Business Intelligence: Multiple Linear Regression

Good Essays
Open Document
Open Document
921 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Data Mining for Business Intelligence: Multiple Linear Regression
Chapter 6: Multiple Linear
Regression

Data Mining for Business
Intelligence
Shmueli, Patel & Bruce
© Galit Shmueli and Peter Bruce 2010

Topics
Explanatory vs. predictive modeling with

regression
Example: prices of Toyota Corollas
Fitting a predictive model
Assessing predictive accuracy
Selecting a subset of predictors (variable selection) Explanatory Modeling
Goal: Explain relationship between predictors
(explanatory variables) and target
 Familiar use of regression in data analysis
 Multiple linear regression – linear relationship between

a dependent variable Y (response) and a set of predictors
X1,…,Xp
 Model Goal: Fit the data well and understand the

contribution of explanatory variables to the model – model performance assessed by residual analysis
 Model fitted to the entire dataset

Predictive Modeling
Goal: Predict target values in new data where we have predictor values, but not target values
Classic data mining context
Model Goal: Optimize predictive accuracy – how

accurately can the fitted model predict new cases
Model trained on training data and performance is assessed on validation or test data
Explaining role of predictors is not the primary

purpose (although useful)

Regression Method
 Predict the value of the dependent variable Y

based on predictors X1,…,Xp
 Regression coefficients β1, β2,…, βp in the equation:
Y = β1X1 + β2X2 + …..+ βpXp

 Coefficients estimated via ordinary least squares

(OLS) method
 Estimated using training sample

 Predictive capacity assessed by prediction results on

validation set – average squared error
 Assumptions – normality, independence, linearity

Example: Prices of Toyota
Corolla
ToyotaCorolla.xls
Goal: Predict sale prices of used Toyota
Corollas based on their specification
Data: Prices of 1442 used Toyota
Corollas, with their specification information – age, mileage, fuel type, engine size

Data Sample
(showing only the variables to be used in analysis) Variables Used
Price in

You May Also Find These Documents Helpful

  • Satisfactory Essays

    DRAFT EXAMINATION TIMETABLE TRIMESTER 3, 2010 MORNING EXAMS AT BURWOOD - COMMENCE AT 8.45 AM…

    • 545 Words
    • 3 Pages
    Satisfactory Essays
  • Satisfactory Essays

    Math 540 Quiz B

    • 927 Words
    • 4 Pages

    Regression methods attempt to develop a mathematical relationship between the item being forecast and factors that cause it to behave the way it does.…

    • 927 Words
    • 4 Pages
    Satisfactory Essays
  • Better Essays

    Business organisation use business information to communicate within the company and outside the company. This information comes from all different variety of sources. From information written in P1 I will analyse the different type of business information and their sources used by BBC.…

    • 912 Words
    • 3 Pages
    Better Essays
  • Better Essays

    The purpose of this literature review is to provide an overall perspective to the workings of business intelligence in a corporate environment. With the onset of massive technological gains in the past decade the implementation of business intelligence has grown accordingly. In the workplace the demand for business process improvement, responsive reporting, cutting edge forecasting, and internal business customer relations has triggered a need for a unit that understands the business needs as well as the impact on company technology.…

    • 7414 Words
    • 23 Pages
    Better Essays
  • Powerful Essays

    Cis 500 Data Mining Report

    • 2046 Words
    • 9 Pages

    This report is an analysis of the benefits of data mining to business practices. It also assesses the reliability of data mining algorithms and with examples. “Data Mining is a process that uses statistical, mathematical, artificial intelligence, and machine learning techniques…

    • 2046 Words
    • 9 Pages
    Powerful Essays
  • Satisfactory Essays

    Team D conducted a regression analysis and determined that the p-value of .0299 is less than 2.87 so we will accept the null…

    • 270 Words
    • 2 Pages
    Satisfactory Essays
  • Good Essays

    The objective of this assignment is to expose you to the problems involved in building a regression model. The assignment requires you to collect data, to build a reasonable model, and to submit a short report on your findings.…

    • 452 Words
    • 2 Pages
    Good Essays
  • Better Essays

    1. Use linear regression to estimate a linear equation describing how the value of sales (y) varies with the level of the fitted equation.…

    • 753 Words
    • 4 Pages
    Better Essays
  • Good Essays

    Written communication involves a selection of words, symbols, letters and numbers. Written communication is used at bmetc to make it possible for tutors to contact parents and students regarding the students’ progress in lessons and around the college; they do this by sending letters to the parents. A letter written to a parent of a student would have been created by a tutor, and then taken to the IT department, here they will check on the student’s attendance and also include this in the letter, and then, finally, it would be taken to the administration department of the campus in order to be sent home to the students’ parents/ guardians.…

    • 999 Words
    • 3 Pages
    Good Essays
  • Powerful Essays

    Companies are adopting business intelligence system within their organizations because by using the system reports they can gain the advantages of understanding their internal strength and weaknesses to face external competitors and challenges to increase profits and reduce cost on their everyday operations and processes.…

    • 2242 Words
    • 9 Pages
    Powerful Essays
  • Powerful Essays

    Dummy Variables

    • 2956 Words
    • 12 Pages

    JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of…

    • 2956 Words
    • 12 Pages
    Powerful Essays
  • Powerful Essays

    Linear Regression

    • 2726 Words
    • 11 Pages

    Linear regression provides a means to estimate or predict the value of a dependent variable based on the value of one or more independent variables. The regression equation is a mathematical expression of a causal proposition emerging from a theoretical framework. The linkage between the theoretical statement and the equation is made prior to data collection and analysis. Linear regression is a statistical method of estimating the expected value of one variable, y, given the value of another variable, x. The term simple linear regression refers to the use of one independent variable, x, to predict one dependent variable, y.…

    • 2726 Words
    • 11 Pages
    Powerful Essays
  • Better Essays

    Regression Analysis

    • 1285 Words
    • 6 Pages

    This presentation on Regression Analysis will relate to a simple regression model. Initially, the regression model and the regression equation will be explored. As well, there will be a brief look into estimated regression equation. This case study that will be used involves a large Chinese Food restaurant chain.…

    • 1285 Words
    • 6 Pages
    Better Essays
  • Good Essays

    Business Intelligence projects start out as a simple report or request for an extract of data. Once the base data is aggregated then the next request usually is about summing data or creating more reports that have different views to the data sets. Before long complex logic comes into play and the metrics coming out of the system are very important to many corporate wide citizens. "Centrally managed business rules enable BI projects to draw from the business know-how of a company and to work with consistent sets of business logic – they are what add the intelligence to business intelligence."(pg14)…

    • 1099 Words
    • 5 Pages
    Good Essays
  • Powerful Essays

    Data Warehouse Case Study

    • 4441 Words
    • 18 Pages

    Jonathan S. Einbinder, MD, MPH; Kenneth W. Scully, MS; Robert D. Pates, PhD; Jane R. Schubart, MBA, MS; Robert E. Reynolds, MD, DrPH…

    • 4441 Words
    • 18 Pages
    Powerful Essays

Related Topics