Preview

Data Mining of Chemical Analysis for White Wine Quality

Better Essays
Open Document
Open Document
1929 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Data Mining of Chemical Analysis for White Wine Quality
Background
Wine was once viewed as a luxury good, but now it is increasingly enjoyed by a wider range of consumers. According to the different qualities, the prices of wines are quite different. So when the wine sellers buy wines from wine makers, it’s important for them to understand the wine quality, which is in some degrees affected by some chemical attributes. When wine sellers get the wine samples, it makes difference for them to accurately classify or predict the wine quality and this will differentiate their profits. So our goal is to model the wine quality based on physicochemical tests and give the reference for wine sellers to select high, moderate and low qualities of wines.
We download wine quality data set that is the white vinho verde wine samples from the north of Portugalthe from UC Irvine Machine Learning Repository. This white wine data set includes 4898 observations and 12 variables, among which quality is the dependent variable, and other 11 attributes- fixed acidity, volatile acidity, citric acid, residual sugar, chlorides, free sulfur dioxide, total sulfur dioxide, density, pH, sulphates, and alcohol-are independent variables.
Technical summary 1. Data pre-process
The first step to analyze data is to pre-process it. First, observing all the data, we found several outliers, so we eliminate these outliers. Then we found that the dependent variables are numerical, and some values are focused in a narrow range, like variable density, ranging from 0.98 to 1.02 , so in the initial analysis, we decided not to bin them. Also we observed the correlation of each variable; since we mainly want to make prediction, even though some variables are correlated, we didn’t eliminate them.Overall, we just eliminate several outliers of this data set. 2. Preliminary Models
We use many models to make classification and prediction. The three models are multiple linear regression, classification tree and neural network.
2.1 Multiple linear regressions
Based on

You May Also Find These Documents Helpful

  • Good Essays

    Bonny Doon Analysis

    • 1373 Words
    • 5 Pages

    The wine making industry in California is fragmented, composing of 847 brick and mortar wineries. Approximately 88% of their production is sold domestically in the United States, which demonstrates the high level of demand for Californian wine in the U.S. Furthermore, demand for Californian wine outside of the U.S has risen “rapidly,” due to its “ripened” flavor. Historically and moving forward, the key success factor in the wine industry is the flavor of wines – or in other words, product quality.…

    • 1373 Words
    • 5 Pages
    Good Essays
  • Good Essays

    Bonny Doon Case

    • 2560 Words
    • 11 Pages

    The size of the wine market in the U.S., measured by tonnage, is estimated to be 2.5 million tons of crushed wine grapes in 1998. About half of the tonnages crushed are red wine grapes and the other half are white wine grapes. The best wineries are located in the Napa Valley and Sonoma region, whose wines receive high praises from critics. The per capita wine consumption in the U.S. is only about 2.02 gallons per adult as compared to 16.2 gallons in France and 15.8 gallons in Italy. Thus, demand for wines in the U.S. has huge potential for continued growth. At the same time, there is increasing demand for U.S.-made wines abroad.…

    • 2560 Words
    • 11 Pages
    Good Essays
  • Good Essays

    The flavour of wine depends on the amount of acid in the wine. This also has an impact on how the wine responds to age. Cheap titration kits are available to wine makers so they can improve wine production by measuring the amount of acid…

    • 394 Words
    • 2 Pages
    Good Essays
  • Better Essays

    Molar Mass Of Na2co3

    • 1774 Words
    • 8 Pages

    The flavour of wine can be effected by its acidity as it is a delicate balance. Because of this wine is also affected by age and is response may vary. Wine may be improved by measuring acidity using titration. The process is inexpensive for wine makers as titration kits are available for this purpose. The process can determine if additional ingredients are needed to maintain its quality.…

    • 1774 Words
    • 8 Pages
    Better Essays
  • Good Essays

    In 2011 the United States become the worlds largest consumer of wine, totalling a staggering 13.5% of global consumption. Further to this fact, individual wine consumption in the US has more than…

    • 1304 Words
    • 6 Pages
    Good Essays
  • Good Essays

    Fugelsang, C. (1997). Wine Microbiology. New York. Chapman & Hall 1997. Retrieved June 9th 2011.…

    • 8386 Words
    • 34 Pages
    Good Essays
  • Better Essays

    Rudd, B. B. (2006). Answers to Questions about Different Quality Levels of Wine. Retrieved from http://www.bbr.com/US/wine-knowledge/faq-quality.lml?ID=null#judge…

    • 1993 Words
    • 8 Pages
    Better Essays
  • Powerful Essays

    MILESTONE 2 Final

    • 2216 Words
    • 8 Pages

    Well we were hoping your database would help us with that. We have specific rooms for each wine, with temperatures set for the different kinds of wine. This provides the best wine for our customer, but we want a more efficient way of retrieving it.…

    • 2216 Words
    • 8 Pages
    Powerful Essays
  • Good Essays

    Scotia Wine Analysis

    • 1756 Words
    • 8 Pages

    In his article, Pellechia briefly introduces a various number of aspects that contribute to wine and how it is present in our world today. One of the main themes that is present throughout the article is how wine has been modernized over time to compete in the globalized world that we live in today. Wine is a unique commodity in that throughout history and even today it has been and still is held to a certain standard of luxury. The production of wine is seen as an artistry of sorts. It’s production, although not overly complex process, is held to a certain quality of standard. This presents a problem in that the signature of being in a modern, industrialized world is factors such as: mass production, efficiency, and a lower cost of production.…

    • 1756 Words
    • 8 Pages
    Good Essays
  • Good Essays

    Assuming that data mining techniques are to be used in the following cases, identify whether the task required is supervised or unsupervised learning.…

    • 362 Words
    • 2 Pages
    Good Essays
  • Good Essays

    Many adults enjoy the consumption of wine but are not aware of the different preservatives and chemicals that are added to the drink. Sulphur Dioxide, which is added to many food products including wine because it acts as a reductant, is ‘well known as a poisonous and allergenic substance (Eco-consult, n.d), making it a somewhat harmful ingredient. The purpose of this experiment is to determine how the amount of sulphur dioxide in white wine is affected by the exposure to the air over different time periods and whether this will negatively or positively affect the human body.…

    • 1609 Words
    • 7 Pages
    Good Essays
  • Satisfactory Essays

    The global wine industry is estimated to be in size of $130 billion to $180 billion in retail sales which is attributed in three types of wine: Table wine (alcohol level 14%) and sparkling where Table wine accounted for the major share of the market. The table wine market is further divided into five principal segments: jug or commodity, popular premium ($3-7 per bottle), super premium ($7 -14 per bottle), ultra and luxury. The consumptions of premium wine kept growing in US and other non-European wine-producing nations, i.e. UK. However, most of the continental European countries continue to keep high demand on inexpensive table wine. US paid $7.2 per bottle on average, which is higher than Western European consumers ($4.8 per bottle.).…

    • 442 Words
    • 2 Pages
    Satisfactory Essays
  • Powerful Essays

    Wine Industry

    • 4543 Words
    • 19 Pages

    The United States wine industry is a 12 billion dollar industry and is composed of 7,000 wineries and around 1,800 different companies. The three major companies within the industry are Constellation brands, E&J Gallo, and The Wine Group Inc. The industry has made its way through the economic crisis at a better rate than some of the other U.S industries however in order for them to continue to see any type of growth it is important that they acknowledge their issues and find ways in which they can rectify them. The majority of the issues among the industry are problems that cannot be directly controlled by individual wine companies. Therefore it is imperative that wineries find away to use these issues to their advantage, since they are impossible to just ignore. The four most crucial obstacles the industry is currently faced with are the economic state, the climate changes, the price of gas, and the CARE Act of 2010. All four of these obstacles affect the production of wine and as an end results affect the consumer. These obstacles cause the cost of wine production to increase and therefore wine companies have to increase the price at which they sell their wine to consumers in order to offset the extra money that was put in to the manufacturing of the good. The industry should also focus on their weaknesses amount the five forces, which include threat of substitutes, threat of entry, and threat of rivalry. If the industry can focus on lowering these threats, and concentrate on the value of their customers then they will be able to face the issues that they cannot control with a stronger hold on the market.…

    • 4543 Words
    • 19 Pages
    Powerful Essays
  • Good Essays

    Sonoma Valley Wines Case

    • 1190 Words
    • 4 Pages

    California State University, Hayward1. (a) There are six variables for each of the two years, giving a total of 12 variables. All of these variables must be nonnegative…

    • 1190 Words
    • 4 Pages
    Good Essays
  • Powerful Essays

    Bacnkdt

    • 5588 Words
    • 36 Pages

    Chapter 4. Dimension Reduction In this chapter we describe the important step of dimension reduction. The dimension of a dataset, which is the number of variables, must be reduced for the data mining algorithms to operate efficiently. We present and discuss several dimension reduction approaches: (1) Incorporating domain knowledge to remove or combine categories, (2) using data summaries to detect information overlap between variables (and remove or combine redundant variables or categories), (3) using data conversion techniques such as converting categorical variables into numerical variables, and (4) employing automated reduction techniques, such as principal components analysis (PCA), where a new set of variables (which are weighted averages of the original variables) is created.…

    • 5588 Words
    • 36 Pages
    Powerful Essays