Preview

Parallel Data Mining and Assurance Service Model Using Hadoop in Cloud

Powerful Essays
Open Document
Open Document
3006 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Parallel Data Mining and Assurance Service Model Using Hadoop in Cloud
Parallel Data Mining and Assurance Service Model Using Hadoop in Cloud

Aditya Jadhav, Mahesh Kukreja
E-mail: aditya.jadhav27@gmail.com & mr_mahesh_in@yahoo.co.in

Abstract : In the information industry, huge amount of data is widely available and there is an imminent need for turning such data into useful information. This need is fulfilled by the process of exploration and analysis, by automatic or semi-automatic means, of large quantities of data provided by Data Mining. In case of a single system with few processors, there are restrictions on the speed of processing as well as the size of the data that can be processed at a time. The speed as well as the limit on the size of the data to be processed can be increased if data mining is carried out in parallel fashion with the help of the coordinated systems connected in LAN. But the problem with this solution is that LAN is not elastic, i.e. the number of systems in which the work is to be distributed on basis of the size of the data to be processed cannot be changed. Our main aim is to distribute data to be analyzed in various nodes in cloud. For optimum data distribution and efficient data mining as per user’s desire, various algorithms must be implemented.

3.

Elasticity: Computing resources can be rapidly increased or decreased as needed, as well as released for other uses when they are no longer required. Pay as you go: Remittance for only the resources actually used and for only the time used must be done.

4.

1.2 Virtualization In computing, the creation of a virtual (rather than actual) version of something, such as a hardware platform, operating system, a storage device or network resources is known as Virtualization. Virtualization can be viewed as part of an overall trend in enterprise IT that includes autonomic computing, a scenario in which the IT environment will be able to manage itself based on perceived activity, and utility computing, in which computer processing power is seen as



References: [1] Eucalyptus. The Eucalyptus Open-source Cloudcomputing System. http://open.eucalyptus.com/ documents / ccgrid2009.pdf [2] Hadoop Wiki http://wiki.apache.org/hadoop/ [3] Dell. Introduction to Hadoop http://content.dell.com/ us/en/business/d/business~solutions~whitepapers~en /Documents~hadoop-introduction.pdf.aspx [4] Storage Conference. The Hadoop Distributed File System http://storageconference.org/ 2010/ Papers/ MSST/Shvachko.pdf [5] A Tutorial on Clustering Algorithms. K-Means Clustering http://home.dei.polimi.it/matteucc/ Clustering/ tutorial_html/kmeans.html [6] International Journal of Computer Science Issues. Setting up of an Open Source based Private Cloud http://ijcsi.org/papers/IJCSI-8-3-1-354-359.pdf [7] Eucalyptus. Modifying a prepackaged image http://open.eucalyptus.com/participate/wiki/modifyi ng-prepackaged-image [8] Michael G. Noll. Running Hadoop On Ubuntu Linux (Single-Node Cluster) http://www.michaelnoll.com/tutorials/running-hadoop-on-ubuntu-linuxsingle-node-cluster/ [9] 8K Miles Cloud Solutions. Hadoop: CDH3 – Cluster (Fully-Distributed) Setup http://cloudblog.8kmiles.com/2011/12/08/hadoopcdh3-cluster-fully-distributed-setup/ [10] Apache Mahout. Creating Vectors from Text https://cwiki.apache.org/MAHOUT/creatingvectors-from-text.html [11] Amgad Madkour Blog. KMeans Clustering Using Apache Mahout http://amgadmadkour.blogspot.in /2012/07/kmeans-clustering-using-apachemahout.html  ISSN (Print): 2278-5140, Volume-1, Issue – 2, 2012 36

You May Also Find These Documents Helpful

  • Good Essays

    Virtualization is a way to pool a few physical servers by creating a virtual environment to accomplish the same task instead of using many individual servers. Currently, each server has their own set of tasks or programs to run and none of them intersects with each other, which can lead to some servers being underutilized while others are idle because no request are sent to the servers. Creating a virtual machine (VM) can reduce the physical network load by shifting network traffic to the internal virtual net. For example, the receptionist attempting to answer five incoming calls at the same time, while the mail room clerk sits by idly waiting for the employee or the postman to drop off mail before the clerk can start to distribute the mail out to the various departments. In this example, the receptionist is overwork while the mail clerk is not. Virtualization can fix this by allowing multiple servers to operate on one physical server using virtualization technology. Virtualization creates stable computing service and is growing into the desktop and storage area networks (SANs) environment. Virtualization is usually done through hosted virtualization used by market leader VMware or hypervisor used by Microsoft which is installed directly on the x86 server. The company will need to decide on which virtualization technology they will deploy in the datacenter.…

    • 894 Words
    • 3 Pages
    Good Essays
  • Good Essays

    1.Hadoop distributed file system: HDFS is where we store the data. It is a distributed file system that provides built-in redundancy and fault tolerance for all the Hadoop processing…

    • 496 Words
    • 2 Pages
    Good Essays
  • Good Essays

    The term paper will consist of a cloud-based solution of your own choosing. Imagine that you are the CIO of a company and you are moving your organization’s locally hosted technology environment to cloud-based models.…

    • 658 Words
    • 3 Pages
    Good Essays
  • Better Essays

    Cis 500- Cloud Computing

    • 1078 Words
    • 5 Pages

    Technology has taken great leaps of advancement. Some of the new technology that companies and consumers are taking advantage of to store and process data is cloud computing. Cloud computing was derived from virtualization. Virtualization allows companies to separate business applications from hardware. Doing this gives the company the capability of assigning applications as needed. The option to manage applications is a great benefit to companies. Resulting from the virtualization error, cloud computing has emerged to provide flexible IT infrastructures. This has not only enhanced the options companies now have, but it is also proven to be more cost efficient. This has increasingly become a preferred method of companies and consumers alike. (Turban, & Volonino, 2011, p.47)…

    • 1078 Words
    • 5 Pages
    Better Essays
  • Powerful Essays

    The compute framework of Hadoop is called Map Reduce. Map Reduce has been proven to the scale of…

    • 3076 Words
    • 13 Pages
    Powerful Essays
  • Powerful Essays

    Cloud Security Report

    • 9993 Words
    • 40 Pages

    [31] Badger, L., Grance, T., Patt-Comer, R., Voas, J. (2012) ‘Cloud Computing Synopsis and Recommendations: Recommendations of the National Institute of Standards and Technology’ NIST Special Publication.…

    • 9993 Words
    • 40 Pages
    Powerful Essays
  • Better Essays

    The significant development of information technology over past few years has led to the increasing demand of resources, extra bandwidth and computational power. Small and medium business companies with their limited budget are finding themselves in the middle of balancing between its client/employee needs and maintaining efficient work environment. In today’s economy the answer for the above problem is ‘Cloud Computing’.…

    • 1023 Words
    • 5 Pages
    Better Essays
  • Powerful Essays

    Business Trend Memo

    • 1299 Words
    • 6 Pages

    Erenben, C. (2009), “Cloud computing: the economic imperative”, eSchool News, 13, 9-26. Retrieved from http://www.eschoolnews.com/emails/esntoday/esntoday061509.htm.…

    • 1299 Words
    • 6 Pages
    Powerful Essays
  • Powerful Essays

    In today 's global environment, the paradigm shift to cloud computing is a major change in the software industry. Cloud computing is a Internet based utility computing platform that allows individuals and organizations to access software applications, servers and storage resources over the internet as shared resources, in a self-service manner (Choudhary & Vithayathil, 2013, p. 69). Instead of having to buy, install, maintain and manage…

    • 2175 Words
    • 7 Pages
    Powerful Essays
  • Good Essays

    Itm 501 Cloud Computing

    • 717 Words
    • 3 Pages

    Cloud computing relies on sharing computing resources rather than having local services or personal devices to handle all of the applications. “The Cloud” is referred to as a type of Internet-based computing. This allows the organizations computers and devices to manage their data without the need to have their own servers.…

    • 717 Words
    • 3 Pages
    Good Essays
  • Best Essays

    Cyber Security

    • 4964 Words
    • 20 Pages

    2. Armbrust, M., Fox, A., Griffith, R., Joseph, A., Katz, R., Konwinski, A., et al. (2009). Above the clouds: A Berkeley view of cloud computing. EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2009-28.…

    • 4964 Words
    • 20 Pages
    Best Essays
  • Good Essays

    Cloud Computing

    • 467 Words
    • 2 Pages

    And like many, they found it a challenge to provide an encompassing definition for cloud computing. Foster et al. defined cloud computing as, “A large-scale distributed computing paradigm that is driven by economies of scale, in which a pool of abstracted, virtualized, dynamically-scalable, managed computing power, storage, platforms, and services are delivered on demand to external customers over the Internet”; while SYS-CON Media Inc., 2008 [2] provided a list of definition from twenty one subject matter experts. Nevertheless, there was a general consensus [1][2][3] that cloud computing could be characterized by its very large scale, virtualization, versatility, scalability, on-demand, high performance, and low storage and usage costs.…

    • 467 Words
    • 2 Pages
    Good Essays
  • Powerful Essays

    Cost Estimation

    • 2353 Words
    • 10 Pages

    Open Source Cloud Computing Software Architecture Education and Training Experts FAQs, Glossary, and Acronyms Literature Programs and Organizations Related Resources Service Providers/Consultants Tools Architecture Description Languages (ADL) DoD-specific Resources Model Driven Architecture (MDA) Neural Networks Patterns Service Oriented Architecture Software Architecture Frameworks Virtualization Software Best Practices Discussion Groups/List Servers/Blogs Education and Training Experts Literature Programs and Organizations Related Resources Service Providers/Consultants Tools Best Practice Vetting Process Integrated Product and Process Development (IPPD) Pair Programming Software Acquisition Best Practice Software Program Managers Network (SPMN) Software Cost Estimation Best Practices Case Studies Education and Training Experts Literature Programs and…

    • 2353 Words
    • 10 Pages
    Powerful Essays
  • Better Essays

    What is Cloud Storage

    • 2213 Words
    • 9 Pages

    “A Cloud is a type of parallel and distributed system consisting of a collection of inter-connected and virtualized computers that are dynamically provisioned and presented as one or more unified computing resource(s) based on service-level agreements established through negotiation between the service provider and consumers”, (Buyya, 2008). The cloud is also described as a large pool of easily usable and accessible resources (such as hardware, development platforms and/or services) that can be dynamically configured to adjusted to variable load (scale), allowing for optimum utilization, which is exploited in a pay-per-use model, (Vaquero, Rodero-Merino, Caceres & Lindner, 2009). While there isn’t a clear and widely agreed upon definition of what the cloud is, both definitions mentioned almost identical as collection of resources, offered to customers on a scalable model where services are provided based on the needs of the customer. One such service is storage on the cloud.…

    • 2213 Words
    • 9 Pages
    Better Essays
  • Powerful Essays

    book

    • 3087 Words
    • 30 Pages

    – JSF 2, PrimeFaces, servlets/JSP, Ajax, jQuery, Android development, Java 6 or 7 programming, custom mix of topics…

    • 3087 Words
    • 30 Pages
    Powerful Essays

Related Topics