Preview

Big Data

Powerful Essays
Open Document
Open Document
4801 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Big Data
Introduction to Basic Statistics

Pat Hammett, Ph.D.

2005

Instructor Comments: This document contains an overview of basic probability and statistics. It also includes a practice test at the end of the document. Note: answers to the practice test questions are included in an appendix.

1

Pat Hammett

University of Michigan

Table of Contents 1. VARIABLES- QUALITATIVE AND QUANTITATIVE......................3 1.1 Qualitative Data (Categorical Variables or Attributes) ........................... 3 1.2 Quantitative Data............................................................................................... 4 DESCRIPTIVE STATISTICS.................................................6 2.1 Sample Data versus Population Data ................................................................... 6 2.2 Parameters and Statistics..................................................................................... 6 2.3 Location Statistics (measures of central tendency) ...................................... 7 2.4 Dispersion Statistics (measures of variability) ............................................... 8 FREQUENCY DISTRIBUTIONS ........................................... 10 3.1 Frequency Measures.............................................................................................. 10 3.2 Histogram .................................................................................................................11 3.3 Discrete Histogram............................................................................................... 12 3.4 Continuous Data Histogram ................................................................................. 13 NORMAL DISTRIBUTION ................................................. 15 4.1 Properties of the Normal Distribution ............................................................. 15 4.2 Estimating Probabilities Using Normal Distribution ..................................... 16 4.3 Calculating Parts Per Million

You May Also Find These Documents Helpful

  • Good Essays

    The first variable considered is Location, a categorical variable. The three subcategories are Urban, Suburban and Rural. The frequency distribution and pie chart are included. Measures of central tendency and descriptive statistics are not calculated due to the categorical nature of the variable.…

    • 1935 Words
    • 8 Pages
    Good Essays
  • Good Essays

    Give the name of one qualitative variable and one quantitative variable from the data set. Note: Your dataset may not have both types (if it does not, please still choose two variables and explain why each are either quantitative or qualitative – the key here is that you understand and can note the difference.)…

    • 332 Words
    • 2 Pages
    Good Essays
  • Powerful Essays

    BUAD310 Lect1 Spring2013

    • 1118 Words
    • 17 Pages

    Probability Probability Random Variables Chapters 7 Chapters 7, 8 Chapter 9 2/4 Covariance and Correlation, Portfolio Analysis Chapters 6, 10 2/6 2/11 2/13 Normal Distribution Sampling Distributions…

    • 1118 Words
    • 17 Pages
    Powerful Essays
  • Good Essays

    DG researh paper

    • 1554 Words
    • 5 Pages

    Bibliography: Bluman, Allan. G. (2007). Elementary Statistics. A Step by Step Approach. 6th ED. McGraw Hill Companies, Inc: NewYork, NY. 2007.…

    • 1554 Words
    • 5 Pages
    Good Essays
  • Satisfactory Essays

    Stat 221 week 6 ilab

    • 572 Words
    • 4 Pages

    We want to calculate the mean for the 10 rolls of the die for each student in the class. Label the column next to die10 in the Worksheet with the word mean. Pull up Calc > Row Statistics and select the radio-button corresponding to Mean. For Input variables: enter all 10 rows of the die data. Go to the Store result in: and select the mean column. Click OK and the mean for each observation will show up in the Worksheet.…

    • 572 Words
    • 4 Pages
    Satisfactory Essays
  • Good Essays

    The 1st individual variable considered is Location. It is a categorical variable. The three subcategories are Urban, Suburban and Rural. Since this is a categorical variable, the measures of central tendency and descriptive statistics has not been calculated for this variable. The frequency distribution and pie chart are given as follows:…

    • 964 Words
    • 4 Pages
    Good Essays
  • Satisfactory Essays

    1. Classify each random variable as discrete or continuous. (a) The number of visitors to the Museum of Science in Boston on a randomly selected day. (b) The camber-angle adjustment necessary for a front-end alignment. (c) The total number of pixels in a photograph produced by a digital camera. (d) The number of days until a rose begins to wilt after it is purchased from a flower shop. (e) The runnning time for the latest James Bond movie. (f) The blood alcohol level of the next person arrested for DUI in a particular county. 2. A bagel shop sells only two different types of bagels: plain (P) and cinnamon raisin (C). Five customers are selected at random. Past records have shown that the demand for cinnamon bagels is twice that for plain bagels. Each customer buys only one bagel and the experiment consists of recording what kind of bagel these five customers buy. Let the random variable X be the number of people who buy a plain bagel. (a) Find the probability distribution for X. (b) Suppose at least 3 people buy a plain bagel. What is the probability that exactly 4 people buy a plain bagel? 3. The probability distribution for a discrete random variable X is given by the formula p(r) = for r = 1, 2, . . . , 6. (a) Verify that this is a valid probability distribution. (b) Find P (X = 4). (c) Find P (X > 2). (d) Find the probability that X takes on the value 3 or 4. (e) Construct the corresponding probability histogram. 4. Two packages are independently shipped from Fort Collins, Colorado, to the same address in Seattle, Washington, and each is guaranteed to arrive within 4 days. The probability that a package arrives within 1 day is 0.10, within 2 days is 0.15, within 3 days is 0.25, and on the fourth day is 0.50. Let the random variable X be the total number of…

    • 613 Words
    • 3 Pages
    Satisfactory Essays
  • Powerful Essays

    Brase, C. H., & Brase, C. P. (2010). Understanding Basic Statistics (5th ed.). Belmont, CA: Brooks/Cole.…

    • 6608 Words
    • 27 Pages
    Powerful Essays
  • Powerful Essays

    Sta2300 Study Guide

    • 64822 Words
    • 260 Pages

    Welcome to Data Analysis! This Study Book is designed to be your guide in making best use of the resources available in this course. The primary resource is the textbook De Veaux, Velleman & Bock, Intro Stats, third edition, and much of the Study Book is designed to direct your reading of the textbook. However a lot of other support material and assistance is also available on the Course StudyDesk within UConnect. We also support the second edition of the textbook as well as the current edition. The StudyDesk should be a regular port of call—at least once weekly, probably more often—in order to access the latest versions of materials, receive announcements from the teaching staff, obtain current lecture notes, and much, much more. Check it out now. The Introductory Material in the Course Resources Block on the StudyDesk gives detailed information about the course and how, for example, to get personal assistance both from the Data Analysis teaching team in the Faculty of Sciences and from staff of lts, Learning and Teaching Support. The Introductory Material is a ‘must’ read, so check it out now if you haven’t done so as yet. Most of you will be using the student version of spss,…

    • 64822 Words
    • 260 Pages
    Powerful Essays
  • Good Essays

    Chapter 9

    • 30243 Words
    • 121 Pages

    Exam 1. 1. All continuous random variables are normally distributed. False. 2. The actual weight of hamburger patties is an example of a continuous random variable. True 3. The college of business administration at acorn University offers a major in finance. Based on historical records, 30% of the college of business students major in Finance. A random sample of 20 students is selected. What is the probability that exactly 3 of the selected students are majoring in Finance? .0716 4. Assume that we have selected a random sample of 25 units from a normally distributed large population. If u = 15, and c2=4, what is the probability that we will obtain a sample mean of less than 14? .0062 5. The normal approximation of the binomial distribution is appropriate when. Np> 5 and n(1-p) >5 6. A newly married couple plans to have four children. Suppose that boys and girls are equal likely each time a child is born. What is the probability the couple will have no more than 2 boys? 62.5% 7. A random variable is said to be discrete if: Its outcome are countable 8. The mean life of pair of shoes is 40 months with a standard deviation of 8 months. If the life of the shoes is normally distributed, how many pairs of shoes out of one million will need replacement before 36 months? 308,500 9. If the sampled population has a mean of 48 and standard deviation 16, then the mean and the standard deviation for the sampling distribution x for n=16 48 and 4 10. The MPG (MILES PER GALLON) for a mid-size car is normally distributed with a mean of 32 and a standard deviation of .8. what is the probability that the MPG for a selected mid-size car would be less that 33.2? 93.32% 11. If the random variable X has a mean of u and a standard deviation g, then (X-u)/g) has a mean and standard deviation respectively. 0 and 1 12. For a binomial probability experiment, with n=150 and p=.2, it is appropriate to use the normal approximation to the binomial distribution. TRUE 13. A computer system uses 4…

    • 30243 Words
    • 121 Pages
    Good Essays
  • Powerful Essays

    Data

    • 1644 Words
    • 7 Pages

    The purpose of the report is to assist Aircraft Solutions (AS) in indentifying the most significant Information Technology (IT) security vulnerabilities. AS products and services are at the forefront of the industry and the protection of such is very important as they are an industry leader. The vulnerabilities that will be discussed are the firewall configuration, virtualization of their hardware assets and defining security policy regarding the timeliness of firewall configuration and updates.…

    • 1644 Words
    • 7 Pages
    Powerful Essays
  • Powerful Essays

    Stats Term Paper

    • 1419 Words
    • 6 Pages

    The following focuses on the discipline of quantitatively describing the main features of a collection of data. It measures different statistics of the variables such as: the mean, standard deviation, minimum and maximum, and the range. These are just a few statistics that form the basis of most quantitative analysis of data. In Table 1, the descriptive statistics will be shown for both variable X and Y:…

    • 1419 Words
    • 6 Pages
    Powerful Essays
  • Good Essays

    Misleading Graphs

    • 590 Words
    • 3 Pages

    References: Bluman, A. G. (2009). Elementary statistics: A step by step approach, (7th ed.). New York, NY: McGraw-Hill. Retrieved from University of Phoenix e-Book collection.…

    • 590 Words
    • 3 Pages
    Good Essays
  • Powerful Essays

    Six sigma is evolved in periods of - i.evolution, ii. design, iii. Refinement, iv. Results, v. Awareness, vi. Adaption, vii. Enlightenment. Select the correct sequence…

    • 1558 Words
    • 16 Pages
    Powerful Essays
  • Good Essays

    Reflective Paper

    • 1125 Words
    • 5 Pages

    The National Council of Teachers of Mathematics. (2012). Data Analysis and Probability. Retrieved from http://www.nctm.org/standards…

    • 1125 Words
    • 5 Pages
    Good Essays