Advanced Census Data Analysis in Excel

Advanced Census Data Analysis in Excel


Welcome to the video tutorial on using Census
data in Excel. I am Alexandra Barker, Data Dissemination
Specialist with the Census Bureau. During this tutorial, you will learn about
how to perform advanced analysis of Census data using Excel. The vast majority of time Excel as it is provides
you with what you need to access, combine, query, transform, visualize, and otherwise
play with your Census data using the visualizations, formulas, and functions. There are however our advanced tools for statistical
analysis that you can turn on as add-ins. You do not need to download anything! You just need to turn it on. Doing so will give you tools for regression
testing, histograms, ANOVA, correlation and covariance, exponential smoothing, F-Tests,
Z-tests, and T-tests, Fourier analysis, moving averages, rank and percentile summaries, sampling,
and a more sophisticated random number generator than a standard function. But going through all of these statistical
functions and methods is beyond the scope of this training, so I will just give you
a basic idea of how it can be used with a few examples. Will we need both the add-on and do some basic
analysis on income table downloaded from the American FactFinder. So let’s start. The table that you see right now I downloaded
through the “Community Facts” option in American FactFinder. It’s a table that compares the median household
income by state. Once it’s downloaded in Excel, all I have
to remember is to always format as “Table.” So when you click here you can always format
as table and you’ll be able to unlock great features in Excel. To enable now the add-in, we’re gonna click
on “File,” then we’re gonna go to options and choose “Add-ins.” If you do not see the Analysis ToolPak, you
can always go to “Manage Add-ins” and click “Go.” And then you can check the Analysis ToolPak
and click “OK.” So now when you click in “Data” you’ll see
that’s now available for you. We will first look at descriptive statistics
which describe the main features of a data set. They are the basic statistics you would generally
want to start with even if you were doing more sophisticated analysis of data. Descriptive statistics provide, for example,
information about the central tendency, variability, and overall shape of your data in just a few
clicks. First, select your income table. Next, from the “Data” tab, click in “Data
Analysis,” select “Descriptive Statistics,” and click “OK.” You need to select an “Input Range” and for
this example we want it to be all the income from the column “Median Household Income.” Make sure it’s grouped by column for your
output options. You want it to be a new worksheet and I’m
naming it “Descriptive Stats.” It’s important to check “Summary statistics.” After you do this, click “OK.” You will now have a new tab called “Descriptive
Stats.” Again, these are useful for your first glance
at the data to initiate your statistical analysis. Now if you want to tell the number of occurrences
of a value or a range in a data set, your best option is to use the histogram chart. You may recall from school that sometimes
letter grades were distributed in odd boundaries. Chances are good that your teacher used a
histogram to determine those letter grade boundaries and the number of scores that fell
in the range. In this case, we are going to look at how
the median income data is distributed. We’ll create a column of bins where we want
the data to fit. So somewhere in this spreadsheet we need to
enter in separate rows on a column the bin numbers. So let’s do that. Let’s go back to the income table and create
a bin. And our ranges will be 40,000, 50,000, 60,000,
and my last one will be 70,000. I’m just making up these ranges just for this
example. Now let’s go back to the “Data” tab and click
on the “Data Analysis” tool. We’re gonna choose the histogram here in the
bottom and then click OK. Let’s select income data for the “Input Range.” Now let’s select our “Bin Range.” Now I have to select some empty space on the
spreadsheet for the “Output Range.” Now let’s click the “Chart Output” box right
here on the bottom – and don’t forget this option – and just click “OK.” So what you see now is a histogram that we
just created using the Data Analysis tool. So now let’s move on to sampling. The Data Analysis tool makes it super simple
to get samples from a population. Although this is a very small amount of data,
within 51 data points, you can imagine if you are doing research on a data set with
hundreds, thousands, tens of thousands of data points, getting a random sample from
such a large data set – it’s not that easy. So even though this is a trivial example it
is good to know how it’s done. So let’s go to the Data Analysis tool. Let’s choose “Sampling” and click “OK.” Our Input Range – it’s just gonna be the income
again. I want a “Random” sample, so let’s indicate
the number of samples and for that will be 5. And then you have to select an Output Range,
so I need 5 spaces for my output and I need to select 5 different cells. Then click “OK.” Now you have your sample with no complex formula,
no macros, and no repetitive operations required. This concludes our overview on how to conduct
advanced analysis of Census data in Excel. For more video tutorials visit Census.gov
and subscribe to our YouTube channel. And don’t forget to follow us on social media! Thank you!

Leave a Reply

Leave a Reply

Your email address will not be published. Required fields are marked *