Using simple language and illustrative examples, this book comprehensively covers data management tasks that bridge the gap between raw data and statistical analysis. Rather than focus on clusters of commands, the author takes a modular approach that enables readers to quickly identify and implement the necessary task without having to access background information first. Each section in the chapters presents a self-contained lesson that illustrates a particular data management task via examples, such as creating data variables and automating error checking. The text also discusses common pitfalls and how to avoid them and provides strategic data management advice. Ideal for both beginning statisticians and experienced users, this handy book helps readers solve problems and learn comprehensive data management skills.
Reviews
The author uses a "learning by example" approach in the book. Overall this works well …
—Morteza Marzjarani, The American Statistician, November 2011
Contents
Introduction
Using this book
Overview of this book
Listing observations in this book
Reading and Writing Datasets
Introduction
Reading Stata datasets
Saving Stata datasets
Reading comma-separated and tab-separated files
Reading space-separated files
Reading fixed-column files
Reading fixed-column files with multiple lines of raw data per observation
Reading SAS XPORT files
Common errors reading files
Entering data directly into the Stata Data Editor
Saving comma-separated and tab-separated files
Saving space-separated files
Saving SAS XPORT files
Data Cleaning
Introduction
Double data entry
Checking individual variables
Checking categorical by categorical variables
Checking categorical by continuous variables
Checking continuous by continuous variables
Correcting errors in data
Identifying duplicatesFinal thoughts on data cleaning
Labeling Datasets
Introduction
Describing datasets
Labeling variables
Labeling values
Labeling utilities
Labeling variables and values in different languages
Adding comments to your dataset using notes
Formatting the display of variables
Changing the order of variables in a dataset
Creating Variables
Introduction
Creating and changing variables
Numeric expressions and functions
String expressions and functions
Recoding
Coding missing values
Dummy variables
Date variables
Date-and-time variables
Computations across variables
Computations across observations
More examples using the egen command
Converting string variables to numeric variables
Converting numeric variables to string variables
Renaming and ordering variables
Combining Datasets
Introduction
Appending: Append...