Cover art for Bad Data Handbook
Published
O'Reilly & Assoc, January 2013
ISBN
9781449321888
Format
Softcover, 250 pages
Dimensions
23.3cm × 17.8cm × 1.4cm

Bad Data Handbook Cleaning Up the Data So You Can Get Back to Work

Not in stock
Fast $7.95 flat-rate shipping!
Only pay $7.95 per order within Australia, including end-to-end parcel tracking.
100% encrypted and secure
We adhere to industry best practice and never store credit card details.
Talk to real people
Contact us seven days a week – our staff are here to help.

Welcome to data science's dirty secret: real-world data is messy. Data scientists must spend a good deal of time playing software developer, writing code to clean up data before they can actually do anything constructive with it. It's a necessary evil, but you can still make the most of it.

This practical book walks you through several real-world examples to demonstrate the theory and practice behind working with and cleaning up dirty data. No one tool solves all of the problems well. Wise data scientists learn many tools and learn where each one shines. To that end, this book takes a polyglot approach: most examples will involve R and Python, but expect the occasional smattering of Groovy and sed/awk fun.

Related books