10 free (and awesome) data science eBooks Python Think Python Python Data Science Handbook Statistics Think Stats Think Bayes Probabilistic Programming & Bayesian Methods for Hackers General data science The following two books are available at a price of your
Using Python to get data from your Clairy natural air purifier (Part 2) After we have seen in part 1 of this tutorial, how to setup Charles to intercept data from apps. Let's concentrate in this part on what information the Clairy app request from their
Clairy Using Python to get data from your Clairy natural air purifier (Part 1) Living in a big city, the air pollution can have quite an impact on your health. About 1.5 years ago, the folks at Clairy put a Kickstarter campaign live, that targeted this
Spyder style variable explorer in Jupyter The Spyder IDE has a really nice feature called variable explorer, where you can inspect all your variables and their content. This is something I was always missing in Jupyter notebooks, but turns
Data Science Book recommendations The signal and the noise In The Signal and the Noise, the New York Times political forecaster Nate Silver, who accurately predicted the results of every single state in the 2012 US election,
Jupyter notebook Jupyter notebook hacks Run the following code blocks in a Jupyter notebook cell, to get awesome keyboard shortcuts. Use Ctrl+⇧+⏎ for a new line and unindent. Gives you a fast option to move out of a
How to become a Data Scientist Great talk by Renee Teate, creator of the blog "Becoming a Data Scientist" about decisions to take to become a data scientist. Header photo by Rob Bye / Unsplash
Data viz Visualization gallery Have you ever been sitting in front of an analysis and wondered how to visualize the data in a comprehensible way? The web is here to help. Data Viz Project is a really
CSV csvkit - Command-line tools to handle CSV files If you are a fan of working with your command-line, csvkit is a great companion to do basic operations around CSV files. This includes converting files to CSV (csv, dbf, fixed, geojson, json,
Jupyter notebook Jupyter Notebook Tip: Multiple Outputs Jupyter notebook (formerly iPython notebook) is the one tool I use every single day. What's really nice about it, is the perfectly formatted and easily comprehensible output it creates: Unfortunately this only works
Podcasts Great Data Science Podcasts A big chunk of your effort in becoming a (better) Data Scientist will always involve sitting in front of your monitor and actually work on projects and trainings. But apart from that, there
Basics Data Science Cheat Sheets (updated) Learning how to code in Python can be overwhelming at the beginning. There are a lot of important packages like Numpy, Pandas and of course Matplotlib, that need to become part of your
Development Workflows for Data Scientists Github teamed up with O'Reilly to answer the question on "What can data science learn from software development?". For that, the author Ciara Byrne, leads us through the practices and
Clean Data Scientist == Data Janitor Each variable is a column, each observation is a row. A tremendous part of the work of a data scientist will always be the process of cleaning and preparing data. Mostly that data
Octave Installing Octave 4.2.1 on OS X Sierra While installing Octave 3.8 on a Mac is quite straight forward ([Link to the installer](the Octave 3.8.0 installer)), newer versions can definitely induce some headaches. The following tips helped
Machine Learning Basic neural network architecture When designing a neural network, there are some basic rules, that help you in choosing a layout. Example classification problem: fruits = {apple, pear, orange, lemon} features = {color, shape} Input nodes The number of
Whitepaper Statistical Modeling: The Two Cultures Algorithmic modeling vs data modeling In this whitepaper from 2001, Dr. Leo Breiman, the creator of the random forest algorithm, talks about the benefits of a machine learning ("algorithmic modeling") versus
Jupyter notebook Reusing Code in Jupyter Notebooks Magic commands in Jupyter notebooks are a great addition to my workflow. One really useful magic command comes with the ipyext package. "Writeandexecute" allows you to quickly save classes or functions
Cheat Sheet: Probability William Chen's comprehensive 10-page cheat sheet on probability should definitely take a place on your computer - or even your desktop as a print version. Based on Harvard's Introduction to Probability course (corresponding
Shortcuts Handy Shortcuts for Programmers Personally, I am a big fan of keyboard shortcuts and you can find me using a many in my routine. For advanced users the following keyboard shortcuts probably aren't anything new. Nevertheless, I