Your First Machine Learning Project in Python Step-By-Step


The text that follows is owned by the site above referred.

Here is only a small part of the article, for more please follow the link


Do you want to do machine learning using Python, but you’re having trouble getting started?

In this post you will complete your first machine learning project using Python.

In this step-by-step tutorial you will:

  1. Download and install Python SciPy and get the most useful package for machine learning in Python.
  2. Load a dataset and understand it’s structure using statistical summaries and data visualization.
  3. Create 6 machine learning models, pick the best and build confidence that the accuracy is reliable.

If you are a machine learning beginner and looking to finally get started using Python, this tutorial was designed for you.

Let’s get started!

  • Update Jan/2017: Updated to reflect changes to the scikit-learn API in version 0.18.

Your First Machine Learning Project in Python Step-By-Step

How Do You Start Machine Learning in Python?

The best way to learn machine learning is by designing and completing small projects.

Python Can Be Intimidating When Getting Started

Python is a popular and powerful interpreted language. Unlike R, Python is a complete language and platform that you can use for both research and development and developing production systems.

There are also a lot of modules and libraries to choose from, providing multiple ways to do each task. It can feel overwhelming.

The best way to get started using Python for machine learning is to complete a project.

  • It will force you to install and start the Python interpreter (at the very least).
  • It will given you a bird’s eye view of how to step through a small project.
  • It will give you confidence, maybe to go on to your own small projects.

Beginners Need A Small End-to-End Project

Books and courses are frustrating. They give you lots of recipes and snippets, but you never get to see how they all fit together.

When you are applying machine learning to your own datasets, you are working on a project.

A machine learning project may not be linear, but it has a number of well known steps:

  1. Define Problem.
  2. Prepare Data.
  3. Evaluate Algorithms.
  4. Improve Results.
  5. Present Results.

The best way to really come to terms with a new platform or tool is to work through a machine learning project end-to-end and cover the key steps. Namely, from loading data, summarizing data, evaluating algorithms and making some predictions.

If you can do that, you have a template that you can use on dataset after dataset. You can fill in the gaps such as further data preparation and improving result tasks later, once you have more confidence.

Hello World of Machine Learning

The best small project to start with on a new tool is the classification of iris flowers (e.g. the iris dataset).

This is a good project because it is so well understood.

  • Attributes are numeric so you have to figure out how to load and handle data.
  • It is a classification problem, allowing you to practice with perhaps an easier type of supervised learning algorithm.
  • It is a multi-class classification problem (multi-nominal) that may require some specialized handling.
  • It only has 4 attributes and 150 rows, meaning it is small and easily fits into memory (and a screen or A4 page).
  • All of the numeric attributes are in the same units and the same scale, not requiring any special scaling or transforms to get started.

Let’s get started with your hello world machine learning project in Python.

Leave a Reply

Your email address will not be published. Required fields are marked *