Chapter 4: Getting Started with Python & Turi Create

In this chapter, you’ll get a quick primer on Python. You’ll learn how to setup your Python environment using Conda, and how to install external libraries. You’ll learn how to run and use Jupyter notebooks to iterate quickly with Python

大綱

Python

  • Python is the dominant programming language used for data science and machine learning.

  • Python community to support data science and machine learning development. These include

    • Data science libraries: Matplotlib, NumPy, Pandas, SciPy

    • Machine learning libraries: Caffe2, Keras, Microsoft Cognitive Toolkit, TensorFlow, Theano, scikit-learn

    • ML-as-a-Service: Amazon Machine Learning, Google ML Kit, IBM Watson, Microsoft Azure Machine Learning Studio, Turi Create

    • Tools: coremltools, virtualenv, pip, Anaconda, Docker, Jupyter notebooks, Google Colaboratory

Packages and environments

  • Working on machine learning projects requires integrating the correct versions of numerous software libraries, also known as “packages”.

  • Most people create environments where they install specific versions of Python and the packages they need.

    • the environment manager virtualenv

    • the package manager pip

Conda

  • Conda handles Python language versions, Python packages, and associated native libraries.

    • Anaconda: Includes all of the standard packages needed for machine learning

Installing Anaconda

  • https://www.anaconda.com/download/#macosarrow-up-right, and download the Python 3.7 version

  • prompted to Change Install Location..., select Install for me only.

    • If it says you can’t install it there, click the Install on a specific disk... button, then click back to the Home button

Using Anaconda navigator

  • Anaconda comes with a desktop GUI that you can use to create environments and install packages in an environment

  • From within Finder, locate and start ~/anaconda3/Anaconda Navigator.

  • Select the Environments tab to see the base (root)

  • There are three ML packages needed that aren’t in the base environment:

    • Keras: A high-level toolkit for building neural networks that works with TensorFlow, Theano, and Microsoft Cognitive Toolkit.

    • TensorFlow: Google’s library for building computational graphs.

    • Turi Create: Apple’s ML-as-a-Service framework.

Setting up a base ML environment

  • a quicker start, Import mlenv.yaml into the Navigator

Python libraries for data science

  • Begin by creating a custom base environment for ML, with NumPy, Pandas, Matplotlib, SciPy and scikit-learn.

    • NumPy: Functions for working with multi-dimensional arrays.

    • Pandas: Data structures and data analysis tools.

    • Matplotlib: 2D plotting library.

    • Seaborn: Statistical data visualization library.

    • SciPy: Modules for statistics, optimization, integration, linear algebra, Fourier transforms, and more, using NumPy arrays.

    • scikit-learn: Machine learning library.

  • Create a new environment named mlenv, with Python 3.6

  • Add the scikit ML libraries:

    • change Installed to Not installed, search for scikit, and check the checkboxes next to scikit-image and scikit-learn

  • Add Seaborn library

Adding Jupyter to base ML environment

  • Select the Home Tab. Notice the Applications on field contains mlenv, and every app displays an Install button

  • Click the Jupyter Notebook Install button

An important note about package versions

  • 不需盲目更新到最新版本

    • If your code works fine and you don’t need any of the new features or essential bug fixes, then keep your Python installation stable and only update your packages when you have a good reason.

Jupyter notebooks

  • With Jupyter notebooks, which are a lot like Swift Playgrounds, you can write and run code, and you can write and render markdown to explain

Starting Jupyter

  • In Anaconda Navigator’s Home tab, with mlenv selected, click the Jupyter Launch button.

Pandas and Matplotlib

  • 快速鍵

    • Help -> Edit Keyboard shortcut.

    • Shift-Tab-Tab: see documentation

Differences between Python and Swift

  • A major syntax difference between Python and most other programming languages is the importance of indentation. With Python, indentation replaces {} to define blocks. ”

Transfer learning with Turi Create

  • In this section, you’ll create the same HealthySnacks model as the previous chapter, except this time, you’ll use Turi Create

Creating a Turi Create environment

  • Clone the mlenv environment to create turienv, then install turicreate in the new environment.

  • Anaconda doesn’t know about turicreate, so you’ll have to pip install it from within Terminal.

List pip-installed packages

  • In Terminal, use this command to list all of the packages in the active environment or a specific package:

Turi Create notebook

  • 啟動notebook

Let’s do some training

Validation

  • 大約在第20次iterations可以發現train_accuracy跟validation_accuracy的差異越來越大,表示model已經開始overfitting了, 根據觀察應該大約在第15次iterations,train_accuracy跟validation_accuracy的差異最接近。所以應該使用第15次iterations後的model。但Turi Create並不會紀錄每一次iterations的model結果,所以只能重新設定成15次,在訓練一次。這就是Keras的優勢,Keras會紀錄每一次iterations的model結果,所以不用在重新訓練一次。

Testing

  • Accuracy, precision and recall are all similar to the final validation accuracy of the model.

    • Unlike Create ML, Turi Create gives only overall values for precision and recall, and you need some code to get precision and recall for each class

  • The confusion matrix shows only the first 10 rows: the model mistook one “ice cream” image for “candy,” three “apple” images for “banana,” etc. Presented this way, it doesn’t look much like a matrix.

Exporting to Core ML

Shutting down Jupyter

  • In the Terminal window that ran jupyter_mac.command ; exit;, press Control-C-C to stop the server.

Deactivating the active environment

useful Conda commands

Docker and Colab

  • There are two other high-level tools for supporting machine learning in Python: Docker and Google Colaboratory.

Docker

  • Docker images can be useful to share pre-defined environments with colleagues, or peers, but at some point they will require an understanding of how to write Docker images

Google Colaboratory

Key points

  • Get familiar with Python. Its widespread adoption with academics in the machine learning field means if you want to keep up to date with machine learning, you’ll have to get on board.

  • Get familiar with Conda. It will make working with Python significantly more pleasant. It allows you to try Python libraries in a controlled environment without damaging any existing environment.

  • Get familiar with Jupyter notebooks. Like Swift playgrounds, they provide a means to quickly test all things Python especially when used in combination with Conda

Last updated