Commit 16e5e0e8 authored by Sylvain Marchienne's avatar Sylvain Marchienne
Browse files

Notebook python, numpy, matplotlib

parent 81d1a4e8
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Api H19 : TP1 - indications\n",
"Ce tutoriel notebook est adapté (et raccourci) d'un cours très populaire de Data Science: http://cs231n.github.io/. Il a pour but de vous former aux bases de Python et de Numpy. Il peut y avoir des explications compliquées; vous pouvez demander des explications à vos tuteurs. Essayez de bien comprendre les cellules de code grâce aux commentaires. Vous devriez prendre environ 1h30 pour finir ce notebook. Au delà, arrêtez-vous et passez au TP suivant.\n",
"\n",
"\n",
"Numpy (calculs scientifiques) et Matplotlib (graphiques) sont les principaux outils pour faire de la Data Science dans un environnement Python. Vous en entendrez parler partout !"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# CS228 Python Tutorial"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Introduction"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Python is a great general-purpose programming language on its own, but with the help of a few popular libraries (numpy, scipy, matplotlib) it becomes a powerful environment for scientific computing.\n",
"\n",
"We expect that many of you will have some experience with Python and numpy; for the rest of you, this section will serve as a quick crash course both on the Python programming language and on the use of Python for scientific computing.\n",
"\n",
"Some of you may have previous knowledge in Matlab, in which case we also recommend the numpy for Matlab users page (https://docs.scipy.org/doc/numpy-dev/user/numpy-for-matlab-users.html)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this tutorial, we will cover:\n",
"\n",
"* Basic Python: Basic data types (Containers, Lists, Dictionaries, Sets, Tuples), Functions, Classes\n",
"* Numpy: Arrays, Array indexing, Datatypes, Array math, Broadcasting"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Basics of Python\n",
"Si vous connaissez bien Python, vous pouvez passer cette partie et aller à Numpy directement. Même si c'est toujours bien de se rafaraichir la mémoire !"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Python versions"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There are currently two different supported versions of Python, 2.7 and 3. Somewhat confusingly, Python 3.0 introduced many backwards-incompatible changes to the language, so code written for 2.7 may not work under 3.4 and vice versa. For this class all code will use Python 3."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Basic data types"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Numbers"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Integers and floats work as you would expect from other languages:"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"3\n"
]
},
{
"data": {
"text/plain": [
"int"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x = 3\n",
"print(x)\n",
"type(x)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"4\n",
"2\n",
"6\n",
"9\n"
]
}
],
"source": [
"print(x + 1) # Addition;\n",
"print(x - 1) # Subtraction;\n",
"print(x * 2) # Multiplication;\n",
"print(x ** 2) # Exponentiation;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Booleans"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Python implements all of the usual operators for Boolean logic, but uses English words rather than symbols (`&&`, `||`, etc.):"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"True\n",
"False\n",
"<class 'bool'>\n"
]
}
],
"source": [
"t, f = True, False # réalise deux affectations à la fois\n",
"print(t)\n",
"print(f)\n",
"print(type(t)) # Prints \"<type 'bool'>\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we let's look at the operations:"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"False\n",
"True\n",
"False\n",
"True\n"
]
}
],
"source": [
"print(t and f) # Logical AND;\n",
"print(t or f) # Logical OR;\n",
"print(not t) # Logical NOT;\n",
"print(t != f) # Logical XOR;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Strings"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"hello\n"
]
},
{
"data": {
"text/plain": [
"5"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"hello = 'hello' # String literals can use single quotes\n",
"world = \"world\" # or double quotes; it does not matter.\n",
"print(hello)\n",
"len(hello)"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"hello world 12\n"
]
}
],
"source": [
"hw12 = '%s %s %d' % (hello, world, 12) # sprintf style string formatting\n",
"print(hw12) # prints \"hello world 12\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Containers"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Python includes several built-in container types: lists, dictionaries, sets, and tuples."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Lists"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A list is the Python equivalent of an array, but is resizeable and can contain elements of different types:"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[3, 1, 2]\n",
"2\n",
"2\n"
]
}
],
"source": [
"xs = [3, 1, 2] # Create a list\n",
"print(xs)\n",
"print(xs[2])\n",
"print(xs[-1]) # Negative indices count from the end of the list; prints \"2\""
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[3, 1, 'foo']\n"
]
}
],
"source": [
"xs[2] = 'foo' # Lists can contain elements of different types\n",
"print(xs)"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[3, 1, 'foo', 'bar']\n"
]
}
],
"source": [
"xs.append('bar') # Add a new element to the end of the list\n",
"print(xs)"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"bar\n",
"[3, 1, 'foo']\n"
]
}
],
"source": [
"x = xs.pop() # Remove and return the last element of the list\n",
"print(x)\n",
"print(xs)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Slicing"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In addition to accessing list elements one at a time, Python provides concise syntax to access sublists; this is known as slicing:"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[0, 1, 2, 3, 4]\n",
"[2, 3]\n",
"[2, 3, 4]\n",
"[0, 1]\n",
"[0, 1, 2, 3, 4]\n",
"[0, 1, 2, 3]\n"
]
}
],
"source": [
"nums = list(range(5)) # range is a built-in function that creates a list of integers\n",
"print(nums) # Prints \"[0, 1, 2, 3, 4]\"\n",
"print(nums[2:4]) # Get a slice from index 2 to 4 (exclusive); prints \"[2, 3]\"\n",
"print(nums[2:]) # Get a slice from index 2 to the end; prints \"[2, 3, 4]\"\n",
"print(nums[:2]) # Get a slice from the start to index 2 (exclusive); prints \"[0, 1]\"\n",
"print(nums[:]) # Get a slice of the whole list; prints [\"0, 1, 2, 3, 4]\"\n",
"print(nums[:-1]) # Slice indices can be negative; prints [\"0, 1, 2, 3]\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Loops"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can loop over the elements of a list like this:"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"cat\n",
"dog\n",
"monkey\n"
]
}
],
"source": [
"animals = ['cat', 'dog', 'monkey']\n",
"for animal in animals:\n",
" print(animal)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### List comprehensions:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When programming, frequently we want to transform one type of data into another. As a simple example, consider the following code that computes square numbers:"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[0, 1, 4, 9, 16]\n"
]
}
],
"source": [
"nums = [0, 1, 2, 3, 4]\n",
"squares = []\n",
"for x in nums:\n",
" squares.append(x ** 2)\n",
"print(squares)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can make this code simpler using a list comprehension:"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[0, 1, 4, 9, 16]\n"
]
}
],
"source": [
"nums = [0, 1, 2, 3, 4]\n",
"squares = [x ** 2 for x in nums]\n",
"print(squares)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"List comprehensions can also contain conditions:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Dictionaries (important !)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A dictionary stores (key, value) pairs, similar to a `Map` in Java or an object in Javascript. You can use it like this:"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"cute\n",
"True\n"
]
}
],
"source": [
"d = {'cat': 'cute', 'dog': 'furry'} # Create a new dictionary with some data\n",
"print(d['cat']) # Get an entry from a dictionary; prints \"cute\"\n",
"print('cat' in d) # Check if a dictionary has a given key; prints \"True\""
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"wet\n"
]
}
],
"source": [
"d['fish'] = 'wet' # Set an entry in a dictionary\n",
"print(d['fish']) # Prints \"wet\""
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [
{
"ename": "KeyError",
"evalue": "'monkey'",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mKeyError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-37-78fc9745d9cf>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0md\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m'monkey'\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# KeyError: 'monkey' not a key of d\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mKeyError\u001b[0m: 'monkey'"
]
}
],
"source": [
"print(d['monkey']) # KeyError: 'monkey' not a key of d"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"N/A\n",
"wet\n"
]
}
],
"source": [
"print(d.get('monkey', 'N/A')) # Get an element with a default; prints \"N/A\"\n",
"print(d.get('fish', 'N/A')) # Get an element with a default; prints \"wet\""
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"N/A\n"
]
}
],
"source": [
"del d['fish'] # Remove an element from a dictionary\n",
"print(d.get('fish', 'N/A')) # \"fish\" is no longer a key; prints \"N/A\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can find all you need to know about dictionaries in the [documentation](https://docs.python.org/2/library/stdtypes.html#dict)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It is easy to iterate over the keys in a dictionary:"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [
{
"name": "stdout",