problem.ipynb 14.9 KB
Newer Older
Mathilde Rineau's avatar
Mathilde Rineau committed
1
2
3
4
5
6
7
{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "5c8980bd",
   "metadata": {},
   "source": [
Rémy Huet's avatar
Rémy Huet committed
8
9
    "# AOS1 - Assignment\n",
    "## Improving the accuracy and speed of support vector machines\n",
Mathilde Rineau's avatar
Mathilde Rineau committed
10
    "\n",
Rémy Huet's avatar
Rémy Huet committed
11
12
13
14
15
16
17
18
19
20
21
22
23
24
    "Authors : Mathilde Rineau, Rémy Huet\n",
    "\n",
    "### Abstract\n",
    "\n",
    "The paper \"Improving the Accuracy and Speed of Support Vector Machines\" by Burges and Schölkopf is investigating a method to improve ht speed an accuracy of a support vector machine.\n",
    "\n",
    "As the authors say, SVM are wildly used for several applications.\n",
    "To improve this method, the authors make the difference between two types of improvements to achieve :\n",
    "- improving the generalization performance;\n",
    "- improving the speed in test phase.\n",
    "\n",
    "The authors propose and combine two methods to improve SVM performances : the \"virtual support vector\" method and the \"reduced set\" method.\n",
    "With those two improvements, they announce a machine much faster (22 times than the original one) and more precise (1.1% vs 1.4% error) than the original one.\n",
    "\n",
Rémy Huet's avatar
Rémy Huet committed
25
    "In this assignment, we will implement the first method to test it."
Mathilde Rineau's avatar
Mathilde Rineau committed
26
27
   ]
  },
Rémy Huet's avatar
Rémy Huet committed
28
29
  {
   "cell_type": "markdown",
Mathilde Rineau's avatar
Mathilde Rineau committed
30
   "id": "12aaeba6",
Rémy Huet's avatar
Rémy Huet committed
31
32
33
34
35
   "metadata": {},
   "source": [
    "### First part : tests with a vanilla SVM\n",
    "\n",
    "In this first part, we will use a vanilla SVM on the MINST dataset with the provided params.\n",
Mathilde Rineau's avatar
Mathilde Rineau committed
36
    "We will observe the error of the SVM and the time for the test phase to compare them with the improved version"
Rémy Huet's avatar
Rémy Huet committed
37
38
   ]
  },
Mathilde Rineau's avatar
Mathilde Rineau committed
39
40
  {
   "cell_type": "code",
Rémy Huet's avatar
Rémy Huet committed
41
   "execution_count": null,
Mathilde Rineau's avatar
Mathilde Rineau committed
42
43
   "id": "9f152334",
   "metadata": {},
Rémy Huet's avatar
Rémy Huet committed
44
   "outputs": [],
Mathilde Rineau's avatar
Mathilde Rineau committed
45
   "source": [
Rémy Huet's avatar
Rémy Huet committed
46
47
    "# We will work on the mnist data set\n",
    "# We load it from fetch_openml\n",
Mathilde Rineau's avatar
Mathilde Rineau committed
48
49
    "from sklearn.datasets import fetch_openml\n",
    "import matplotlib.pyplot as plt\n",
Rémy Huet's avatar
Rémy Huet committed
50
    "import numpy as np\n",
Mathilde Rineau's avatar
Mathilde Rineau committed
51
    "\n",
Rémy Huet's avatar
Rémy Huet committed
52
53
54
55
56
    "X, y = fetch_openml('mnist_784', version=1, return_X_y=True, as_frame=False)"
   ]
  },
  {
   "cell_type": "markdown",
Mathilde Rineau's avatar
Mathilde Rineau committed
57
   "id": "855cdb06",
Rémy Huet's avatar
Rémy Huet committed
58
59
60
61
62
63
64
   "metadata": {},
   "source": [
    "We do some inspection on the dataset :"
   ]
  },
  {
   "cell_type": "code",
Rémy Huet's avatar
Rémy Huet committed
65
   "execution_count": null,
Mathilde Rineau's avatar
Mathilde Rineau committed
66
   "id": "708c8ea1",
Rémy Huet's avatar
Rémy Huet committed
67
   "metadata": {},
Rémy Huet's avatar
Rémy Huet committed
68
   "outputs": [],
Rémy Huet's avatar
Rémy Huet committed
69
70
   "source": [
    "# We print the caracteristics of X and Y\n",
Mathilde Rineau's avatar
Mathilde Rineau committed
71
    "print(X.shape)\n",
Rémy Huet's avatar
Rémy Huet committed
72
73
74
75
76
77
78
79
80
81
    "print(y.shape)\n",
    "# Values taken by y\n",
    "print(np.unique(y))\n",
    "\n",
    "image = np.reshape(X[0], (28, 28))\n",
    "plt.imshow(image, cmap='gray')"
   ]
  },
  {
   "cell_type": "markdown",
Mathilde Rineau's avatar
Mathilde Rineau committed
82
   "id": "4e49be54",
Rémy Huet's avatar
Rémy Huet committed
83
84
85
86
87
88
89
90
91
92
93
94
   "metadata": {},
   "source": [
    "The dataset contains 70k samples of 784 features.\n",
    "The classes are 0 to 9 (the digits on the images).\n",
    "\n",
    "The features are the pixels of a 28 x 28 image that we can retrieve using numpy's reshape function.\n",
    "\n",
    "For example, the 1st image is a 5."
   ]
  },
  {
   "cell_type": "markdown",
Mathilde Rineau's avatar
Mathilde Rineau committed
95
   "id": "60f31892",
Rémy Huet's avatar
Rémy Huet committed
96
97
98
99
100
101
   "metadata": {},
   "source": [
    "With our dataset, we can generate a training dataset and a testing dataset.\n",
    "As in the article, we will use 60k samples as training samples and 10k as testing.\n",
    "\n",
    "We split the dataset using the `train_test_split` function from `sklearn`."
Mathilde Rineau's avatar
Mathilde Rineau committed
102
103
104
105
   ]
  },
  {
   "cell_type": "code",
Rémy Huet's avatar
Rémy Huet committed
106
   "execution_count": null,
Mathilde Rineau's avatar
Mathilde Rineau committed
107
108
109
110
   "id": "4d3fa1c7",
   "metadata": {},
   "outputs": [],
   "source": [
Rémy Huet's avatar
Rémy Huet committed
111
112
    "# We divide the data set in two parts: train set and test set\n",
    "# According to the recommended values the train set's size is 60000 and the test set's size is 10000\n",
Mathilde Rineau's avatar
Mathilde Rineau committed
113
114
115
116
117
    "from sklearn.model_selection import train_test_split\n",
    "X_train, X_test, y_train, y_test = train_test_split(\n",
    "    X, y, train_size=60000, test_size=10000)"
   ]
  },
Rémy Huet's avatar
Rémy Huet committed
118
119
  {
   "cell_type": "markdown",
Mathilde Rineau's avatar
Mathilde Rineau committed
120
   "id": "d0532cc1",
Rémy Huet's avatar
Rémy Huet committed
121
122
123
124
125
126
127
128
129
   "metadata": {},
   "source": [
    "From the article, we retrieve the parameters of the SVM used.\n",
    "We get C = 10, and a polynomial kernel of degree 5.\n",
    "Coefficients `gamma` and `coef0` are respectively equals to 1 and 0.\n",
    "\n",
    "We can now train a SVM with these params on the training dataset."
   ]
  },
Mathilde Rineau's avatar
Mathilde Rineau committed
130
131
  {
   "cell_type": "code",
Rémy Huet's avatar
Rémy Huet committed
132
   "execution_count": null,
Mathilde Rineau's avatar
Mathilde Rineau committed
133
134
   "id": "d809fc87",
   "metadata": {},
Rémy Huet's avatar
Rémy Huet committed
135
   "outputs": [],
Mathilde Rineau's avatar
Mathilde Rineau committed
136
   "source": [
Rémy Huet's avatar
Rémy Huet committed
137
    "# First, we perform a SVC without preprocessing or improving in terms of accuracy or speed\n",
Mathilde Rineau's avatar
Mathilde Rineau committed
138
    "from sklearn.svm import SVC\n",
Rémy Huet's avatar
Rémy Huet committed
139
140
141
    "# we perform the default SVC, with the hyperparameter C=10 and a polynomial kernel of degree 5\n",
    "# according to the article\n",
    "svc = SVC(C=10, kernel = 'poly', degree = 5, gamma=1, coef0=0)\n",
Mathilde Rineau's avatar
Mathilde Rineau committed
142
143
144
    "svc.fit(X_train, y_train)"
   ]
  },
Rémy Huet's avatar
Rémy Huet committed
145
146
  {
   "cell_type": "markdown",
Mathilde Rineau's avatar
Mathilde Rineau committed
147
   "id": "a8cf4850",
Rémy Huet's avatar
Rémy Huet committed
148
149
150
151
152
153
154
155
   "metadata": {},
   "source": [
    "Using the previously trained SVM, we make a prediction on the test dataset.\n",
    "\n",
    "One of the measured performance of the SVM in this article is the speed of the test phase.\n",
    "We thus measure it."
   ]
  },
Mathilde Rineau's avatar
Mathilde Rineau committed
156
157
  {
   "cell_type": "code",
Rémy Huet's avatar
Rémy Huet committed
158
   "execution_count": null,
Mathilde Rineau's avatar
Mathilde Rineau committed
159
160
   "id": "8cb28178",
   "metadata": {},
Rémy Huet's avatar
Rémy Huet committed
161
   "outputs": [],
Mathilde Rineau's avatar
Mathilde Rineau committed
162
   "source": [
Rémy Huet's avatar
Rémy Huet committed
163
164
165
166
167
168
169
170
171
172
173
174
    "import time\n",
    "\n",
    "start = time.time()\n",
    "# We predict the values for our test set\n",
    "y_pred = svc.predict(X_test)\n",
    "end = time.time()\n",
    "\n",
    "print(f'Elapsed time : {end - start}')"
   ]
  },
  {
   "cell_type": "markdown",
Mathilde Rineau's avatar
Mathilde Rineau committed
175
   "id": "90f08e8b",
Rémy Huet's avatar
Rémy Huet committed
176
177
   "metadata": {},
   "source": [
Rémy Huet's avatar
Rémy Huet committed
178
    "Of course the prediction time varies between two splits of the dataset, two computers and two executions, but we will retain that is is close from 70s.\n",
Rémy Huet's avatar
Rémy Huet committed
179
180
    "\n",
    "Using `y_test` the real classes of the `X_test` samples and `y_pred` the predicted classes from the SVM, we can compute the confusion matrix and the error to see the how good the predictions are."
Mathilde Rineau's avatar
Mathilde Rineau committed
181
182
183
184
   ]
  },
  {
   "cell_type": "code",
Rémy Huet's avatar
Rémy Huet committed
185
   "execution_count": null,
Mathilde Rineau's avatar
Mathilde Rineau committed
186
187
   "id": "c1248238",
   "metadata": {},
Rémy Huet's avatar
Rémy Huet committed
188
   "outputs": [],
Mathilde Rineau's avatar
Mathilde Rineau committed
189
   "source": [
Rémy Huet's avatar
Rémy Huet committed
190
191
192
193
194
195
    "# We compute the confusion matrix\n",
    "from sklearn.metrics import ConfusionMatrixDisplay, classification_report, accuracy_score\n",
    "\n",
    "disp = ConfusionMatrixDisplay.from_predictions(y_test, y_pred)\n",
    "disp.figure_.suptitle('Confusion matrix for the vanilla SVM')\n",
    "plt.show()"
Mathilde Rineau's avatar
Mathilde Rineau committed
196
197
198
199
   ]
  },
  {
   "cell_type": "code",
Rémy Huet's avatar
Rémy Huet committed
200
   "execution_count": null,
Mathilde Rineau's avatar
Mathilde Rineau committed
201
202
   "id": "ba4e38ac",
   "metadata": {},
Rémy Huet's avatar
Rémy Huet committed
203
   "outputs": [],
Mathilde Rineau's avatar
Mathilde Rineau committed
204
   "source": [
Rémy Huet's avatar
Rémy Huet committed
205
    "# We print the classification report\n",
Mathilde Rineau's avatar
Mathilde Rineau committed
206
207
208
209
210
    "print(classification_report(y_test, y_pred))"
   ]
  },
  {
   "cell_type": "code",
Rémy Huet's avatar
Rémy Huet committed
211
   "execution_count": null,
Mathilde Rineau's avatar
Mathilde Rineau committed
212
213
   "id": "947b0895",
   "metadata": {},
Rémy Huet's avatar
Rémy Huet committed
214
   "outputs": [],
Mathilde Rineau's avatar
Mathilde Rineau committed
215
   "source": [
Rémy Huet's avatar
Rémy Huet committed
216
217
218
219
220
221
222
223
224
    "# We print the accuracy of the SVC and the error rate\n",
    "acc = accuracy_score(y_test, y_pred)\n",
    "\n",
    "print(\"Accuracy: \", acc)\n",
    "print(\"Error rate: \", (1-acc) * 100, \"%\")"
   ]
  },
  {
   "cell_type": "markdown",
Mathilde Rineau's avatar
Mathilde Rineau committed
225
   "id": "8f780139",
Rémy Huet's avatar
Rémy Huet committed
226
227
   "metadata": {},
   "source": [
Rémy Huet's avatar
Rémy Huet committed
228
    "As earlier, the values are affected by the selection of the training and testing dateset. Wi will retain a value of 3.4 % for the error.\n",
Rémy Huet's avatar
Rémy Huet committed
229
    "\n",
Mathilde Rineau's avatar
Mathilde Rineau committed
230
    "The method described by the authors relies on the support vectors of the previously trained SVM.\n",
Rémy Huet's avatar
Rémy Huet committed
231
    "We will thus do some inspection on them before going further."
Mathilde Rineau's avatar
Mathilde Rineau committed
232
233
234
235
   ]
  },
  {
   "cell_type": "code",
Rémy Huet's avatar
Rémy Huet committed
236
   "execution_count": null,
Mathilde Rineau's avatar
Mathilde Rineau committed
237
238
   "id": "81b09df7",
   "metadata": {},
Rémy Huet's avatar
Rémy Huet committed
239
   "outputs": [],
Mathilde Rineau's avatar
Mathilde Rineau committed
240
   "source": [
Rémy Huet's avatar
Rémy Huet committed
241
242
243
244
245
246
247
    "s_vects = svc.support_vectors_\n",
    "\n",
    "print(s_vects.shape)\n",
    "\n",
    "v = s_vects[0]\n",
    "v_index = svc.support_[0]\n",
    "v_class = y_train[v_index]\n",
Mathilde Rineau's avatar
Mathilde Rineau committed
248
249
    "print(v_index)\n",
    "print(v_class)\n",
Rémy Huet's avatar
Rémy Huet committed
250
251
252
253
254
255
256
257
    "\n",
    "print(f'Class of the first support vector : {v_class}')\n",
    "img = np.reshape(v, (28, 28))\n",
    "plt.imshow(img, cmap='gray')"
   ]
  },
  {
   "cell_type": "markdown",
Mathilde Rineau's avatar
Mathilde Rineau committed
258
   "id": "14cb2622",
Rémy Huet's avatar
Rémy Huet committed
259
260
   "metadata": {},
   "source": [
Rémy Huet's avatar
Rémy Huet committed
261
    "There are around 7300 support vectors in the SVM.\n",
Rémy Huet's avatar
Rémy Huet committed
262
263
264
265
266
267
    "\n",
    "Each support vector is a sample of `X_train`. We can thus retrieve its class using its index on the train dataset, and display it as an image as above."
   ]
  },
  {
   "cell_type": "markdown",
Mathilde Rineau's avatar
Mathilde Rineau committed
268
   "id": "fa1d441f",
Rémy Huet's avatar
Rémy Huet committed
269
270
   "metadata": {},
   "source": [
Rémy Huet's avatar
Rémy Huet committed
271
    "### Implementing the \"Virtual Support Vectors\" method\n",
Rémy Huet's avatar
Rémy Huet committed
272
    "\n",
Rémy Huet's avatar
Rémy Huet committed
273
274
275
276
277
278
279
280
281
    "We will now implement the \"Virtual Support Vectors\" as proposed by the authors.\n",
    "The aim of this method is to add some invariance in the data to make the previsions more robust.\n",
    "\n",
    "For a given trained SVM, we now that the only data relevant for classification are the support vectors.\n",
    "We will thus re-train a SVM, but with different data created from the support vectors.\n",
    "\n",
    "Here, the invariance proposed is shifting the image to one of the four directions.\n",
    "\n",
    "For each support vector, we will shift the image to the four directions, and use those images as a new dataset."
Rémy Huet's avatar
Rémy Huet committed
282
283
284
285
   ]
  },
  {
   "cell_type": "code",
Rémy Huet's avatar
Rémy Huet committed
286
   "execution_count": null,
Rémy Huet's avatar
...    
Rémy Huet committed
287
   "id": "a9efe219",
Rémy Huet's avatar
Rémy Huet committed
288
289
   "metadata": {},
   "outputs": [],
Rémy Huet's avatar
Rémy Huet committed
290
   "source": [
Rémy Huet's avatar
Rémy Huet committed
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
    "# Retrieve the indexes of the support vectors\n",
    "sv_indexes = svc.support_\n",
    "\n",
    "# Arrays for storing the data\n",
    "X_vsv = []\n",
    "y_vsv = []\n",
    "\n",
    "for i in sv_indexes:\n",
    "    # Get the support vector and reshape it as image\n",
    "    sv = X_train[i].reshape((28, 28))\n",
    "    sv_class = y_train[i]\n",
    "    # Generate the four shifts, reshape them\n",
    "    sv_1 = np.roll(sv, 1, axis=0).reshape(784)\n",
    "    sv_2 = np.roll(sv, -1, axis=0).reshape(784)\n",
    "    sv_3 = np.roll(sv, 1, axis=1).reshape(784)\n",
    "    sv_4 = np.roll(sv, -1, axis=1).reshape(784)\n",
    "\n",
    "    # Add them to the dataset\n",
    "    X_vsv.append(sv_1)\n",
    "    X_vsv.append(sv_2)\n",
    "    X_vsv.append(sv_3)\n",
    "    X_vsv.append(sv_4)\n",
    "\n",
    "    # Add the corresponding classes\n",
    "    y_vsv.append(sv_class)\n",
    "    y_vsv.append(sv_class)\n",
    "    y_vsv.append(sv_class)\n",
    "    y_vsv.append(sv_class)\n",
    "\n",
    "X_vsv = np.array(X_vsv)\n",
    "y_vsv = np.array(y_vsv)\n",
    "\n",
    "print(X_vsv.shape)\n",
    "print(y_vsv.shape)\n",
    "\n",
    "im0 = X_vsv[0].reshape((28, 28))\n",
    "im1 = X_vsv[1].reshape((28, 28))\n",
    "im2 = X_vsv[2].reshape((28, 28))\n",
    "im3 = X_vsv[3].reshape((28, 28))\n",
    "\n",
Rémy Huet's avatar
Rémy Huet committed
331
332
    "print(f'classes : {y_vsv[0]} {y_vsv[1]} {y_vsv[2]} {y_vsv[3]}')\n",
    "\n",
Rémy Huet's avatar
Rémy Huet committed
333
334
335
336
337
338
    "plt.figure()\n",
    "_, axis = plt.subplots(1, 4)\n",
    "axis[0].imshow(im0, cmap='gray')\n",
    "axis[1].imshow(im1, cmap='gray')\n",
    "axis[2].imshow(im2, cmap='gray')\n",
    "axis[3].imshow(im3, cmap='gray')"
Rémy Huet's avatar
Rémy Huet committed
339
   ]
Rémy Huet's avatar
Rémy Huet committed
340
341
342
  },
  {
   "cell_type": "markdown",
Rémy Huet's avatar
...    
Rémy Huet committed
343
   "id": "38b7c7cc",
Rémy Huet's avatar
Rémy Huet committed
344
345
346
347
348
349
350
351
   "metadata": {},
   "source": [
    "With this new dataset, we can now train a SVM with the same params."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
Rémy Huet's avatar
...    
Rémy Huet committed
352
   "id": "ab220d66",
Rémy Huet's avatar
Rémy Huet committed
353
354
355
356
357
358
359
360
361
   "metadata": {},
   "outputs": [],
   "source": [
    "# note that we can re-fit the same SVC object with the new dataset\n",
    "svc.fit(X_vsv, y_vsv)"
   ]
  },
  {
   "cell_type": "markdown",
Rémy Huet's avatar
...    
Rémy Huet committed
362
   "id": "044b30ea",
Rémy Huet's avatar
Rémy Huet committed
363
364
365
366
367
368
369
370
   "metadata": {},
   "source": [
    "Let's make some inspection on the results of the training"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
Rémy Huet's avatar
...    
Rémy Huet committed
371
   "id": "20cd715f",
Rémy Huet's avatar
Rémy Huet committed
372
373
374
375
376
377
378
379
   "metadata": {},
   "outputs": [],
   "source": [
    "print(svc.support_.shape)"
   ]
  },
  {
   "cell_type": "markdown",
Rémy Huet's avatar
...    
Rémy Huet committed
380
   "id": "82b7f205",
Rémy Huet's avatar
Rémy Huet committed
381
382
   "metadata": {},
   "source": [
Rémy Huet's avatar
Rémy Huet committed
383
384
    "With the \"vanilla SVM\", we got ~7300 support vectors for a training set of 60000 samples, so ~12 % of the dataset.\n",
    "With this training, we notice that most of the data was selected (~67 %).\n",
Rémy Huet's avatar
Rémy Huet committed
385
386
387
388
389
390
391
    "\n",
    "We can now use the trained SVM on the test date to measure the error and the time of the test."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
Rémy Huet's avatar
...    
Rémy Huet committed
392
   "id": "23b80cd6",
Rémy Huet's avatar
Rémy Huet committed
393
394
395
396
397
398
399
400
401
402
403
404
405
   "metadata": {},
   "outputs": [],
   "source": [
    "start = time.time()\n",
    "# We predict the values for our test set\n",
    "y_pred = svc.predict(X_test)\n",
    "end = time.time()\n",
    "\n",
    "print(f'Elapsed time : {end - start}')"
   ]
  },
  {
   "cell_type": "markdown",
Rémy Huet's avatar
...    
Rémy Huet committed
406
   "id": "efad7eb6",
Rémy Huet's avatar
Rémy Huet committed
407
408
   "metadata": {},
   "source": [
Rémy Huet's avatar
Rémy Huet committed
409
    "The time for the SVM to predict the test dataset is about 200s, which is much more than th vanilla SVM.\n",
Rémy Huet's avatar
Rémy Huet committed
410
411
412
413
414
415
416
417
    "It was predictable because the number of support vector is larger than the number of the support vectors of the vanilla SVM.\n",
    "\n",
    "Let's now see the error of the predictions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
Rémy Huet's avatar
...    
Rémy Huet committed
418
   "id": "9c9bcd87",
Rémy Huet's avatar
Rémy Huet committed
419
420
421
422
423
424
425
426
427
428
429
430
431
432
   "metadata": {},
   "outputs": [],
   "source": [
    "# We compute the confusion matrix\n",
    "from sklearn.metrics import ConfusionMatrixDisplay, classification_report, accuracy_score\n",
    "\n",
    "disp = ConfusionMatrixDisplay.from_predictions(y_test, y_pred)\n",
    "disp.figure_.suptitle('Confusion matrix for the vanilla SVM')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
Rémy Huet's avatar
...    
Rémy Huet committed
433
   "id": "fe0437c5",
Rémy Huet's avatar
Rémy Huet committed
434
435
436
437
438
439
440
441
442
443
   "metadata": {},
   "outputs": [],
   "source": [
    "# We print the classification report\n",
    "print(classification_report(y_test, y_pred))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
Rémy Huet's avatar
...    
Rémy Huet committed
444
   "id": "5de4d361",
Rémy Huet's avatar
Rémy Huet committed
445
446
447
448
449
450
451
452
453
454
455
456
   "metadata": {},
   "outputs": [],
   "source": [
    "# We print the accuracy of the SVC and the error rate\n",
    "acc = accuracy_score(y_test, y_pred)\n",
    "\n",
    "print(\"Accuracy: \", acc)\n",
    "print(\"Error rate: \", (1-acc) * 100, \"%\")"
   ]
  },
  {
   "cell_type": "markdown",
Rémy Huet's avatar
...    
Rémy Huet committed
457
   "id": "d7cee3dd",
Rémy Huet's avatar
Rémy Huet committed
458
459
   "metadata": {},
   "source": [
Rémy Huet's avatar
Rémy Huet committed
460
461
462
463
464
    "We can see that the error is smaller than the error of the vanilla SVM (2.2 % vs 3.4 %)."
   ]
  },
  {
   "cell_type": "markdown",
Rémy Huet's avatar
...    
Rémy Huet committed
465
   "id": "1ec4101a",
Rémy Huet's avatar
Rémy Huet committed
466
467
468
469
470
471
472
473
474
   "metadata": {},
   "source": [
    "### Conclusion\n",
    "\n",
    "By implementing the \"virtual support vectors\" technique, we were able to ass some invariance in the data.\n",
    "This modification allowed an improvement of the accuracy of the SVM.\n",
    "\n",
    "However, most of the new data generated from the support vectors were selected as support vectors for the second machine.\n",
    "The augmentation of the number of support vectors led to an augmentation of the computation time during the test phase.\n",
Rémy Huet's avatar
Rémy Huet committed
475
    "\n",
Rémy Huet's avatar
Rémy Huet committed
476
    "That is why the authors suggest in the article to use a second technique to create a reduced set of vectors to reduce the computation time in the test phase."
Rémy Huet's avatar
Rémy Huet committed
477
   ]
Mathilde Rineau's avatar
Mathilde Rineau committed
478
479
480
  }
 ],
 "metadata": {
Rémy Huet's avatar
Rémy Huet committed
481
482
483
  "interpreter": {
   "hash": "78ff2a7d75990e26f7862f23aec114522929670ec71bbfd9a70bdb18a9100993"
  },
Mathilde Rineau's avatar
Mathilde Rineau committed
484
  "kernelspec": {
Rémy Huet's avatar
Rémy Huet committed
485
   "display_name": "Python 3.8.10 64-bit ('AOS1-3HAiNONq': pipenv)",
Mathilde Rineau's avatar
Mathilde Rineau committed
486
487
488
489
490
491
492
493
494
495
496
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
Rémy Huet's avatar
Rémy Huet committed
497
   "pygments_lexer": "ipython3"
Mathilde Rineau's avatar
Mathilde Rineau committed
498
499
500
501
502
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}