mirror of
https://app-learninglab.inria.fr/moocrr/gitlab/da84ababf0696af51bddad556af86353/mooc-rr.git
synced 2026-06-17 09:35:24 +02:00
Version anglaise de M2/exo5 pour Python/Orgmode
This commit is contained in:
parent
9a8977a299
commit
14453e5ee7
1 changed files with 217 additions and 0 deletions
217
module2/exo5/exo5_python-en.org
Normal file
217
module2/exo5/exo5_python-en.org
Normal file
|
|
@ -0,0 +1,217 @@
|
|||
#+TITLE: Analysis of the risk of failure of the O-rings on the Challenger shuttle
|
||||
#+AUTHOR: Arnaud Legrand
|
||||
#+LANGUAGE: fr
|
||||
|
||||
#+HTML_HEAD: <link rel="stylesheet" type="text/css" href="http://www.pirilampo.org/styles/readtheorg/css/htmlize.css"/>
|
||||
#+HTML_HEAD: <link rel="stylesheet" type="text/css" href="http://www.pirilampo.org/styles/readtheorg/css/readtheorg.css"/>
|
||||
#+HTML_HEAD: <script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.3/jquery.min.js"></script>
|
||||
#+HTML_HEAD: <script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.4/js/bootstrap.min.js"></script>
|
||||
#+HTML_HEAD: <script type="text/javascript" src="http://www.pirilampo.org/styles/lib/js/jquery.stickytableheaders.js"></script>
|
||||
#+HTML_HEAD: <script type="text/javascript" src="http://www.pirilampo.org/styles/readtheorg/js/readtheorg.js"></script>
|
||||
|
||||
#+LATEX_HEADER: \usepackage{a4}
|
||||
#+LATEX_HEADER: \usepackage[french]{babel}
|
||||
|
||||
# #+PROPERTY: header-args :session :exports both
|
||||
|
||||
On January 27, 1986, the day before the takeoff of the shuttle /Challenger/, had
|
||||
a three-hour teleconference was held between
|
||||
Morton Thiokol (the manufacturer of one of the engines) and NASA. The
|
||||
discussion focused on the consequences of the
|
||||
temperature at take-off of 31°F (just below
|
||||
0°C) for the success of the flight and in particular on the performance of the
|
||||
O-rings used in the engines. Indeed, no test
|
||||
had been performed at this temperature.
|
||||
|
||||
The following study takes up some of the analyses carried out that
|
||||
night with the objective of assessing the potential influence of
|
||||
the temperature and pressure to which the O-rings are subjected
|
||||
on their probability of malfunction. Our starting point is
|
||||
the results of the experiments carried out by NASA engineers
|
||||
during the six years preceding the launch of the shuttle
|
||||
Challenger.
|
||||
|
||||
* Loading the data
|
||||
We start by loading this data:
|
||||
#+begin_src python :results value :session *python* :exports both
|
||||
import numpy as np
|
||||
import pandas as pd
|
||||
data = pd.read_csv("shuttle.csv")
|
||||
data
|
||||
#+end_src
|
||||
|
||||
#+RESULTS:
|
||||
#+begin_example
|
||||
Date Count Temperature Pressure Malfunction
|
||||
0 4/12/81 6 66 50 0
|
||||
1 11/12/81 6 70 50 1
|
||||
2 3/22/82 6 69 50 0
|
||||
3 11/11/82 6 68 50 0
|
||||
4 4/04/83 6 67 50 0
|
||||
5 6/18/82 6 72 50 0
|
||||
6 8/30/83 6 73 100 0
|
||||
7 11/28/83 6 70 100 0
|
||||
8 2/03/84 6 57 200 1
|
||||
9 4/06/84 6 63 200 1
|
||||
10 8/30/84 6 70 200 1
|
||||
11 10/05/84 6 78 200 0
|
||||
12 11/08/84 6 67 200 0
|
||||
13 1/24/85 6 53 200 2
|
||||
14 4/12/85 6 67 200 0
|
||||
15 4/29/85 6 75 200 0
|
||||
16 6/17/85 6 70 200 0
|
||||
17 7/2903/85 6 81 200 0
|
||||
18 8/27/85 6 76 200 0
|
||||
19 10/03/85 6 79 200 0
|
||||
20 10/30/85 6 75 200 2
|
||||
21 11/26/85 6 76 200 0
|
||||
22 1/12/86 6 58 200 1
|
||||
#+end_example
|
||||
|
||||
The data set shows us the date of each test, the number of O-rings
|
||||
(there are 6 on the main launcher), the
|
||||
temperature (in Fahrenheit) and pressure (in psi), and finally the
|
||||
number of identified malfunctions.
|
||||
|
||||
* Graphical inspection
|
||||
Flights without incidents do not provide any information
|
||||
on the influence of temperature or pressure on malfunction.
|
||||
We thus focus on the experiments in which at least one O-ring was defective.
|
||||
|
||||
#+begin_src python :results value :session *python* :exports both
|
||||
data = data[data.Malfunction>0]
|
||||
data
|
||||
#+end_src
|
||||
|
||||
#+RESULTS:
|
||||
: Date Count Temperature Pressure Malfunction
|
||||
: 1 11/12/81 6 70 50 1
|
||||
: 8 2/03/84 6 57 200 1
|
||||
: 9 4/06/84 6 63 200 1
|
||||
: 10 8/30/84 6 70 200 1
|
||||
: 13 1/24/85 6 53 200 2
|
||||
: 20 10/30/85 6 75 200 2
|
||||
: 22 1/12/86 6 58 200 1
|
||||
|
||||
We have a high temperature variability but
|
||||
the pressure is almost always 200, which should
|
||||
simplify the analysis.
|
||||
|
||||
How does the frequency of failure vary with temperature?
|
||||
#+begin_src python :results output file :var matplot_lib_filename="freq_temp_python.png" :exports both :session *python*
|
||||
import matplotlib.pyplot as plt
|
||||
|
||||
plt.clf()
|
||||
data["Frequency"]=data.Malfunction/data.Count
|
||||
data.plot(x="Temperature",y="Frequency",kind="scatter",ylim=[0,1])
|
||||
plt.grid(True)
|
||||
|
||||
plt.savefig(matplot_lib_filename)
|
||||
print(matplot_lib_filename)
|
||||
#+end_src
|
||||
|
||||
#+RESULTS:
|
||||
[[file:freq_temp_python.png]]
|
||||
|
||||
At first glance, the dependence does not look very important, but let's try to
|
||||
estimate the impact of temperature $t$ on the probability of O-ring malfunction.
|
||||
|
||||
* Estimation of the temperature influence
|
||||
|
||||
Suppose that each of the six O-rings is damaged with the same
|
||||
probability and independently of the others and that this probability
|
||||
depends only on the temperature. If $p(t)$ is this probability, the
|
||||
number $D$ of malfunctioning O-rings during a flight at
|
||||
temperature $t$ follows a binomial law with parameters $n=6$ and
|
||||
$p=p(t)$. To link $p(t)$ to $t$, we will therefore perform a
|
||||
logistic regression.
|
||||
|
||||
#+begin_src python :results value :session *python* :exports both
|
||||
import statsmodels.api as sm
|
||||
|
||||
data["Success"]=data.Count-data.Malfunction
|
||||
data["Intercept"]=1
|
||||
|
||||
|
||||
# logit_model=sm.Logit(data["Frequency"],data[["Intercept","Temperature"]]).fit()
|
||||
logmodel=sm.GLM(data['Frequency'], data[['Intercept','Temperature']], family=sm.families.Binomial(sm.families.links.logit)).fit()
|
||||
|
||||
logmodel.summary()
|
||||
#+end_src
|
||||
|
||||
#+RESULTS:
|
||||
#+begin_example
|
||||
Generalized Linear Model Regression Results
|
||||
==============================================================================
|
||||
Dep. Variable: Frequency No. Observations: 7
|
||||
Model: GLM Df Residuals: 5
|
||||
Model Family: Binomial Df Model: 1
|
||||
Link Function: logit Scale: 1.0
|
||||
Method: IRLS Log-Likelihood: -3.6370
|
||||
Date: Fri, 20 Jul 2018 Deviance: 3.3763
|
||||
Time: 16:56:08 Pearson chi2: 0.236
|
||||
No. Iterations: 5
|
||||
===============================================================================
|
||||
coef std err z P>|z| [0.025 0.975]
|
||||
-------------------------------------------------------------------------------
|
||||
Intercept -1.3895 7.828 -0.178 0.859 -16.732 13.953
|
||||
Temperature 0.0014 0.122 0.012 0.991 -0.238 0.240
|
||||
===============================================================================
|
||||
#+end_example
|
||||
|
||||
The most likely estimator of the temperature parameter is 0.0014
|
||||
and the standard error of this estimator is 0.122, in other words we
|
||||
cannot distinguish any particular impact and we must take our
|
||||
estimates with caution.
|
||||
|
||||
* Estimation of the probability of O-ring malfunction
|
||||
The expected temperature on the take-off day is 31°F. Let's try to
|
||||
estimate the probability of O-ring malfunction at
|
||||
this temperature from the model we just built:
|
||||
|
||||
#+begin_src python :results output file :var matplot_lib_filename="proba_estimate_python.png" :exports both :session *python*
|
||||
import matplotlib.pyplot as plt
|
||||
|
||||
data_pred = pd.DataFrame({'Temperature': np.linspace(start=30, stop=90, num=121), 'Intercept': 1})
|
||||
data_pred['Frequency'] = logmodel.predict(data_pred)
|
||||
data_pred.plot(x="Temperature",y="Frequency",kind="line",ylim=[0,1])
|
||||
plt.scatter(x=data["Temperature"],y=data["Frequency"])
|
||||
plt.grid(True)
|
||||
|
||||
plt.savefig(matplot_lib_filename)
|
||||
print(matplot_lib_filename)
|
||||
#+end_src
|
||||
|
||||
#+RESULTS:
|
||||
[[file:proba_estimate_python.png]]
|
||||
|
||||
As expected from the initial data, the
|
||||
temperature has no significant impact on the probability of failure of the
|
||||
O-rings. It will be about 0.2, as in the tests
|
||||
where we had a failure of at least one joint. Let's get back to the initial dataset to estimate the probability of failure:
|
||||
|
||||
#+begin_src python :results output :session *python* :exports both
|
||||
data = pd.read_csv("shuttle.csv")
|
||||
print(np.sum(data.Malfunction)/np.sum(data.Count))
|
||||
#+end_src
|
||||
|
||||
#+RESULTS:
|
||||
: 0.06521739130434782
|
||||
|
||||
This probability is thus about $p=0.065$. Knowing that there is
|
||||
a primary and a secondary O-ring on each of the three parts of the
|
||||
launcher, the probability of failure of both joints of a launcher
|
||||
is $p^2 \approx 0.00425$. The probability of failure of any one of the
|
||||
launchers is $1-(1-p^2)^3 \approximately 1.2%$. That would really be
|
||||
bad luck.... Everything is under control, so the takeoff can happen
|
||||
tomorrow as planned.
|
||||
|
||||
But the next day, the Challenger shuttle exploded and took away
|
||||
with her the seven crew members. The public was shocked and in
|
||||
the subsequent investigation, the reliability of the
|
||||
O-rings was questioned. Beyond the internal communication problems
|
||||
of NASA, which have a lot to do with this fiasco, the previous analysis
|
||||
includes (at least) a small problem.... Can you find it?
|
||||
You are free to modify this analysis and to look at this dataset
|
||||
from all angles in order to to explain what's wrong.
|
||||
|
||||
Loading…
Add table
Reference in a new issue