Probability Exercises With ArviZ: Solutions & Discussion
Let's dive into some probability exercises using Python, ArviZ, and other helpful libraries! This article provides detailed solutions and explanations, making it easier to understand the concepts and code. We'll be using libraries such as math, numpy, pandas, matplotlib, and arviz. So, let's get started!
Prerequisites
Before we begin, make sure you have the necessary libraries installed. You can install them using pip:
pip install arviz matplotlib h2o
Exercise 1: Student Selection
Problem Statement
Imagine we have a group of students from different fields: 8 from Electronics (E), 3 from Systems (S), and 9 from Industrial (I), making a total of 20 students. We want to calculate probabilities for selecting students with and without replacement.
Defining Variables
First, let's define our variables:
import math
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import arviz as az
from math import comb
E, S, I = 8, 3, 9
N = E + S + I # Total of 20 students
Calculating Probabilities
Next, we'll calculate the probabilities for different scenarios, both with and without replacement. Without replacement means once a student is selected, they aren't put back into the pool. With replacement means the selected student is returned to the pool, and can be selected again.
Without Replacement
In scenarios without replacement, the total number of possible outcomes changes with each selection. This is handled by using combinations.
def ejercicio1():
resultados = []
# --- SIN reemplazo ---
denom_sin = comb(N, 3)
p_3E_sin = comb(E, 3) / denom_sin if E >= 3 else 0.0
p_3S_sin = comb(S, 3) / denom_sin if S >= 3 else 0.0
p_2E_1S_sin = (comb(E, 2) * comb(S, 1)) / denom_sin
p_al_menos_1S_sin = 1 - comb(N - S, 3) / denom_sin
p_1cada_sin = (comb(E,1) * comb(S,1) * comb(I,1)) / denom_sin
p_orden_ESI_sin = (E / N) * (S / (N - 1)) * (I / (N - 2))
Explanation of Probabilities (Without Replacement):
- P(3E): Probability that all 3 selected are from Electronics: This is calculated by dividing the number of ways to choose 3 students from the Electronics group by the total number of ways to choose 3 students from all groups.
- P(3S): Probability that all 3 selected are from Systems: Similar to the electronics case, but for the Systems group.
- P(2E, 1S): Probability of selecting 2 from Electronics and 1 from Systems: Calculated by multiplying the combinations of selecting 2 from Electronics and 1 from Systems, then dividing by the total possible combinations.
- P(at least 1S): Probability of selecting at least 1 student from Systems: This is the complement of not selecting any Systems students. We calculate the probability of not selecting any Systems students and subtract it from 1.
- P(1 each): Probability of selecting one student from each field: Calculated by multiplying the combinations of selecting one from each field, and dividing by the total possible combinations.
- P(E, S, I in order): Probability of selecting one from each field in the specific order Electronics, Systems, Industrial: This is calculated by multiplying the individual probabilities of selecting from each group in the specified order.
With Replacement
In scenarios with replacement, the total number of possibilities remains the same for each selection. This simplifies probability calculations.
# --- CON reemplazo ---
pE, pS, pI = E / N, S / N, I / N
p_3E_con = pE ** 3
p_3S_con = pS ** 3
p_2E_1S_con = comb(3,2) * (pE ** 2) * pS
p_al_menos_1S_con = 1 - (1 - pS) ** 3
p_1cada_con = math.factorial(3) * pE * pS * pI
p_orden_ESI_con = pE * pS * pI
Explanation of Probabilities (With Replacement):
- P(3E): Probability that all 3 selected are from Electronics: Calculated by raising the probability of selecting an Electronics student to the power of 3.
- P(3S): Probability that all 3 selected are from Systems: Similar to the electronics case, but for the Systems group.
- P(2E, 1S): Probability of selecting 2 from Electronics and 1 from Systems: This is calculated using the binomial probability formula.
- P(at least 1S): Probability of selecting at least 1 student from Systems: This is calculated as the complement of not selecting any students from the Systems field.
- P(1 each): Probability of selecting one student from each field: This is calculated by considering all possible orders and multiplying by the probability of each field.
- P(E, S, I in order): Probability of selecting one from each field in the specific order Electronics, Systems, Industrial: Calculated by multiplying the probability of selecting each field in the specified order.
Displaying Results with Pandas and ArviZ
To present our results, we will use Pandas to create a DataFrame and ArviZ to provide a summary. ArviZ helps in exploratory analysis of Bayesian models.
resultados.append(("CON reemplazo", {
"Los 3 sean Electrónica": p_3E_con,
"Los 3 sean Sistemas": p_3S_con,
"2 Electrónica y 1 Sistemas": p_2E_1S_con,
"Al menos 1 Sistemas": p_al_menos_1S_con,
"1 de cada carrera": p_1cada_con,
"En orden E,S,I": p_orden_ESI_con
}))
return resultados
res1 = ejercicio1()
# Convertir resultados a DataFrame
rows = []
for tipo, dic in res1:
for evento, p in dic.items():
rows.append({"Tipo": tipo, "Evento": evento, "Probabilidad": p})
df1 = pd.DataFrame(rows)
# Convertir el DataFrame a InferenceData con ArviZ
idata1 = az.from_pandas(df1)
print("\n=== EJERCICIO 1: Selección de estudiantes (usando ArviZ) ===")
print(df1.to_string(index=False, formatters={"Probabilidad": "{:.6f}".format}))
print("\n📊 Resumen de ArviZ:")
print(az.summary(idata1, var_names=["Probabilidad"]))
The code above first calculates the probabilities for each scenario. Then, it structures the data into a Pandas DataFrame for easy viewing. Finally, it converts the DataFrame into an ArviZ InferenceData object, which allows us to generate summaries and perform further analysis.
Exercise 2: Arrangement of Subjects
Problem Statement
Let's consider arranging subjects. Suppose we have 4 Engineering, 6 English, and 2 Physics books. We want to find the number of ways to arrange these books under two conditions:
- Each subject's books are grouped together.
- No restrictions.
Calculating Arrangements
To solve this, we'll calculate the arrangements for both conditions.
def ejercicio2():
n_ing, n_ingl, n_fis = 4, 6, 2
formas_a = math.factorial(3) * math.factorial(n_ing) * math.factorial(n_ingl) * math.factorial(n_fis)
formas_b = math.factorial(9) * math.factorial(4)
return formas_a, formas_b
formas2_a, formas2_b = ejercicio2()
datos2 = {"Caso": ["Cada asignatura junta", "Sin restricción"],
"Formas": [formas2_a, formas2_b]}
df2 = pd.DataFrame(datos2)
idata2 = az.from_pandas(df2)
print("\n=== EJERCICIO 2: Arreglo de libros ===")
print(df2.to_string(index=False))
print("\n📊 Resumen de ArviZ:")
print(az.summary(idata2, var_names=["Formas"]))
Explanation:
- Condition A (Each subject grouped): We treat each subject as a block. There are 3! ways to arrange the blocks. Within each block, there are n! ways to arrange the books, where n is the number of books for that subject.
- Condition B (No restrictions): We arrange all books together without any restrictions. Then we calculate the factorial of total books
Conclusion
These exercises demonstrate how to use Python, Pandas, and ArviZ to solve and analyze probability problems. By structuring the data into DataFrames and using ArviZ for summaries, we can gain better insights into the results. Remember, probability is a powerful tool, and mastering these concepts can be highly beneficial in various fields. Keep practicing, and you'll become a pro in no time! Happy coding! 🚀