My Machine Learning Journey: From Python Basics to Building Neural Networks
Table of contents
- The Beginning: Learning Python and NumPy
- Why NumPy?
- Example:
- Deep Diving into Pandas for Data Analysis
- Key Concepts I Learned:
- Example Project:
- Mathematics for Deep Learning
- 1. Linear Algebra:
- 2. Calculus:
- 3. Probability and Statistics:
- 4. Optimization:
- 5. Information Theory:
- The First Big Leap: Implementing a Neural Network in NumPy
- What I Built:
- Key Challenges:
- Code Snippet:
- Results of My First Model
- Next Steps: Mastering Frameworks
- Exploring Datasets and Classification Problems
- Example with make_circles:
- Input Data
- Classification Model Implementation
Machine Learning (ML) has always fascinated me. The idea of enabling a machine to "learn" and make decisions intrigued me enough to embark on a journey to understand and implement it. Here, I’ll share my experiences, challenges, and what I’ve achieved so far, hoping to inspire and guide those who wish to start their own ML journey.
The Beginning: Learning Python and NumPy
My journey began with mastering Python. It’s the go-to language for ML due to its simplicity and extensive libraries. After grasping Python basics like loops, functions, and object-oriented programming, I moved to NumPy, a powerful library for numerical computing.
Why NumPy?
Machine learning heavily relies on matrix operations and numerical computations. NumPy makes these tasks efficient and straightforward. Some tasks I practiced:
Creating arrays for dataset representation.
Performing element-wise operations.
Manipulating matrices for linear algebra.
Example:
import numpy as np
# Creating a simple dataset
data = np.array([[1, 2], [3, 4], [5, 6]])
# Calculating mean
mean = np.mean(data, axis=0)
print("Mean:", mean)
Deep Diving into Pandas for Data Analysis
Once I became comfortable with NumPy, I explored Pandas. Understanding and preparing data is a significant part of any ML project.
Key Concepts I Learned:
Handling missing data.
Filtering and sorting datasets.
Grouping and aggregating data.
Example Project:
I worked on a mock sales dataset to analyze trends:
Found the top-performing products.
Identified seasonal sales patterns.
import pandas as pd
# Loading dataset
sales_data = pd.read_csv('sales.csv')
# Top-selling products
top_products = sales_data.groupby('Product')['Sales'].sum().sort_values(ascending=False)
print(top_products.head())
Mathematics for Deep Learning
As I progressed, I realized the importance of mathematics in understanding and building ML models effectively. Here are the key areas I focused on:
1. Linear Algebra:
Matrix operations, eigenvalues, and eigenvectors.
Essential for understanding weights, activations, and dimensionality reduction.
2. Calculus:
Gradients and partial derivatives for backpropagation.
The chain rule to compute gradients across layers.
3. Probability and Statistics:
Probability distributions, Bayes' Theorem, and random variables.
Key for interpreting model outputs and understanding data.
4. Optimization:
Techniques like Gradient Descent and its variants (Adam, RMSProp).
Learning rate tuning for efficient model training.
5. Information Theory:
- Concepts like entropy and KL-divergence for regularization.
These mathematical tools enabled me to grasp the underlying mechanics of ML models, especially when implementing concepts like CNNs and RNNs.
The First Big Leap: Implementing a Neural Network in NumPy
After mastering data handling, I decided to understand the core of ML—Neural Networks (NNs). Instead of relying on high-level libraries, I built one from scratch using NumPy to appreciate the fundamentals.
What I Built:
A simple feedforward neural network with one hidden layer.
Sigmoid activation function for non-linearity.
Gradient descent for optimization.
Key Challenges:
Understanding backpropagation.
Debugging errors in matrix dimensions.
Code Snippet:
def init_params():
W1 = np.random.rand(10, 784) - 0.5 # Input to 1st hidden layer
b1 = np.random.rand(10, 1) - 0.5
W2 = np.random.rand(10, 10) - 0.5 # 1st hidden layer to 2nd hidden layer
b2 = np.random.rand(10, 1) - 0.5
W3 = np.random.rand(10, 10) - 0.5 # 2nd hidden layer to output layer
b3 = np.random.rand(10, 1) - 0.5
return W1, b1, W2, b2, W3, b3
def ReLU(Z):
return np.maximum(Z, 0)
def softmax(Z):
A = np.exp(Z) / sum(np.exp(Z))
return A
def forward_prop(W1, b1, W2, b2, W3, b3, X):
Z1 = W1.dot(X) + b1
A1 = ReLU(Z1) # Activation of 1st hidden layer
Z2 = W2.dot(A1) + b2
A2 = ReLU(Z2) # Activation of 2nd hidden layer
Z3 = W3.dot(A2) + b3
A3 = softmax(Z3) # Output layer
return Z1, A1, Z2, A2, Z3, A3
def ReLU_deriv(Z):
return Z > 0
def one_hot(Y):
one_hot_Y = np.zeros((Y.size, Y.max() + 1))
one_hot_Y[np.arange(Y.size), Y] = 1
one_hot_Y = one_hot_Y.T
return one_hot_Y
def backward_prop(Z1, A1, Z2, A2, Z3, A3, W1, W2, W3, X, Y):
one_hot_Y = one_hot(Y)
dZ3 = A3 - one_hot_Y
dW3 = 1 / m * dZ3.dot(A2.T)
db3 = 1 / m * np.sum(dZ3, axis=1, keepdims=True)
dZ2 = W3.T.dot(dZ3) * ReLU_deriv(Z2)
dW2 = 1 / m * dZ2.dot(A1.T)
db2 = 1 / m * np.sum(dZ2, axis=1, keepdims=True)
dZ1 = W2.T.dot(dZ2) * ReLU_deriv(Z1)
dW1 = 1 / m * dZ1.dot(X.T)
db1 = 1 / m * np.sum(dZ1, axis=1, keepdims=True)
return dW1, db1, dW2, db2, dW3, db3
def update_params(W1, b1, W2, b2, W3, b3, dW1, db1, dW2, db2, dW3, db3, alpha):
W2 = W2 - alpha * dW2
W1 = W1 - alpha * dW1
b1 = b1 - alpha * db1
b2 = b2 - alpha * db2
W3 = W3 - alpha * dW3
b3 = b3 - alpha * db3
return W1, b1, W2, b2, W3, b3
Results of My First Model
Prediction: [0]
Label: [0]
Prediction: [3]
Label: [3]
Prediction: [2]
Label: [2]
Next Steps: Mastering Frameworks
Recently, I transitioned to PyTorch for advanced ML. Its flexibility and dynamic computation graphs make it ideal for building custom models. By the end of December 2024, I had:
Implemented CNNs for image recognition.
Explored RNNs for sequential data.
Exploring Datasets and Classification Problems
To practice real-world applications, I explored classification problems. I used synthetic datasets like:
Circles Dataset:
Visualized complex patterns.
Trained models to classify non-linear data.
Moons Dataset:
- Analyzed how ML handles intertwined data points.
Example with make_circles
:
from sklearn.datasets import make_circles
# make 1000 samples
n_samples = 1000
# Create cirecles
X, Y = make_circles(n_samples,
noise = 0.05,
shuffle = True,
random_state = 42 # Equivalent to setting a random seed
)
# Visualize the data
import matplotlib.pyplot as plt
plt.scatter(x = X[:, 0],
y = X[:, 1],
c = Y,
cmap = plt.cm.RdYlBu)
Input Data
Classification Model Implementation
subclass
nn.module
almost every model in PyTorch do thisCreate 2
nn.Linear()
layers that are capable of handling the shapes of our dataDefine a
forward()
methodInstatiate an instance of our model class and send it to the target
device
class ClassificationModel(nn.Module):
def __init__(self):
super().__init__()
self.model = nn.Sequential(
nn.Linear(in_features=1, out_features=512),
nn.ReLU(), # Non - Linear Activation after Layer 1
nn.Linear(in_features=512, out_features=256),
nn.ReLU6(), # Non - Linear Activation after Layer 2
nn.Linear(in_features=256, out_features=512),
nn.ReLU(), # Non - Linear Activation after Layer 3
nn.Linear(in_features=512, out_features=1),
)
def forward(self, X : torch.tensor) -> torch.tensor :
return self.model(X)
torch.manual_seed(42)
model_2 = ClassificationModel()
# Loss and Optimizer
loss_fn_01 = nn.L1Loss()
optimizer_01 = torch.optim.SGD(model_2.parameters(), lr = 0.001)
# Training and Testing loops
epochs = 1000
for epoch in range(epochs):
model_2.train()
y_pred_training = model_2(X_train_regression)
loss_training = loss_fn_01(y_pred_training, Y_train_regression)
optimizer_01.zero_grad()
loss_training.backward()
optimizer_01.step()
# Testing Loop
model_2.eval()
with torch.inference_mode():
y_preds_testing = model_2(X_test_regression)
loss_testing = loss_fn_01(y_preds_testing, Y_test_regression)
if epoch % 100 == 0:
print(f"Epoch {epoch} | Training Loss: {loss_training} | Testing Loss: {loss_training}")
Result After Training the Model
Multi class Classification Problem
I’ve also worked with sklearn.datasets.make_blobs
, sklearn.datasets.make_moons
and spirals data creation function from CS231n
make_blobs
Input data
import torch
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs
from sklearn.model_selection import train_test_split
# set hyperparamter for our model
## From sklearn.datasets.make_blobs():
# 1. Create multi-class data
NUM_CLASSES = 4
NUM_FEATURES = 2
RANDOM_SEED = 43
X_blob, Y_blob = make_blobs(n_samples = 1000,
n_features = NUM_FEATURES,
centers = NUM_CLASSES,
cluster_std = 1.7,
random_state = RANDOM_SEED)
# 2. Turn this data into tensors
X_blob = torch.from_numpy(X_blob).type(torch.float)
Y_blob = torch.from_numpy(Y_blob).type(torch.LongTensor) # Because `CrossEntropyLoss()` takes long as target values
# 3. Split into training and test data sets
X_blob_train, X_blob_test, Y_blob_train, Y_blob_test = train_test_split(X_blob, Y_blob, test_size=0.2)
# 4. Plot the data
plt.figure(figsize =(10, 7))
plt.scatter(X_blob[:, 0], X_blob[:, 1], c = Y_blob, cmap = plt.cm.RdYlBu)
Output results
make_moons
Input data
from sklearn.datasets import make_moons
N_SAMPLES = 3000
X, Y = make_moons(n_samples = N_SAMPLES,
noise = 0.5,
random_state = RANDOM_SEED)
# Understand the data Effectively
X[:5],Y[:5], X.shape, Y.shape, torch.from_numpy(Y).unique()
Output
spirals data creation function from CS231n
Input
# Code for creating a spiral dataset from CS231n
import numpy as np
import matplotlib.pyplot as plt
RANDOM_SEED = 42
np.random.seed(RANDOM_SEED)
N = 1000 # number of points per class
D = 2 # dimensionality
K = 6 # number of classes
X = np.zeros((N*K,D)) # data matrix (each row = single example)
y = np.zeros(N*K, dtype='uint8') # class labels
for j in range(K):
ix = range(N*j,N*(j+1))
r = np.linspace(0.0,1,N) # radius
t = np.linspace(j*4,(j+1)*4,N) + np.random.randn(N)*0.2 # theta
X[ix] = np.c_[r*np.sin(t), r*np.cos(t)]
y[ix] = j
# lets visualize the data
plt.scatter(X[:, 0], X[:, 1], c=y, s=40, cmap=plt.cm.RdYlBu)
plt.show()
Output
Lessons Learned
Start Small: Begin with Python and simple libraries like NumPy.
Visualize Data: Always plot datasets to understand them better.
Learn Fundamentals: Implement core ML concepts manually before relying on libraries.
Stay Curious: ML is vast, but tackling one small concept at a time makes it manageable.
Machine Learning (ML) captivated my curiosity, leading me to dive into Python and libraries like NumPy and Pandas for data handling and analysis. This journey involved mastering mathematical foundations crucial for ML, such as Linear Algebra and Calculus, and building a neural network from scratch to grasp core concepts like backpropagation. Transitioning to PyTorch, I explored various datasets, classification problems, and advanced model implementations, continually iterating and experimenting. My path underscores the importance of starting small, visualizing data, understanding ML fundamentals, and staying consistently curious while leveraging external resources for deeper learning.
External Resources That Helped Me Learn
Throughout my journey, I relied on various resources to understand and implement concepts. Here are the ones that guided me:
Neural Networks Basics: 3Blue1Brown's "Neural Networks" Series
Deep Learning Basics and Implementation: Deep Learning Specialization by Daniel Bourke
Mathematics for Machine Learning: Mathematics for Machine Learning by Imperial College London
PyTorch Tutorials: PyTorch Official Documentation
Extra’s: Wikipedia
These resources provided a mix of theory, hands-on practice, and insights into best practices in ML.
For More Updates, You can follow me on:
Final Words
From understanding data to building neural networks, this journey has been transformative. If you’re just starting, take it step by step. Focus on learning, experimenting, and building projects. Remember, the key is consistency. ML is not just about coding—it’s about thinking, solving problems, and staying curious.
Happy learning!