Welcome & The Titanic Story
15 minutes🚢 Setting the Scene
It's April 15, 1912. The "unsinkable" RMS Titanic has just struck an iceberg. In the chaos that follows, some passengers will survive while others won't. But who? And why?
You've been hired as a data analyst to investigate the factors that influenced survival. Your findings will be used to improve maritime safety regulations.
📊 Data Preview
Here's a sample of the Titanic dataset you'll be working with:
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 0 | 3 | Braund, Mr. Owen Harris | male | 22 | 1 | 0 | A/5 21171 | 7.25 | S | |
| 2 | 1 | 1 | Cumings, Mrs. John Bradley | female | 38 | 1 | 0 | PC 17599 | 71.28 | C85 | C |
| 3 | 1 | 3 | Heikkinen, Miss. Laina | female | 26 | 0 | 0 | STON/O2. 3101282 | 7.93 | S | |
| 4 | 1 | 1 | Futrelle, Mrs. Jacques Heath | female | 35 | 1 | 0 | 113803 | 53.1 | C123 | S |
| 5 | 0 | 3 | Allen, Mr. William Henry | male | 0 | 0 | 373450 | 8.05 | S |
📌 Note: Notice the missing values in Age (row 5) and Cabin (multiple rows) - you'll need to handle these!
🔍 Live Data Explorer NEW
Filter and sort the full 891-passenger dataset directly in your browser. This is what your Excel sheet will look like once cleaned — use it to spot patterns before you start.
📊 Understanding the Columns
Each column in our dataset tells part of the story:
PassengerId
Unique identifier
Survived
0 = No, 1 = Yes
Pclass
Ticket class (1, 2, 3)
Name
Passenger name
Sex
Gender
Age
Age in years
SibSp
Siblings/Spouses aboard
Parch
Parents/Children aboard
Fare
Ticket price
Embarked
Port (C, Q, S)
💼 Business Context
This dataset structure mirrors real business data:
- Sales Data: CustomerID, Purchase, Product Class, Demographics
- HR Data: EmployeeID, Retained, Department, Age, Salary
- Marketing Data: CampaignID, Converted, Channel, Budget
✅ Checkpoint 1: Module Complete!
Data Detective Challenge
40 minutes🔍 Your First Investigation
Before we clean the data, let's learn to spot issues independently. You have 5 minutes to explore the raw data.
⚠️ IMPORTANT FIRST STEP: Convert CSV to Excel
- Open the downloaded titanic_raw.csv file:
- Double-click the CSV file (it will open in Excel)
- Or right-click → Open with → Excel
- Immediately save as Excel format:
- Go to File → Save As
- Choose location (your workshop folder)
- Change "Save as type" to Excel Workbook (*.xlsx)
- Name it: titanic_workshop.xlsx
- Click Save
- Why this matters:
- CSV files don't support multiple sheets (needed for pivot tables)
- Excel features like slicers require .xlsx format
- Formatting and formulas are better preserved in Excel format
Data Quality Checklist
Missing Values Found In:
Inconsistent Formats In:
Outliers or Odd Values:
Columns We Might Not Need:
🧹 Data Cleaning Workshop
Remove Unnecessary Columns
Delete these columns that won't help our survival analysis:
Name- Column D - Too unique, no analytical valueTicket- Column I (will be H after Name is deleted)Cabin- Column K (will be I after previous deletions)
📋 Detailed Excel Steps:
- Delete Name column first:
- Click on column header D (Name)
- Right-click → Select "Delete"
- Delete Ticket column (now in position H):
- After deleting Name, Ticket shifts from column I to H
- Click column header H
- Right-click → Delete
- Delete Cabin column (now in position I):
- After previous deletions, Cabin shifts to column I
- Click column header I
- Right-click → Delete
- Verify your columns after deletion:
- You should now have: A(PassengerId), B(Survived), C(Pclass), D(Sex), E(Age), F(SibSp), G(Parch), H(Fare), I(Embarked)
- Total of 9 columns remaining
Handle Missing Ages
Replace missing ages with the average age:
📋 Detailed Excel Steps:
- First, calculate the average age:
- Click on any empty cell (e.g., cell K2)
- Type:
=AVERAGE(E:E)(Age is now in column E after deletions) - Press Enter - note this number (≈29.7)
- Create a helper column with filled ages:
- Click on cell J1 (first empty column after Embarked)
- Type:
Age_Filledas header - Click on cell J2
- Enter formula:
=IF(ISBLANK(E2),29.7,E2) - Or use dynamic average:
=IF(ISBLANK(E2),AVERAGE(E:E),E2)
- Copy formula down:
- Click cell J2
- Copy (Ctrl+C)
- Select range J3:J891
- Paste (Ctrl+V)
- Shortcut: Double-click the fill handle (small square at bottom-right of J2)
- Replace original Age column:
- Select column J (Age_Filled)
- Copy (Ctrl+C)
- Click column E (Age)
- Paste Special → Values (Alt+E+S+V)
- Delete column J
Create Age Groups
Add a new column "Age_Group" with categories:
📋 Detailed Excel Steps:
- Add column header:
- Click cell J1 (first empty column after the 9 data columns)
- Type:
Age_Group - Press Enter
- Enter the IF formula:
- Click cell J2
- Type exactly:
=IF(E2<18,"Child",IF(E2<60,"Adult","Senior")) - Note: Using E2 because Age is now in column E
- Press Enter
- Understanding the formula:
- If Age (E2) < 18 → "Child"
- Else if Age (E2) < 60 → "Adult"
- Else → "Senior"
- Apply to all rows:
- Click cell J2
- Move cursor to bottom-right corner (fill handle)
- When cursor becomes a + sign, double-click
- Formula auto-fills to last row with data
Create Survival Labels
Transform 0/1 into readable text:
📋 Detailed Excel Steps:
- Add new column:
- Click cell K1 (next to Age_Group)
- Type:
Survival_Status
- Create formula:
- Click cell K2
- Type:
=IF(B2=1,"Survived","Did Not Survive") - Note: Survived column remains in B
- Press Enter
- Fill down:
- Double-click the fill handle in K2
- Or select K2:K891 and press Ctrl+D
Create Ports of Embarkation
Transform C/S/Q into readable text:
📋 Detailed Excel Steps:
- Add new column:
- Click cell L1 (next to Survival_Status)
- Type:
Port
- Create formula:
- Click cell L2
- Type:
=IF(I2="C","Cherbourg",IF(I2="S","Southampton","Queenstown")) - Note: Embarked column remains in I
- Press Enter
- Fill down:
- Double-click the fill handle in L2
- Or select L2:L891 and press Ctrl+D
Remove Empty Rows (Embarked)
Find and remove rows with missing Embarked values:
📋 Detailed Excel Steps:
- Apply filter:
- Select all data (Ctrl+A)
- Data tab → Filter (or Ctrl+Shift+L)
- Filter for blanks in Embarked column:
- Click dropdown arrow in column I (Embarked)
- Note: Embarked is now in column I after deletions
- Uncheck "Select All"
- Check only "(Blanks)"
- Click OK
- Delete blank rows:
- Select all visible rows (the filtered blanks)
- Right-click on row numbers → Delete Row
- Remove filter:
- Data → Clear (or click Filter button again)
Verify Your Column Structure
After all cleaning steps, your columns should be:
| Column | Content | Data Type |
|---|---|---|
| A | PassengerId | Number |
| B | Survived | 0 or 1 |
| C | Pclass | 1, 2, or 3 |
| D | Sex | male/female |
| E | Age | Number (no blanks) |
| F | SibSp | Number |
| G | Parch | Number |
| H | Fare | Decimal |
| I | Embarked | C, Q, or S |
| J | Age_Group | Child/Adult/Senior |
| K | Survival_Status | Survived/Did Not Survive |
| L | Port | Cherbourg/Southampton/Queenstown |
✅ Checkpoint 2: Data Cleaned!
🧪 Excel Formula Sandbox NEW
Test what your Excel formulas will do before you type them in Excel. Try AVERAGE, COUNTIF, IF, ISBLANK and more — all against the real 891-row Titanic dataset.
💼 Business Context
In Sales Analytics: You'd clean product names, standardize categories, handle missing prices
In HR Analytics: You'd categorize experience levels, standardize job titles, fill missing departments
Myth Busters: Exploratory Analysis
30 minutes🎭 Time to Bust Some Myths!
Remember those assumptions from the beginning? Let's use pivot tables to find the truth.
Myth #1: "More men survived than women"
Your Investigation:
📋 Detailed Steps to Create Your First Pivot Table:
- Select your data:
- Click anywhere in your cleaned data
- Press Ctrl+A to select all data
- Or manually select range A1:K889
- Insert Pivot Table:
- Go to Insert tab
- Click PivotTable (far left)
- In dialog box, choose "New Worksheet"
- Click OK
- Rename it "Gender"
- Build the Pivot:
- In PivotTable Fields panel (right side):
- Drag
Sexto Rows area - Drag
Survival_Statusto Columns area - Drag
PassengerIdto Values area - Make sure you are using
Count of PassengerIdin the field settings - It should show "Count of PassengerId"
- Add Survival Rate Calculation:
- Right-click any value in the pivot table
- Select "Show Values As"
- Choose "% of Row Total"
- This shows survival percentage by gender
- Alternative: Manual Calculation:
- Note the counts for each gender
- In a cell beside the pivot, calculate:
=Survived_Count/(Survived_Count+Not_Survived_Count)*100
Myth #2: "First-class passengers had the highest survival rate"
Your Investigation:
📋 Create Your Second Pivot Table:
- Create new pivot:
- Go back to your cleaned data sheet
- Select your data range again
- Insert → PivotTable → New Worksheet
- Click OK
- Rename it "Class"
- Configure for Class Analysis:
- Drag
Pclassto Rows - Drag
Survival_Statusto Columns - Drag
PassengerIdto Values - Make sure you are using
Count of PassengerIdin the field settings
- Drag
- Format the Results:
- Right-click on values → Show Values As → % of Row Total
- Or create a calculated field for survival rate
Myth #3: "Age was the strongest predictor of survival"
Your Investigation:
📋 Create Age Group Analysis:
- New pivot table:
- Insert → PivotTable → New Worksheet
- Click OK
- Rename it "Age Group"
- Use Age Groups:
- Drag
Age_Groupto Rows - Drag
Survival_Statusto Columns - Drag
PassengerIdto Values - Make sure you are using
Count of PassengerIdin the field settings
- Drag
- Calculate percentages:
- Show Values As → % of Row Total
- Compare survival rates across age groups
Myth #4: "Passengers from Cherbourg had better chances of survival"
Your Investigation:
📋 Create Port Origin Analysis:
- New pivot table:
- Insert → PivotTable → New Worksheet
- Click OK
- Rename it "Origin"
- Use Port:
- Drag
Portto Rows - Drag
Survival_Statusto Columns - Drag
PassengerIdto Values - Make sure you are using
Count of PassengerIdin the field settings
- Drag
- Calculate percentages:
- Show Values As → % of Row Total
- Compare survival rates across ports
✅ Checkpoint 3: Pivot Tables Mastered!
Data Visualization: Chart Selection Challenge
40 minutes📈 The Ship Owner's Request
The White Star Line's board of directors wants to understand what happened. They've asked specific questions. Your job: Choose the RIGHT chart for each question.
Question 1: "Show me survival rates by gender"
Why Column Chart? Comparing two distinct categories with percentages.
Build It:
- Select your Gender pivot table
- Select data range, including titles
- Do not select the Grand Total Column/Row
- Insert → Column Chart
- Format: Add data labels, title, remove gridlines
- Color (Optional): Blue for Male, Pink for Female (or your preference)
- Style (Optional): Change Chart Style
Question 2: "How did class affect survival?"
Why Stacked Column? Comparing survival vs death within each class.
Build It:
- Select your Class pivot table
- Select data range, including titles
- Do not select the Grand Total Column/Row
- Insert → Column Chart → Stacked
- Add percentage labels
- Format: Add data labels, title, remove gridlines
- Color (Optional): Different colors for Survived/Did Not Survive
- Style (Optional): Change Chart Style
Question 3: "What was the age distribution?"
Why Stacked Column? Comparing survival vs death within each age group.
Build It:
- Select your Age Group pivot table
- Select data range, including titles
- Do not select the Grand Total Column/Row
- Insert → Column Chart → Stacked
- Add percentage labels
- Format: Add data labels, title, remove gridlines
- Color (Optional): Different colors for different age groups
- Style (Optional): Change Chart Style
Question 4: "What was the embarkation port distribution?"
Why Stacked Bar Chart? Comparing survival vs death within each origin.
Build It:
- Select your Origin/Port pivot table
- Select data range, including titles
- Do not select the Grand Total Column/Row
- Insert → Bar → Stacked Bar
- Add percentage labels
- Format: Add data labels, title, remove gridlines
- Color (Optional): Different colors for ports
- Style (Optional): Change Chart Style
Extra Data for Dashboard
Total Survival Rate
📋 Create Pivot Table:
- Create new pivot:
- Go back to your cleaned data sheet
- Select your data range again
- Insert → PivotTable → New Worksheet
- Click OK
- Rename it "Rates"
- Configure for Rate Analysis:
- Drag
Survival_Statusto Rows - Drag
PassengerIdto Values - Make sure you are using
Count of PassengerIdin the field settings
- Drag
- Format the Results:
- Right-click on values → Show Values As → % of Column Total
✅ Checkpoint 4: Charts Created!
Dashboard Assembly Workshop
60 minutes🎨 Creating Your Professional Dashboard
Time to combine everything into an interactive story!
Create Dashboard Sheet
- Create new sheet named "Dashboard"
- Hide gridlines: View → Uncheck Gridlines
- Set zoom to 80% for more space
Design Your Layout
Suggested zones:
- Top: Title and Key Metrics
- Left: Slicers
- Center: Main charts
- Bottom: Supporting tables/insights (optional)
Add Interactive Filters
Make your dashboard interactive:
- Click on the Gender pivot table/chart
- PivotTable Analyze → Insert Slicer
- Add slicers for:
- Gender
- Class
- Age Group
- Embarked Port
- Format slicers: Right-click → Size and Properties
- On each slicer connect to all pivot tables: Right-click → Report Connections
- Select all slicers Ctrl+Click
- Copy Ctrl+C
- Go to your Dashboard
- Paste the slicers Ctrl+V
- Select each one of the charts created, copy and paste on your dashboard
- Change size, format, position, etc.
- On the left panel insert a TextBox for the Mortality Rate
Did Not Survive - Link it to your Rates Pivot Table
=Rates!$B$4 - On the left panel insert a TextBox for the Survival Rate
Survived - Link it to your Rates Pivot Table
=Rates!$B$5
Professional Touches
Consistent Color Scheme:
- Add title: "Titanic Survival Analysis Dashboard"
- Include subtitle with your name and date
- Add key metrics boxes (Total Passengers, Survival Rate, etc.)
- Insert text box with 3 key insights
- Add data source citation
🚀 Advanced Challenges (Optional)
Dynamic Title
Create a cell formula that changes based on slicer selection
Hint
=CONCATENATE("Analysis for ", GETPIVOTDATA(...))
Survival Predictor
Build a formula that predicts survival based on multiple inputs
Hint
=IF(AND(Gender="Female", Class=1), "95% chance", ...)
Sparklines
Add mini trend charts showing survival by age
Hint
Insert → Sparklines → Line✅ Checkpoint 5: Dashboard Complete!
Data Storytelling & Business Applications
20 minutes📖 Crafting Your Data Story
Transform your dashboard into a compelling narrative using the Story Arc:
1. Setup
"On that fateful night, 2,224 souls were aboard..."
2. Conflict
"Only enough lifeboats for 1,178 people..."
3. Investigation
"Our analysis reveals three critical factors..."
4. Climax
"The data shows systematic inequality..."
5. Resolution
"Recommendations for future safety..."
🎯 Your Final Data Story
Compile Your Headlines:
💼 Apply This to Your Business
Sales Dashboard
- Replace "Survived" with "Converted"
- Replace "Class" with "Product Tier"
- Replace "Gender" with "Customer Segment"
Find which customer segments convert best
HR Dashboard
- Replace "Survived" with "Retained"
- Replace "Class" with "Department"
- Replace "Age" with "Tenure"
Identify retention patterns
Marketing Dashboard
- Replace "Survived" with "Engaged"
- Replace "Fare" with "Campaign Spend"
- Replace "Embarked" with "Channel"
Optimize campaign targeting
✅ Checkpoint 6: Workshop Complete!
📊 Your Workshop Analytics
Your personal performance dashboard — quiz scores, time per module, and earned achievements. All data stays in your browser.