Assignment Overview
In this assignment, you will complete your first end-to-end data science project. You will work through each phase of the data science lifecycle we covered in Topic 1.1, using the Python environment and Jupyter skills from Topic 1.2.
Project Setup
Create a proper folder structure with virtual environment
Jupyter Notebook
Document your analysis with markdown and code
Data Analysis
Answer business questions using the dataset
The Scenario
WeatherWatch Analytics
You have just been hired as a Junior Data Analyst at WeatherWatch Analytics, a startup that provides weather insights to local businesses. Your manager has given you your first assignment:
"We have temperature data from the past year. The local ice cream shop chain wants to know when they should increase inventory and staffing. Can you analyze the data and give them actionable recommendations?"
Your Task
Create a complete Jupyter Notebook that analyzes the temperature data and answers the business questions below. Your notebook should be professional enough to share with the client (the ice cream shop).
The Dataset
You will work with a simple temperature dataset. Copy this data into a CSV file called
temperatures.csv in your project folder:
month,avg_temp_f,rainy_days,sunny_days
January,35,12,8
February,38,10,9
March,48,11,10
April,58,9,12
May,68,8,15
June,78,6,18
July,85,4,22
August,83,5,20
September,74,7,16
October,62,9,13
November,48,11,10
December,38,13,7
Dataset Columns
month- Name of the monthavg_temp_f- Average temperature in Fahrenheitrainy_days- Number of rainy days in that monthsunny_days- Number of sunny days in that month
Requirements
Your notebook must include all of the following sections. Follow the data science lifecycle from Topic 1.1!
Title and Introduction (Markdown)
Start with a title, your name, date, and a brief introduction explaining what this analysis is about and who it is for (the ice cream shop client).
Problem Statement (Markdown)
Clearly define the business problem. What questions are you trying to answer? Why does this matter to the client?
Setup and Imports (Code)
Import pandas and any other libraries. Load the CSV file into a DataFrame. Display the first few rows.
import pandas as pd
# Load the data
df = pd.read_csv('temperatures.csv')
df.head()
Data Exploration (Code + Markdown)
Explore the dataset. How many rows? What are the data types? Any missing values? Basic statistics? Add markdown cells explaining what you find.
# Check the shape, info, and basic stats
print(df.shape)
df.info()
df.describe()
Analysis and Answers (Code + Markdown)
Answer these specific questions with code. Explain each answer in markdown:
- Q1: What is the hottest month? The coldest?
- Q2: Which months have average temperature above 70 degrees F? (Peak ice cream season)
- Q3: What is the average temperature across all months?
- Q4: Which month has the most sunny days? (Best for outdoor sales)
- Q5: Is there a relationship between temperature and sunny days?
Conclusions and Recommendations (Markdown)
Summarize your findings in plain English. Give the ice cream shop 3-5 specific, actionable recommendations based on your analysis.
Submission
Create a public GitHub repository with the exact name shown below:
Required Repository Name
weather-analysis-project
Required Files
weather-analysis-project/
├── temperatures.csv # The dataset
├── weather_analysis.ipynb # Your Jupyter notebook
├── requirements.txt # List of packages used (pandas, etc.)
└── README.md # REQUIRED - see contents below
README.md Must Include:
- Your full name and submission date
- Brief description of your analysis approach
- Key findings from your weather data analysis
- Any challenges faced and how you solved them
Do Include
- Clear markdown documentation
- All code cells executed (show outputs)
- requirements.txt with pandas listed
- Proper file naming
- README.md with all required sections
Do Not Include
- The venv folder (use .gitignore)
- Any .pyc or cache files
- Unexecuted notebooks
- Excessive/messy output
Enter your GitHub username - we'll verify your repository automatically
Grading Rubric
Your assignment will be graded on the following criteria:
| Criteria | Points | Description |
|---|---|---|
| Project Structure | 15 | Proper folder organization, all required files present |
| Documentation | 25 | Clear markdown headers, explanations, and introduction |
| Code Quality | 20 | Clean code, proper imports, no errors when running |
| Analysis | 25 | All 5 questions answered correctly with appropriate code |
| Conclusions | 15 | Clear, actionable recommendations for the client |
| Total | 100 |
Ready to Submit?
Make sure you have completed all requirements and reviewed the grading rubric above.
Submit Your AssignmentWhat You Will Practice
Data Science Lifecycle
Following the complete process from problem definition through data collection, analysis, and communication of insights
Project Organization
Setting up a professional project structure with proper file organization and dependency management
Documentation Skills
Using markdown effectively to explain your analysis so non-technical stakeholders can understand
Pandas Basics
Loading CSV files, exploring DataFrames, and using basic pandas operations to answer questions
Pro Tips
Getting Started
- Read through all requirements before coding
- Set up your environment first (Python, Jupyter)
- Create your project folder structure early
- Test each section as you complete it
Documentation
- Use markdown headers to organize your notebook
- Explain your thinking, not just the code
- Add comments for complex operations
- Include a summary of findings at the end
Time Management
- Don't spend too long on any single question
- Start with the easier tasks to build momentum
- Save your work frequently
- Leave time for review and README writing
Common Mistakes
- Forgetting to include the dataset file
- Not running all cells before submission
- Missing README.md or incomplete sections
- Leaving repository as private