Assignment 1: Your First Data Science Project | Data Science Course

Assignment Overview

In this assignment, you will complete your first end-to-end data science project. You will work through each phase of the data science lifecycle we covered in Topic 1.1, using the Python environment and Jupyter skills from Topic 1.2.

Skills Applied: This assignment tests your understanding of the data science lifecycle, environment setup, and Jupyter Notebook proficiency from Module 1.

Project Setup

Create a proper folder structure with virtual environment

Jupyter Notebook

Document your analysis with markdown and code

Data Analysis

Answer business questions using the dataset

Ready to submit? Already completed the assignment? Submit your work now!

Submit Now

The Scenario

WeatherWatch Analytics

You have just been hired as a Junior Data Analyst at WeatherWatch Analytics, a startup that provides weather insights to local businesses. Your manager has given you your first assignment:

"We have temperature data from the past year. The local ice cream shop chain wants to know when they should increase inventory and staffing. Can you analyze the data and give them actionable recommendations?"

Your Task

Create a complete Jupyter Notebook that analyzes the temperature data and answers the business questions below. Your notebook should be professional enough to share with the client (the ice cream shop).

Remember: This is a real-world simulation! Your notebook should be clear enough that someone with no coding knowledge can understand your findings and recommendations.

The Dataset

You will work with a simple temperature dataset. Copy this data into a CSV file called temperatures.csv in your project folder:

month,avg_temp_f,rainy_days,sunny_days
January,35,12,8
February,38,10,9
March,48,11,10
April,58,9,12
May,68,8,15
June,78,6,18
July,85,4,22
August,83,5,20
September,74,7,16
October,62,9,13
November,48,11,10
December,38,13,7

Dataset Columns

month - Name of the month
avg_temp_f - Average temperature in Fahrenheit
rainy_days - Number of rainy days in that month
sunny_days - Number of sunny days in that month

Requirements

Your notebook must include all of the following sections. Follow the data science lifecycle from Topic 1.1!

Title and Introduction (Markdown)

Start with a title, your name, date, and a brief introduction explaining what this analysis is about and who it is for (the ice cream shop client).

Problem Statement (Markdown)

Clearly define the business problem. What questions are you trying to answer? Why does this matter to the client?

Setup and Imports (Code)

Import pandas and any other libraries. Load the CSV file into a DataFrame. Display the first few rows.

import pandas as pd

# Load the data
df = pd.read_csv('temperatures.csv')
df.head()

Data Exploration (Code + Markdown)

Explore the dataset. How many rows? What are the data types? Any missing values? Basic statistics? Add markdown cells explaining what you find.

# Check the shape, info, and basic stats
print(df.shape)
df.info()
df.describe()

Analysis and Answers (Code + Markdown)

Answer these specific questions with code. Explain each answer in markdown:

Q1: What is the hottest month? The coldest?
Q2: Which months have average temperature above 70 degrees F? (Peak ice cream season)
Q3: What is the average temperature across all months?
Q4: Which month has the most sunny days? (Best for outdoor sales)
Q5: Is there a relationship between temperature and sunny days?

Conclusions and Recommendations (Markdown)

Summarize your findings in plain English. Give the ice cream shop 3-5 specific, actionable recommendations based on your analysis.

Pro Tip: Use the Jupyter shortcuts you learned! Shift+Enter to run cells, M for markdown, B to add cells below.

Submission

Create a public GitHub repository with the exact name shown below:

Required Repository Name

weather-analysis-project

github.com/<your-username>/weather-analysis-project

Required Files

weather-analysis-project/
├── temperatures.csv        # The dataset
├── weather_analysis.ipynb  # Your Jupyter notebook
├── requirements.txt        # List of packages used (pandas, etc.)
└── README.md               # REQUIRED - see contents below

README.md Must Include:

Your full name and submission date
Brief description of your analysis approach
Key findings from your weather data analysis
Any challenges faced and how you solved them

Do Include

Clear markdown documentation
All code cells executed (show outputs)
requirements.txt with pandas listed
Proper file naming
README.md with all required sections

Do Not Include

The venv folder (use .gitignore)
Any .pyc or cache files
Unexecuted notebooks
Excessive/messy output

Important: Before submitting, do Kernel - Restart and Run All to make sure your notebook runs from top to bottom without errors!

Submit Your Assignment

Enter your GitHub username - we'll verify your repository automatically

Grading Rubric

Your assignment will be graded on the following criteria:

Criteria	Points	Description
Project Structure	15	Proper folder organization, all required files present
Documentation	25	Clear markdown headers, explanations, and introduction
Code Quality	20	Clean code, proper imports, no errors when running
Analysis	25	All 5 questions answered correctly with appropriate code
Conclusions	15	Clear, actionable recommendations for the client
Total	100

Ready to Submit?

Make sure you have completed all requirements and reviewed the grading rubric above.

Submit Your Assignment

What You Will Practice

Data Science Lifecycle

Following the complete process from problem definition through data collection, analysis, and communication of insights

Project Organization

Setting up a professional project structure with proper file organization and dependency management

Documentation Skills

Using markdown effectively to explain your analysis so non-technical stakeholders can understand

Pandas Basics

Loading CSV files, exploring DataFrames, and using basic pandas operations to answer questions

Pro Tips

Getting Started

Read through all requirements before coding
Set up your environment first (Python, Jupyter)
Create your project folder structure early
Test each section as you complete it

Documentation

Use markdown headers to organize your notebook
Explain your thinking, not just the code
Add comments for complex operations
Include a summary of findings at the end

Time Management

Don't spend too long on any single question
Start with the easier tasks to build momentum
Save your work frequently
Leave time for review and README writing

Common Mistakes

Forgetting to include the dataset file
Not running all cells before submission
Missing README.md or incomplete sections
Leaving repository as private

Your First Data Science Project

What You'll Practice

Contents