 ## Data science - Beginner Level

Difficulty: ⭐️☆☆☆☆

## Description

This course introduces learners to the fundamental understandings of data science – a rapidly growing field in computer science and engineering in recent years. Through the theoretical lectures and hands-on practices, students will learn how to scrape data from public sources, visualize and analyze data, and apply statistical modeling and machine learning algorithms to predict future outcomes of academic and business data.

## Learning outcomes:

By the end of this course, students will achieve:

• Basic programming with Python / R.
• Understanding of data structure.
• Skills to read, write, analyze a simple dataset.
• Skills to detect and handle missing data and outliers.
• Skills to visualize dataset for report, analysis and communication.

## Requirements

• Basic computer skills

## Curriculum For This Course

#### Lesson 1:

1. Introduction to data science
2. Basic programming in Python.
1. Comment
2. print() command
3. Basic data types
4. Basic operators
5. Basic control flow: if statement
6. Variables and Variable Types
7. Basic String manipulation

#### Lesson 2:

1. Basic control flow: for loop
2. Complex data structures in Python:
1. List
2. Dictionary
3. Tuple
3. Practice exercises

#### Lesson 3:

1. Function
1. Definition syntax
2. Keyword parameters
3. Default parameters
2. Advanced data structure in Python:
1. Numpy Array
2. Pandas DataFrame.
3. Practice exercises

#### Lesson 4:

1. Basic Data Visualization
2. Matplotlib library
1. Figure function
2. Plot function
3. Show function
3. Scatter Plot, Line Plot, Bar Plot
4. Implementation of those plots with Python/R
5. Practice exercises

#### Lesson 5:

1. Intermediate Data Visualization
2. Bubble Plot, Box plot, Histogram, Multiple Plots
3. Implementation of those plots with Python/R
4. Practice exercises

#### Mini project:

1. Pick a dataset out of 3 themes: Sport, Real Estate, Academics
2. Exploring and Analyzing the dataset using data visualization techniques:
1. Scatter Plot
2. Line Plot
3. Bar Plot
4. Bubble Plot
5. Box Plot
6. Histogram
3. Summarize insights from the dataset

#### Lesson 7:

1. Introduction to Data Preprocessing
2. Detection of Missing Data, Outliers
3. How to treat missing data and outliers
4. Implementation of data preprocessing with Python/R
5. Practice exercises

#### Lesson 8:

1. Introduction to Data Collection
2. Example of famous and study datasets.
3. Introduction to basic data scraping.
4. Implementation of data scraping with Python
5. Practice exercises

#### Lesson 9:

1. Introduction to Text Mining
2. API to download from social media
4. Practice exercises

#### Lesson 10:

1. Introduction to Sentiment Analysis
2. Processing text data to predict Sentiment
3. Implementation of sentiment analysis with Python/R
4. Practice exercises

#### Lesson 11:

1. Introduction to Association Rule  – Market Basket Analysis
2. Metrics in Association Rule:
1. Lift
2. Support
3. Confidence
3. Implementation of Association Rule with Python/R
4. Practice exercises

#### Completion project

1. Pick a dataset from our dataset bank
2. Using learned knowledge to perform analysis on the dataset.
3. Write a report for insights found.

## UPCOMING COURSES/WORKSHOPS

Group name Start date Session duration Number of sessions Standard price
Data Science Beginner Level (Group Class) - Limited Time Offer! Apr 7 (Sun) 15:00 - 17:00 2h 12 sessions (weekly) 72,000 JPY

## FEATURED MENTOR Phong Nguyen

Head of Artificial Intelligence and Data Science Department

• Data Science and Research Mentor at Tokyo Techies
• Master Graduate from Carnegie Mellon University.
• Researcher in an AI lab at a big global corporation.
• Global working experience in Singapore, Australia, US, Vietnam and Japan.
• A results-oriented Researcher and Data Guru with vast managerial and technical experience in financial management, marketing, business planning.
• Adapt at programming in multiple different languages, such as Java, Python and R.