Data science - Beginner Level

In light of the global pandemic we are experiencing, we will not be accepting any new students for now.

We will post updates on the class schedule once we are assured on safety of everyone.

Difficulty: ⭐️☆☆☆☆


This course introduces learners to the fundamental understandings of data science – a rapidly growing field in computer science and engineering in recent years. Through the theoretical lectures and hands-on practices, students will learn how to scrape data from public sources, visualize and analyze data, and apply statistical modeling and machine learning algorithms to predict future outcomes of academic and business data.

Learning outcomes:

By the end of this course, students will achieve:

  • Basic programming with Python / R.
  • Understanding of data structure.
  • Skills to read, write, analyze a simple dataset.
  • Skills to detect and handle missing data and outliers.
  • Skills to visualize dataset for report, analysis and communication.


  • Basic computer skills

Curriculum For This Course

      1. Introduction to data science
      2. Basic programming in Python.
        1. Comment
        2. print() command
        3. Basic data types
        4. Basic operators
        5. Basic control flow: if statement
        6. Variables and Variable Types
        7. Basic String manipulation
    1. Basic control flow: for loop
    2. Complex data structures in Python:
      1. List
      2. Dictionary
      3. Tuple
    3. Practice exercises
    1. Function
      1. Definition syntax
      2. Keyword parameters
      3. Default parameters
    2. Advanced data structure in Python:
      1. Numpy Array
      2. Pandas DataFrame.
    3. Practice exercises
    1. Basic Data Visualization
    2. Matplotlib library
      1. Figure function
      2. Plot function
      3. Show function
    3. Scatter Plot, Line Plot, Bar Plot
    4. Implementation of those plots with Python/R
    5. Practice exercises
    1. Intermediate Data Visualization
    2. Bubble Plot, Box plot, Histogram, Multiple Plots
    3. Implementation of those plots with Python/R
    4. Practice exercises
    1. Pick a dataset out of 3 themes: Sport, Real Estate, Academics
    2. Exploring and Analyzing the dataset using data visualization techniques:
      1. Scatter Plot
      2. Line Plot
      3. Bar Plot
      4. Bubble Plot
      5. Box Plot
      6. Histogram
    3. Summarize insights from the dataset
    1. Introduction to Data Preprocessing
    2. Detection of Missing Data, Outliers
    3. How to treat missing data and outliers
    4. Implementation of data preprocessing with Python/R
    5. Practice exercises
    1. Introduction to Data Collection
    2. Example of famous and study datasets.
    3. Introduction to basic data scraping.
    4. Implementation of data scraping with Python
    5. Practice exercises
    1. Introduction to Text Mining
    2. API to download from social media
    3. Implementation of downloading social media data with Python/R
    4. Practice exercises
    1. Introduction to Sentiment Analysis
    2. Processing text data to predict Sentiment
    3. Implementation of sentiment analysis with Python/R
    4. Practice exercises
    1. Introduction to Association Rule  – Market Basket Analysis
    2. Metrics in Association Rule:
      1. Lift
      2. Support
      3. Confidence
    3. Implementation of Association Rule with Python/R
    4. Practice exercises
      1. Pick a dataset from our dataset bank
      2. Using learned knowledge to perform analysis on the dataset.
      3. Write a report for insights found.


Group name Start date Session duration Number of sessions Standard price


Below are the courses that you may want to take after this course:


Tokyo Techies Lecturer

Phong Nguyen

Head of Artificial Intelligence and Data Science Department

  • Data Science and Research Mentor at Tokyo Techies
  • Master Graduate from Carnegie Mellon University.
  • Researcher in an AI lab at a big global corporation.
  • Global working experience in Singapore, Australia, US, Vietnam and Japan.
  • A results-oriented Researcher and Data Guru with vast managerial and technical experience in financial management, marketing, business planning.
  • Adapt at programming in multiple different languages, such as Java, Python and R.