- i. Types of data (categorical, quantitative, etc.)
- ii. Variables, values, observations
- iii. Variation, distribution, averages, variance, skewness, kurtosis, covariance

Photo Credit © Warren Poh

Sign up for our flagship course and learn about data science in real estate. This course will equip participants with all the pre-requisites to be a Data Scientist (in any field), plus real estate specific techniques and knowledge. Or, sign up for one of our short or medium courses to get a useful understanding of a specific topic e.g., Geographic Information Systems, Python, etc.

- Real Estate Data Science Syllabus(flagship course)
- Intro to Data Science(short course)
- Intro to Python(medium course)
- Intro to Geographic Information Systems(short course)

This course equips participants with the skills to apply data science approaches to the real estate markets. Graduates will gain a skillset far superior to industry norms, and will be sought after for research, investment analysis, and consultancy roles. Individual investors in real estate may also be interested in this course to empower independent data-driven decision making. In addition, this course covers all the fundamentals of data science, so participants will be able to apply their new knowledge across a range of industries; they will have the necessary skills to undertake a new profession as Data Scientists in virtually any field.

- a. Introduction to Supervised vs. unsupervised learning
- b. Introduction to Classification vs. regression
- c. Introduction to Dimensionality reduction
- d. Introduction to Artificial Neural Networks

- a. Installation and set-up (hands-on)
- b. Variables, strings, printing
- c. Functions; Reading and writing files
- d. Control flow (if-elif-else, loops, etc.)
- e. Data structures (lists, dictionaries)
- f. Packages (finding, installing, importing, commonly used packages)

- a. Overview; introduction to series and dataframes
- b. Loading and saving data
- c. Viewing and inspecting data; basic analyses
- d. Filter, Sort, Groupby
- e. Data Cleaning
- f. Join / Combine

- a. Overview
- i. Types of data (categorical, quantitative, etc.)
- ii. Variables, values, observations
- iii. Variation, distribution, averages, variance, skewness, kurtosis, covariance
- b. Visualizations (matplotlib, seaborn)
- i. Line Plot
- ii. Histogram
- iii. Boxplot, violinplot
- iv. Scatterplot (and scatterplot matrix)
- c. Hypothesis testing
- i. t-test
- ii. P-values
- iii. Nonparametric statistical significance tests

- a. Overview
- i. Supervised vs. Unsupervised Learning
- ii. Classification vs. Regression
- iii. Features and responses
- iv. Categorical variables; feature engineering
- v. Evaluating model fit
- vi. Model selection; cross validation
- vii. Dimensionality reduction
- viii. Typical flow of Scikit-learn analyses
- b. Linear Regression
- c. Logistic Regression
- d. Nearest Neighbors
- e. Decision Trees and Random Forests
- f. SVMs
- g. Clustering
- h. Dimensionality Reduction
- i. Artificial Neural Networks

- a. Basic Real Estate Concepts
- b. Introduction to Property Price Indices
- c. Hedonic Regression
- d. Geographic Information Systems
- i. Introduction to QGIS
- ii. Vectors & Rasters, file formats
- iii. Map visualizations
- iv. Finding and displaying data
- e. Real Estate Data Sources

This short course equips participants with a basic understanding of Data Science concepts techniques through examples and explanations of the various methodologies. Participants will gain an understanding of the fundamental underpinnings of Data Science, when various methods are useful, and what can be achieved with the latest technologies.

- a. Types of data (categorical, quantitative, etc.)
- b. Variables, values, observations
- c. Variation, distribution, averages, variance, skewness, kurtosis, covariance
- d. Hypothesis testing

- a. Line Plot
- b. Histogram
- c. Boxplot, violinplot
- d. Scatterplot (and scatterplot matrix)

- a. Feature engineering
- b. Evaluating model fit
- c. Model selection
- d. Cross validation

- a. Clustering

- a. Linear Regression
- b. Logistic Regression
- c. Nearest Neighbors
- d. Decision Trees and Random Forests
- e. SVMs

This medium-length course equips participants with basic hands-on capabilities in Python 3. Upon completion of the course, participants will be able to write and execute simple but useful programs. Participants will also gain knowledge of fundamental programming concepts, and be introduced to the resources for further self-learning.

This short course equips participants with the basic skills to use Quantum GIS for the purpose of loading and presenting map-based data and working on real-world projects.

Visualizing gross rental yields in London, UK:

Location Scoring in Manhattan, New York:

Nelson Lau has a PhD in Decision Sciences from INSEAD, is a CFA Charterholder, and completed his undergraduate work at Columbia University, double majoring in Economics and Mathematics-Statistics. He has published papers in Management Science, Decision Support Systems, and Decision Analysis, one of which received a special recognition award. Nelson started his career as a trader/researcher at an award-winning hedge fund deploying medium and low frequency strategies to global markets, and also spent significant time as lead trader at a leading global high frequency trading firm focusing on Asian markets. He has studied data science formally and applied it to building quantitative trading models in futures contracts across global markets in all major asset classes.

Cheng Xingzhi has a PhD in Statistical Physics from the National University of Singapore (NUS) and a B.S. in Computer Science from Peking University, with papers published in Physical Review Letters and elsewhere. He was a postdoctoral research fellow at the Santa Fe Institute and NUS before moving to quantitative trading, where he has 5 years of experience as a researcher, trader, and quantitative developer. Xingzhi has deep experience building frameworks to allow data science techniques to be applied to extremely large financial market datasets, enabling the production of profitable automated trading algorithms. One project he worked on was a system that autonomously conducted research on financial market data, optimized, selected, and updated trading systems that were deployed directly into the marketplace. In addition, he is a proficient mobile and web developer and has authored several iOS apps and websites in his free time.