top of page
about.

about me

Hi, welcome to my website, and nice to e-meet you! My name is Jiaying Wu, and I go by Claire.

I've been always passionate about how scientific reasoning and business reasoning converge to drive core decisions. With over 8 years of professional experience, I bring a unique skill combination of decision intelligence, marketing management, and cross-cultural communication, which enables me to translate data insights into executable business pivots

Currently I'm a Data Scientist at Deloitte Consulting, working with the Decisioning team of HUX (Human Experience). With my specialties in machine learning,  feature engineering, and model interpretation, I answer key business questions in retail domain, such as propensity, segmentation, LTV, price optimization, product return, and more. The tech stack of my daily work includes Python, Spark, SQL, Bash, Git, Luigi, Kubeflow, Docker, AWS, PyCharm, etc.

In my personal time, I also mentor students and career transitioners who are interested in dedicating themselves to data science. I thrive on creating a fulfilling life for myself and making positive impacts on the lives of others. Now I live on the upper west side in Manhattan, New York. I would love to hear from you and be connected with you through below channels.

email round icon.png
linkedin_circle_black-512.png
25231-github-cat-in-a-circle-icon-vector
projects.

projects

Note: Below projects are more for side/ course projects when I was studying at Columbia. Data science concentrations at my current work, please see here

Machine Learning / Statistical Modeling

Improved the MAE by 5.67% using XGBoost and the methods of imputation, model selection, hyperparameters tuning (GridSearch cross-validation)

zillow_cover.png

Home Value Prediction Optimization

Problem: Increase home value estimation accuracy even more
Approach: missing value imputation, feature engineering, model selection (XGBoost), hyper-parameters tuning (GridSearch CV)
Outcome: Improved the MAE by 5.67%

Python, R

fashion mnist_cover.png

Fashion Image Recognition

Problem: Find the best-performed model for this multi-classification task
Approach: Constructed 10 machine learning models and evaluated the performance based on sample size, efficiency, and accuracy
Outcome: randomForest & SVM performed best

R

insurance_cover_3.jpg

Insurance Renewal Estimation

Problem: Add profit with optimized  incentives
Approach: Improved prediction by stacking Random Forest, Neural Network, Naive Bayes, and GBM, and optimized incentives by optim()
Outcome: Renewal probability prediction & optimized incentive for each policy

R

Data Analytics

itunes_app_store_icon_field_640.0.jpg

Apple App Store Monetization

Problem: Increase the revenue of App Store
Approach: Analyzed the correlation between conversion rate & forms of payment; Ran an A/B test, promoting credit card over PayPal
Outcome: Achieved a 10.71% revenue increase based on existing purchase flow

Python

citibike_cover.jpeg

Citi Bike User Engagement Analysis

Problem: Expand user base & improve rider engagement
Approach: Analyzed the riders’ behaviors, usage seasonality, and station traffic by data cleaning, inspection and EDA
Outcome: Recommended 3 marketing strategies

Python

taxi_cover_2.jpg

NYC Taxi Ride Exploratory

Problem: Which vendor is easier to find a cab?
Approach: Performed EDA on NYC Taxi Rides dataset and visualized vendors availability by time and by geography
Outcome: CMT is the vendor with higher availability & more pickup options

Python

cardiology_cover3.jpg

Longitudinal Medical Records Analysis

Problem: Analyzed the effectiveness of each cardiology medication
Approach: Cleaned the long-formed dataset, summarized the utilization, crude event rates, and computed the unadjusted odds ratio 
Outcome: Statins > ACE Inhibitors > Beta Blockers (potential conclusion)

R

YouTube-trendings-tab-800x488.png

YouTube Trending Videos Analysis

Problem: What do we learn from the trending videos and how could I gain more views?
Approach: Derived insights from trending videos dataset using EDA, text mining, sentiment analysis, and linear regression 
Outcome: Created 4 actionable strategies to drive more traffic for video posters

R

e-learning.jpg

Online Education System Evaluation

Problem: Assessed the effectiveness of the target e-learning system
Approach: Analyzed the time efficiency, inspected correlations with outcomes, and evaluated the exam performance over time 
Outcome: This digital system has no obvious positive impact on students' exam performance

R

Data Visualization

shiny_cover.png

Marketing Survey Reporting Engine

Problem: Extract insights from survey results by only clicking buttons
Approach: Developed a dynamic reporting tool to visualize all the results using the flexdashboard package
Outcome: an interactive reporting engine

R Shiny

economy_cover.png

Economy Dataset Storytelling

Problem: What are the hidden correlations in the economy dataset?
Approach: Told the story by connecting the internal "dots" with external research
Outcome: an infographic - “An Unfiltered Exploration Into The ‘80s”

Tableau

ufo_cover.png

UFO Report Data Storytelling

Problem: Do UFOs really exist?
Approach: Visualized interesting patterns in the UFO reports data and verified findings by merging external datasets
Outcome: an infographic - “A Look at the Past 600 Years of UFO ‘Conspiracy’” 

Tableau

Database Design

Data Lake Architecture

zillow_db_cover.png

Home Property Database & Insight Sharepoint Design

Problem: Accelerate data communication for both technical and non-technical teammates
Approach: database normalization, schema design, ERD creation, data integrity inspection, data ETL, and dashboard engine setup
Outcome: a newly established database and an interactive analytical 
Sharepoint

Python, PostgreSQL

stitchfix_cover.png

Data Lake Architecture Design for a Fashion Recommender Engine 

Problem: Design the data lake architecture to support social media data stream 
Approach: incorporating of mongoDB and neo4j into a broader data lake configuration consisting of Hadoop, AWS, and Spark tools
Outcome: recommendations for a full data infrastructure for StitchFix

(conceptual)

Experiment Design

research_design_cover.png

A Research Concerning the Increase of Commercial Videos Traffic

Problem: Design a study to uncover the essential aspects of commercial videos in beauty industry
Approach: defining the management & research questions, hypotheses, methodology, and sampling plan
Outcome: Designed a research proposal 

(conceptual)

UX Design

ux_cover_2.png

An Open Data Analytical Application For New Yorkers

Problem: Improve the user experience and analytic application usability for NYC Open Data
Approach: analyzing five usability tests, initiating and iterating the revised wireframes, prototyping and final usability test
Outcome: a final report with a prototype 

Balsamiq, InVision

experiences

experiences

2018 - Present

deloitte logo.jpeg

07/2019 - Present

Data Scientist

at Deloitte Digital

07/2019 - Present

Data Scientist

at Deloitte Digital

Decisioning for retail domain by building statistical models, creating auto ML pipelines, feature engineering, hyperparameter tuning,  and black-box interpreting

cbs_logo.jpeg

02/2019 - 05/2019

Graduate Research Assistant at Columbia Business School

02/2019 - 05/2019

Graduate Research Assistant at Columbia Business School

Collected and analyzed behavioral study data in R and organizing in-lab quantitative researches at the Behavioral Research Lab

hubspot_logo.png

09/2018 - 12/2018

Analytics Consultant 

at HubSpot

09/2018 - 12/2018

Analytics Consultant 

at HubSpot

Constructed and presented six actionable strategies based on critical insights derived by data analysis and hypothesis testing

(Capstone Project)

cu_logo.png

01/2018 - 05/2019

M.S. in Applied Analytics at Columbia

01/2018 - 05/2019

M.S. in Applied Analytics at Columbia

Graduated from the STEM M.S. in Advanced Analytics program, core courses including Research Design, Machine Learning, and Applied Data Science

Events

events

email round icon.png
linkedin_circle_black-512.png
25231-github-cat-in-a-circle-icon-vector
contact.
eli.png

"... I was extremely impressed by Claire's technical dexterity throughout the Capstone Project. As a team leader, she was also very responsible for monitoring her team’s performance from the beginning of the project until the conclusion. I highly recommend Claire in any professional and academic capacity in business analytics and data science."

Eli Joseph, Faculty Member at Columbia University, TEDx Speaker, Forbes Under 30 Scholar

ck.png

"Claire is a quick starter and was able to learn and adapt quickly. At any given time, she has to manage multiple marketing projects in a very dynamic environment. She is equipped with good project management, communication, presentation, independent thinking, and cross-functional collaboration skills to deliver her role."

CK Tan, Vice President at The Lubrizol Corporation

greg.png

"As one of Claire's career coaches at Columbia, I've consistently been impressed and amazed by her initiative. Each time we've spoken, I discover something new that she's gotten involved with. A new part-time job, project, or even building a personal website, Claire raises the bar when it comes to dedicating herself to her professional development. She's extremely focused, welcoming of new opportunities, and carries herself with grace and humility. Claire is also one of the kindest people I know. No matter where she finds herself in her career, she'll make a positive impact. "

Gregory Costanzo, Associate Director of Industry Relations at Columbia University

contact_updated.png
email round icon.png
linkedin_circle_black-512.png
25231-github-cat-in-a-circle-icon-vector
bottom of page