Claire Jiaying Wu
Data Advocator and Translator
about me
Hi, welcome to my website, and nice to e-meet you! My name is Jiaying Wu, and I go by Claire.
I've been always passionate about how scientific reasoning and business reasoning converge to drive core decisions. With over 8 years of professional experience, I bring a unique skill combination of decision intelligence, marketing management, and cross-cultural communication, which enables me to translate data insights into executable business pivots.
Currently I'm a Data Scientist at Deloitte Consulting, working with the Decisioning team of HUX (Human Experience). With my specialties in machine learning, feature engineering, and model interpretation, I answer key business questions in retail domain, such as propensity, segmentation, LTV, price optimization, product return, and more. The tech stack of my daily work includes Python, Spark, SQL, Bash, Git, Luigi, Kubeflow, Docker, AWS, PyCharm, etc.
In my personal time, I also mentor students and career transitioners who are interested in dedicating themselves to data science. I thrive on creating a fulfilling life for myself and making positive impacts on the lives of others. Now I live on the upper west side in Manhattan, New York. I would love to hear from you and be connected with you through below channels.
projects
Note: Below projects are more for side/ course projects when I was studying at Columbia. Data science concentrations at my current work, please see here.
Machine Learning / Statistical Modeling
Improved the MAE by 5.67% using XGBoost and the methods of imputation, model selection, hyperparameters tuning (GridSearch cross-validation)
Data Analytics
Longitudinal Medical Records Analysis
Problem: Analyzed the effectiveness of each cardiology medication
Approach: Cleaned the long-formed dataset, summarized the utilization, crude event rates, and computed the unadjusted odds ratio
Outcome: Statins > ACE Inhibitors > Beta Blockers (potential conclusion)
R
YouTube Trending Videos Analysis
Problem: What do we learn from the trending videos and how could I gain more views?
Approach: Derived insights from trending videos dataset using EDA, text mining, sentiment analysis, and linear regression
Outcome: Created 4 actionable strategies to drive more traffic for video posters
R
Online Education System Evaluation
Problem: Assessed the effectiveness of the target e-learning system
Approach: Analyzed the time efficiency, inspected correlations with outcomes, and evaluated the exam performance over time
Outcome: This digital system has no obvious positive impact on students' exam performance
R
Data Visualization
Database Design
Data Lake Architecture
Home Property Database & Insight Sharepoint Design
Problem: Accelerate data communication for both technical and non-technical teammates
Approach: database normalization, schema design, ERD creation, data integrity inspection, data ETL, and dashboard engine setup
Outcome: a newly established database and an interactive analytical Sharepoint
Python, PostgreSQL
Data Lake Architecture Design for a Fashion Recommender Engine
Problem: Design the data lake architecture to support social media data stream
Approach: incorporating of mongoDB and neo4j into a broader data lake configuration consisting of Hadoop, AWS, and Spark tools
Outcome: recommendations for a full data infrastructure for StitchFix
(conceptual)
Experiment Design
A Research Concerning the Increase of Commercial Videos Traffic
Problem: Design a study to uncover the essential aspects of commercial videos in beauty industry
Approach: defining the management & research questions, hypotheses, methodology, and sampling plan
Outcome: Designed a research proposal
(conceptual)
UX Design
An Open Data Analytical Application For New Yorkers
Problem: Improve the user experience and analytic application usability for NYC Open Data
Approach: analyzing five usability tests, initiating and iterating the revised wireframes, prototyping and final usability test
Outcome: a final report with a prototype