Biography

I am currently a researcher at the Machine Intelligence Technology Lab, DAMO Academy, Alibaba. My main focus is to break the language barriers across the Alibaba ecosystem.

My professional career is a mixed of industrial research labs and startups. Previously, I spent 2 years in Textio, a start-up of augmented writing, where I was responsible for machine learning models. I worked for Microsoft on machine learning models in wearable devices such as the HoloLens project. I was a machine translations researcher at SDL. I am actively coaching early stage startups and young engineers in Vietnam.

Specialties: machine translation, natural language processing, speech recognition, speaker identification, machine learning, summarization, parsing, morphology, confidence estimation, language modeling.

Interests

  • Natural Language Processing
  • Speech Recognition & Synthesis
  • Computer Vision
  • Information Retrieval
  • Machine Learning

Education

  • PhD in Language Technology, 2012

    Carnegie Mellon University

  • MS in Computer Science, 2005

    Johns Hopkins University

  • BSc in Maths & CS, 2001

    Vietnam National University, Hanoi

Experience

 
 
 
 
 

Staff engineer

Alibaba

Jul 2018 – Present Bellevue, Washington
Breaking language barriers in the Alibaba ecosystem
 
 
 
 
 

Software engineer

Textio

Oct 2016 – Jul 2018 Seattle, Washington

As the 1st machine learning engineer, I’ve helped build Textio’s core predictive engine and learning loop for the augmented writing platform which already used by thousands of companies worldwide.

  • Spearheaded the development of the Textio core models with cutting-edge technologies in statistical natural language processing and machine learning.

  • Design, develop, ship, and improve production features, such as prediction engines for equal opportunity employment, job type, and document type.

  • Created scoring models that helped increase predictive power significantly while preserving explainability and interpretability.

 
 
 
 
 

Research scientist

Microsoft

Jan 2014 – Oct 2016 Redmon, Washington

Working on the next generation of wearable devices at Microsoft, e.g. HoloLens:

  • BCI with deep learning models, e.g. CNN, LSTM, GRU, with a patent pending on eye tracking technology.

  • Implement speaker verification systems on DSP which includes enrollment with MAP adaptation, verification with novel scoring methods, and back-end training pipeline for GMMs.

  • Reduce memory footprint and speed up runtime for i-Vector speaker recognition system with matrix factorization. Implement average stochastic gradient descent with L2 regularization to train sub-matrices.

Research on deep neural network for brain computer interface, i-Vector, probabilistic linear discriminant analysis, matrix factorization, and DNN for multiple-speaker identification.

 
 
 
 
 

Research scientist

SDL

Feb 2012 – Jan 2014 Los Angeles, California

R&D in commercial machine translation systems.

  • Model adaptation: worked on techniques to automatically adapt background translation system to a specific domain/genre via information retrieval approach and machine learning methods.

  • Confidence estimation: explored methods for machine translation quality-prediction including SVM and M5P decision tree. Member of the SDL Language Weaver team that won the 2012 MT quality prediction competition.

  • Reordering models: implemented lexicalized reordering models with distributed Hadoop/Pig training pipeline and real-time decoding.

Skills

Machine translation

Build MT models from end to end

Speech & language processing

Can understand my kids

Software engineering

Enough to get things done timely

Machine learning

Be able to explain deep learning to my grandma

Data munging

Extract gold from dirt

Product development

Turn research ideas to business opportunities

Publications

FINDINGS OF THE IWSLT 2020 EVALUATION CAMPAIGN

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2020) featured this year six challenge …