My data science skill set

Programming language

  • Python   (ML + visualization)

Machine learning

  • XGBoost - parameter tuning best practices, reason codes, model deployment in Python / Scala-Spark
    Talk slides - Big Data Utah meetup, 2016
    • Rank 10 / 3274 as a solo team in kaggle's Sberbank Russian Housing Market competition (2017)
  • Other common ML algorithms like random forests, logistic regression, SVM, etc.

Database

  • SQL server (1 yr+ experience)
  • Neo4j (Graph database) - graph algorithms and APOC, complex cypher query desgin

Visualization

Big data tools / platform

  • Spark for ML pipeline

Data science articles

  • XGBoost deployment made easy   Link
  • Plotting decision boundaries in 3D - Logistic regression and XGBoost   Link