© 2019 pSilent Partners Ltd

What is a Data Scientist?

Caveat Emptor: There is no one size fits all description of a Data Scientist that we've found encompassing of every role in every company out in the wild nor do we believe in absolutes (we're data scientists). Do your own research and know your goals before adopting ours at face value.

The Job Description

The ideal candidate for a role in Data Science has the ability to recognize sublime patterns within a subjective reality and surface probabilistic causality using mathematics as the backbone for all narratives relating to past observations and future predictions.

More plainly, the ideal candidate can intuitively see, explain, and predict patterns in reality with programmable math.

Furthermore, an expert in the field will develop algorithmic models capable of intelligently adapting to changes in subjective reality.

Presentation plays a key factor in the success of a career-long data scientist. Connection with stakeholders is the operative factor of success. Maintaining a professional understanding of a theory or discovery among a team is no small feat. The ideal data science team leader will present models to the organization in such a manner that both technical and non-technical stakeholders attain compulsory understanding.

Day to day, the data scientist will be mired in tasks relating to intelligent systems rooted in machine learning methods. These methods analyze components of information for statistical significance and offer a clear path to automation for the processes they attempt to explain.

The Skillset

A data scientist's hard skills aren't limited to the core list we're offering below. A PhD or Master's degree in statistics is not a requisite for guaranteed success either, though it certainly demonstrates a higher probability of depth and breadth in light of competency. Primary we look for three core competencies: Programming, Math, and Curiosity.

Nonetheless, the ideal data scientist can demonstrate mastery in...

Computational Competencies

  1. At least two computer programming languages. Three or more preferred.

  2. Applied understanding of probability and statistics.

  3. Data visualization methods and related tools.

  4. Experimental design and adaptive experimentation methods.

  5. Data collection and cleaning methods

  6. Relational database management with SQL or NoSQL document database topology

  7. Applied Mathematics - Linear Algebra and Calculus

  8. Algorithms and data structures - Sorting, Trees, Graphs, and Data Topology Design

Modeling Competencies

  1. Linear regression models and their evaluation methods

  2. Classifiers and their evaluation methods

  3. Principal Component Analysis

  4. Supervised learning methods - KNN Classifiers; Random Forest Classifiers; Logistic, Ridge, and Lasso Regression Methods; Support Vector Machines; Boosting Models

  5. Unsupervised learning methods - K-means, mean-shift, spectral, and affinity clustering methods

  6. Neural Networks and Deep Learning - Supervised NN; Unsupervised NN; Recursive NN; Convolutional NN; Long Short-Term Memory NN; Deep Learning (with hidden layers); Reinforcement Learning

  7. Natural Language Processing - LSA, PCA, LSTM, Vectorization, N-Grams, and Contextual Analysis Methods

  8. Computer Vision - Tensorflow/Keras; OpenCV; YOLOv3; MNIST

General Data and Experimentation Competencies

  1. Web scraping methods

  2. API Design Methods

  3. Big Data Analysis with Hadoop, Kafka, Spark

  4. Survey Design

  5. Data normalization and distribution methods

The Future

The old world rested upon absolutes. The world was flat until it wasn't. To date, humanity still reasons with unconscious absolutes to a great extent while information in the irreverent Universe continues knocking on the doors of our collective perception, begging us to consider that everything is possible if not probable at best.

If you're not one to be easily convinced and you or someone you know has a skill-set that matches the description above, then you're among the few that have the tools to push open the doorway into universal perception while fusing the qualitative nature of intuition with the quantitative density of mathematical reasoning. If that's so, then we invite you to comment or reach out to us to start a dialog and explore the possibilities of invention without limits.

Thank you for reading. Please leave your comments below.