All the tech and tools needed to be a well rounded-data scientists
Asked on 2020-11-28 19:02 by H L.
I’ve been trying to generalize my Data science skills this past holiday and I will admit it’s been a bit challenging. Data Science is not just about domain knowledge and models. There’s the cloud piece, the Git piece, the programming piece, and other moving targets as tech and tools evolve.
From job descriptions, you do see there’s a need for a bit of everything.
With that said, what do think are the top xx items that data scientists to focus on and incorporate (either in practice, or in their portfolio). And what are some good resources that have been successful?
These few items I’m listing below are ones that I lack greatly and am trying (but having trouble) grasping for various reasons (just seems so much, a bit overwhelming, unable to connect the dots to what I do daily)
Cloud services - AWS, GCP, AZURE....which platform?
Object oriented programming
Deployment concepts (Airflow/Luigi/plumbR, CI/CD, unit testing)
Flask/Dash (for if I want to “practice” model deployment)
Response #1By Michelle W. on 2020-12-08 18:38
I'm still in school but my profs have discussed this a bit. They tend to focus on two things: 1) R/Julia/Python or other languages, and 2) the underlying database technologies. They tend to focus on flexibility and being able to really focus on solving the business problem and pick up technologies as they are needed.
Response #2By Sarvesh S. on 2020-12-22 16:03
I would not worry about a lot of these technologies, the focus should be understanding the theoretical concepts of Stats and ML and being able to apply them to a real business problem. Having a solid understanding of Python and SQL should do the rest for an entry level. Almost all the other things you described can be learned on the go or doing projects for your portfolio.