I recently watched a webinar hosted by We Cloud Data on Big Data. I still have a lot to learn, but it helped to see the whole ecosystem in one slide together with some examples of how the frameworks handle distributed storage and fault tolerance. I took a screenshot of their description of the Big Data Ecosystem:

Big Data Ecosystem

It was reassuring to hear that it is not necessary to master all infrastructure providers. One can focus on one provider, and learning the other will be straightforward.

On the learning side, I am currently taking the following courses:

  • DataCamp Data Scientist with Python.
    I have completed 12 courses and have 11 more to go.
  • CS50 on edX.
    I am working on Problem Set 5. I have neglected it a bit lately, but I can still finish by December
  • Kaggle.
    Four courses down, eleven to go.
  • HarvardX Professional Certificate in Data Science.
    I am about to finish the second course out of nine.

By December, I think I will be able to finish both of my DataCamp career tracks: Data Scientist With Python and Machine Learning With Python. I also plan to have completed all the Kaggle courses and CS50.

As for the HarvardX Professional Certificate in Data Science, there are a total of 9 courses and, if I continue at my current rate, I will finish by December. It is still great because on the website, it says it takes 17 months, and I believe I will finish in seven or eight. But maybe some of the future courses will get harder and maybe push completion to next year.

I also want to share that I submitted the paperwork for my Green Card in July. With everything going on right now, it is hard to tell how long it will take, so I’ll have to wait patiently and see how it goes. Wish me luck!