Blog vs Kaggle vs GitHub: Choosing Where to Publish Your Data Science Portfolio
If you have read my beginners guide to building a data science portfolio then you already know how important it is to find a way to demonstrate your skills, especially if you don’t have any work experience yet. The three most popular platforms for building a portfolio are Github, Kaggle and a personal website or blog.
At the end of this blog post you should be able to pick one (or all three) of these platforms to host your portfolio.
In my opinion, all three of these aspects are required to build a good portfolio that will get you hired even without any work experience. Work on projects (either your own project or from a competition), push them to a Github repository and then write a clear, easy to follow blog post that tells a story or teaches someone something.
You can even combine your Github repo with your blog by building a static site. This blog is built with Hugo and deployed using Netlify from a Github repo.
Github
Github is one of the most popular platforms to store code and build a portfolio and it is used by developers all over the world. In addition, git is a must-have skill for all data science and developer roles.
Pro’s:
- It is pretty easy to set up your projects on Github and most people are already familiar with the platform so potential employers will be comfortable with navigating around your repositories
- Public repositories are free
- Editing code is simple - commits are cheap
- Github lets people see what you have built and how you have built it.
Con’s:
- Since Github is so popular, chances are that there will be other candidates who will also be using the platform. This means it can be a bit more difficult to differentiate yourself from the crowd.
- Github will show anyone viewing your profile all of your contribution activity. This means you must remain active on the platform because it will be off-putting for potential employers if they see you haven’t actually done anything in a while.
Lastly, the README file is probably the most important part of your Github profile. Github automatically parses the Markdown format of your README and renders it on the front page of your repositories. It is the first thing anyone will see when they land on your Github profile so you need to make sure it stands out.
Blog
Hiring managers who land on your personal website or blog will see you going the extra mile and this can be eye-catching and impressive.
Pro’s:
- A blog or personal website can help you to stand out and help to get your foot in the door for a first interview
- In addition to portfolio projects, you can write detailed blog posts, deepening your own knowledge while also helping others to understand complex topics
- A blog allows you to create a data-based story or interactive project that can make quite the impression.
Con’s:
- Building a personal website or blog takes a lot more time and skill than simply pushing code to a Github repo.
Kaggle
Kaggle is one of the most popular data science competition platforms. Not only can you compete in competitions with fellow data scientists but you can also publish Kernels on the platform and engage with other users on the forums.
Pro’s:
- Kaggle competitions can be a lot easier than a ‘real-world’ projects since they take care of coming up with a clearly defined goal, get the data for you, and even clean it into some usable form.
- A lot of new and experienced data scientists post notebooks and kernels on Kaggle, describing their process and you can learn a lot from them
- Actively contributing in the forums can help you to make a name for yourself which can potentially bring new opportunities
Con’s:
- Having the goals and data handed to you is also a con since you would not have demonstrated your expertise in defining a problem and cleaning data, which are some of the most important skills to have in data science
Remember that one repository, one project, or one competition does not make you a data scientist. You need to need to show that you are dedicated and passionate and that will require you to do multiple projects, continuously improving and iterating on them over time.