Understand the world around us using data science

Photo by Daniel Schludi on Unsplash

Currently, everybody's mind is occupied with the subject of COVID vaccinations. So in this story, you will see how we can use data science to understand what people are thinking about vaccinations and what do they discuss about it

I will be taking tweets related to the PFIZER vaccination and show you

Sentiment analysis, which will tell us what people think about the vaccinations)

Topic detection, which will give you insight into what people discuss about the vaccinations

News Impact analysis, to see if any news has an impact on people’s tweet around this subject)

Sentiment Analysis

When we speak about sentiment…

Understanding an outlier is more important than finding it

Understanding Outliers — Image by author

Finding outliers is technical, but understanding outliers is a mix of skill and art. As data scientists, we always tend to “remove” the outliers because it messes up a predictive model — and this practice has led to the wrong conclusion that an outlier is something not good and must be removed

Instead, the focus should be on understanding the outliers. And in this story, I will explain to you some techniques on how to better understand outlier data points.

We will take a dataset of cars which has columns such as make, fuel-type, aspiration, num-of-doors, price, etc.

Being euphoric right from 1st Jan

This article is my new year greeting to you. I had far simpler alternatives to wish you an excellent year using simple text messages with nice words or copy-paste a new year greeting image.

However, I want to start the new year with a lot of optimism and euphoria. So here is how I have gone about

Collecting data related to new year wishes

First of all, I searched for all keywords related to new year's wishes. I came up with the list shown here.

Greetings data — Image by author

It is organized as follows

  • The first column is a category column such as Health, Family, Professional, Prosperity, etc.
  • The second column…

From understanding flow to a quick trick to replace machine learning

Photo by Solen Feyissa on Unsplash + Image by Author

Sankey charts have become one of the important visualisation techniques in recent time for advanced analytics. It has both characteristics of any awesome visualisation — 1. It can look visually stunning 2. It gives very useful insights

However visualisation makes sense only if it is used in a certain context and purpose. For example, using bar chart to show sales trend is not as effective as using trend charts. Similarly a scatter plot does not make sense if data does not have enough variance

So here I would like to state the use-cases where Sankey charts makes sense.

Analysing flow

When Sankey…

Top Use-cases for customer analytics and how to do them

The top domain in which advanced analytics and data science is used is in understanding customers. And it is far more superior compared to any other domain — supply chain, IoT, finance etc.. And the reason is obvious — the success of your business depends upon how well you understand your customers. All other things will fall in place , once you have started to understand your customer in an effective way

Customer analytics is a very wide area. As well as every industry will analyse customers in a different way. …

Let data do the talking and leave bias and emotions out of the stock game

Data driven means that your decision are driven by data and not by emotions. This approach can be very useful in stock market investment. Here is a summary of a data driven approach which I have been taking recently

Stop analysing stock by stock

Being a data scientist, the one thing which I do not like about existing stock market tools is that the analysis is done in a very crude way. The analysis has still to be done for each stock symbol. Something like this

Choose a stock, Analyse it, decide on buy/sell

Go to second stock, Analyse it, decide on buy/sell

Go to…

How data scientist can go beyond coding and elevate themselves

Photo by Mario Azzi on Unsplash

Let us start with a small quiz

Question 1. What makes you satisfied

Option A. Complex data science code which can detect lung cancer with high accuracy

Option B. People life expectancy increases due to machine learning prediction capability

Question 2. How does your blogs and articles look like

Option A. There is lot of code in your blogs. You focus on technical explanation of how an algorithm or a data science technique works

Option B. Your blogs do not contain code. You focus on the usefulness and purpose of data science techniques

Question 3. How do you convince your…

Enhance the power of your data exploration using textual explanations

The popular saying “A picture is worth a thousand words” may be wrong when it comes to data science. Take the example of Uber Estimated Time of Arrival (ETA) algorithm which informs the user when the ride is expected to arrive.

Behind the ETA , there is a lot of complex predictive algorithm and cutting-edge visualisation with the map getting updated in real time. But all this is of no use without the single text line which says “The closest driver is approximately 1 min away.”

Uber Estimated Time of Arrival (ETA) algorithm in action

A data scientist or data analyst produces lots of data visualisation during a data…

Why treating models like data is a very strategic approach

Photo by Alexander Sinn on Unsplash

Here is a very abstract question — What does an AI or data science model look like? We are all using data science models in our day to day life. Most people that aren’t data scientists have experienced a data science model but have never seen one. So, let me reveal the secret. It may look scary. Here is what a data science model looks like

Code-free way for teachers to better understand students marks

Photo by Science in HD on Unsplash

Teachers and educational institutes deal with a lot of data related to student marks. In this story how they can use data science and advanced analytics to get insights into student marks data.


For this tutorial, we assume that an English teacher, who has a class of 26 students would like to

  • Better understand student marks
  • See if performance in one subject can impact other
  • Compare marks by gender
  • Compare marks by student

Getting the Data

The first step is to get data. The teacher collects all score of students in excel. …

Pranay Dave

Data to Insights. Creator of Youtube channel Data Science Demonstrated

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store