Data Analysis and Visualizations using Python

This project is a part of “IBM Data Science Professional Certificate on Coursera”. You can check my Jupyter Notebook on GitHub

Los Angeles Sky Line (Image Source: Wikipidea)


Los Angeles is a very vibrant city with a lot of neighborhoods, each with unique character. Some neighborhoods are quiet and cozy, has convenient store locations, while others offer a lot of fun and nightlife activities. Choosing a neighborhood to live in or open a business can be a complicated task to do, but with the help of location data from Foursquare and crime data, we can make it a little bit easier.

Business Problem

The objective of this capstone project…

With k-Fold Cross Validation (from scratch)

Neighbors (Image Source: Freepik)

In this article, we shall understand how k-Nearest Neighbors (kNN) algorithm works and build kNN algorithm from ground up. We also shall evaluate our algorithm using the k-Fold cross-validation which is also developed from scratch.

After completing this tutorial you will know:

  • How to code the k-Nearest Neighbors algorithm step-by-step
  • How to use k-Nearest Neighbors to make a prediction for new data
  • How to code the k-Fold Cross Validation step-by-step
  • How to evaluate k-Nearest Neighbors on a real dataset using k-Fold Cross Validation

Prerequisites: Basic understanding of Python and the concept of classes and objects from Object-oriented Programming (OOP)

k-Nearest Neighbors


An introduction to phases of NLP pipeline

Natural Language Processing (Source: Wootric)

Natural Language Processing (NLP) is one of the fastest growing field in the world. It is a subfield of artificial intelligence dealing with human interactions with computers. Main challenges in NLP involve speech recognition, natural language understanding, and natural language generation. NLP is making its way into a number of products and services that we use everyday. This article gives an overview of common end-to-end NLP pipeline.

The common NLP pipeline consists of three stages:

  • Text Processing
  • Feature Extraction
  • Modeling

Visualizations and Predictions on COVID-19 pandemic data

SARS-CoV-2 Structure Structure (Source: Scientific Animations under CC License)


The COVID-19 pandemic also known as coronavirus pandemic is the ongoing outbreak of coronavirus disease (COVID-19). It is caused by a coronavirus called severe acute respiratory syndrome coronavirus 2 (SARS‑CoV‑2).

The outbreak was identified in Wuhan, China, in December 2019. The World Health Organization declared the outbreak a Public Health Emergency of International Concern on 30 January, and a pandemic on 11 March.

It is a respiratory disease and is thought to spread mainly through close contact from person-to-person in respiratory droplets from someone who is infected. People who are infected often have symptoms of illness. Some people without symptoms…

Extract-Transform-Load (Source: Astera)


In general, a pipeline is a linear sequence of specialized modules used to design or execute a computer instruction in successive steps. Similarly, data pipeline is a generic term for moving data from one place to another. For example, it could be moving data from one server to another server.

Of the many data pipeline methodologies present, this article discusses in brief about the most used ETL pipeline.

Outline of this article:

  • Introduction to ETL pipeline
  • Example
  • Stages in ETL pipeline

A small example code for ETL pipeline can be found in my GitHub.

Introduction to ETL Pipeline

ETL is a specific kind of…

AirBnB logo
AirBnB logo
AirBnB Logo (Source: Internet)


AirBnb is an online marketplace for providing lodging, primarily b&b (bed and breakfast). The company does not own any of the listings on the application; it acts as a broker and receives commissions from each booking. Started in 2008, the company is based in San Francisco, California, US.

The company was conceived after its founders put an air mattress in their living room, effectively turning their apartment into a bed and breakfast, in order to offset the high cost of rent in San Francisco; AirBnB is a shortened version of its original name,

Airbnb’s market share has been on…

Chaitanya Krishna Kasaraneni

An avid reader, Content Creator, Masters student, and Data Science Enthusiast |

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store