Aniket Chakraborty
A final year MSc Data Science Student at Manipal Academy of Higher Education. Start my data science journey back in 2019 with Mathematics. I completed over 15 self hosted projects and some professional certificate courses so far.
My Skiils
- Python
- SQL
- R Programming
- Tableau
- ChatGPT
- PowerPoint
- Spreadsheet
Educational Qualification
- Baranagore Ramakrishna Mission Ashram High School (2007-2017)
- Baranagore Narendranath Vidyamandir (2017-2019)
- Ramakrishna Mission Vivekananda Centenary College (2019-2022) (BSc(H) in Mathematics)
- Manipal Academy of Higher Education (2022- present) (MSc in Data Science)
My Certificate Courses
Courses Learned in Data Science Carrer
- Python for Data Science
- R for Data Science
- Mathematics for Data Science
- Statistics
- SQL
- Time Series Analysis
- Machine Learning
- Deep Learning
- Simple and Multiple Linear Regression
- Logistics Regression
- Natural Language Processing
- Computer Vision
- Report Writing
My Data Science and Data Analysis Projects
- Python is used to complete the project
- Timeline: May, 2023
- A simple data analysis project, comes under EDA
- Pandas, numpy and Matplotlib libraries are used
- Comparison of numbers scored by one person in different years using graphs
- Python is used for this project
- Timeline: September, 2023
- A simple EDA project
- General data science libraries are used
- Help to underdstand and analyse the regular human vitals captured by smart wtches
- R programming language is used
- Timeline: May, 2023
- Basic data science libraries such as tidyverse, ggplot2 and splyr are used
- EDA is done to render underlying hidden information and patterns
- The most important inference is that the patients in age range 20-25 are most affected by Diabetics, this may be a reason of bad food habit
- Python is used
- Timeline: December, 2023
- Standard Data science python libraries are used
- This project shows the rate of Immegration to Canada from 1980 onwards
- Some unique plots such as Area plot, Regression Plot, Bubble plots are used
- Inspired from IBM Data Science Data Visualization Segment
- Softwares used - Python, SQL, Tableau
- Timeline: November, 2023
- Data analysis on real life original data derived from google
- Infer why India and Australia are the finalist
- Discover some hidden pattenrs inside the data
- Python Folium is used
- Timeline: November, 2023
- Pointed some important cities in India Map
- Use latitude and longitude to point out a position
- Python programming language
- Timeline: September, 2023
- T test to check whether the group means are equal or not
- R programming is used
- Timeline: May, 2023
- In R, lm() function is used to build the model
- The adjusted R-2 Score is important to know model accuracy
- P value determines a feature is important or not
- If P value is larger than significance level (0.05), we reject that feature from our linear relationship assumption
- R programming is used
- Timeline: May,2023
- In R, we use glm() function to conduct this
- Deals with probability
- Values ranges between 0 and 1
- R programming is used
- Timeline: June,2023
- The factors are in the form of l^f
- Used when there are more than one factors and they have one or more than one levels
- They have a special geometric shape depending on the number of factors and levels
- Use of python programming
- Timeline: December, 2023
- Use of python plotting functions such as matplotlib and seaborn
- Use of def() class to define functions
- Use of python programming
- Timeline: August, 2023
- Def() class is used to define interior and exterior angles
- Each interior angle can’t exceed 180 degrees
- Python programming is used
- Timeline: December, 2023
- ML library scikit Learn is used in this project
- To check accuracy metrics module is used
- Supervised Machine Learning Process
- The Regression is conducted by using Training and Test set
- Python Programming is used
- Timeline: December, 2023
- Sklearn is used with Knnclassifier
- Metrics module is used to check accuracy
- Supervised Machine Learning Process
- Python Programming is used
- Timeline: December, 2023
- Sklearn Decision Tree classifier is used
- Supervised Machine Learning process
- Metrics module is used to get the accuracy
- Python is used in this project
- Timeline: March. 2024
- Sklearn’s KMeans, metrics are used
- WSS are calculated to create elbow plot
- Silhouette Scores are calculated to observe optimal number of clusters
- Cluster profiles are created along with Z score profiles
- Python is used in this project
- Timeline: March, 2024
- Just like Clustering (K-Means), for this project, the data in hand must be numeric and scaled (Z-transformed)
- Sklearn is not used. Instead I use scipy’s dendrogram and linkage modukle
- It is clear how the tree like structrure is formed.
- Python is used to complete the project
- Timeline: March, 2024
- Linear Regression, Logistic Regression, KNN, Decision Tree, SVM
- MAE, MSE, R2, Accuracy, Jaccard Index, F1 score, Log Loss, confusion_matrix
- numpy, pandas, matplotlib, sklearn, itertools, warnings, os, magic functions
- Python Numpy, Pandas, Seaborn, Matplotlib, Scipy is used
- Boxplopts and Histograms are used to analyse the distribution
- Normality Check is done using the Shapiro test
- T test is done both on Gender and Test scores
- Correlation Plot is calculated to know th pairwise correlation between scores using heatmap function.
- Timeline: May, 2024
My Social Presence