Introduction
Data Processing/Manipulating Actual dataset of my own YouTube Channel for practical use that YouTube Dashboard does not produce.
Datasets / Channel Information
All data is from my YouTube Channel’s live data. I have run my own YouTube Channel since April 2021, for spreading my daily life as a Graduate Student majoring in Data Analytics to my Friends, Family, and Korean subscribers.
Channel Name : [KOR] 공부하는섭이 / [ENG] Studying Seob
공부하는섭이 - Jay's Channel
Libraries used
- General: Numpy, Pandas, Matplotlib
- Google API : Auth2client, Googleapiclient
YouTube Data API | Google Developers
- Korean Text Processing/Mining/Visualizing: Konlpy, Wordcloud
KoNLPy: Korean NLP in Python - KoNLPy 0.6.0 documentation
- Machine Learning: Scikit-learn
Purpose
- Extracting insights beneath the data, for enhancing the effect of propagation and channel growth.
- Excavating insight into the proper topic/time/duration/activities to upload, by extracting several indexes from data, especially focusing on the features that couldn’t be checked on YouTube Creator Dashboard.
- Categorical Comparison
- Manual combination of different indexes
- Overall activities comparison at a glance
- Clustering and Visualizing the frequently mentioned words for intuitive catching of active channel viewers’(including subscribers) emotional atmosphere.
Initial Questions asked to myself
- Is there significant activities(views/comments/likes) gap among different categories?