Text Classification

Implementing Text Classification Using Information Retrieval Methods

Program output

Natural Language Processing

Table of Contents


Text Classification

  • Implement Text Classification using information retrieval methods.
  • Analyze, comprehend, and compare the techniques used.


  • Python


After successful completion of this experiment, students will be able to:

  1. Understand the concept of Information Retrieval and its application in text classification.


Text classification is a machine learning technique that assigns a set of predefined categories to open-ended text. It is fundamental in natural language processing with applications such as sentiment analysis, topic labeling, spam detection, and intent detection. Automatic text classification leverages machine learning, NLP, and AI techniques to classify text efficiently and accurately.

Machine Learning Text Classification Algorithms

  • Naive Bayes
  • Support Vector Machines
  • Deep Learning (Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN))

Task to be completed in PART B


  1. Select a dataset of your choice or use the dataset from News Category Dataset.
  2. Apply at least two different techniques/algorithms for classifying the text in the given dataset to develop a model.
  3. Predict the class of the test dataset using the developed models.
  4. Analyze, comprehend, and compare results using appropriate metrics.

For further information and datasets, refer to:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import re
import nltk
from nltk.corpus import stopwords
from nltk.stem import PorterStemmer
from nltk.stem import WordNetLemmatizer
from nltk.tokenize import word_tokenize
from nltk.tokenize import sent_tokenize
from wordcloud import WordCloud
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report
from sklearn.metrics import make_scorer, roc_curve, roc_auc_score
from sklearn.metrics import precision_recall_fscore_support as score
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.multiclass import OneVsRestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC, LinearSVC
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB, MultinomialNB, BernoulliNB

from nltk.corpus import stopwords


Performing basic EDA of the dataset (along with Visualization of data):

df = pd.read_json("News_Category_Dataset_v3.json", lines=True)
(209527, 6)

0https://www.huffpost.com/entry/covid-boosters-...Over 4 Million Americans Roll Up Sleeves For O...U.S. NEWSHealth experts said it is too early to predict...Carla K. Johnson, AP2022-09-23
1https://www.huffpost.com/entry/american-airlin...American Airlines Flyer Charged, Banned For Li...U.S. NEWSHe was subdued by passengers and crew when he ...Mary Papenfuss2022-09-23
2https://www.huffpost.com/entry/funniest-tweets...23 Of The Funniest Tweets About Cats And Dogs ...COMEDY"Until you have a dog you don't understand wha...Elyse Wanshel2022-09-23
3https://www.huffpost.com/entry/funniest-parent...The Funniest Tweets From Parents This Week (Se...PARENTING"Accidentally put grown-up toothpaste on my to...Caroline Bologna2022-09-23
4https://www.huffpost.com/entry/amy-cooper-lose...Woman Who Called Cops On Black Bird-Watcher Lo...U.S. NEWSAmy Cooper accused investment firm Franklin Te...Nina Golgowski2022-09-22

mean2015-04-30 00:44:14.344308736
min2012-01-28 00:00:00
25%2013-08-10 00:00:00
50%2015-03-16 00:00:00
75%2016-11-01 00:00:00
max2022-09-23 00:00:00
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 209527 entries, 0 to 209526
Data columns (total 6 columns):
 #   Column             Non-Null Count   Dtype         
---  ------             --------------   -----         
 0   link               209527 non-null  object        
 1   headline           209527 non-null  object        
 2   category           209527 non-null  object        
 3   short_description  209527 non-null  object        
 4   authors            209527 non-null  object        
 5   date               209527 non-null  datetime64[ns]
dtypes: datetime64[ns](1), object(5)
memory usage: 9.6+ MB
link                         object
headline                     object
category                     object
short_description            object
authors                      object
date                 datetime64[ns]
dtype: object
POLITICS          35602
WELLNESS          17945
TRAVEL             9900
STYLE & BEAUTY     9814
PARENTING          8791
QUEER VOICES       6347
FOOD & DRINK       6340
BUSINESS           5992
COMEDY             5400
SPORTS             5077
BLACK VOICES       4583
HOME & LIVING      4320
PARENTS            3955
WEDDINGS           3653
WOMEN              3572
CRIME              3562
IMPACT             3484
DIVORCE            3426
WORLD NEWS         3299
MEDIA              2944
WEIRD NEWS         2777
GREEN              2622
WORLDPOST          2579
RELIGION           2577
STYLE              2254
SCIENCE            2206
TECH               2104
TASTE              2096
MONEY              1756
ARTS               1509
ENVIRONMENT        1444
FIFTY              1401
GOOD NEWS          1398
U.S. NEWS          1377
ARTS & CULTURE     1339
COLLEGE            1144
CULTURE & ARTS     1074
EDUCATION          1014
Name: count, dtype: int64
# Convert the 'category' column to categorical
df['category'] = df['category'].astype('category')

# Adding a new column with category codes
df['category_code'] = df['category'].cat.codes
0         35
1         35
2          5
3         22
4         35
209522    32
209523    28
209524    28
209525    28
209526    28
Name: category_code, Length: 209527, dtype: int8
category = df[["category", "category_code"]].drop_duplicates().sort_values("category_code")

28033GOOD NEWS14
0U.S. NEWS35
category_counts = [df[df['category'] == category].shape[0] for category in category_dict]
colors = ["skyblue"]
plt.title("Visulaize numbers of Category of data")


for category in category_dict:
   df_category = df[df['category'] == category]
   print(f"{category} = {df_category}")
ARTS =                                                      link   
44677   https://www.huffingtonpost.com/entry/an-alert-...  \
46526   https://www.huffingtonpost.com/entry/stage-doo...   
47136   https://www.huffingtonpost.com/entry/donna-que...   
47485   https://www.huffingtonpost.com/entry/top-5-siz...   
47655   https://www.huffingtonpost.com/entry/defending...   
...                                                   ...   
133555  https://www.huffingtonpost.com/entry/mark-inne...   
133574  https://www.huffingtonpost.com/entry/boys-in-t...   
133607  https://www.huffingtonpost.com/entry/first-nig...   
133631  https://www.huffingtonpost.com/entry/artists-s...   
133651  https://www.huffingtonpost.com/entry/aisle-vie...   

                                                 headline category   
44677   An Alert, Well-Hydrated Artist in No Acute Dis...     ARTS  \
46526   Stage Door: Ute Lemper's Songs From The Broken...     ARTS   
47136                           Donna Quesada: Art Review     ARTS   
47485   Top 5 Sizzling Hot Winter Music Festivals in F...     ARTS   
47655         Defending their lives in 'Ride the Cyclone'     ARTS   
...                                                   ...      ...   
133555                   Mark Innerst at DC Moore Gallery     ARTS   
133574                                  Boys in the Attic     ARTS   
133607  First Nighter: Moss Hart's "Act One" in Two Gr...     ARTS   
133631  Artists' Statements: Can't Live With Them, Can...     ARTS   
133651                    Aisle View: Kiss of the Vampire     ARTS   

44677                                                      \
...                                                   ...   
133555  I recently spoke to Mark Innerst to ask him a ...   
133574  Two recent Bay Area productions included chara...   
133631  Artists' statements are a gold mine for someon...   
133651  The comedic chameleon Arnie Burton first came ...   

                                                  authors       date   
44677   Catherine Armsden, ContributorAuthor, architec... 2017-01-28  \
46526    Fern Siegel, ContributorDeputy Editor, MediaPost 2017-01-08   
47136     Ira Israel, ContributorAuthor & Psychotherapist 2017-01-01   
47485   ZEALnyc, Contributorarts - culture - entertain... 2016-12-28   
47655   ZEALnyc, Contributorarts - culture - entertain... 2016-12-26   
...                                                   ...        ...   
133555  John Seed, ContributorProfessor of Art and Art... 2014-04-19   
133574  George Heymont, ContributorSan Francisco-based... 2014-04-19   
133607      David Finkle, ContributorWriter, Drama Critic 2014-04-18   
133631  Jane Chafin, ContributorDirector, Offramp Gall... 2014-04-18   
133651             Steven Suskin, ContributorDrama critic 2014-04-18   

44677               0  
46526               0  
47136               0  
47485               0  
47655               0  
...               ...  
133555              0  
133574              0  
133607              0  
133631              0  
133651              0  

[1509 rows x 7 columns]
ARTS & CULTURE =                                                     link   
15389  https://www.huffingtonpost.com/entry/modeling-...  \
16100  https://www.huffingtonpost.com/entry/actor-jef...   
16372  https://www.huffingtonpost.com/entry/new-yorke...   
16418  https://www.huffingtonpost.com/entry/jk-rowlin...   
16602  https://www.huffingtonpost.com/entry/man-girlf...   
...                                                  ...   
94910  https://www.huffingtonpost.com/entry/street-ar...   
94940  https://www.huffingtonpost.com/entry/breaking-...   
94960  https://www.huffingtonpost.com/entry/your-favo...   
94982  https://www.huffingtonpost.com/entry/dance-ban...   
95466  https://www.huffingtonpost.com/entry/scientolo...   

                                                headline        category   
15389  Modeling Agencies Enabled Sexual Predators For...  ARTS & CULTURE  \
16100  Actor Jeff Hiller Talks “Bright Colors And Bol...  ARTS & CULTURE   
16372  New Yorker Cover Puts Trump 'In The Hole' Afte...  ARTS & CULTURE   
16418  J. K. Rowling Trolls Trump For Canceled UK Vis...  ARTS & CULTURE   
16602  Man Surprises Girlfriend By Drawing Them In Di...  ARTS & CULTURE   
...                                                  ...             ...   
94910  Street Art Murals From All 196 Countries In Th...  ARTS & CULTURE   
94940  What Breaking Up On ‘The Bachelorette’ Reveals...  ARTS & CULTURE   
94960  Watch An Entire Disney Movie In The Blink Of A...  ARTS & CULTURE   
94982          Six Countries Where It's Illegal To Dance  ARTS & CULTURE   
95466  Scientology Leader David Miscavige's Father To...  ARTS & CULTURE   

15389  In the 1980s and '90s, Carolyn Kramer said she...  \
16100  This week I talked with actor Jeff Hiller abou...   
16372  The president reportedly referred to groups of...   
16418                                 Not a scaredy-cat.   
16602                    What a colorful Christmas gift.   
...                                                  ...   
94910  8. Liberia Artist: Nanook Title: N/A Location:...   
94940  The way we raise boys makes having relationshi...   
94960  Stunning film visualizations compress feature-...   
94982  Let's just say Baby could do worse than the co...   
95466  There are dysfunctional families, and then the...   

                                                 authors       date   
15389                                    Angelina Chapin 2018-01-29  \
16100  Charlotte Robinson, ContributorEmmy Award Winn... 2018-01-17   
16372                                       Sara Boboltz 2018-01-12   
16418                                          Lee Moran 2018-01-12   
16602                                      Elyse Wanshel 2018-01-10   
...                                                  ...        ...   
94910                             Keith Estiler, Pixable 2015-07-08   
94940                                      Claire Fallon 2015-07-07   
94960                                        Maddie Crum 2015-07-07   
94982                                        Mallika Rao 2015-07-07   
95466                                   Stephanie Marcus 2015-07-01   

15389              1  
16100              1  
16372              1  
16418              1  
16602              1  
...              ...  
94910              1  
94940              1  
94960              1  
94982              1  
95466              1  

[1339 rows x 7 columns]
BLACK VOICES =                                                      link   
455     https://www.huffpost.com/entry/mariah-carey-la...  \
456     https://www.huffpost.com/entry/diddy-bet-award...   
461     https://www.huffpost.com/entry/bet-awards-2022...   
1259    https://www.huffpost.com/entry/herbie-husker-o...   
1525    https://www.huffpost.com/entry/georgia-guarant...   
...                                                   ...   
209422  https://www.huffingtonpost.com/entry/uncf-hono...   
209423  https://www.huffingtonpost.com/entry/oprah-win...   
209424  https://www.huffingtonpost.com/entry/max-hardy...   
209425  https://www.huffingtonpost.com/entry/light-of-...   
209426  https://www.huffingtonpost.com/entry/martin-lu...   

                                                 headline      category   
455     Mariah Carey Brings Big, Big Energy To Latto's...  BLACK VOICES  \
456     Diddy Honored With Lifetime Achievement, Star-...  BLACK VOICES   
461     BET Awards 2022 Red Carpet: See The Best Looks...  BLACK VOICES   
1259    University Of Nebraska Changes Mascot's Hand S...  BLACK VOICES   
1525    Hundreds Of Black Women In Georgia Will Get $8...  BLACK VOICES   
...                                                   ...           ...   
209422  UNCF Honors Students With "An Evening With The...  BLACK VOICES   
209423           Oprah Winfrey's Style Evolution (PHOTOS)  BLACK VOICES   
209424  Max Hardy, Amare Stoudemire's Personal Chef, A...  BLACK VOICES   
209425                                Darkness Ain't Cool  BLACK VOICES   
209426         Would Martin Luther Vote for Barack Obama?  BLACK VOICES   

455     The "Queen of Christmas" joined the "Queen of ...  \
456     The music mogul was presented the highest hono...   
461     Drip or drown was the motto on the red carpet,...   
1259                     Herbie Husker cleans up his act.   
1525    The guaranteed income pilot is set to be one o...   
...                                                   ...   
209422  Still holding true to their motto, "A Mind is ...   
209423  Regal is the best way to describe Oprah Winfre...   
209424  Perhaps Michael and Magic had their personal c...   
209425  If being in the dark sucks so much, why would ...   
209426  If Luther was living today and had the opportu...   

                                                  authors       date   
455                                   Ruth Etiesit Samuel 2022-06-27  \
456                                          Taryn Finley 2022-06-27   
461                                   Ruth Etiesit Samuel 2022-06-26   
1259                                       Mary Papenfuss 2022-01-30   
1525                                  Sarah Ruiz-Grossman 2021-12-08   
...                                                   ...        ...   
209422                                                    2012-01-29   
209423                                       Julee Wilson 2012-01-29   
209424                       Jessica Cumberbatch Anderson 2012-01-29   
209425         Alisha L. Gordon, Contributor\nContributor 2012-01-29   
209426  William E. Flippin, Jr., Contributor\nAdvocate... 2012-01-29   

455                 2  
456                 2  
461                 2  
1259                2  
1525                2  
...               ...  
209422              2  
209423              2  
209424              2  
209425              2  
209426              2  

[4583 rows x 7 columns]
BUSINESS =                                                      link   
162     https://www.huffpost.com/entry/rei-workers-ber...  \
353     https://www.huffpost.com/entry/twitter-elon-mu...   
632     https://www.huffpost.com/entry/starbucks-leave...   
690     https://www.huffpost.com/entry/coinbase-crypto...   
727     https://www.huffpost.com/entry/us-april-jobs-r...   
...                                                   ...   
209507  https://www.huffingtonpost.com/entry/four-more...   
209508  https://www.huffingtonpost.com/entry/bank-fees...   
209509  https://www.huffingtonpost.comhttp://jobs.aol....   
209510  https://www.huffingtonpost.com/entry/world-eco...   
209511  https://www.huffingtonpost.com/entry/positive-...   

                                                 headline  category   
162     REI Workers At Berkeley Store Vote To Unionize...  BUSINESS  \
353     Twitter Lawyer Calls Elon Musk 'Committed Enem...  BUSINESS   
632     Starbucks Leaving Russian Market, Shutting 130...  BUSINESS   
690     Crypto Crash Leaves Trading Platform Coinbase ...  BUSINESS   
727     US Added 428,000 Jobs In April Despite Surging...  BUSINESS   
...                                                   ...       ...   
209507  Four More Bank Closures Mark the Week of Janua...  BUSINESS   
209508  Everything You Need To Know About Overdraft Fe...  BUSINESS   
209509            Walmart Waving Goodbye To Some Greeters  BUSINESS   
209510  At World Economic Forum, Fear of Global Contag...  BUSINESS   
209511  Positive Customer Experience: What's the Retur...  BUSINESS   

162     They follow in the footsteps of REI workers in...  \
353     Delaware Chancery Judge Kathaleen McCormick de...   
632     Starbucks' move follows McDonald's exit from t...   
690     Cryptocurrency trading platform Coinbase has l...   
727     At 3.6%, unemployment nearly reached the lowes...   
...                                                   ...   
209507  The general pattern of the FDIC closing banks ...   
209508  Don't like keeping all of your money stuffed u...   
209509  After 30 years, "People Greeters" will no long...   
209510  For decades, as crises have assailed developin...   
209511  "Analysts at Adobe combined historical purchas...   

                                                  authors       date   
162                                         Dave Jamieson 2022-08-25  \
353                                        Marita Vlachou 2022-07-20   
632                                    DEE-ANN DURBIN, AP 2022-05-23   
690                                          Matt Ott, AP 2022-05-12   
727                                      Paul Wiseman, AP 2022-05-06   
...                                                   ...        ...   
209507  Dennis Santiago, Contributor\nGlobal Risk and ... 2012-01-28   
209508                                     Harry Bradford 2012-01-28   
209509                                                    2012-01-28   
209510  Peter S. Goodman, Contributor\nExecutive Busin... 2012-01-28   
209511                Ernan Roman, Contributor\nPresident 2012-01-28   

162                 3  
353                 3  
632                 3  
690                 3  
727                 3  
...               ...  
209507              3  
209508              3  
209509              3  
209510              3  
209511              3  

[5992 rows x 7 columns]
COLLEGE =                                                      link   
14783   https://www.huffingtonpost.com/entry/cornell-f...  \
17323   https://www.huffingtonpost.com/entry/norman-pa...   
18529   https://www.huffingtonpost.com/entry/norman-pa...   
21887   https://www.huffingtonpost.com/entry/when-ice-...   
22375   https://www.huffingtonpost.com/entry/as-colleg...   
...                                                   ...   
133458  https://www.huffingtonpost.com/entry/college-f...   
133460  https://www.huffingtonpost.com/entry/what-to-d...   
133517  https://www.huffingtonpost.com/entry/bowdoins-...   
133586  https://www.huffingtonpost.com/entry/rejected-...   
133639  https://www.huffingtonpost.com/entry/research-...   

                                                 headline category   
14783   Cornell Frat's 'Pig Roast' Gave Points For Sex...  COLLEGE  \
17323   Norm Pattiz, Accused Of Sexual Harassment, To ...  COLLEGE   
18529   Radio Mogul Under Pressure To Resign From Powe...  COLLEGE   
21887            When ICE Comes Calling In The Ivy League  COLLEGE   
22375   As College Costs Rise, Congress Must Save The ...  COLLEGE   
...                                                   ...      ...   
133458  How Much College Football Players Should Be Ma...  COLLEGE   
133460  How To Figure Out What You Really Want To Do I...  COLLEGE   
133517                             Bowdoin's Double Bogey  COLLEGE   
133586  Why Being Rejected By Your Dream School Isn't ...  COLLEGE   
133639  'The Only Way I Can Do This Research Project I...  COLLEGE   

14783           Zeta Beta Tau was slapped with probation.  \
17323   The media executive submitted his resignation ...   
18529   Several women have accused Norman Pattiz of se...   
21887                                     We must resist.   
22375   In today’s highly competitive global economy, ...   
...                                                   ...   
133517  This is American higher education today: an an...   

                                                  authors       date   
14783                                          Ron Dicker 2018-02-07  \
17323                                      Carla Herreria 2017-12-29   
18529                                         Matt Ferner 2017-12-11   
21887   Center for Community Change Action, Contributo... 2017-10-26   
22375   Rep. Terri Sewell, ContributorRep. Terri A. Se... 2017-10-20   
...                                                   ...        ...   
133458                                                    2014-04-20   
133460                                                    2014-04-20   
133517  Peter W. Wood, ContributorPresident of the Nat... 2014-04-19   
133586                                       Jessica Kane 2014-04-18   
133639                                   Allison Bresnick 2014-04-18   

14783               4  
17323               4  
18529               4  
21887               4  
22375               4  
...               ...  
133458              4  
133460              4  
133517              4  
133586              4  
133639              4  

[1144 rows x 7 columns]
COMEDY =                                                      link   
2       https://www.huffpost.com/entry/funniest-tweets...  \
344     https://www.huffpost.com/entry/funniest-tweets...   
466     https://www.huffpost.com/entry/funniest-tweets...   
510     https://www.huffpost.com/entry/seth-meyers-rud...   
536     https://www.huffpost.com/entry/funniest-tweets...   
...                                                   ...   
209484  https://www.huffingtonpost.com/entry/tim-erics...   
209485  https://www.huffingtonpost.com/entry/the-best-...   
209486  https://www.huffingtonpost.com/entry/daily-sho...   
209487  https://www.huffingtonpost.com/entry/mitt-romn...   
209488  https://www.huffingtonpost.com/entry/7-amazing...   

                                                 headline category   
2       23 Of The Funniest Tweets About Cats And Dogs ...   COMEDY  \
344     23 Of The Funniest Tweets About Cats And Dogs ...   COMEDY   
466     20 Of The Funniest Tweets About Cats And Dogs ...   COMEDY   
510     Seth Meyers Has A Field Day With Rudy Giuliani...   COMEDY   
536     25 Of The Funniest Tweets About Cats And Dogs ...   COMEDY   
...                                                   ...      ...   
209484  Tim & Eric's 'Billion Dollar Movie Pledge' Sig...   COMEDY   
209485  The Best Late Night Clips of the Week (VIDEO/P...   COMEDY   
209486  Daily Show Correspondent Clip Of The Week: Al ...   COMEDY   
209487       Mitt Romney Madness: Florida Edition (VIDEO)   COMEDY   
209488                 7 Amazing Name Generators (PHOTOS)   COMEDY   

2       "Until you have a dog you don't understand wha...  \
344     “you ever bring ur pet up to a mirror and ur l...   
466     "Petition to stop ringing the doorbell on TV s...   
510     “Sorry, buddy. You just gave yourself away. No...   
536     "i keep hearing this ad for fresh cat food tha...   
...                                                   ...   
209484  The pledge also asks its viewers to not see up...   
209485  President Obama finally broke through the onsl...   
209486  If you're like us, by the time Monday rolls ar...   
209487  The apparent madness that gripped Mitt Romney ...   
209488  Let's be honest: most of our names are pretty ...   

                                                  authors       date   
2                                           Elyse Wanshel 2022-09-23  \
344                                         Elyse Wanshel 2022-07-22   
466                                         Elyse Wanshel 2022-06-25   
510                                      Josephine Harvey 2022-06-16   
536                                         Elyse Wanshel 2022-06-10   
...                                                   ...        ...   
209484                                                    2012-01-28   
209485  Matt Wilstein, Contributor\nEditor, Gotcha Med... 2012-01-28   
209486                                                    2012-01-28   
209487                                           Ben Craw 2012-01-28   
209488                                         Seena Vali 2012-01-28   

2                   5  
344                 5  
466                 5  
510                 5  
536                 5  
...               ...  
209484              5  
209485              5  
209486              5  
209487              5  
209488              5  

[5400 rows x 7 columns]
CRIME =                                                      link   
107     https://www.huffpost.com/entry/ap-us-jogger-ab...  \
202     https://www.huffpost.com/entry/trump-org-cfo-t...   
241     https://www.huffpost.com/entry/united-states-m...   
258     https://www.huffpost.com/entry/albuquerque-vol...   
269     https://www.huffpost.com/entry/albuquerque-new...   
...                                                   ...   
207483  https://www.huffingtonpost.com/entry/elizabeth...   
207545  https://www.huffingtonpost.comhttp://oldnorthe...   
208133  https://www.huffingtonpost.com/entry/convict-e...   
208134  https://www.huffingtonpost.com/entry/new-york-...   
208213  https://www.huffingtonpost.com/entry/karen-swi...   

                                                 headline category   
107     Memphis Police: Arrest Made In Jogger's Disapp...    CRIME  \
202     Trump Org. CFO To Plead Guilty, Testify Agains...    CRIME   
241     Officials: NH Missing Girl Case Shifts To Homi...    CRIME   
258     Albuquerque Police Share Photo Of Car Eyed In ...    CRIME   
269     Albuquerque Police Tell Muslim Community To Be...    CRIME   
...                                                   ...      ...   
207483  Elizabeth Smart, Former Kidnapping Victim, Mar...    CRIME   
207545  Hannah Kelly, Pastor's Daughter, Dies After Ac...    CRIME   
208133  Tim Cole, Convict Exonerated After Death, Gets...    CRIME   
208134  Even When the Subject Is Gun Control, Our Gove...    CRIME   
208213  Karen Swift's Funeral Planned For Saturday As ...    CRIME   

107     Police in Tennessee say an arrest has been mad...  \
202     Allen Weisselberg is charged with taking more ...   
241     Authorities say the search for a New Hampshire...   
258     Authorities have said that all four of the kil...   
269     Police are searching for the shooter, or shoot...   
...                                                   ...   
207483  ABC News announced in July it had hired Smart ...   
207545  20-year-old Hannah Kelley died Saturday mornin...   
208133  The legislature also created the Timothy Cole ...   
208134  I'm an advocate of gun control, and a knee-jer...   
208213  Police have not yet released a cause of death ...   

                                                  authors       date   
107                                                       2022-09-04  \
202                                  Michael R. Sisak, AP 2022-08-18   
241                                       Holly Ramer, AP 2022-08-11   
258                                        Nina Golgowski 2022-08-08   
269                                          Sara Boboltz 2022-08-06   
...                                                   ...        ...   
207483                                   Reuters, Reuters 2012-02-19   
207545                                                    2012-02-18   
208133                                   Reuters, Reuters 2012-02-12   
208134  Steven Strauss , Contributor\nJohn L. Weinberg... 2012-02-12   
208213                                         David Lohr 2012-02-11   

107                 6  
202                 6  
241                 6  
258                 6  
269                 6  
...               ...  
207483              6  
207545              6  
208133              6  
208134              6  
208213              6  

[3562 rows x 7 columns]
CULTURE & ARTS =                                                      link   
8       https://www.huffpost.com/entry/mija-documentar...  \
16      https://www.huffpost.com/entry/hulu-reboot-sho...   
45      https://www.huffpost.com/entry/alex-aster-ligh...   
65      https://www.huffpost.com/entry/ani-liu-art-sci...   
66      https://www.huffpost.com/entry/sidney-review-t...   
...                                                   ...   
208890  https://www.huffingtonpost.com/entry/the-art-o...   
208891  https://www.huffingtonpost.com/entry/wonder-wo...   
209514  https://www.huffingtonpost.com/entry/dont-thin...   
209515  https://www.huffingtonpost.com/entry/matthew-m...   
209516  https://www.huffingtonpost.com/entry/allard-va...   

                                                 headline        category   
8       How A New Documentary Captures The Complexity ...  CULTURE & ARTS  \
16      'Reboot' Is A Clever And Not Too Navel-Gazey L...  CULTURE & ARTS   
45      Meet Alex Aster, The TikToker Changing The Pub...  CULTURE & ARTS   
65      How Ani Liu Is Brilliantly Disguising Her Art ...  CULTURE & ARTS   
66      'Sidney' Tackles The Not-So-Comfortable Conver...  CULTURE & ARTS   
...                                                   ...             ...   
208890  'The Art Of Not Making' Explores The Intention...  CULTURE & ARTS   
208891  Fictional And Real Life Women Kick Butt In The...  CULTURE & ARTS   
209514  'Don't Think': A Look At The Chemical Brothers...  CULTURE & ARTS   
209515         Matthew Marks Discusses His New LA Gallery  CULTURE & ARTS   
209516  Allard Van Hoorn's 'Urban Songline' Explores R...  CULTURE & ARTS   

8       In "Mija," director Isabel Castro combined mus...  \
16      Starring Keegan-Michael Key, Judy Greer and Jo...   
45      The Colombian-American author's new book "Ligh...   
65      The research-based artist has found dynamic wa...   
66      It’s not about sensationalizing or even tarnis...   
...                                                   ...   
208890  Check out a slideshow of some of the 115 artis...   
208891  What inspired you to make the film? I'm curiou...   
209514  Amid cheers and the occasional "Here we go!" f...   
209515  Was it an obvious choice to recruit Ellsworth ...   
209516  A recent exhibition at Storefront for Art and ...   

                                  authors       date  category_code  
8                             Marina Fang 2022-09-22              7  
16      Marina Fang and Candice Frederick 2022-09-20              7  
45                    Marilyn La Jeunesse 2022-09-15              7  
65                           Xintian Wang 2022-09-12              7  
66                      Candice Frederick 2022-09-11              7  
...                                   ...        ...            ...  
208890                                    2012-02-04              7  
208891                                    2012-02-04              7  
209514                      Kia Makarechi 2012-01-28              7  
209515                                    2012-01-28              7  
209516                                    2012-01-28              7  

[1074 rows x 7 columns]
DIVORCE =                                                      link   
133683  https://www.huffingtonpost.comhttp://www.thegl...  \
133686  https://www.huffingtonpost.comhttp://www.thegl...   
133696  https://www.huffingtonpost.com/entry/blake-she...   
133702  https://www.huffingtonpost.com/entry/life-afte...   
133717  https://www.huffingtonpost.com/entry/how-to-re...   
...                                                   ...   
209337  https://www.huffingtonpost.comhttp://unioncity...   
209347  https://www.huffingtonpost.comhttp://washingto...   
209349  https://www.huffingtonpost.comhttp://www.miami...   
209355  https://www.huffingtonpost.com/entry/finding-l...   
209379  https://www.huffingtonpost.com/entry/five-unex...   

                                                 headline category   
133683  50 Empowering Songs To Help Get You Through A ...  DIVORCE  \
133686  I'm Sleeping With A Cheater, Only Complicating...  DIVORCE   
133696  Blake And Miranda Respond To Divorce Rumors In...  DIVORCE   
133702  What 'Grey's Anatomy' Taught Me About Moving O...  DIVORCE   
133717  How to Revitalize Love and Passion With Your P...  DIVORCE   
...                                                   ...      ...   
209337      Local Mom Gives A "Hand Up" To Single Mothers  DIVORCE   
209347  Fathers Challenge Jail Sentences For Child Sup...  DIVORCE   
209349            Tips To Help Your Dog Deal With Divorce  DIVORCE   
209355  Finding Love Again: Advice for the Divorced Woman  DIVORCE   
209379     Five Unexpected Behaviors That Sink a Marriage  DIVORCE   

133683  Breakups are a very special and pointed type o...  \
133686  I’ve been a sex worker for six years–my entire...   
133696  Worried about the state of Blake Shelton and M...   
133702  If there's ever a time you need a little distr...   
133717  Let's face it, when we fall in love and commit...   
...                                                   ...   
209337  Tricia Ward, 40, was a successful real estate ...   
209347  After Lance Hendrix returned from military ser...   
209349  Breaking up is hard to do, and when the family...   
209355  The legal freedom that comes from a divorce de...   
209379  In order to fully recover in a healthy way fro...   

                                                  authors       date   
133683                                                    2014-04-17  \
133686                                                    2014-04-17   
133696                                      Brittany Wong 2014-04-17   
133702                                                    2014-04-17   
133717  Terry Gaspard, Contributor\nLicensed Clinical ... 2014-04-17   
...                                                   ...        ...   
209337                                                    2012-01-30   
209347                                                    2012-01-30   
209349                                                    2012-01-30   
209355  Dr. Janet Page, Contributor\nPsychotherapist, ... 2012-01-30   
209379  Rachel A. Sussman, LCSW, Contributor\nAuthor, ... 2012-01-30   

133683              8  
133686              8  
133696              8  
133702              8  
133717              8  
...               ...  
209337              8  
209347              8  
209349              8  
209355              8  
209379              8  

[3426 rows x 7 columns]
EDUCATION =                                                      link   
94      https://www.huffpost.com/entry/ap-us-los-angel...  \
183     https://www.huffpost.com/entry/united-states-m...   
764     https://www.huffpost.com/entry/parents-schools...   
1890    https://www.huffpost.com/entry/nyc-teachers-st...   
1941    https://www.huffpost.com/entry/federal-judge-s...   
...                                                   ...   
133284  https://www.huffingtonpost.com/entry/the-globa...   
133334  https://www.huffingtonpost.com/entry/what-to-d...   
133337  https://www.huffingtonpost.com/entry/californi...   
133496  https://www.huffingtonpost.com/entry/common-co...   
133638  https://www.huffingtonpost.com/entry/my-son-on...   

                                                 headline   category   
94      Cyberattack Prompts Los Angeles School Distric...  EDUCATION  \
183     Minneapolis Teacher Contract Race Language Ign...  EDUCATION   
764     Despite GOP Attacks, Parents Are Pretty Happy ...  EDUCATION   
1890    COVID Vaccine Mandate Takes Effect For NYC Tea...  EDUCATION   
1941    Federal Judge Suspends New York City's Vaccine...  EDUCATION   
...                                                   ...        ...   
133284  The Global Search for Education:  The School o...  EDUCATION   
133334                        Brainstorming Middle School  EDUCATION   
133337  Staunch Majority of California Voters Support ...  EDUCATION   
133496  Why Doesn't the New York Times Understand the ...  EDUCATION   
133638         My Son Only Read One Book in Middle School  EDUCATION   

94      Such attacks have become a growing threat to U...  \
183     When Minneapolis teachers settled a 14-day str...   
764     Increasing Republican attacks on the nation's ...   
1890    Unvaccinated employees will be placed on unpai...   
1941    But the district is confident that it will pre...   
...                                                   ...   
133334  If we want our children to help us preserve, s...   
133337  Through these proposals, we can provide every ...   
133496  How can the nation's "newspaper of record" be ...   
133638  Great reading can be done in middle school if ...   

                                                  authors       date   
94        Stefanie Dazio, Frank Bajak and Zeke Miller, AP 2022-09-07  \
183                                   Steve Karnowski, AP 2022-08-21   
764                                          Sara Boboltz 2022-04-30   
1890                                                      2021-10-04   
1941                                     Michael Hill, AP 2021-09-25   
...                                                   ...        ...   
133284  C. M. Rubin, ContributorBlogger and author, 'T... 2014-04-22   
133334  Allison Gaines Pell, ContributorHead of School... 2014-04-22   
133337      Deborah Kong, ContributorDirector, Early Edge 2014-04-22   
133496  Diane Ravitch, ContributorResearch Professor o... 2014-04-20   
133638                     Franchesca Warren, Contributor 2014-04-18   

94                  9  
183                 9  
764                 9  
1890                9  
1941                9  
...               ...  
133284              9  
133334              9  
133337              9  
133496              9  
133638              9  

[1014 rows x 7 columns]
ENTERTAINMENT =                                                      link   
20      https://www.huffpost.com/entry/golden-globes-r...  \
28      https://www.huffpost.com/entry/james-cameron-f...   
39      https://www.huffpost.com/entry/blade-runner-20...   
43      https://www.huffpost.com/entry/the-phantom-of-...   
47      https://www.huffpost.com/entry/viola-davis-wom...   
...                                                   ...   
209449  https://www.huffingtonpost.comhttp://www.tmz.c...   
209450  https://www.huffingtonpost.comhttp://insidetv....   
209451  https://www.huffingtonpost.comhttp://www.tmz.c...   
209512  https://www.huffingtonpost.com/entry/sundance-...   
209513  https://www.huffingtonpost.com/entry/girl-with...   

                                                 headline       category   
20      Golden Globes Returning To NBC In January Afte...  ENTERTAINMENT  \
28      James Cameron Says He 'Clashed' With Studio Be...  ENTERTAINMENT   
39      Amazon Greenlights 'Blade Runner 2099' Limited...  ENTERTAINMENT   
43      'The Phantom Of The Opera' To Close On Broadwa...  ENTERTAINMENT   
47      Viola Davis Feared A Heart Attack During 'The ...  ENTERTAINMENT   
...                                                   ...            ...   
209449    Bow Wow Has Tax Liens From 2006, 2008, And 2010  ENTERTAINMENT   
209450  World Preview Of Madonna's 'Give Me All Your L...  ENTERTAINMENT   
209451  'Terminator 3' Star Nick Stahl Arrested For No...  ENTERTAINMENT   
209512  Sundance, Ice-T, and Shades of the American Ra...  ENTERTAINMENT   
209513  'Girl With the Dragon Tattoo' India Release Ca...  ENTERTAINMENT   

20      For the past 18 months, Hollywood has effectiv...  \
28      The "Avatar" director said aspects of his 2009...   
39      The director of the original 1982 film joins a...   
43      “The Phantom of the Opera” — Broadway’s longes...   
47      The Oscar winner said she worked out for five ...   
...                                                   ...   
209449  Bow Wow needs to hire himself a new accountant...   
209450  Fox and American Idol snagged the exclusive wo...   
209451  Nick Stahl found himself a little short on cas...   
209512  Representation of the collective diaspora has ...   
209513  "Sony Pictures will not be releasing The Girl ...   

                                                  authors       date   
20                                                        2022-09-20  \
28                                           Ben Blanchet 2022-09-18   
39                                      Marco Margaritoff 2022-09-16   
43                                       Mark Kennedy, AP 2022-09-16   
47                                      Marco Margaritoff 2022-09-15   
...                                                   ...        ...   
209449                                                    2012-01-29   
209450                                                    2012-01-29   
209451                                                    2012-01-29   
209512  Courtney Garcia, Contributor\nI tell stories a... 2012-01-28   
209513                                                    2012-01-28   

20                 10  
28                 10  
39                 10  
43                 10  
47                 10  
...               ...  
209449             10  
209450             10  
209451             10  
209512             10  
209513             10  

[17362 rows x 7 columns]
ENVIRONMENT =                                                      link   
32      https://www.huffpost.com/entry/oil-gas-coal-re...  \
34      https://www.huffpost.com/entry/bc-us-alaska-co...   
35      https://www.huffpost.com/entry/tropical-storm-...   
37      https://www.huffpost.com/entry/jackson-water-c...   
76      https://www.huffpost.com/entry/bc-us-californi...   
...                                                   ...   
209502  https://www.huffingtonpost.com/entry/boxer-pup...   
209503  https://www.huffingtonpost.com/entry/black-smo...   
209504  https://www.huffingtonpost.com/entry/green-peo...   
209505  https://www.huffingtonpost.com/entry/winter-we...   
209506  https://www.huffingtonpost.com/entry/insects-t...   

                                                 headline     category   
32      First Public Global Database Of Fossil Fuels L...  ENVIRONMENT  \
34      Alaska Prepares For 'Historic-Level' Storm Bar...  ENVIRONMENT   
35      Puerto Rico Braces For Landslides And Severe F...  ENVIRONMENT   
37      Privatization Isn’t The Answer To Jackson’s Wa...  ENVIRONMENT   
76      Severe Winds Batter Southern California As Hea...  ENVIRONMENT   
...                                                   ...          ...   
209502  Boxer Puppy And Cows Make Friends During Walk ...  ENVIRONMENT   
209503  'Black Smoker' Vents: New Species Discovered N...  ENVIRONMENT   
209504                      Green Activists: 50 And Older  ENVIRONMENT   
209505  Winter Weather Photo Contest: Submit Your Own ...  ENVIRONMENT   
209506          Insects Top Newly Discovered Species List  ENVIRONMENT   

32      On Monday, the world’s first public database o...  \
34      “In 10 years, people will be referring to the ...   
35      Puerto Rico was under a hurricane watch Saturd...   
37      Studies have repeatedly shown that ending publ...   
76      After a 10-day heat wave that nearly overwhelm...   
...                                                   ...   
209502  This bevy of otters were also filmed having a ...   
209503  Photos and captions courtesy of University of ...   
209504  If you look at some of today's most prominent ...   
209505  While severe winter weather has devastated som...   
209506  Species IDs need improvement In addition to th...   

                                                 authors       date   
32                                      Drew Costley, AP 2022-09-18  \
35                                       DÁNICA COTO, AP 2022-09-17   
37                                     Nathalie Baptiste 2022-09-17   
76                     JULIE WATSON and JOHN ANTCZAK, AP 2022-09-10   
...                                                  ...        ...   
209502                                                   2012-01-28   
209503                                                   2012-01-28   
209504                                                   2012-01-28   
209505                                                   2012-01-28   
209506                                                   2012-01-28   

32                 11  
34                 11  
35                 11  
37                 11  
76                 11  
...               ...  
209502             11  
209503             11  
209504             11  
209505             11  
209506             11  

[1444 rows x 7 columns]
FIFTY =                                                      link   
43952   https://www.huffingtonpost.com/entry/love-face...  \
47074   https://www.huffingtonpost.com/entry/boomers-w...   
47660   https://www.huffingtonpost.com/entry/be-gratef...   
47711   https://www.huffingtonpost.com/entry/a-no-bull...   
48303   https://www.huffingtonpost.com/entry/vocabular...   
...                                                   ...   
133470  https://www.huffingtonpost.com/entry/middleage...   
133509  https://www.huffingtonpost.com/entry/aging-gra...   
133510  https://www.huffingtonpost.com/entry/inheritan...   
133522  https://www.huffingtonpost.com/entry/eight-fac...   
133590  https://www.huffingtonpost.com/entry/spring-fa...   

                                                 headline category   
43952                       Love, Facebook and Infidelity    FIFTY  \
47074   Boomers Were Time's "Man of the Year" Fifty Ye...    FIFTY   
47660   Be Grateful At The Holidays For Sprinkles Of H...    FIFTY   
47711                        A No Bullsh-t Holiday Letter    FIFTY   
48303               How Our Vocabulary Gives Away Our Age    FIFTY   
...                                                   ...      ...   
133470             Middle-aged and Invisible at Coachella    FIFTY   
133509     How A Dinner Party Changed My Outlook On Aging    FIFTY   
133510  What Kind Of Inheritance Do You Really Owe You...    FIFTY   
133522  Eight Factors To Consider When Choosing Your O...    FIFTY   
133590         4 Stunning Spring Dresses For Boomer Women    FIFTY   

43952                                                      \
48303   We may look much younger than we really are. W...   
...                                                   ...   
133470  I accept that Coachella has become one of thos...   
133509  We were invited to a small dinner party -- a b...   
133522  If you're thinking about moving overseas and y...   
133590  Entering my closet to look for evening wear is...   

                                                  authors       date   
43952   Roz Warren, ContributorAuthor of OUR BODIES, O... 2017-02-05  \
47074   Candy Leonard, ContributorSociologist, author ... 2017-01-02   
47660     Honey Good, ContributorFounder of HoneyGood.com 2016-12-26   
47711   Iris Ruth Pastor, ContributorSlice-of-life col... 2016-12-25   
48303   Delfín Carbonell, ContributorPh.D. in Philolog... 2016-12-18   
...                                                   ...        ...   
133470  Julie Bergman Sender, ContributorDirector and ... 2014-04-20   
133509  Cathy Chester, ContributorAward-winning blogge... 2014-04-19   
133510                                                    2014-04-19   
133522  Suzan Haskins and Dan Prescher, ContributorInt... 2014-04-19   
133590  Felice Shapiro, ContributorFounder/Publisher w... 2014-04-18   

43952              12  
47074              12  
47660              12  
47711              12  
48303              12  
...               ...  
133470             12  
133509             12  
133510             12  
133522             12  
133590             12  

[1401 rows x 7 columns]
FOOD & DRINK =                                                      link   
280     https://www.huffpost.com/entry/stacey-truman-c...  \
294     https://www.huffpost.com/entry/dan-giusti-voic...   
395     https://www.huffpost.com/entry/orange-wine_l_6...   
411     https://www.huffpost.com/entry/how-to-make-a-d...   
415     https://www.huffpost.com/entry/is-it-safe-to-s...   
...                                                   ...   
209290  https://www.huffingtonpost.com/entry/franks-vs...   
209300  https://www.huffingtonpost.com/entry/super-bow...   
209306  https://www.huffingtonpost.com/entry/korean-re...   
209318  https://www.huffingtonpost.com/entry/leftover-...   
209332  https://www.huffingtonpost.com/entry/clean-out...   

                                                 headline      category   
280     'Cafeteria Workers Do A Lot More Than People R...  FOOD & DRINK  \
294     I Cooked For The World's 1%, But I Traded It T...  FOOD & DRINK   
395     Orange Wine: Everything You Need To Know And P...  FOOD & DRINK   
411     How To Make A Dirty Shirley, The Unofficial Dr...  FOOD & DRINK   
415     Is It Safe To Swim Right After Eating? Experts...  FOOD & DRINK   
...                                                   ...           ...   
209290  Frank's vs. Tabasco Buffalo: What's The Best W...  FOOD & DRINK   
209300           10 Dips, Nibbles And Dishes For Game Day  FOOD & DRINK   
209306     9 Korean Recipes: Go Outside Your Comfort Zone  FOOD & DRINK   
209318                       5 Ways To Use Leftover Bread  FOOD & DRINK   
209332              10 Recipes Made From Common Leftovers  FOOD & DRINK   

280     Many of these school employees quietly go abov...  \
294     Dan Giusti is changing the way institutions li...   
395                       No, it's not made from oranges.   
411     The classic “kiddie cocktail” has a new lease ...   
415     We've all been told to wait 30 minutes after e...   
...                                                   ...   
209290  The two brands are embroiled in a marketing ca...   
209300  Can't say I love football, but I always love h...   
209306  Gochujang is a red chili paste that is made fr...   
209318  There's a reason they call it the daily bread ...   
209332  Everyone has their share of leftovers sitting ...   

                                                  authors       date   
280                                        Emily Laurence 2022-08-04  \
294                                        Emily Laurence 2022-08-01   
395                                         Beth Krietsch 2022-07-11   
411                                        Julie Kendrick 2022-07-08   
415                                          Taylor Tobin 2022-07-07   
...                                                   ...        ...   
209290                                                    2012-01-31   
209300  Jennifer Segal, Contributor\nChef, Cookbook Au... 2012-01-30   
209306                       Kitchen Daily, Kitchen Daily 2012-01-30   
209318                    Food52, Contributor\nfood52.com 2012-01-30   
209332                       Kitchen Daily, Kitchen Daily 2012-01-30   

280                13  
294                13  
395                13  
411                13  
415                13  
...               ...  
209290             13  
209300             13  
209306             13  
209318             13  
209332             13  

[6340 rows x 7 columns]
GOOD NEWS =                                                      link   
28033   https://www.huffingtonpost.com/entry/what-if-e...  \
29809   https://www.huffingtonpost.com/entry/lobsterme...   
30745   https://www.huffingtonpost.com/entry/asheville...   
31054   https://www.huffingtonpost.com/entry/india-bil...   
31171   https://www.huffingtonpost.com/entry/80th-wedd...   
...                                                   ...   
132841  https://www.huffingtonpost.com/entry/brave-hea...   
132951  https://www.huffingtonpost.com/entry/students-...   
133048  https://www.huffingtonpost.com/entry/purritos-...   
133140  https://www.huffingtonpost.com/entry/blind-bea...   
133328  https://www.huffingtonpost.com/entry/global-so...   

                                                 headline   category   
28033                 What If Every School Had This Sign?  GOOD NEWS  \
29809   Tiny Seal Pup Found Tangled In Fishing Net Sav...  GOOD NEWS   
30745   North Carolina Cops Respond To Party Complaint...  GOOD NEWS   
31054   See The Slick Moves That Got India's 'Billy El...  GOOD NEWS   
31171   Secrets To A Happy Marriage From A 99-Year-Old...  GOOD NEWS   
...                                                   ...        ...   
132841                              Brave: Hearts, Mended  GOOD NEWS   
132951  These Kids Thought Their School Lacked A Place...  GOOD NEWS   
133048  Purritos = Cats, Burritos, The Internet. All O...  GOOD NEWS   
133140  Blind People Describe Beauty As 'Joy,' 'Truth,...  GOOD NEWS   
133328  Earth Day Project Collecting 1 Million Differe...  GOOD NEWS   

28033   The Welcome Your Neighbor signs are transforma...  \
29809                             Lobstermen for the win!   
30745   These officers didn't break up the party. They...   
31054   The son of a welder, 15-year-old Amir Shah has...   
31171   Lovebirds Donald and Vivian Hart tied the knot...   
...                                                   ...   
132841  Paris was half of one of the couples in our ci...   

                                                  authors       date   
28033   Regan Manwell Sowinski, ContributorWoke Teache... 2017-08-09  \
29809                                      Nina Golgowski 2017-07-19   
30745                                      Carla Herreria 2017-07-07   
31054                                 Dominique Mosbergen 2017-07-04   
31171                                 Dominique Mosbergen 2017-07-02   
...                                                   ...        ...   
132841  Kristin Shaw, ContributorAuthor and blogger, F... 2014-04-28   
132951                                   Alexandra Zaslow 2014-04-26   
133048                                  Melissa McGlensey 2014-04-25   
133140                                Dominique Mosbergen 2014-04-24   
133328                                  Melissa McGlensey 2014-04-22   

28033              14  
29809              14  
30745              14  
31054              14  
31171              14  
...               ...  
132841             14  
132951             14  
133048             14  
133140             14  
133328             14  

[1398 rows x 7 columns]
GREEN =                                                      link   
16081   https://www.huffingtonpost.com/entry/mcdonalds...  \
16386   https://www.huffingtonpost.com/entry/fourth-se...   
16440   https://www.huffingtonpost.com/entry/californi...   
16529   https://www.huffingtonpost.com/entry/californi...   
16551   https://www.huffingtonpost.com/entry/ferc-pipe...   
...                                                   ...   
133478  https://www.huffingtonpost.com/entry/bp-oil-sp...   
133499  https://www.huffingtonpost.com/entry/story_n_5...   
133529  https://www.huffingtonpost.com/entry/climate-c...   
133608  https://www.huffingtonpost.com/entry/throwing-...   
133622  https://www.huffingtonpost.com/entry/a-way-of-...   

                                                 headline category   
16081   McDonald's Says Its Packaging Will Be 100 Perc...    GREEN  \
16386   Fourth San Francisco Swimmer In A Month Attack...    GREEN   
16440   Your Questions About The California Mudslides,...    GREEN   
16529    Why The California Mudslides Have Been So Deadly    GREEN   
16551   The Agency That Approves Pipelines Is About To...    GREEN   
...                                                   ...      ...   
133478  Four Years Later, BP Oil Spill Still Taking A ...    GREEN   
133499              What Big Oil Doesn't Want You To Know    GREEN   
133529  Keystone XL May Wait on Nebraska, but Climate ...    GREEN   
133608                           Throwing Away Good Water    GREEN   
133622  A Way of Life at Risk on the Anniversary of th...    GREEN   

16081   All of its restaurants will also feature recyc...  \
16386   The woman was not seriously injured even thoug...   
16440   It's not just a tragic coincidence that the re...   
16529   Unheeded evacuation warnings, late emergency a...   
16551   Activists have been fighting with this governm...   
...                                                   ...   
133608  Folks, even if this guy is pissing out pure co...   
133622  On April 20, 2010, an explosion on BP's Deepwa...   

                                                  authors       date   
16081                                      Nina Golgowski 2018-01-17  \
16386                                      Mary Papenfuss 2018-01-12   
16440                                      Lydia O'Connor 2018-01-12   
16529                                    Antonia Blumberg 2018-01-11   
16551                                        Eoin Higgins 2018-01-10   
...                                                   ...        ...   
133478                                        Nick Visser 2014-04-20   
133499                                                    2014-04-20   
133529  Susan Casey-Lefkowitz, ContributorDirector of ... 2014-04-19   
133608  Peter H. Gleick, ContributorChief Scientist, P... 2014-04-18   
133622  Jeffrey Buchanan, ContributorSenior Domestic P... 2014-04-18   

16081              15  
16386              15  
16440              15  
16529              15  
16551              15  
...               ...  
133478             15  
133499             15  
133529             15  
133608             15  
133622             15  

[2622 rows x 7 columns]
HEALTHY LIVING =                                                      link   
16252   https://www.huffingtonpost.com/entry/to-the-pe...  \
16367   https://www.huffingtonpost.com/entry/eating-sh...   
16421   https://www.huffingtonpost.com/entry/anxiety-f...   
16601   https://www.huffingtonpost.com/entry/tweets-ab...   
16608   https://www.huffingtonpost.com/entry/the-real-...   
...                                                   ...   
133587  https://www.huffingtonpost.com/entry/happy-hea...   
133588  https://www.huffingtonpost.com/entry/mental-il...   
133599  https://www.huffingtonpost.com/entry/wake-up-c...   
133624  https://www.huffingtonpost.com/entry/narcissis...   
133662  https://www.huffingtonpost.com/entry/happiness...   

                                                 headline        category   
16252   To The People Who Say ‘I’m Tired’ When Someone...  HEALTHY LIVING  \
16367   Eating Shake Shack Made Me Feel Healthier Than...  HEALTHY LIVING   
16421   How To Stay Updated On The News Without Losing...  HEALTHY LIVING   
16601   27 Perfect Tweets About Whole30 That Will Make...  HEALTHY LIVING   
16608          The Real Reason Your Hands Are Always Cold  HEALTHY LIVING   
...                                                   ...             ...   
133587  Why You Need Both a 'Bouncer' and a 'Bartender...  HEALTHY LIVING   
133588  How Video Games Can Improve Dialogue on Mental...  HEALTHY LIVING   
133599  Wake-Up Calls Inspired My Change From Overdriv...  HEALTHY LIVING   
133624        Loving a Narcissist Without Losing Yourself  HEALTHY LIVING   
133662                            Reasons Not to Be Happy  HEALTHY LIVING   

16252   When you feel like this, it’s important to kno...  \
16367   I can vividly remember the first time I felt f...   
16421      Because it's only becoming more of a struggle.   
16601   "The only Whole30 I want to participate in is ...   
16608   Essentially, your hands are kept warm thanks t...   
...                                                   ...   
133587  Instead of judging whether you made the right ...   
133588  While there are strong arguments for the games...   
133599  My wake-up call marching orders were clear: No...   
133624  It is very difficult for some people to see an...   
133662  Our thoughts and feelings are powerful, but ma...   

                                                  authors       date   
16252   The Mighty, ContributorWe face disability, dis... 2018-01-16  \
16367   Colleen Werner, ContributorCampus Editor-at-Large 2018-01-12   
16421                                      Lindsay Holmes 2018-01-12   
16601                                      Lindsay Holmes 2018-01-10   
16608   Refinery29, ContributorThe #1 new-media brand ... 2018-01-10   
...                                                   ...        ...   
133587  Elizabeth Grace Saunders, ContributorFounder, ... 2014-04-18   
133588         Mona Shattell, Contributornurse researcher 2014-04-18   
133599  Jane Shure, ContributorLeadership Coach, Psych... 2014-04-18   
133624  Nancy Colier, ContributorPsychotherapist, inte... 2014-04-18   
133662  Mindy Utay, Contributor"Calming Life's Conflicts" 2014-04-18   

16252              16  
16367              16  
16421              16  
16601              16  
16608              16  
...               ...  
133587             16  
133588             16  
133599             16  
133624             16  
133662             16  

[6694 rows x 7 columns]
HOME & LIVING =                                                      link   
394     https://www.huffpost.com/entry/girl-in-the-pic...  \
474     https://www.huffpost.com/entry/new-movies-show...   
867     https://www.huffpost.com/entry/the-call-popula...   
1353    https://www.huffpost.com/entry/just-go-with-it...   
1414    https://www.huffpost.com/entry/movies-shows-le...   
...                                                   ...   
209387  https://www.huffingtonpost.com/entry/kelly-wea...   
209461  https://www.huffingtonpost.com/entry/diy-ideas...   
209462  https://www.huffingtonpost.com/entry/ikea-shop...   
209470  https://www.huffingtonpost.com/entry/design-in...   
209477  https://www.huffingtonpost.com/entry/on-the-fe...   

                                                 headline       category   
394     The Most Popular Movies On Netflix Right Now B...  HOME & LIVING  \
474     New On Netflix July 2022: 'Persuasion,' 'Virgi...  HOME & LIVING   
867     The Most Popular Movies On Netflix Right Now B...  HOME & LIVING   
1353    The Most Popular Movies On Netflix Right Now B...  HOME & LIVING   
1414        Here's What's Leaving Netflix In January 2022  HOME & LIVING   
...                                                   ...            ...   
209387  Kelly Wearstler Designs New Hollywood Home, St...  HOME & LIVING   
209461  DIY Ideas: 9 Projects To Enhance Your Home Thi...  HOME & LIVING   
209462   IKEA Shopping: The Best Items You Can Buy Online  HOME & LIVING   
209470  Design Inspiration: Francis Ford Coppola Berna...  HOME & LIVING   
209477                                       On the Fence  HOME & LIVING   

394     A new animated film and action comedy are also...  \
474     The streaming service announced the movies and...   
867     A new Polish crime drama and SpaceX documentar...   
1353    Two Adam Sandler movies are trending on the st...   
1414    "Episodes" and all five films of “The Twilight...   
...                                                   ...   
209387  We've been wondering what designer and the que...   
209461  Looking to update and refresh your home with s...   
209462  Long considered the go-to for budget-friendly ...   
209470  Allow us to dream a little this weekend. Ever ...   
209477  When designing a home, I generally believe the...   

                                                  authors       date   
394                                      Caroline Bologna 2022-07-11  \
474                                      Caroline Bologna 2022-06-23   
867                                      Caroline Bologna 2022-04-11   
1353                                     Caroline Bologna 2022-01-10   
1414                                     Caroline Bologna 2021-12-29   
...                                                   ...        ...   
209387                                       Dickson Wong 2012-01-30   
209461                                    Diana N. Nguyen 2012-01-28   
209462                                      Kaitlyn Davis 2012-01-28   
209470                                       Dickson Wong 2012-01-28   
209477  Ron Radziner and Leo Marmol, Contributor\nDesi... 2012-01-28   

394                17  
474                17  
867                17  
1353               17  
1414               17  
...               ...  
209387             17  
209461             17  
209462             17  
209470             17  
209477             17  

[4320 rows x 7 columns]
IMPACT =                                                      link   
3834    https://www.huffpost.com/entry/why-you-shouldn...  \
3931    https://www.huffpost.com/entry/hummingbird-sav...   
3964    https://www.huffpost.com/entry/companies-clima...   
4504    https://www.huffpost.com/entry/lisbon-airbnb-l...   
4534    https://www.huffpost.com/entry/america-history...   
...                                                   ...   
209436  https://www.huffingtonpost.com/entry/texana-ho...   
209437  https://www.huffingtonpost.com/entry/malarias-...   
209499  https://www.huffingtonpost.com/entry/hands-on-...   
209500  https://www.huffingtonpost.com/entry/maternal-...   
209501  https://www.huffingtonpost.com/entry/tom-brady...   

                                                 headline category   
3834                   Why You Shouldn't Recycle Receipts   IMPACT  \
3931    How One Of The World's Rarest Hummingbirds Is ...   IMPACT   
3964    Companies Are Making Major Climate Pledges. He...   IMPACT   
4504    Lisbon Says Airbnb Forced Out Locals. Here’s I...   IMPACT   
4534    Behind America’s Mutual Aid Boom Lies A Long H...   IMPACT   
...                                                   ...      ...   
209436  Texana Hollis, 101-Year-Old Evicted Detroit Wo...   IMPACT   
209437                  Malaria's Defeat, Africa's Future   IMPACT   
209499                        Tinker and Change the World   IMPACT   
209500          Pregnant and Displaced: Double the Danger   IMPACT   
209501  Tom Brady Helps Mentor, Tom Martinez, Find A K...   IMPACT   

3834    CVS and other companies are shortening excessi...  \
3931    The marvelous spatuletail hummingbird inspired...   
3964    How to decode corporate climate change targets...   
4504    The mayor of Portugal's capital city plans to ...   
4534    The tradition of mutual aid is as old as the c...   
...                                                   ...   
209436  A local contracting company offered to install...   
209437  Africa is taking command of its future by tack...   
209499  Tinkering -- that hands-on, garage-based tradi...   
209500  It's time we all step up our efforts to ensure...   
209501  Since Brady started promoting Martinez's cause...   

                                                  authors       date   
3834                                       Amanda Schupak 2020-10-30  \
3931                                       Amanda Schupak 2020-10-13   
3964                                          Kyla Mandel 2020-10-07   
4504                                       Laura Paddison 2020-07-07   
4534                                       Amanda Schupak 2020-07-02   
...                                                   ...        ...   
209436                                                    2012-01-29   
209437  Ellen Johnson-Sirleaf, Contributor\nPresident ... 2012-01-29   
209499  Larry Bock, Contributor\nFounder and Organizer... 2012-01-28   
209500  Sarah Costa, Contributor\nExecutive Director o... 2012-01-28   
209501                                                    2012-01-28   

3834               18  
3931               18  
3964               18  
4504               18  
4534               18  
...               ...  
209436             18  
209437             18  
209499             18  
209500             18  
209501             18  

[3484 rows x 7 columns]
LATINO VOICES =                                                      link   
2880    https://www.huffpost.com/entry/protest-chicago...  \
8932    https://www.huffingtonpost.com/entry/aaron-sch...   
9025    https://www.huffingtonpost.com/entry/fiesta-pr...   
9130    https://www.huffingtonpost.com/entry/starbucks...   
9217    https://www.huffingtonpost.com/entry/edward-su...   
...                                                   ...   
133087  https://www.huffingtonpost.com/entry/latin-bil...   
133108  https://www.huffingtonpost.com/entry/are-depor...   
133298  https://www.huffingtonpost.com/entry/latina-am...   
133318  https://www.huffingtonpost.com/entry/latino-wh...   
133341  https://www.huffingtonpost.com/entry/living-an...   

                                                 headline       category   
2880    Hundreds Protest Police Killing Of 13-Year-Old...  LATINO VOICES  \
8932    Attorney Aaron Schlossberg Insists Anti-Spanis...  LATINO VOICES   
9025    Protesters Throw A Fiesta To Razz Lawyer Who R...  LATINO VOICES   
9130    Latino Man Insulted When Starbucks Barista Wri...  LATINO VOICES   
9217    Angry White Dude's Rant About People Speaking ...  LATINO VOICES   
...                                                   ...            ...   
133087  LOOK: Latin Billboard Music Awards Fashion Hit...  LATINO VOICES   
133108  Are Deportations Rising or Falling? A Focus on...  LATINO VOICES   
133298  When I Chose to Stop Feeling Small: The Story ...  LATINO VOICES   
133318     More Latino Than White Students Admitted To UC  LATINO VOICES   
133341        Living and Breathing Gabriel García Márquez  LATINO VOICES   

2880    Outraged by video showing the boy with his han...  \
8932    In the apology, Aaron Schlossberg claims he mo...   
9025    Aaron Schlossberg was treated to outrage, mari...   
9130    The coffee giant tried to apologize with a $50...   
9217    The man tells an employee “Your staff is speak...   
...                                                   ...   
133108  Although clear and correct statistics are no d...   
133298  No woman in my family had ever been this tall,...   
133341  Thirty years ago I lived García Márquez. Maybe...   

                                                  authors       date   
2880                                  Sarah Ruiz-Grossman 2021-04-17  \
8932                                           David Moye 2018-05-22   
9025                                       Mary Papenfuss 2018-05-19   
9130                                           David Moye 2018-05-17   
9217                                           David Moye 2018-05-16   
...                                                   ...        ...   
133087                                    Carolina Moreno 2014-04-25   
133108  Laura E. Enriquez, ContributorAssistant Profes... 2014-04-24   
133298  Laura Elizabeth Hernandez, ContributorWriter a... 2014-04-22   
133318                                     Lydia O'Connor 2014-04-22   
133341  Elio Leturia, ContributorAssociate Professor, ... 2014-04-22   

2880               19  
8932               19  
9025               19  
9130               19  
9217               19  
...               ...  
133087             19  
133108             19  
133298             19  
133318             19  
133341             19  

[1130 rows x 7 columns]
MEDIA =                                                      link   
319     https://www.huffpost.com/entry/chris-cuomo-new...  \
450     https://www.huffpost.com/entry/alex-wagner-msn...   
886     https://www.huffpost.com/entry/fox-news-benjam...   
1036    https://www.huffpost.com/entry/meta-facebook-h...   
1079    https://www.huffpost.com/entry/new-york-times-...   
...                                                   ...   
133409  https://www.huffingtonpost.com/entry/nbc-news-...   
133437  https://www.huffingtonpost.com/entry/glenn-gre...   
133480  https://www.huffingtonpost.com/entry/david-bro...   
133596  https://www.huffingtonpost.com/entry/otis-care...   
133661  https://www.huffingtonpost.com/entry/natalie-m...   

                                                 headline category   
319     Chris Cuomo Returning To Cable News After CNN ...    MEDIA  \
450     MSNBC Names Rachel Maddow's Successor: Alex Wa...    MEDIA   
886     Fox News Reporter Feels ‘Damn Lucky’ After Los...    MEDIA   
1036    Meta Grants Exemption To Hate Speech Rules, Al...    MEDIA   
1079         New York Times Tech Workers Vote To Unionize    MEDIA   
...                                                   ...      ...   
133409  NBC News Makes Bizarre Move To Boost David Gre...    MEDIA   
133437           Glenn Greenwald Reacts To Pulitzer Prize    MEDIA   
133480  David Brooks: Obama Has A 'Manhood Problem In ...    MEDIA   
133596          Magazine Faces Lawsuit For Racist Article    MEDIA   
133661                              Ouch, Natalie Morales    MEDIA   

                                        short_description           authors   
Show Text Column of Dataset:

short_description = df["short_description"]
0    Health experts said it is too early to predict...
1    He was subdued by passengers and crew when he ...
2    "Until you have a dog you don't understand wha...
3    "Accidentally put grown-up toothpaste on my to...
4    Amy Cooper accused investment firm Franklin Te...
5    The 63-year-old woman was seen working at the ...
6    "Who's that behind you?" an anchor for New Yor...
7    More than half a million people remained witho...
8    In "Mija," director Isabel Castro combined mus...
9    White House officials say the crux of the pres...
Name: short_description, dtype: object

Show Category Column of Dataset:

category = df['category']
0         U.S. NEWS
1         U.S. NEWS
2            COMEDY
3         PARENTING
4         U.S. NEWS
5         U.S. NEWS
6         U.S. NEWS
7        WORLD NEWS
9        WORLD NEWS
Name: category, dtype: category
Categories (42, object): ['ARTS', 'ARTS & CULTURE', 'BLACK VOICES', 'BUSINESS', ..., 'WELLNESS', 'WOMEN', 'WORLD NEWS', 'WORLDPOST']

Remove all codes:

def remove_tags(text):
  remove = re.compile(r'')
  return re.sub(remove, '', text)
df['Text'] = df['short_description'].apply(remove_tags)

Remove special all characters:

def special_char(text):
  reviews = ''
  for x in text:
    if x.isalnum():
      reviews = reviews + x
      reviews = reviews + ' '
  return reviews
df['Text'] = df['short_description'].apply(special_char)

Converting everything (all articles) into lowercase for uniformity:

def convert_lower(text):
   return text.lower()
df['short_description'] = df['short_description'].apply(convert_lower)
"he was subdued by passengers and crew when he fled to the back of the aircraft after the confrontation, according to the u.s. attorney's office in los angeles."

Removing all stopwords:

def remove_stopwords(text):
  stop_words = set(stopwords.words('english'))
  words = word_tokenize(text)
  return [x for x in words if x not in stop_words]
df['short_description'] = df['short_description'].apply(remove_stopwords)
# df['short_description'][1]

Lemmatizing all words:

def lemmatize_word(text):
  wordnet = WordNetLemmatizer()
  return " ".join([wordnet.lemmatize(word) for word in text])
df['short_description'] = df['short_description'].apply(lemmatize_word)
"subdued passenger crew fled back aircraft confrontation , according u.s. attorney 's office los angeles ."

After cleaning, our dataset looks like this:


0https://www.huffpost.com/entry/covid-boosters-...Over 4 Million Americans Roll Up Sleeves For O...U.S. NEWS[health, expert, said, early, predict, whether...Carla K. Johnson, AP2022-09-2335Health experts said it is too early to predict...
1https://www.huffpost.com/entry/american-airlin...American Airlines Flyer Charged, Banned For Li...U.S. NEWS[subdued, passenger, crew, fled, back, aircraf...Mary Papenfuss2022-09-2335He was subdued by passengers and crew when he ...
2https://www.huffpost.com/entry/funniest-tweets...23 Of The Funniest Tweets About Cats And Dogs ...COMEDY[``, dog, n't, understand, could, eaten, ., ``]Elyse Wanshel2022-09-235Until you have a dog you don t understand wha...
3https://www.huffpost.com/entry/funniest-parent...The Funniest Tweets From Parents This Week (Se...PARENTING[``, accidentally, put, grown-up, toothpaste, ...Caroline Bologna2022-09-2322Accidentally put grown up toothpaste on my to...
4https://www.huffpost.com/entry/amy-cooper-lose...Woman Who Called Cops On Black Bird-Watcher Lo...U.S. NEWS[amy, cooper, accused, investment, firm, frank...Nina Golgowski2022-09-2235Amy Cooper accused investment firm Franklin Te...
209522https://www.huffingtonpost.com/entry/rim-ceo-t...RIM CEO Thorsten Heins' 'Significant' Plans Fo...TECH[verizon, wireless, &, already, promoting, lte...Reuters, Reuters2012-01-2832Verizon Wireless and AT T are already promotin...
209523https://www.huffingtonpost.com/entry/maria-sha...Maria Sharapova Stunned By Victoria Azarenka I...SPORTS[afterward, ,, azarenka, ,, effusive, press, n...2012-01-2828Afterward Azarenka more effusive with the pr...
209524https://www.huffingtonpost.com/entry/super-bow...Giants Over Patriots, Jets Over Colts Among M...SPORTS[leading, super, bowl, xlvi, ,, talked, game, ...2012-01-2828Leading up to Super Bowl XLVI the most talked...
209525https://www.huffingtonpost.com/entry/aldon-smi...Aldon Smith Arrested: 49ers Linebacker Busted ...SPORTS[correction, :, earlier, version, story, incor...2012-01-2828CORRECTION An earlier version of this story i...
209526https://www.huffingtonpost.com/entry/dwight-ho...Dwight Howard Rips Teammates After Magic Loss ...SPORTS[five-time, all-star, center, tore, teammate, ...2012-01-2828The five time all star center tore into his te...

209527 rows × 8 columns

x = df['short_description']
y = df['category_code']

Taks 2: Apply at least two different techniques/algorithm for classifying the text in the given dataset to develop a model.

x = np.array(df.iloc[:,0].values)
y = np.array(df.category_code.values)
cv = CountVectorizer(max_features = 5000)
x = cv.fit_transform(df.Text).toarray()
print("X.shape = ", x.shape)
print("y.shape = ", y.shape)
X.shape =  (209527, 5000)
y.shape =  (209527,)
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.3, random_state = 0, shuffle = True)
perform_list = [ ]
def run_model(model_name, est_c, est_pnlty):
    mdl = ''
    if model_name == 'Logistic Regression':
        mdl = LogisticRegression()
    elif model_name == 'Random Forest':
        mdl = RandomForestClassifier(n_estimators=100 ,criterion='entropy' , random_state=0)
    elif model_name == 'Multinomial Naive Bayes':
        mdl = MultinomialNB(alpha=1.0,fit_prior=True)
    elif model_name == 'Support Vector Classifer':
        mdl = SVC()
    elif model_name == 'Decision Tree Classifier':
        mdl = DecisionTreeClassifier()
    elif model_name == 'K Nearest Neighbour':
        mdl = KNeighborsClassifier(n_neighbors=10 , metric= 'minkowski' , p = 4)
    elif model_name == 'Gaussian Naive Bayes':
        mdl = GaussianNB()
    oneVsRest = OneVsRestClassifier(mdl)
    oneVsRest.fit(x_train, y_train)
    y_pred = oneVsRest.predict(x_test)
    # Performance metrics
    accuracy = round(accuracy_score(y_test, y_pred) * 100, 2)
    # Get precision, recall, f1 scores
    precision, recall, f1score, support = score(y_test, y_pred, average='micro')
    print(f'Test Accuracy Score of Basic {model_name}: % {accuracy}')
    print(f'Precision : {precision}')
    print(f'Recall : {recall}')
    print(f'F1-score : {f1score}')
    # Add performance parameters to list
    ('Model', model_name),
    ('Test Accuracy', round(accuracy, 2)),
    ('Precision', round(precision, 2)),
    ('Recall', round(recall, 2)),
    ('F1', round(f1score, 2))
run_model('Logistic Regression', est_c=None, est_pnlty=None)
run_model('Random Forest', est_c=None, est_pnlty=None)
run_model('Multinomial Naive Bayes', est_c=None, est_pnlty=None)
run_model('Multinomial Naive Bayes', est_c=None, est_pnlty=None)
run_model('Decision Tree Classifier', est_c=None, est_pnlty=None)
run_model('K Nearest Neighbour', est_c=None, est_pnlty=None)
run_model('Gaussian Naive Bayes', est_c=None, est_pnlty=None)
model_performance = pd.DataFrame(data=perform_list)
model_performance = model_performance[['Model', 'Test Accuracy', 'Precision', 'Recall', 'F1']]
model = model_performance["Model"]
max_value = model_performance["Test Accuracy"].max()
print("The best accuracy of model is", max_value,"from Random")
classifier = RandomForestClassifier(n_estimators=100 ,criterion='entropy' , random_state=0).fit(x_train, y_train)
y_pred = classifier.predict(x_test)
y_pred1 = cv.transform(['Hour ago, I contemplated retirement for a lot of reasons. I felt like people were not sensitive enough to my injuries. I felt like a lot of people were backed, why not me? I have done no less. I have won a lot of games for the team, and I am not feeling backed, said Ashwin'])
yy = classifier.predict(y_pred1)
result = ""
if yy == [0]:
  result = "Business News"
elif yy == [1]:
  result = "Tech News"
elif yy == [2]:
  result = "Politics News"
elif yy == [3]:
  result = "Sports News"
elif yy == [1]:
  result = "Entertainment News"


Finally after doing Data cleaning and Data Preprocessing (cleaning data, train_test_split model, creating a bag of words NLP model, and machine learning model) we got the accuracy scores and we can say that Random Forest Classification gives the best accuracy among all machine learning models.

And at last, we also predict the category of different news articles.

