FDP on Natural Language Processing

The Department of CSE, MVJ College of Engineering, organized a Faculty Development Program on Natural Language Processing, on 4^th February, 2023. The Lecture was delivered by Mr. Samrat Sengupta, Sr. Lead Data Scientist, Happiest Mind Technologies, Bangalore. Mr. Samrat Sengupta has a history of leading projects in AI and Machine Learning. His contributions and researches in areas of NLP, Computer Vision and Predictive Analytics culminated in multiple IP launches. He also has experience spanning 13+ years, in Cloud, Integration and backend programing, along with ML/AI.

The first session began at 11.30 am in Seminar Hall 5, MVJCE. The faculty members from the Departments of Computer Science and Engineering and Information Science Engineering participated in this programme. Dr. M B Sudhan, Head of the Department (Computer Science and Engineering) welcomed Mr. Samrat Sengupta and presented a bouquet to him. The session was then taken over by the Speaker Mr. Samrat Sengupta.

Mr. Samrat Sengupta started his lecture by speaking about how Natural Language Processing strives to build machines that understand and respond to text or voice data – and respond with text or speech of their own – in much the same way that humans do.

Points discussed by Mr. Samrat Sengupta about NER

Named Entity Recognition (NER) is one of the most popular data preprocessing tasks. It involves the identification of key information in the text, and classification into a set of predefined categories.

An entity is basically the thing that is consistently talked about or referred to in the text. NER is a form of NLP. At its core, NLP is just a two-step process – Detecting the entities from the text and Classifying them into different categories.

Some categories that are the most important architecture in NER:

Person
Organization
Place/ Location

Other common tasks include classifying of the following:

Date/time
Expression
Numeral measurement (money, percentage, weight etc.)
E-mail address

The next session started at 1.45 pm, where Mr. Samrat Sengupta spoke about how Topic Modelling is implemented in the field of NLP. He explained about how Topic modeling is an algorithm extracting the topic or topics for a collection of documents, and how text mining method is widely used in Natural Language Processing to gain insights about the text documents. It can be considered as the process of obtaining the required features from a bag of words. This is very important, because in NLP, each word present in the corpus is considered as a feature. Thus, feature reduction helps us to focus on the right content instead of wasting our time going through all the text in the data. Mr. Samrat Sengupta suggested that for better understanding of the concepts, we should stay away from the mathematics background.

This highly important process can be performed by various algorithms or methods. Some of them are:

Latent Dirichlet Allocation (LDA)
Latent Semantic Analysis (LSA)

Mr. Samrat Sengupta motivated the faculty to explore more in the field of NLP. The session culminated with a Q/A session, where the faculty cleared their doubts relating to NER and Topic Modeling. The Vote of thanks was proposed in the end.