These Class 10 AI Important Questions Chapter 6 Natural Language Processing Class 10 Important Questions and Answers NCERT Solutions Pdf help in building a strong foundation in artificial intelligence.
Natural Language Processing Class 10 Important Questions
Class 10 AI Natural Language Processing Important Questions
Important Questions of Natural Language Processing Class 10 – Class 10 Natural Language Processing Important Questions
Natural Language Processing Class 10 Subjective Questions
Question 1.
Which chatbot did you try? Name any one.
Answer:
I tried the chatbot named Mitsuku.
Question 2.
What is the purpose of the Mistuku chatbot?
Answer:
The purpose of the Mistuku chatbot is to have conversations with users and provide them with information, entertainment, and assistance. It can chat on various topics, answer questions, play games, and even tell jokes. It’s designed to mimic human conversation and make the interaction enjoyable and helpful.
Question 3.
When we talk to a Mistuku did the chat feel like talking to a human or a robot? Why do you think so?
Answer:
When we talk to Mitsuku, the chat can sometimes feel like talking to a human because it uses natural language and responds to our questions and statements in a friendly and engaging manner. However, there are times when it feels more like talking to a robot, especially when it doesn’t understand a question or gives a repetitive or irrelevant answer. This is because, despite its advanced programming, Mitsuku still relies on pre-defined rules and algorithms to generate responses.
Question 4.
What are the possible difficulties a machine would face in processing natural language?
Answer:
Machines face several difficulties in processing natural language, including understanding context, handling ambiguity, recognizing idiomatic expressions, and managing diverse syntax structures. These challenges stem from the complexity and variability of human language.
Question 5.
Consider the example of different syntax and same semantics: Question 2 + 3 = 3 + 2. Give some other examples of different syntax and same symantics and vice-versa.
Answer:
For example, consider the expressions “I read the book” and “The book was read by me.” Both sentences have different syntax but the same semantics (they mean the same thing). Conversely, “I will book a flight” and “I read a book” have similar syntax but different semantics (they mean different things).
Question 6.
Examples: The red car zoomed past his nose -> taking about the color of the car.
His face turns red after consuming the medicine -> talking about the allergic reaction.
From the above examples, we can see that color red have different meaning in both sentences.
Write some other words which carhave multiple meanings and use them in sentenes.
Answer:
Words with multiple meanings (polsemy) can be challenging. For instance:
- Bank “He sat on the river bank.” (rivr’s edge) vs. “She deposited money in the bank.” (finarial institution)
- Bark “The tree’s bark is rough.” (trees outer covering) vs. “The dog’s bark was loud.” (sound mde by a dog)
- Light “The room was filled with lighf (illumination) vs. “The package was light.” (not heavy)
These examples illustrate how contex determines the meaning of words with multiple interretations.
Question 7.
Here is a corpus for you to challengeyourself with the given tasks. The corpus is followed ty some challenging questions based on the :orpus. Online tools can be used for answering thee questions.
The Corpus
Document 1 We can use health chatbots fr treating stress
Document 2 We can use NLP to create chabots and we will be making health chatbots now!
Document 3 Health chatbots cannot replae human counsellors now. YAY><!!@1nteLA!4Y
Accomplish the following challengeson the basis of the corpus given above. Use the toolsavailable online for these challenges. Link for tool is gven below:
1. Sentence Segmentation: https://tinyurl.com/y36hd92n
2. Tokenization: https://text-processing.com/dem//tokenize/
3. Stopwords Removal: https://demos.datasciencedojo.com/demo/stop words/
4. Lowercase Conversion: https://caseconverter.com
5. Stemming: https://testanalysisonline.com/nilt.porter-stemm er
6. Lemmatisation: https://textanalysisonline.com/spicy-word-lemm atize
7. Bag of Words: Create a document victor table for all documents
8. Generate TFIDF values for all the wods
9. Find the words having highest value
10. Find the words having the least valur
Answer:
Sentence Segmentation We need to split each document into individual sentences.
Document 1: “We can use health chatbots for treating stress.”
Document 2: “We can use NLP to create chatbots and we will be making health chatbots now!”
Document 3: “Health Chatbots cannot replace human counsellors now.” “Yay >< II @inteLA14Y”
Tokenisation Tokenize each sentence to break them into individual words.
Document 1: [“We”, “can”, “use”, “health”, “chatbots”, “for”, “treating”, “stress”, “.”]
Document 2: [“We”, “can”, “use”, “NLP”, “to”, “create”, “chatbots”, “and”, “we”, “will”, “be”, “making”, “health”, “chatbots”, “now”, “!”]
Document 3: [“Health”, “Chatbots”, “cannot”, “replace”, “human”, “counsellors”, “now”, “, “Yay”, “><“, “II”, “@inteLA14Y”]
Stopwords Removal Remove common stopwords from each document.
Document 1: [“use”, “health”, “chatbots”, “treating”, “stress”]
Document 2: [“use”, “NLP”, “create”, “chatbots”, “making”, “health”, “chatbots”]
Document 3: [“Health”, “Chatbots”, “replace”, “human”, “counsellors”, “Yay”, “><“, “II”, “ÐinteLA14Y”]
Lowercase Conversion Convert all tokens to lowercase.
Document 1: [“use”, “health”, “chatbots”, “treating”, “stress”]
Document 2: [“use”, “nlp”, “create”, “chatbots”, “making”, “health”, “chatbots”]
Document 3: [“health”, “chatbots”, “replace”, “human”, “counsellors”, “yay”, “><“, “ii”, “@intela14y”]
Stemming Apply stemming to each word to reduce them to their root forms.
Document 1: [“use”, “health”, “chatbot”, “treat”, “stress”]
Document 2: [“use”, “nlp”, “creat”, “chatbot”, “make”, “health”, “chatbot”]
Document 3: [“health”, “chatbot”, “replac”, “human”, “counsel”, “yay”, “><“, “ii”, “@intela14y”]
Lemmatisation Apply lemmatization to each word to convert them to their base form.
Document 1: [“use”, “health”, “chatbot”, “treat”, “stress”]
Document 2: [“use”, “nlp”, “create”, “chatbot”, “make”, “health”, “chatbot”]
Document 3: [“health”, “chatbot”, “replace”, “human”, “counsellor”, “yay”, “><“, “ii”, “@intela14y”]
Bag of Words Create a document vector table for all documents.
Natural Language Processing Class 10 Very Short Type Answer Questions
Question 1.
What is NLP?
Answer:
NLP is a field of artificial intelligence focused on the interaction between computers and humans through natural language.
Question 2.
Name any two commonly used applications of NLP.
Answer:
Sentiment analysis, language translation
Question 3.
Name the process of dividing whole corpus into sentences.
Answer:
Sentence segmentation
Question 4.
Identify the given chat bot type:
It learns from its environment and experience. It also builds on its capabilities based on the knowledge. These can collaborate with humans, working alongside them and learning their behaviour.
Answer:
Smart bot
Question 5.
What NLP stands for?
Answer:
Natural Language Processing
Question 6.
What is a chatbot?
Answer:
Chatbots are computer programs designed to simulate conversation with human users.
Question 7.
Ayushi was learning about NLP. She wanted to know the term used for the whole textual data from all the documents altogether. Help her in identifying the term used for it.
Answer:
Corpus
Question 8.
What is the full form of TF-IDF?
Answer:
Term Frequency Inverse Document Frequency
Question 9.
Identify the type of chatbot with the information given below
These bots work on pre-programmed instructions inside the application/machine and are generally easy to develop. They are deployed in the customer care section of various companies. Their job is to answer some basic queries that they are coded for and connect them to human executives once they are unable to handle the conversation.
Answer:
Script bot
Question 10.
What will be the output of the word “studies” if we do the following:
(a) Lemmatization
(b) Stemming
Answer:
(a) The output of the word after lemmatization will be study.
(b) The output of the word after stemming will be study.
Question 11.
How many tokens are there in the sentence given below?
Traffic Jams have become a common part of our lives nowadays. Living in an urban area means you have to face traffic each and every time you get out on the road. Mostly, school students opt for buses to go to school.
Answer:
46 tokens are there in the given sentence.
Question 12.
Identify any 2 stopwords in the given sentence:
Pollution is the introduction of contaminants into the natural environment that cause adverse change. The three types of pollution are air pollution, water pollution and land pollution. Competency Based ques.
Answer:
Stopwords, in the given sentence are: is, the, of, that, into, are, and
Natural Language Processing Class 10 Short Type Answer Questions
Question 1.
What is Tokenisation? Count how many tokens are present in the following statement:
I find that the harder I work, the more luck I seem to have.
Answer:
Tokenisation is the process of breaking down text into individual words or tokens. In the given statement, there are 14 tokens: “I”, “find”, “that”, “the”, “harder”, “I”, “work”, “the”, “more”, “luck”, “I”, “seem”, “to” “have”.
Question 2.
Kaira, a beginner in the field of NLP is trying to understand the process of stemming. Help her in filling up the following table by suggesting appropriate affixes and stem of the words mentioned there:
Answer:
Question 3.
Explain the following picture which depicts one of the processes on NLP. Also mention the purpose which will be achieved by this process.
Answer:
The process depicted is case normalisation, where all letters are converted to lowercase for consistency. The purpose achieved by this process is to ensure that text processing algorithms treat words with different capitalisations as equivalent, thus improving accuracy in tasks like word frequency counting or search.
Question 4.
Identify any two stop words which should not be removed from the given sentence and why?
Get help and support whether you’re shopping now or need help with a past purchase.
Contact us at abc@pwershel.com or on our website www.pwershel.com
Answer:
Two stopwords in the given sentence which should not be removed are:
@(at), .(fullstop)
In the above sentence, these tokens are part of email id. removing these tokens may lead to invalid website address and email ID, So these words should not be removed from the above sentence.
Question 5.
What will be the results of conversion of the term, ‘happily’ in the process of stemming and lemmatisation? Which process takes longer time for execution?
Answer:
In stemming, the term “happily” would typically be reduced to its root form, which might result in “happi” or “happi.” Stemming operates by removing suffixes or prefixes to reduce a word to its base or root form.
In lemmatisation, however, the term “happily” would be converted to its lemma, which is the base or dictionary form of the word. In this case, “happily” would likely be lemmatised to “happy.”
In terms of execution time, lemmatisation generally takes longer than stemming.
Question 6.
What do we get from the “bag of words” algorithm?
Answer:
Bag of words gives us two things
1. A vocabulary of words for the corpus
2. The frequency of these words (number of times it has occurred in the whole corpus)
Question 7.
Write any two applications of TF-IDF.
Or
With reference to data processing, expand the term TF-IDF. Also give any two applications of TF-IDF.
Answer:
TF-IDF – Term Frequency – Inverse Document Frequency
1. Document Classification
Helps in classifying the type and genre of a document.
2. Topic Modelling
It helps in predicting the topic for a corpus.
Question 8.
Write down the steps to implement bag of words algorithm.
Answer:
The steps to implement bag of words algorithm are as follows
- Text Normalisation Collect data and pre-process it
- Create Dictionary Make a list of all the unique words occurring in the corpus. (Vocabulary)
- Create document vectors For each document in the corpus, find out how many times the word from the unique list of words has occurred.
- Create document vectors for all the documents.
Question 9.
What are the primary differences between Script-bots and Smart-bots?
Answer:
Script bots are pre-programmed with a specific use case in mind.’They cannot functionally ‘learn’ or change. Smart bots leverage AI to adapt and perform a wide array of complex stats, but they cannot be programmed to take surveys.
Question 10.
Define Chatbot. What are its types?
Answer:
Refer to text on page 146 and 147 (Chatbots, Types of Chatbots).
Natural Language Processing Class 10 Long Answer Type Questions
Question 1.
Consider the text of the following documents:
Document 1 Sahil likes to play cricket
Document 2 Sajal likes cricket too
Document 3 Sajal also likes to play basketball
Apply all the four steps of Bag of words model of NLP on the above given documents and generate the output.
Answer:
Step 1: Tokenization
Document 1 [Sahil, likes, to, play, cricket]
Document 2 [Sajal, likes, cricket, too]
Document 3 [Sajal, also, likes, to, play, basketball]
Step 2 Vocabulary Çreation
Vocabulary: [Sahil, likes, to, play, cricket, Sajal, too, also, basketball]
Step 3 Document-Term Matrix (DTM) Construction
Step 4 Term Frequency-Inverse Document Frequency
(TF-IDF) Calculation
TF-IDF values for each term in each document can be calculated using the term frequency and inverse document frequency formulas, which provide a measure of the importance of each term in each document relative to the entire corpus.
Question 2.
With reference to NLP, explain the following terms in details with the help of suitable example:
(i) Term Frequency
(ii) Inverse Document Frequency
Answer:
(i) Term Frequency Term Frequency (TF) is a measure used in natural language processing to quantify the frequency of a term (word) in a document relative to the total number of terms in that document. It indicates how often a particular term appears within a document.
Example Suppose we have a document containing 100 words, and the word “apple” appears 5 times in that document. The term frequency of “apple” in this document would be:
TF(“apple”, document) Question = 5 / 100 = 0.05
(ii) Inverse Document Frequency Inverse Document Frequency (IDF) is a measure used to determine the importance of a term in a collection of documents. It quantifies how rare or common a term is across all documents in the corpus. Terms that occur frequently in many documents are considered less important, while terms that occur rarely in few documents are considered more important.
Example: Suppose we have a corpus containing 1,000 documents, and the term “apple” appears in 100 of these documents. The inverse document frequency of “apple” would be:
IDF (“apple”) = log (1000 / 100) = log (10) = 1
Question 3.
We, human beings, can read, write and understand many languages. But computers can understand only machine language. Do you think we might face any challenges if we try to teach computers how to understand and interact in human languages? Explain.
Answer:
Yes, we might face any challenges if we try to teach computers how to understand and interact in human languages.
The possible difficulties are
1. Arrangement of the words and meaning the computer has to identify the different parts of a speech. Also, it may be extremely difficult for a computer to understand the meaning behind the language we use.
2. Multiple Meanings of a word same word can be used in a number of different ways which according to the context of the statement changes its meaning completely.
3. Perfect Syntax, no Meaning Sometimes, a statement can have a perfectly correct syntax but it does not mean anything. For example, take a look at this statement:
Chickens feed extravagantly while the moon drinks tea. This statement is correct grammatically but does this make any sense? In Human language, a perfect balance of syntax and semantics is important for better understanding.
Question 4.
Samiksha, a student of class X was exploring the Natural Language Processing domain. She got stuck while performing the text normalisation. Help her to normalise the text on the segmented sentences given below:
Document 1: Akash and Ajay are best friends.
Document 2: Akash likes to play football but Ajay prefers to play online games.
Answer:
1. Tokenisation
Akash, and, Ajay, are, best, friends Akash, likes, to, play, football, but, Ajay, prefers, to, play, online, games
2. Removal of stopwords
Akash, Ajay, best, friends Akash, likes, play, football, Ajay, prefers, play, online, games
3. Converting text to a common case akash, ajay, best, friends akash, likes, play, football, ajay, prefers, play, online, games
4. Stemming/Lemmatisation akash, ajay, best, friend akash, like, play, football, ajay, prefer, play, online, game
Question 5.
Consider the following two documents:
Document 1 ML ahd DL are part of AI.
Document 2 DL is a subset of ML.
Implement all four steps of the Bag of Words (BoW) model to create a document vector table. Depict the outcome of each step.
Answer:
Four steps of the Bag of Words (BoW) model for the given documents:
Step 1 Tokenization
Tokenization involves breaking the text into individual words or tokens.
For Document 1:
Tokens: [“ML”, “and”, “DL”, “are”, “part”, “of”, “AI”]
For Document 2:
Tokens: [“DL”, “is”, “a”, “subset”, “of”, “ML”]
Step 2 Lowercasing
Lowercasing converts all words to lowercase to ensure consistency.
For Document 1:
Tokens: [“ml”, “and”, “dl”, “are”, “part”, “of”, “ai”]
For Document 2:
Tokens: [“dl”, “is”, “a”, “subset”, “of”, “ml”]
Step 3 Removing Stopwords (Optional)
Stopwords are common words like “and”, “is”, “a”, etc., that may not contribute much to the meaning of the text. This step is optional and can be skipped depending on the application.
Tokens: [“ml”, “dl”, “part”, “ai”]
For Document 2:
Tokens: [“dl”, “subset”, “ml”]
Step 4 Creating Document Vectors
Document vectors represent the frequency of each word in the document.
For Document 1:
Document Vector: {“ml”: 1, “dl”: 1, “part”: 1, “ai”: 1}
For Document 2:
Document Vector: {“dl”: 1 , “is”: 1, ” a “: 1 , “subset”: 1 , “of”: 1, “ml”: 1}
Document Vector Table
Word | Document 1 | Document 2 |
ml | 1 | 1 |
dl | 1 | 1 |
part | 1 | 0 |
ai | 1 | 0 |
is | 0 | 1 |
a | 0 | 1 |
subset | 0 | 1 |
of | 0 | 1 |
Question 6.
Create a document vector table from the following documents by implementing all the four steps of Bag of words model. Also depict the outcome of each step.
Document 1: Sameera and Sanya are classmates.
Document 2: Sameera likes dancing but Sanya loves to study mathematics.
Will it be valid to say that not all the devices which are termed as “smart” are AI-enabled? Justify this statement. Explain any two examples from the daily life which are commonly misunderstood as AI.
Ans.
Step of the Bag of Words (BoW) model for the given documents:
Step 1 Tokenization
Tokenization involves breaking the text into individual words or tokens.
For Document 1:
Tokens: [“Sameera”, “and”, “Sanya”, “are”, “classmates”]
For Document 2:
Tokens: [“Sameera”, “likes”, “dancing”, “but”, “Sanya”, “loves”, “to”, “study”, “mathematics”]
Step 2 Lowercasing
Lowercasing converts all words to lowercase to ensure consistency.
For Document 1:
Tokens: [“sameera”, “and”, “sanya”, “are”, “classmates”]
For Document 2:
Tokens: [“sameera”, “likes”, “dancing”, “but”, “sanya”, “loves”, “to”, “study”, “mathematics”]
Step 3 Removing Stopwords (Optional)
Stopwords are common words like “and”, “are”, “to”, etc., that may not contribute much to the meaning of the text. This step is optional and can be skipped depending on the application.
For Document 1:
Tokens: [“sameera”, “sanya”, “classmates”]
For Document 2:
Tokens: [“sameera”, “likes”, “dancing”, “sanya”, “loves”, “study”, “mathematics”]
Step 4 Creating Document Vectors
Document vectors represent the frequency of each word in the document.
For Document 1:
Document Vector: “sameera”: 1 , “sanya”: 1 , “classmates”: 1}
For Document 2:
Document Vector: {“sameera”: 1, 7ikes”: 1, “dancing”: 1, “sanya”: 1, “loves”: 1, “study”: 1, “mathematics”: 1}
Document Vector Table
Word | Document 1 | Document 2 |
sameera | 1 | 1 |
sanya | 1 | 1 |
classmates | 1 | 0 |
likes | 0 | 1 |
dancing | 0 | 1 |
loves | 0 | 1 |
study | 0 | 1 |
mathematics | 0 | 1 |
This table shows the frequency of each word (token) in each document. The BoW model represents each document as a vector where each element corresponds to the frequency of a word in the document.
The post Natural Language Processing Class 10 Questions and Answers appeared first on Learn CBSE.