×
Every great story on the planet happened when someone decided not to give up, but kept going no matter what.
--Your friends at LectureNotes
Close

Previous Year Exam Questions for Natural Language Processing - NLP of 2018 - CEC by Bput Toppers

  • Natural Language Processing - NLP
  • 2018
  • PYQ
  • Biju Patnaik University of Technology BPUT - BPUT
  • Information Technology Engineering
  • B.Tech
  • 843 Views
  • 15 Offline Downloads
  • Uploaded 8 months ago
Bput Toppers
Bput Toppers
0 User(s)
Download PDFOrder Printed Copy

Share it with your friends

Leave your Comments

Text from page-1

Registration No : Total Number of Pages : 03 B.Tech. PIT6J004 6th Semester Regular Examination 2017-18 NATURAL LANGUAGE PROCESSING BRANCH : IT Time : 3 Hours Max Marks : 100 Q.CODE : C370 Answer Part-A which is compulsory and any four from Part-B. The figures in the right hand margin indicate marks. Q1 a) b) c) d) e) f) Part – A (Answer all the questions) Answer the following questions : multiple type or dash fill up type : N-grams are defined as the combination of N keywords together. How many bi-grams can be generated from given sentence: “Analytics Sandhya is a great source to learn data science” (A) 7 (B) 8 (C) 9 (D) 10 (E) 11 What is the right order for a text classification model components 1. Text cleaning 2. Text annotation 3. Gradient descent 4. Model tuning 5. Text to predictors (A) 12345 (B) 13425 (C) 12534 (D) 13452 Which of the following technique is not a part of flexible text matching? (A) Soundex (B) Metaphone (C) Edit Distance (D) Keyword Hashing While working with text data obtained from news sentences, which are structured in nature, which of the grammar-based text parsing techniques can be used for noun phrase detection, verb phrase detection, subject detection and object detection. (A) Part of speech tagging (B) Dependency Parsing and Constituency Parsing (C) Skip Gram and N-Gram extraction (D) Continuous Bag of Words While creating a machine learning model on text data, you created a document term matrix of the input data of 100K documents. Which of the following remedies can be used to reduce the dimensions of data – 1. Latent Dirichlet Allocation 2. Latent Semantic Indexing 3. Keyword Normalization (A) Only 1 (B) 2, 3 (C) 1, 3 (D) 1, 2, 3 While working with context extraction from a text data, you encountered two different sentences : The tank is full of soldiers. The tank is full of nitrogen. Which of the following measures can be used to remove the problem of word sense disambiguation in the sentences? (A) Compare the dictionary definition of an ambiguous word with the terms contained in its neighborhood (B) Co-reference resolution in which one resolute the meaning of ambiguous word with the proper noun present in the previous sentence (C) Use dependency parsing of sentence to understand the meanings (2 x 10)

Text from page-2

g) h) i) j) Q2 a) b) c) d) e) f) g) h) Retrieval based models and Generative models are the two popular techniques used for building chatbots. Which of the following is an example of retrieval model and generative model respectively? (A) Dictionary based learning and Word 2 vector model (B) Rule-based learning and Sequence to Sequence model (C) Word 2 vector and Sentence to Vector model (D) Recurrent neural network and convolutional neural network What is the major difference between CRF (Conditional Random Field) and HMM (Hidden Markov Model)? (A) CRF is Generative whereas HMM is Discriminative model (B) CRF is Discriminative whereas HMM is Generative model (C) Both CRF and HMM are Generative model (D) Both CRF and HMM are Discriminative model You have created a document term matrix of the data, treating every tweet as one document. Which of the following is correct, in regards to document term matrix? 1. Removal of stopwords from the data will affect the dimensionality of data 2. Normalization of words in the data will reduce the dimensionality of data 3. Converting all the words in lowercase will not affect the dimensionality of the data (A) Only 1 (B) Only 2 (C) Only 3 (D) 1 and 2 (E) 2 and 3 (F) 1, 2 and 3 Which of the following features can be used for accuracy improvement of a classification model? (A) Frequency count of terms (B) Vector Notation of sentence (C) Part of Speech Tag (D) Dependency Grammar (E) All of these Answer the following questions : Short answer type : A sentence can easily have more than one parse tree that is consistent with a given CFG. How do PCFGs and non-probability-based CFGs differ in terms of handling parsing ambiguity? What is the difference between phrase and sentence level construction? Draw the top-ranked parse tree for the sentence below by applying the given PCFG. production rule probability S → VP 1.0 VP → Verb NP 0.7 VP → Verb NP PP 0.3 NP → NP PP 0.3 NP → Det Noun 0.7 PP → Prep Noun 1.0 Det→ the 0.1 Verb → Cut | Ask | Find| ... 0.1 Prep → with | in | ... 0.1 Noun → envelope | grandma | scissors | men | suits | summer | ... 0.1 Does the result seem reasonable to you? Why or why not? “Ask the grandma with scissors.” Describe why production rules with zero probability are problematic. Describe one method to avoid zero probabilities for lexicalized PCFGs. 4-grams are better than trigrams for part-of-speech tagging. Is it true or false? Explain your answer. Noun phrase coreference resolution includes pronoun resolution, proper noun resolution, and common noun resolution. Which of the three would you expect to be the most difficult to handle computationally? Explain why. Information extraction is harder than text categorization. Is it true or false? Explain your answer. (2 x 10)

Text from page-3

i) j) State two advantages of partial parsers over parsers that provide in-depth syntactic information. What is the difference between phrase-based and feature-based NLP? Q3 a) b) Part – B (Answer any four questions) Explain the issues in computational morphology with suitable example. Discuss the applications of natural language processing. Q4 a) (10) (5) (10) b) Given the grammar and lexicon below, show the final chart for the following sentence after applying the bottom-up chart parser. Remember that the final chart contains all edges added during the parsing process. You may use either the notation from class (i.e. nodes/links) or the notation from the book to depict the chart. S → VP VP → Verb NP NP → NP PP NP → Det Noun PP → Prep Noun Det → the Verb → Find Prep → in Noun → men | suits Find the men in suits. Give an account of CYK parser. Q5 a) b) Discuss language as a rule-based system. Discuss stochastic part-of-speech tagging. (10) (5) Q6 a) b) Write an algorithm for simple top-down parser with an example. Explain the five verb forms. (10) (5) Q7 a) b) Describe unification method with suitable examples. Explain how unification is implemented. (10) (5) Q8 a) Between the words eat and find which would you expect to be more effective in selecting restriction-based sense disambiguation. Explain the application of semantics. (10) Consider the following article for this problem. “I bought my wireless keyboard/mouse set several months ago, and, like a lot of new products, it has some unanticipated issues. On the plus side, obviously, is the styling. The design is fresh, clean, and interesting. The keyboard can tilt at different angles, which was important because I had some difficulty typing with it flat. The bluetooth receiver in the charger was functional, and I appreciated having a bluetooth hub for my cellphone. The mouse and the keyboard have both proved durable and reliable despite a number of mishaps.” For each of inferences (a) through (d) below, (a) The reviewer owns the keyboard. (b) The charger is part of the keyboard. (c) The reviewer had difficulty typing with the keyboard. (d) The reviewer likes the keyboard. 1. State whether the inference depends on the discourse context, knowledge about actions, and/or general world knowledge; and 2. Describe what natural language processing techniques, if any, might enable a system to make the inference automatically. What is text summarization? Explain with an example. (10) b) Q9 a) b) (5) (5) (5)

Lecture Notes