Author: Nathan Schneider
This is a pen-and-paper assignment to give you practice with syntactic formalisms and parsing algorithms. The breakdown into 60 points is shown. (Your score will be scaled so that it is worth the same amount as other homework assignments.)
Warning: Allow plenty of time for the last 3 parts!
Questions will concern the sentence:
What will you purchase me for my birthday in July?
You may find it useful to consult the textbook's description of syntactic analysis of English (ignore the part about traces and movement).
[1 pt] What is the sentence type: declarative, imperative, interrogative, or interjection?
[2 pts] Identify two words in the sentence that are ambiguous with respect to part-of-speech (POS). Explain.
[2 pts] Assuming the contextually correct POS tags, identify a structural ambiguity in the sentence.
[5 pts] Assign the contextually correct POS to words (ignoring punctuation) from the following list:
ADJ
: adjectiveADV
: adverbAUX
: auxiliaryCONJ
: conjunctionDET
: determinerNOUN
: common nounNUM
: numberPREP
: prepositionPROPN
: proper nounPRON:INDEF
: indefinite pronounPRON:PERS
: personal pronounPRON:POSS
: possessive pronounPRON:WH
: wh pronounVERB
: verb[10 pts] Draw two dependency trees: one for each way the structural ambiguity can be disambiguated. You can try running an automatic parser (e.g., CoreNLP parser). Note that an automatic parse may contain errors, so you should double-check the output and correct it if necessary. Also note that the corrected tree will correspond to one of the two trees you need for this problem, so you will have to draw the other tree by hand! Use dependency relations from the following list (ignore punctuation), and use content heads:
advmod
: adverb modifieramod
: adjective modifieraux
: auxiliarycase
: case marker (the preposition of a prepositional phrase, or a possessive ending of a noun phrase)det
: determineriobj
: indirect objectnmod
: noun modifiernmod:poss
: possessive modifiernsubj
: subjectnummod
: number modifierobj
: direct objectobl
: oblique case (nominal modifying a verb, adjective, or adverb, possibly with a preposition)root
: rootIf you are interested, take a look at the Universal Dependencies guidelines.
[10 pts] For each dependency tree, draw a corresponding constituency (phrase structure) tree.
Include parts of speech as preterminal categories. Use the following phrasal
(ignore punctuation):
S
: sentenceNP
: noun phraseVP
: verb phrasePP
: prepositional phraseHint 1: Neither the subject nor anything before it should be in a VP.
Hint 2: Apart from the terminals, there should be 4 unary branching structures.
[4 pts] Give a small context-free grammar that allows the two constituency trees. You do not have to write out rules for the lexicon entries.
Hint: My solution has 9 rules.
[1 pt] Assuming a finite lexicon, does your CFG describe a finite or infinite string language?
[1 pt] Give one example where your CFG overgenerates, i.e., allows a sentence that is not really grammatical English.
[4 pts] Binarize the grammar so it is in Chomsky-Normal Form (CNF).
Hint: There are multiple valid ways to do this. My solution has 23 binary rules, which contain only nonterminals (phrasal categories and POS tags) on the RHS.
[10 pts] Use the CNF CFG to fill out the CKY chart for constituency parsing the sentence. A template is provided. Assume the correct POS tags are already determined. (Ignore punctuation.) Ensure both valid parses are encoded in the chart. There should also be some possible constituents that do not lead to a full parse.
[10 pts] For the first dependency parse you drew, write out the steps for the transition-based algorithm to yield the structure. (Ignore the edge labels and punctuation.) Show the stack, buffer, and relations columns, and use the transition operations:
S
: shiftLA
: left arcRA
: right arc