Author: Nathan Schneider
This is a pen-and-paper assignment to give you practice with syntactic formalisms and parsing algorithms. The breakdown into 60 points is shown. (Your score will be scaled so that it is worth the same amount as other homework assignments.)
Warning: Allow plenty of time for the last 3 parts!
Questions will concern the sentence:
What will you purchase me for my birthday in July?
[1 pt] What is the sentence type: declarative, imperative, interrogative, or interjection?
[2 pts] Identify two words in the sentence that are ambiguous with respect to part-of-speech (POS). Explain.
[2 pts] Assuming the contextually correct POS tags, identify a structural ambiguity in the sentence.
[5 pts] Assign the contextually correct POS to words (ignoring punctuation) from the following list:
ADJ
: adjectiveADV
: adverbAUX
: auxiliaryCONJ
: conjunctionDET
: determinerNOUN
: common nounNUM
: numberPREP
: prepositionPROPN
: proper nounPRON:INDEF
: indefinite pronounPRON:PERS
: personal pronounPRON:POSS
: possessive pronounPRON:WH
: wh pronounVERB
: verb[10 pts] Draw two dependency trees: one for each way the structural ambiguity can be disambiguated. Use dependency relations from the following list (ignore punctuation), and use content heads:
adjmod
: adjective modifieradvmod
: adverb modifieraux
: auxiliarydet
: determinerdobj
: direct objectiobj
: indirect objectnmod
: noun modifiernummod
: number modifierpobj
: object of prepositionpossmod
: possessive modifierprep
: prepositionroot
: rootsubj
: subject[10 pts] For each dependency tree, draw a corresponding constituency (phrase structure) tree.
Include parts of speech as preterminal categories. Use the following phrasal
(ignore punctuation):
S
: sentenceNP
: noun phraseVP
: verb phrasePP
: prepositional phraseHint 1: Neither the subject nor anything before it should be in a VP.
Hint 2: Apart from the terminals, there should be 4 unary branching structures.
[4 pts] Give a small context-free grammar that allows the two constituency trees. You do not have to write out rules for the lexicon entries.
Hint: My solution has 9 rules.
[1 pt] Assuming a finite lexicon, does your CFG describe a finite or infinite string language?
[1 pt] Give one example where your CFG overgenerates, i.e., allows a sentence that is not really grammatical English.
[4 pts] Binarize the grammar so it is in Chomsky-Normal Form (CNF).
Hint: There are multiple valid ways to do this. My solution has 23 binary rules, which contain only nonterminals (phrasal categories and POS tags) on the RHS.
[10 pts] Use the CNF CFG to fill out the CKY chart for constituency parsing the sentence. A template is provided. Assume the correct POS tags are already determined. (Ignore punctuation.) Ensure both valid parses are encoded in the chart. There should also be some possible constituents that do not lead to a full parse.
[10 pts] For the first dependency parse you drew, write out the steps for the transition-based algorithm to yield the structure. (Ignore the edge labels and punctuation.) Show the stack, buffer, and relations columns, and use the transition operations:
S
: shiftLA
: left arcRA
: right arc