CitationSentimentClassifier/example.arff at master · detonator413/CitationSentimentClassifier · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
@relation 'C:\\work\\data\\arff\\postwriteup\\aan3withauth.txt'

@attribute @@class@@ {n,p,o}
@attribute @@id@@ string
@attribute @@sentence@@ string
@attribute @@author@@ string
@attribute @@dependencies@@ string

@data
o,811,'This iterative optimiser , derived from a word disambiguation technique <CIT> , finds the nearest local maximum in the lexical cooccurrence network from each concept seed ',Yarowsky,'det_optimiser_This amod_optimiser_iterative nsubj_finds_optimiser partmod_optimiser_derived prep_derived_from det_<CIT>_a nn_<CIT>_word nn_<CIT>_disambiguation nn_<CIT>_technique pobj_from_<CIT> det_maximum_the amod_maximum_nearest amod_maximum_local dobj_finds_maximum prep_maximum_in det_network_the amod_network_lexical nn_network_cooccurrence pobj_in_network prep_network_from det_seed_each nn_seed_concept pobj_from_seed '
p,812,'1 Introduction <CIT> introduced minimum error rate training -LRB- MERT -RRB- for optimizing feature weights in statistical machine translation -LRB- SMT -RRB- models , and demonstrated that it produced higher translation quality scores than maximizing the conditional likelihood of a maximum entropy model using the same features ',Och,'num_<CIT>_1 nn_<CIT>_Introduction nsubj_introduced_<CIT> amod_training_minimum nn_training_error nn_training_rate dobj_introduced_training abbrev_training_MERT prep_training_for pcomp_for_optimizing nn_weights_feature dobj_optimizing_weights prep_optimizing_in amod_models_statistical nn_models_machine nn_models_translation nn_models_SMT pobj_in_models cc_introduced_and conj_introduced_demonstrated complm_produced_that nsubj_produced_it ccomp_demonstrated_produced amod_scores_higher nn_scores_translation nn_scores_quality dobj_produced_scores prep_scores_than pcomp_than_maximizing det_likelihood_the amod_likelihood_conditional dobj_maximizing_likelihood prep_likelihood_of det_model_a amod_model_maximum nn_model_entropy pobj_of_model partmod_model_using det_features_the amod_features_same dobj_using_features '
n,813,'<OTH> applied the parser of <CIT> developed for English , to Czech , and found thatthe performance wassubstantially lower when compared to the results for English ',Collins,'nsubj_applied_<OTH> det_parser_the dobj_applied_parser prep_parser_of pobj_of_<CIT> partmod_<CIT>_developed prep_developed_for pobj_for_English prep_applied_to pobj_to_Czech cc_applied_and conj_applied_found amod_performance_thatthe nsubj_compared_performance advmod_lower_wassubstantially dep_compared_lower advmod_compared_when xcomp_found_compared prep_compared_to det_results_the pobj_to_results prep_results_for pobj_for_English '
o,814,'This model is related to the averaged perceptron algorithm of <CIT> ',Collins,'det_model_This nsubjpass_related_model auxpass_related_is prep_related_to det_algorithm_the amod_algorithm_averaged nn_algorithm_perceptron pobj_to_algorithm prep_algorithm_of '
o,815,'a22 a14 is the sufficient statistic of a16 a14 Then , we can rewrite a2a24a3 a10a27 a42a7 a25 as : a5a7a6a9a8a11a10 a23 a3 a10 a7 a15 a27 a25a18a17a26a25 a12a28a27 a5a7a6a29a8a30a10 a23 a3 a10 a7 a15 a27 a25a18a17 3 Loss Functions for Label Sequences Given the theoretical advantages of discriminative models over generative models and the empirical support by <CIT> , and that CRFs are the state-of-the-art among discriminative models for label sequences , we chose CRFs as our model , and trained by optimizing various objective functions a31 a3 a10a36 a25 with respect to the corpus a36 The application of these models to the label sequence problems vary widely ',Klein,'amod_a14_a22 nsubj_statistic_a14 cop_statistic_is det_statistic_the amod_statistic_sufficient prep_statistic_of amod_a14_a16 pobj_of_a14 advmod_statistic_Then nsubj_rewrite_we aux_rewrite_can ccomp_statistic_rewrite amod_a25_a2a24a3 amod_a25_a10a27 nn_a25_a42a7 dobj_rewrite_a25 prep_rewrite_as amod_Functions_a5a7a6a9a8a11a10 nn_Functions_a23 amod_Functions_a3 amod_Functions_a10 nn_Functions_a7 amod_Functions_a15 amod_Functions_a27 nn_Functions_a25a18a17a26a25 amod_Functions_a12a28a27 amod_Functions_a5a7a6a29a8a30a10 nn_Functions_a23 amod_Functions_a3 amod_Functions_a10 nn_Functions_a7 amod_Functions_a15 amod_Functions_a27 nn_Functions_a25a18a17 nn_Functions_3 nn_Functions_Loss dep_as_Functions dep_Functions_for nn_Sequences_Label pobj_for_Sequences partmod_Sequences_Given det_advantages_the amod_advantages_theoretical dobj_Given_advantages prep_advantages_of amod_models_discriminative pobj_of_models prep_Given_over amod_models_generative pobj_over_models cc_models_and det_support_the amod_support_empirical conj_models_support prep_Given_by pobj_by_<CIT> cc_statistic_and dep_state-of-the-art_that nsubj_state-of-the-art_CRFs cop_state-of-the-art_are det_state-of-the-art_the ccomp_chose_state-of-the-art prep_state-of-the-art_among amod_models_discriminative pobj_among_models prep_models_for nn_sequences_label pobj_for_sequences nsubj_chose_we conj_statistic_chose dobj_chose_CRFs prep_chose_as poss_model_our pobj_as_model cc_chose_and conj_chose_trained prep_trained_by pcomp_by_optimizing amod_functions_various nn_functions_objective nsubj_a25_functions amod_a25_a31 amod_a25_a3 amod_a25_a10a36 xcomp_optimizing_a25 prep_a25_with pobj_with_respect prep_optimizing_to det_a36_the nn_a36_corpus pobj_to_a36 det_application_The dobj_optimizing_application prep_application_of det_models_these pobj_of_models prep_models_to det_label_the pobj_to_label nn_problems_sequence nsubj_vary_problems rcmod_label_vary advmod_vary_widely '
o,816,'Pustejovsky confronted with the problem of automatic acquisition more extensively in <OTH> ',Dunning,'advmod_confronted_Pustejovsky prep_confronted_with det_problem_the pobj_with_problem prep_problem_of amod_acquisition_automatic pobj_of_acquisition advmod_extensively_more dep_confronted_extensively prep_extensively_in pobj_in_<OTH> '
o,817,'PropBank encodes propositional information by adding a layer of argument structure annotation to the syntactic structures of the Penn Treebank <CIT> ',Marcus,'nsubj_encodes_PropBank amod_information_propositional dobj_encodes_information prep_encodes_by pcomp_by_adding det_layer_a dobj_adding_layer prep_layer_of nn_annotation_argument nn_annotation_structure pobj_of_annotation prep_adding_to det_structures_the amod_structures_syntactic pobj_to_structures prep_structures_of det_Treebank_the nn_Treebank_Penn pobj_of_Treebank '
o,818,'However , while similarity measures -LRB- such as WordNet distance or Lins similarity metric -RRB- only detect cases of semantic similarity , association measures -LRB- such as the ones used by Poesio et al , or by Garera and Yarowsky -RRB- also find cases of associative bridg497 Lin98 RFF TheY TheY : G2 PL03 Land -LRB- country\\/state\\/land -RRB- Staat Staat Kemalismus Regierung Kontinent state state Kemalism government continent Stadt Stadt Bauernfamilie Prasident Region city city agricultural family president region Region Landesregierung Bankgesellschaft Dollar Stadt region country government banking corporation dollar city Bundesrepublik Bundesregierung Baht Albanien Staat federal republic federal government Baht Albania state Republik Gewerkschaft Gasag Hauptstadt Bundesland republic trade union -LRB- a gas company -RRB- capital state Medikament -LRB- medical drug -RRB- Arzneimittel Pille RU Patient Arzneimittel pharmaceutical pill -LRB- a drug -RRB- patient pharmaceutical Praparat Droge Abtreibungspille Arzt Lebensmittel preparation drug -LRB- non-medical -RRB- abortion pill doctor foodstuff Pille Praparat Viagra Pille Praparat pill preparation Viagra pill preparation Hormon Pestizid Pharmakonzern Behandlung Behandlung hormone pesticide pharmaceutical company treatment treatment Lebensmittel Lebensmittel Praparat Abtreibungspille Arznei foodstuff foodstuff preparation abortion pill drug highest ranked words , with very rare words removed : RU 486 , an abortifacient drug Lin98 : Lins distributional similarity measure <OTH> RFF : Geffet and Dagans Relative Feature Focus measure <OTH> TheY : association measure introduced by Garera and Yarowsky <OTH> TheY : G2 : similar method using a log-likelihood-based statistic <CIT> this statistic has a preference for higher-frequency terms PL03 : semantic space association measure proposed by Pado and Lapata <OTH> Table 1 : Similarity and association measures : most similar items ing like 1a , b ; the result of this can be seen in table -LRB- 2 -RRB- : while the similarity measures -LRB- Lin98 , RFF -RRB- list substitutable terms -LRB- which behave like synonyms in many contexts -RRB- , the association measures -LRB- Garera and Yarowskys TheY measure , Pado and Lapatas association measure -RRB- also find non-compatible associations such as countrycapital or drugtreatment , which is why they are commonly called relationfree ',Dunning,'advmod_find_However mark_detect_while nn_measures_similarity nsubj_detect_measures dep_as_such dep_measures_as amod_distance_WordNet pobj_as_distance cc_distance_or nn_metric_Lins nn_metric_similarity conj_distance_metric advmod_detect_only dep_find_detect dobj_detect_cases prep_cases_of amod_similarity_semantic pobj_of_similarity nn_measures_association nsubj_find_measures dep_as_such dep_measures_as det_ones_the pobj_as_ones partmod_ones_used prep_used_by pobj_by_Poesio cc_Poesio_et conj_Poesio_al cc_by_or conj_by_by pobj_by_Garera cc_Garera_and conj_Garera_Yarowsky advmod_find_also dobj_find_cases prep_cases_of amod_TheY_associative amod_TheY_bridg497 nn_TheY_Lin98 nn_TheY_RFF nn_TheY_TheY pobj_of_TheY dep_list_G2 nn_Land_PL03 dobj_G2_Land dep_Land_country\\/state\\/land nn_Staat_Staat nn_Staat_Staat nn_Staat_Kemalismus nn_Staat_Regierung nn_Staat_Kontinent nn_Staat_state nn_Staat_state nn_Staat_Kemalism nn_Staat_government nn_Staat_continent nn_Staat_Stadt nn_Staat_Stadt nn_Staat_Bauernfamilie nn_Staat_Prasident nn_Staat_Region nn_Staat_city nn_Staat_city nn_Staat_agricultural nn_Staat_family nn_Staat_president nn_Staat_region nn_Staat_Region nn_Staat_Landesregierung nn_Staat_Bankgesellschaft nn_Staat_Dollar nn_Staat_Stadt nn_Staat_region nn_Staat_country nn_Staat_government nn_Staat_banking nn_Staat_corporation nn_Staat_dollar nn_Staat_city nn_Staat_Bundesrepublik nn_Staat_Bundesregierung nn_Staat_Baht nn_Staat_Albanien nsubj_G2_Staat amod_republic_federal dep_Bundesland_republic amod_government_federal dep_republic_government nn_Gewerkschaft_Baht nn_Gewerkschaft_Albania nn_Gewerkschaft_state nn_Gewerkschaft_Republik dep_Bundesland_Gewerkschaft nn_Bundesland_Gasag nn_Bundesland_Hauptstadt dep_seen_Bundesland nn_union_republic nn_union_trade dep_Bundesland_union det_company_a nn_company_gas appos_union_company nn_state_capital dep_union_state nn_patient_Medikament amod_drug_medical appos_patient_drug nn_patient_Arzneimittel nn_patient_Pille nn_patient_RU nn_patient_Patient nn_patient_Arzneimittel amod_patient_pharmaceutical nn_patient_pill det_drug_a appos_patient_drug nsubjpass_seen_patient amod_drug_pharmaceutical nn_drug_Praparat nn_drug_Droge nn_drug_Abtreibungspille nn_drug_Arzt nn_drug_Lebensmittel nn_drug_preparation dep_foodstuff_drug dep_drug_non-medical nn_Arznei_abortion nn_Arznei_pill nn_Arznei_doctor nn_Arznei_foodstuff nn_Arznei_Pille nn_Arznei_Praparat nn_Arznei_Viagra nn_Arznei_Pille nn_Arznei_Praparat nn_Arznei_pill nn_Arznei_preparation nn_Arznei_Viagra nn_Arznei_pill nn_Arznei_preparation nn_Arznei_Hormon nn_Arznei_Pestizid nn_Arznei_Pharmakonzern nn_Arznei_Behandlung nn_Arznei_Behandlung nn_Arznei_hormone nn_Arznei_pesticide amod_Arznei_pharmaceutical nn_Arznei_company nn_Arznei_treatment nn_Arznei_treatment nn_Arznei_Lebensmittel nn_Arznei_Lebensmittel nn_Arznei_Praparat nn_Arznei_Abtreibungspille dep_foodstuff_Arznei rcmod_patient_foodstuff nn_drug_foodstuff nn_drug_preparation nn_drug_abortion nn_drug_pill iobj_foodstuff_drug dobj_foodstuff_highest partmod_highest_ranked dobj_ranked_words prep_ranked_with advmod_rare_very amod_words_rare pobj_with_words partmod_words_removed dep_highest_RU num_RU_486 det_Lin98_an amod_Lin98_abortifacient nn_Lin98_drug conj_RU_Lin98 nn_RFF_Lins amod_RFF_distributional nn_RFF_similarity nn_RFF_measure nn_RFF_<OTH> dep_RU_RFF dep_RU_Geffet cc_RU_and nn_TheY_Dagans nn_TheY_Relative nn_TheY_Feature nn_TheY_Focus nn_TheY_measure nn_TheY_<OTH> conj_RU_TheY nn_measure_association dep_highest_measure partmod_measure_introduced prep_introduced_by pobj_by_Garera cc_Garera_and nn_TheY_Yarowsky nn_TheY_<OTH> conj_Garera_TheY dep_highest_G2 amod_method_similar dep_G2_method partmod_method_using det_<CIT>_a amod_<CIT>_log-likelihood-based nn_<CIT>_statistic dobj_using_<CIT> det_statistic_this nsubj_has_statistic rcmod_<CIT>_has det_preference_a dobj_has_preference prep_preference_for amod_PL03_higher-frequency nn_PL03_terms pobj_for_PL03 amod_measure_semantic nn_measure_space nn_measure_association dep_highest_measure partmod_measure_proposed prep_proposed_by pobj_by_Pado cc_highest_and nn_Table_Lapata nn_Table_<OTH> conj_highest_Table dep_Table_1 dep_Table_Similarity cc_Similarity_and nn_measures_association conj_Similarity_measures advmod_similar_most amod_items_similar nn_ing_items dep_Table_ing prep_ing_like pobj_like_1a appos_1a_b det_result_the dep_Table_result prep_result_of pobj_of_this aux_seen_can auxpass_seen_be dep_G2_seen prep_seen_in pobj_in_table appos_table_2 mark_list_while det_measures_the nn_measures_similarity nsubj_list_measures appos_measures_Lin98 dep_Lin98_RFF parataxis_find_list amod_terms_substitutable dobj_list_terms nsubj_behave_which parataxis_list_behave prep_behave_like pobj_like_synonyms prep_synonyms_in amod_contexts_many pobj_in_contexts det_measures_the nn_measures_association nsubj_find_measures nn_measure_Garera cc_Garera_and conj_Garera_Yarowskys nn_measure_TheY dep_find_measure conj_measure_Pado cc_measure_and nn_measure_Lapatas nn_measure_association conj_measure_measure advmod_find_also dep_list_find amod_associations_non-compatible dobj_find_associations dep_as_such prep_associations_as pobj_as_countrycapital cc_countrycapital_or conj_countrycapital_drugtreatment nsubj_is_which rcmod_countrycapital_is advmod_called_why nsubjpass_called_they auxpass_called_are advmod_called_commonly advcl_is_called dep_called_relationfree '
o,819,'Jiao et al propose semi-supervised conditional random fields <CIT> that try to maximize the conditional log-likelihood on the training data and simultaneously minimize the conditional entropy of the class labels on the unlabeled data ',Jiao,'nsubj_propose_Jiao cc_Jiao_et conj_Jiao_al amod_fields_semi-supervised amod_fields_conditional amod_fields_random nsubj_<CIT>_fields ccomp_propose_<CIT> complm_try_that ccomp_<CIT>_try aux_maximize_to xcomp_try_maximize det_log-likelihood_the amod_log-likelihood_conditional dobj_maximize_log-likelihood prep_maximize_on det_data_the nn_data_training pobj_on_data cc_maximize_and advmod_maximize_simultaneously conj_maximize_minimize det_entropy_the amod_entropy_conditional dobj_minimize_entropy prep_entropy_of det_labels_the nn_labels_class pobj_of_labels prep_minimize_on det_data_the amod_data_unlabeled pobj_on_data '
o,820,'<CIT> report extracting database records by learning record field compatibility ',Wick,'nsubj_report_<CIT> xcomp_report_extracting nn_records_database dobj_extracting_records prep_extracting_by pcomp_by_learning amod_compatibility_record nn_compatibility_field dobj_learning_compatibility '
o,821,'Unfortunately , a counterexample illustrated in <OTH> shows that the max function does not produce valid kernels in general ',Pedersen,'advmod_illustrated_Unfortunately det_counterexample_a nsubj_illustrated_counterexample prep_illustrated_in amod_shows_<OTH> pobj_in_shows complm_produce_that det_function_the nn_function_max nsubj_produce_function aux_produce_does neg_produce_not ccomp_illustrated_produce amod_kernels_valid dobj_produce_kernels prep_produce_in pobj_in_general '
p,822,'1 Introduction When data have distinct sub-structures , models exploiting latent variables are advantageous in learning <CIT> ',Matsuzaki,'num_Introduction_1 nsubj_advantageous_Introduction advmod_have_When nsubj_have_data dep_Introduction_have amod_sub-structures_distinct dobj_have_sub-structures nsubj_exploiting_models dep_have_exploiting amod_variables_latent dobj_exploiting_variables cop_advantageous_are prep_advantageous_in pcomp_in_learning '
o,823,'2 Detecting Discourse-New Definite Descriptions 21 Vieira and Poesio Poesio and Vieira <OTH> carried out corpus studies indicating that in corpora like the Wall Street Journal portion of the Penn Treebank <CIT> , around 52 \% of DDs are discourse-new <OTH> , and another 15 \% or so are bridging references , for a total of about 66-67 \% firstmention ',Marcus,'num_Vieira_2 nn_Vieira_Detecting nn_Vieira_Discourse-New nn_Vieira_Definite nn_Vieira_Descriptions num_Vieira_21 nsubj_carried_Vieira cc_Vieira_and nn_Poesio_Poesio conj_Vieira_Poesio cc_Vieira_and nn_<OTH>_Vieira conj_Vieira_<OTH> prt_carried_out nn_studies_corpus dobj_carried_studies xcomp_carried_indicating dobj_indicating_that prep_that_in pobj_in_corpora prep_indicating_like det_portion_the nn_portion_Wall nn_portion_Street nn_portion_Journal pobj_like_portion prep_portion_of det_<CIT>_the nn_<CIT>_Penn nn_<CIT>_Treebank pobj_of_<CIT> prep_carried_around num_\%_52 pobj_around_\% prep_\%_of nn_discourse-new_DDs dep_discourse-new_are pobj_of_discourse-new num_discourse-new_<OTH> cc_carried_and dep_\%_another number_\%_15 nsubj_bridging_\% cc_\%_or conj_\%_so aux_bridging_are conj_carried_bridging dobj_bridging_references prep_bridging_for det_total_a pobj_for_total prep_total_of quantmod_66-67_about num_\%_66-67 pobj_of_\% partmod_\%_firstmention '
o,824,'The distinction between lexical and relational similarity for word pair comparison is recognized by <CIT> -LRB- hecallstheformer attributional similarity -RRB- , though the methods he presents focus on relational similarity ',Turney,'det_distinction_The nsubjpass_recognized_distinction prep_distinction_between amod_similarity_lexical cc_lexical_and conj_lexical_relational pobj_between_similarity prep_similarity_for nn_comparison_word nn_comparison_pair pobj_for_comparison auxpass_recognized_is prep_recognized_by pobj_by_<CIT> amod_similarity_hecallstheformer amod_similarity_attributional appos_<CIT>_similarity advmod_<CIT>_though det_methods_the dep_<CIT>_methods nsubj_presents_he dep_recognized_presents dobj_presents_focus prep_focus_on amod_similarity_relational pobj_on_similarity '
o,825,'The POS disambiguation has usually been performed by statistical approaches mainly using hidden markov model -LRB- HMM -RRB- -LRB- <CIT> et al , 1992 ; Kupiec ',Cutting,'det_disambiguation_The dep_disambiguation_POS nsubjpass_performed_disambiguation aux_performed_has advmod_performed_usually auxpass_performed_been prep_performed_by amod_approaches_statistical pobj_by_approaches advmod_using_mainly partmod_approaches_using amod_model_hidden amod_model_markov dobj_using_model abbrev_model_HMM dep_model_<CIT> cc_<CIT>_et conj_<CIT>_al dep_<CIT>_1992 dep_approaches_Kupiec '
o,826,'As a baseline model we used a maximum entropy tagger , very similar to the one described in <CIT> ',Ratnaparkhi,'prep_used_As det_model_a nn_model_baseline pobj_As_model nsubj_used_we det_tagger_a amod_tagger_maximum nn_tagger_entropy dobj_used_tagger advmod_similar_very dep_used_similar prep_similar_to det_one_the pobj_to_one partmod_one_described prep_described_in '
o,827,'We assign tags of part-of-speech -LRB- POS -RRB- to the words with MXPOST that adopts the Penn Treebank tag set <CIT> ',Ratnaparkhi,'nsubj_assign_We dobj_assign_tags prep_tags_of pobj_of_part-of-speech appos_part-of-speech_POS prep_assign_to det_words_the pobj_to_words prep_words_with pobj_with_MXPOST nsubj_adopts_that rcmod_words_adopts det_tag_the nn_tag_Penn nn_tag_Treebank nsubj_set_tag ccomp_adopts_set '
o,828,'Such coarse-grained inventories can be produced manually from scratch <OTH> or by automatically relating <OTH> or clustering <CIT> existing word senses ',Navigli,'amod_inventories_Such amod_inventories_coarse-grained nsubjpass_produced_inventories aux_produced_can auxpass_produced_be advmod_produced_manually prep_produced_from nn_<OTH>_scratch pobj_from_<OTH> cc_from_or conj_from_by advmod_relating_automatically pcomp_by_relating dobj_relating_<OTH> cc_<OTH>_or amod_senses_clustering nn_senses_<CIT> amod_senses_existing nn_senses_word conj_<OTH>_senses '
o,829,'Given a contextual word cw that occurs in the paragraphs of bc , a log-likelihood ratio -LRB- G2 -RRB- test is employed <CIT> , which checks if the distribution of cw in bc is similar to the distribution of cw in rc ; p -LRB- cw bc -RRB- = p -LRB- cw rc -RRB- -LRB- null hypothesis -RRB- ',Dunning,'prep_employed_Given det_cw_a amod_cw_contextual nn_cw_word dep_Given_cw nsubj_occurs_that rcmod_cw_occurs prep_occurs_in det_paragraphs_the pobj_in_paragraphs prep_paragraphs_of pobj_of_bc det_ratio_a amod_ratio_log-likelihood appos_bc_ratio appos_bc_G2 dep_bc_test auxpass_employed_is nsubjpass_employed_<CIT> nsubj_checks_which rcmod_<CIT>_checks mark_similar_if det_distribution_the nsubj_similar_distribution prep_distribution_of pobj_of_cw prep_cw_in pobj_in_bc cop_similar_is advcl_checks_similar prep_similar_to det_distribution_the pobj_to_distribution prep_distribution_of pobj_of_cw prep_cw_in pobj_in_rc dep_cw_p nn_bc_cw nsubj_p_bc dep_p_= ccomp_p_p nn_rc_cw appos_cw_rc amod_hypothesis_null appos_cw_hypothesis '
o,830,'In this paper we use the so-called Model 4 from <CIT> ',Brown,'prep_use_In det_paper_this pobj_In_paper nsubj_use_we det_4_the amod_4_so-called number_4_Model dobj_use_4 prep_use_from '
o,831,'We would expect the opposite effect with hand-aligned data <CIT> ',Galley,'nsubj_expect_We aux_expect_would det_effect_the amod_effect_opposite dobj_expect_effect prep_expect_with amod_data_hand-aligned pobj_with_data '
o,832,'Extensions to Hiero Several authors describe extensions to Hiero , to incorporate additional syntactic information <CIT> , or to combine it with discriminative latent models <OTH> ',Zhang,'nsubj_describe_Extensions prep_Extensions_to nn_authors_Hiero amod_authors_Several pobj_to_authors dobj_describe_extensions prep_describe_to pobj_to_Hiero aux_incorporate_to dep_describe_incorporate amod_<CIT>_additional amod_<CIT>_syntactic nn_<CIT>_information dobj_incorporate_<CIT> cc_incorporate_or aux_combine_to conj_incorporate_combine dobj_combine_it prep_combine_with amod_<OTH>_discriminative amod_<OTH>_latent nn_<OTH>_models pobj_with_<OTH> '
o,833,'The other form of hybridization ? ? a statistical MT model that is based on a deeper analysis of the syntactic 33 structure of a sentence ? ? has also long been identified as a desirable objective in principle -LRB- consider <CIT> -RRB- ',Wu,'det_form_The amod_form_other dep_?_form prep_form_of pobj_of_hybridization det_MT_a amod_MT_statistical nsubjpass_identified_model nsubjpass_based_that auxpass_based_is dep_model_based prep_based_on det_analysis_a amod_analysis_deeper pobj_on_analysis prep_analysis_of det_structure_the amod_structure_syntactic tmod_syntactic_33 pobj_of_structure prep_structure_of det_sentence_a pobj_of_sentence aux_identified_has advmod_identified_also advmod_identified_long auxpass_identified_been rcmod_MT_identified prep_identified_as det_objective_a amod_objective_desirable pobj_as_objective prep_MT_in pobj_in_principle dep_principle_consider acomp_consider_<CIT> '
o,834,'272 Similarity-based estimation was first used for language modeling in the cooccurrence smoothing method of Essen and Steinbiss <OTH> , derived from work on acoustic model smoothing by Sugawara et al ',Brown,'amod_estimation_272 amod_estimation_Similarity-based nsubjpass_used_estimation auxpass_used_was advmod_used_first prep_used_for nn_modeling_language pobj_for_modeling prep_used_in det_method_the amod_method_cooccurrence amod_method_smoothing pobj_in_method prep_method_of nn_<OTH>_Essen cc_Essen_and conj_Essen_Steinbiss pobj_of_<OTH> partmod_method_derived prep_derived_from pobj_from_work prep_derived_on amod_model_acoustic pobj_on_model partmod_model_smoothing prep_smoothing_by pobj_by_Sugawara cc_Sugawara_et conj_Sugawara_al '
o,835,'Following the setup in <CIT> , we initialize the transition and emission distributions to be uniform with a small amount of noise , and run EM and VB for 1000 iterations ',Johnson,'prep_initialize_Following det_setup_the pobj_Following_setup prep_setup_in pobj_in_<CIT> nsubj_initialize_we det_distributions_the nn_distributions_transition cc_transition_and conj_transition_emission nsubj_uniform_distributions aux_uniform_to cop_uniform_be xcomp_initialize_uniform prep_uniform_with det_amount_a amod_amount_small pobj_with_amount prep_amount_of pobj_of_noise cc_uniform_and conj_uniform_run dobj_run_EM cc_EM_and conj_EM_VB prep_run_for num_iterations_1000 pobj_for_iterations '
o,836,'Our method uses assumptions similar to <CIT> et al 1996 but is naturally suitable for distributed parallel computations ',Berger,'poss_method_Our nsubj_uses_method dobj_uses_assumptions amod_assumptions_similar prep_similar_to pobj_to_<CIT> cc_<CIT>_et conj_<CIT>_al tmod_similar_1996 cc_uses_but cop_suitable_is advmod_suitable_naturally conj_uses_suitable prep_suitable_for amod_computations_distributed amod_computations_parallel pobj_for_computations '
o,837,'The agreement on identifying the boundaries of units , using the statistic discussed in <CIT> , was = 9 -LRB- for two annotators and 500 units -RRB- ; the agreement on features -LRB- 2 annotators and at least 200 units -RRB- was as follows : UTYPE : = 76 ; VERBED : = 9 ; FINITE : = 81 ',Carletta,'det_agreement_The prep_agreement_on pcomp_on_identifying det_boundaries_the dobj_identifying_boundaries prep_boundaries_of pobj_of_units dep_identifying_using det_statistic_the dobj_using_statistic dep_agreement_discussed prep_discussed_in pobj_in_<CIT> aux_=_was dep_<CIT>_= dobj_=_9 dep_discussed_for num_annotators_two pobj_for_annotators cc_annotators_and num_units_500 conj_annotators_units det_agreement_the nsubj_was_agreement prep_agreement_on pobj_on_features dep_features_2 dep_2_annotators cc_annotators_and quantmod_200_at dep_at_least num_units_200 conj_annotators_units parataxis_discussed_was mark_follows_as advcl_was_follows parataxis_discussed_UTYPE dep_UTYPE_= dobj_=_76 parataxis_discussed_VERBED parataxis_discussed_= dobj_=_9 parataxis_discussed_FINITE dep_FINITE_= dobj_=_81 '
n,838,'Hanks and <CIT> proposed using pointwise mutual information to identify collocations in lexicography ; however , the method may result in unacceptable collocations for low-count pairs ',Church,'nsubj_proposed_Hanks cc_Hanks_and conj_Hanks_<CIT> xcomp_proposed_using amod_information_pointwise amod_information_mutual dobj_using_information aux_identify_to xcomp_using_identify dobj_identify_collocations prep_identify_in pobj_in_lexicography advmod_result_however det_method_the nsubj_result_method aux_result_may parataxis_proposed_result prep_result_in amod_collocations_unacceptable pobj_in_collocations prep_collocations_for amod_pairs_low-count pobj_for_pairs '
o,839,'In comparison we introduce 28 several metrics coefficients reported in Albrecht and Hwa <OTH> including smoothed BLEU <OTH> , METEOR <OTH> , HWCM <CIT> , and the metric proposed in Albrecht and Hwa <OTH> using the full feature set ',Liu,'prep_reported_In pobj_In_comparison nsubj_introduce_we rcmod_comparison_introduce num_coefficients_28 amod_coefficients_several nn_coefficients_metrics dobj_introduce_coefficients prep_reported_in pobj_in_Albrecht cc_Albrecht_and conj_Albrecht_Hwa nsubj_reported_<OTH> prep_<OTH>_including nn_<OTH>_smoothed nn_<OTH>_BLEU pobj_including_<OTH> nn_<OTH>_METEOR conj_<OTH>_<OTH> nn_<CIT>_HWCM conj_<OTH>_<CIT> cc_<OTH>_and det_metric_the conj_<OTH>_metric amod_metric_proposed prep_metric_in pobj_in_Albrecht cc_<OTH>_and nn_<OTH>_Hwa conj_<OTH>_<OTH> partmod_<OTH>_using det_set_the amod_set_full nn_set_feature dobj_using_set '
o,840,'This is one manifestation of what is commonly referred to as the data sparseness problem , and was discussed by <CIT> as a side-effect of specificity ',Rapp,'nsubj_manifestation_This cop_manifestation_is num_manifestation_one prep_manifestation_of nsubjpass_referred_what auxpass_referred_is advmod_referred_commonly pcomp_of_referred prep_referred_to mark_sparseness_as det_data_the nsubjpass_sparseness_data pcomp_to_sparseness dobj_sparseness_problem cc_sparseness_and auxpass_discussed_was conj_sparseness_discussed prep_discussed_by pobj_by_<CIT> prep_discussed_as det_side-effect_a pobj_as_side-effect prep_side-effect_of pobj_of_specificity '
o,841,'Techniques for weakening the independence assumptions made by the IBM models 1 and 2 have been proposed in recent work <CIT> ',Berger,'nsubj_made_Techniques prep_Techniques_for pcomp_for_weakening det_assumptions_the nn_assumptions_independence dobj_weakening_assumptions prep_made_by det_models_the nn_models_IBM pobj_by_models nsubjpass_proposed_1 cc_1_and conj_1_2 aux_proposed_have auxpass_proposed_been dep_made_proposed prep_proposed_in amod_work_recent pobj_in_work '
o,842,'C3BTC5 and CCCDCA were used in <OTH> and <CIT> , respectively ',Turney,'nsubjpass_used_C3BTC5 cc_C3BTC5_and conj_C3BTC5_CCCDCA auxpass_used_were prep_used_in pobj_in_<OTH> cc_<OTH>_and conj_<OTH>_<CIT> advmod_used_respectively '
o,843,'In this work we will use structured linear classifiers <CIT> ',Collins,'prep_use_In det_work_this pobj_In_work nsubj_use_we aux_use_will amod_classifiers_structured amod_classifiers_linear dobj_use_classifiers '
o,844,'This is the best automatically learned part-of-speech tagging result known to us , representing an error reduction of 44 \% on the model presented in <CIT> , using the same data splits , and a larger error reduction of 121 \% from the more similar best previous loglinear model in Toutanova and Manning <OTH> ',Collins,'nsubj_best_This cop_best_is det_best_the advmod_learned_automatically partmod_best_learned amod_result_part-of-speech amod_result_tagging dobj_learned_result partmod_result_known prep_known_to pobj_to_us csubj_splits_representing det_reduction_an nn_reduction_error dobj_representing_reduction prep_reduction_of num_\%_44 pobj_of_\% prep_representing_on det_model_the pobj_on_model partmod_model_presented prep_presented_in pobj_in_<CIT> dep_representing_using det_data_the amod_data_same dobj_using_data ccomp_best_splits cc_best_and det_reduction_a amod_reduction_larger nn_reduction_error nsubj_<OTH>_reduction prep_reduction_of num_\%_121 pobj_of_\% prep_\%_from det_model_the amod_model_more amod_model_similar dep_similar_best amod_model_previous nn_model_loglinear pobj_from_model prep_\%_in pobj_in_Toutanova cc_Toutanova_and conj_Toutanova_Manning conj_best_<OTH> '
o,845,'The template we use here is similar to <CIT> , but we have added extra context words before the X and after the Y Our morphological processing also differs from <CIT> ',Turney,'det_template_The nsubj_use_we dep_template_use nsubj_similar_here cop_similar_is ccomp_use_similar aux_<CIT>_to xcomp_similar_<CIT> cc_use_but nsubj_added_we aux_added_have conj_use_added amod_words_extra nn_words_context dobj_added_words prep_added_before det_X_the pobj_before_X cc_before_and conj_before_after det_Y_the pobj_after_Y poss_processing_Our amod_processing_morphological nsubj_differs_processing advmod_differs_also dep_template_differs prep_differs_from '