Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve nsub and nn dependencies analysis #106

Merged
merged 41 commits into from
Feb 11, 2015
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
366d817
improve doc
yhamoudi Feb 5, 2015
1701850
Merge branch 'master' into nsubj_revolution
yhamoudi Feb 5, 2015
e8c8875
better
yhamoudi Feb 5, 2015
22e7e3d
better
yhamoudi Feb 5, 2015
f80dcdb
clean + doc + fix unittests
yhamoudi Feb 6, 2015
ac7af03
bla
yhamoudi Feb 6, 2015
2d26746
Merge branch 'master' into nsubj_revolution
yhamoudi Feb 6, 2015
ac79fa9
doc
yhamoudi Feb 6, 2015
cc66c48
doc + improvement
yhamoudi Feb 6, 2015
eb677da
Merge remote-tracking branch 'origin/master' into nsubj_revolution
Ezibenroc Feb 7, 2015
988ca0c
doc
yhamoudi Feb 7, 2015
4b898be
Merge branch 'nsubj_revolution' of github.com:ProjetPP/PPP-QuestionPa…
yhamoudi Feb 7, 2015
1c93ec6
better dependency analysis
yhamoudi Feb 7, 2015
f4291c5
fix unittests
yhamoudi Feb 7, 2015
951f5dc
Add deep tests.
Ezibenroc Feb 7, 2015
cb22b20
more deep tests
yhamoudi Feb 7, 2015
4c27aad
Add dependencyTree correction.
Ezibenroc Feb 7, 2015
747df55
Add tests. Rename function.
Ezibenroc Feb 7, 2015
96c3275
Typo.
Ezibenroc Feb 7, 2015
8876d76
bla
yhamoudi Feb 7, 2015
d814a0e
done
yhamoudi Feb 7, 2015
52a55eb
Merge pull request #108 from ProjetPP/dependency_tree_correction
yhamoudi Feb 7, 2015
a6cc8dd
more deep tests
yhamoudi Feb 7, 2015
6d22685
bla
yhamoudi Feb 7, 2015
7c46184
fix nn analysis
yhamoudi Feb 7, 2015
1e7ecff
More deep tests.
Ezibenroc Feb 7, 2015
8074de6
bla
yhamoudi Feb 7, 2015
0996e86
bli
yhamoudi Feb 7, 2015
89dcb9d
51 deep tests
yhamoudi Feb 7, 2015
eb911cb
bla
yhamoudi Feb 7, 2015
05933d6
bla
yhamoudi Feb 7, 2015
d740479
bla
yhamoudi Feb 8, 2015
ae6ba11
More deep tests.
Ezibenroc Feb 8, 2015
8ae626d
bli
yhamoudi Feb 8, 2015
bbed6e8
Merge branch 'nsubj_revolution' of github.com:ProjetPP/PPP-QuestionPa…
yhamoudi Feb 8, 2015
14313dc
bla
yhamoudi Feb 8, 2015
2dcf677
bla
yhamoudi Feb 8, 2015
76c791b
bla
yhamoudi Feb 8, 2015
009934b
One more question in deep test, and fix spaces.
Ezibenroc Feb 11, 2015
5b35a2a
Merge branch 'master' into nsubj_revolution
yhamoudi Feb 11, 2015
10aa145
bla
yhamoudi Feb 11, 2015
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 81 additions & 0 deletions documentation/Cases_review.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@

Amélioration de nsubj/dobj avec instance_of
===========================================

#### instance_of + nsubj(pass)

* nsubjpass + prep_in : What language is spoken in Argentina?
* nsubj + dobj : What actor married John F. Kennedy's sister?
* nsubj + prep_by : List movies directed by Spielberg
* nsubjpass : Which president has been killed by Oswald?
* nsubjpass : which book was authored by Victor Hugo

#### instance_of + dobj

* Which books did Suzanne Collins write?
* How many films did Ingmar Bergman make?
* How many children does Barack Obama have?
* How many gas stations are there in the United States?

#### nsubj avec verbe nécessaire

* What is the most beautiful country in Europe?
* Who was the first Taiwanese President?
* What was the monetary value of the Nobel Peace Prize in 1989?
* When was Benjamin Disraeli prime minister?
* nsubjpass : Where was Ulysses S. Grant born?
* nsubjpass : Where is Inoco based?
* What was the first Gilbert and Sullivan opera?
* Where is the ENS of Lyon?
* What did Bob write ?
* Who is the author of Sea and Sky?
* Is there a ghost in my house
* Are there computers in your room

#### Question word nsubj

No subject after preprocessing

* Who wrote the song, "Stardust"?
* Who invented the hula hoop?
* Who elected the president ?
* Who was killed by Oswald?

#### Autres

* tmod : which day was the president born
* prep_of : Of which country is Paris the capital? > mal parsé
* prep_in : In which countries is the Lake Victoria? https://www.google.fr/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=%22in+which+countries%22
* prep_from : From which country is Alan Turing?

_________________________________________________________________________________________________________________________________


Exists
======

* Is there a ghost in my house
* Is there a pilot in the plane
* Is there a capital in France
* Is there a king of england > https://www.wikidata.org/wiki/Q18810062
* http://english.stackexchange.com/questions/34353/is-there-versus-are-there
* Are there any articles available on the subject?
* Are there computers in your room
* Does a king of England exist?

_________________________________________________________________________________________________________________________________

Semi question words
===================

* Show me Star Wars movies > mal parsé
* List movies directed by Spielberg
* List books by Roald Dahl
* List albums of Pink Floyd
* List films with Jack Nicholson
* List of US presidents
* List of presidents of France
* Give me the capital of France
* Give the capital of France
* Give us the capital of France
* list of president of usa > mal parsé
220 changes: 177 additions & 43 deletions documentation/General_questions.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,12 @@
General
=======

* Yes/no question: product dobj relation?
* verb+ing: do sthg special (look POS tag)? : What did Richard Feynman say upon hearing he would receive the Nobel Prize in Physics?
* If nounification becomes powerful enough: use it to analyse superlative (biggest > size...)
* Multiple words : Where is Inoco based? > base + place = base place :( >> en fait "base" se nounifie en "place" ?
* Article : enlever numéro apparaissant dans noeud (et idem pour arbre) / enlever encadrement
* data model : autoriser des listes de prédicats dans les sort ?
* réecrire demo3 pour le rendre dépendant de DependencyTree
* t5 peut être enlevé ?
* Dans le question word processing (et plus généralement) : les connecteurs ne sont pas uniquement les 1000. Les 1000 prennent les conj mais pas le superlatives.
* Travailler sur la forme normalisée?
>> garantir qu'en entrée de normalize chaque noeud contient une seule alternative
Expand All @@ -20,6 +18,19 @@ General
* Multiple predicates pour les sort
* Tell me where the DuPont company is located. Name the Ranger who was always after Yogi Bear.
* How do you solve "Rubik's Cube"? > en quoi est transformé how
* réduire le nb de map, ajouter + d'infos
* autres auxiliaire (have) : What dictator has the nickname "El Maximo"
* propagation de types : nsubjRule + qw in strongQuestionWord = R5s
* Who was the leader of the Branch Davidian Cult confronted by the FBI in Waco, Texas in 1993? >> gros sujet
* Where is Inoco based? >> revoir la nounification associée
* Who Clinton defeated? >> prq nounification échoue ? non lemmatizé ?
* Rapprocher/renommer les règles R.. similaires
* __How many__ : opérateur de comptage
> How many films did Ingmar Bergman make?
> How many children does Barack Obama have?
> How many gas stations are there in the United States?
> cf instance_of sur dobj >> on récupère la liste produite en sortie et on renvoie sa taille
> How much did Mercury spend on advertising in 1993?

Remarks
=======
Expand All @@ -33,7 +44,7 @@ Gestion des prep(c)_x
=====================

go -prep_to-> ... = go to -prep->
What is Frozen based on?
* What is Frozen based on?

* What two US biochemists won the Nobel Prize in medicine in 1992?

Expand All @@ -49,55 +60,178 @@ Superlative
Yes/No questions
================

Exists
======
* Yes/no question: product dobj relation?
* nsubj + prep_from : Are you from Germany? > (you,origin,Germany) > yes/no : (subj | pred:be from, do..live | cpt)

* Is there a ghost in my house
* Is there a pilot in the plane
* Is there a capital in France
* Is there a king of england > https://www.wikidata.org/wiki/Q18810062
* http://english.stackexchange.com/questions/34353/is-there-versus-are-there
* Are there any articles available on the subject?
* Are there computers in your room
* Does a king of England exist?
Conjonction
===========

Semi question words
Mauvais :
* What was the first Gilbert and Sullivan opera?

Exemples :
----------
* Who makes and distributes bells?
* Who is the author of Sea and Sky?
* What percentage of the world's plant and animal species can be found in the Amazon forests?
* Good: Who is section manager for guidance and control systems at JPL?
* Bad: How many people did the United Nations commit to help restore order and distribute humanitarian relief in Somalia in September 1992?
* Bad: Which Italian city is home to the Cathedral of Santa Maria del Fiore or the Duomo?

Problem with merging:
---------------------
* What is the length of border between the Ukraine and Russia?

Comment construire les sous arbres
----------------------------------
* What was the first Gilbert and Sullivan opera?
* When was General Manuel Noriega ousted as the leader of Panama and turned over to U.S. authorities?
* When did Princess Diana and Prince Charles get married?
* When did the royal wedding of Prince Andrew and Fergie take place?
* ++ How many people did the United Nations commit to help restore order and distribute humanitarian relief in Somalia in September 1992?
>> peut être propager les prep après ?
>> même problème que pour les nn

Merge nn with the 2 nodes if nn above them:
- When did Princess Diana and Charles get married?
- When did Princess Diana and Prince Charles get married?
- Who is section manager for guidance and control systems at JPL?

_________________________________________________________________________________________________________________________________
_________________________________________________________________________________________________________________________________

Améliorer la MWE recognition
============================

Rattraper un mauvais parsing:
* who is the president of the United states of america
* Where is the ENS of Lyon? (merge car majuscule?)

Good:
* Who is the United States president
* What was the first Gilbert and Sullivan opera?
* Obama is the United States president.

Amod:
* Who is the French president? >> nécessite avant de transformer French en France
* Who was the first Taiwanese President?

What organization was founded by the Rev. Jerry Falwell? >> tagger Rev car majuscule

_________________________________________________________________________________________________________________________________

Traitement des prep
===================

* Show me Star Wars movies
Passer prep en Rnew
* verbe auxiliaire :
-
* verbe non auxiliaire :
- List movies directed by Spielberg
- What language is spoken in Argentina? :(
- What kings ruled on France?
- Who was born on 1984?
* nom :
- List of books by Roald Dahl
- president of France

_________________________________________________________________________________________________________________________________

### nsubj

R5
==

* Where does the president live?

R3
==

R5 ou R3
========

* What did George Orwell write?
* Which books did Suzanne Collins write?

### nsubpass

R5
==

R3
==

R5 ou R3
========

* Where was Ulysses S. Grant born?
* Where is Inoco based?

### agent

R5
==

R3
==

R5 ou R3
========

* Who was killed by Oswald?
* Which president has been killed by Oswald?
* Which books were authored by Victor Hugo?

----------------

### dobj

R5
==

R3
==

R5 ou R3
========

* Who developed Microsoft?
* What actor married John F. Kennedy's sister?
* Who has written "The Hitchhiker's Guide to the Galaxy"?
* Who wrote the song, "Stardust"?
* Who invented the hula hoop?
* Who elected the president ?
* Who killed Gandhi?

### prep (+ V)

R5
==

R3
==

* Which kings ruled on France
* List movies directed by Spielberg
* List books by Roald Dahl
* List albums of Pink Floyd
* List films with Jack Nicholson
* List of US presidents
* List of presidents of France
* Give me the capital of France
* Give the capital of France
* Give us the capital of France

Racine à fils multiples
=======================

* nsubj + prep_from : Are you from Germany? > (you,origin,Germany) > yes/no : (subj | pred:be from, do..live | cpt)
* nsubj + prep_by : List movies directed by Spielberg
* prep_of + prep_of : list of president of usa
* nsubj + prep_by : List movies directed by Spielberg
R5 ou R3
========

* What language is spoken in Argentina?
* Who followed Willy Brandt as chancellor of the Federal Republic of Germany?
* Who was born on 1984

----------------

The animal | lives in | the farm.
Subject Predicate Object >> ( animal , residence , farm )

instance of:
The animal | lives in | the farm.
Object Predicate Subject >> ( farm , inhabitant , animal )

* prep_from + prep_to + prep_on : carpool from Lyon to Paris on December 31 > (?, instance of, carpool) ∩ (?, from, Paris) ∩ (?, to, Lyon) ∩ (?,day, December 31st)
* nsubjpass + prep_in : What language is spoken in Argentina? > (Argentina, language, ?)
* nsubj + dobj : Which books did Suzanne Collins write? > (Suzanne Collins, author, ?) + typage "book" sur ?
* nsubj + dobj (+ do) : What albums did Pearl Jam record?
* nsubj + dobj : What dictator has the nickname "El Maximo"?
* nsubj + dobj : What actor married John F. Kennedy's sister? > (?, instance of, actor) ∩ (?, wife, (John F. Kennedy, sister, ?))
* nsubj + prep_in : How many gas stations are there in the United States?
---------------

* voir Problematic questions dans hierarchy review
processQuestionInfo dans questionWordProcessing doit être le seul habilité à affaiblir une règle en R2 (ou R3 bis) (mais pas en R0: where is the residence)
dependency analysis pose un R5/R3 puis processQuestionInfo affaiblie les dépendances de plus haut niveau s'il trouve l'info en-dessous

Amélioration des question maps
==============================

* How much : ajouter cost
* Plus généralement : réduire le nb de map, ajouter + d'infos
Loading