Skip to content
This repository has been archived by the owner on Jun 4, 2021. It is now read-only.

Updates from last 9 months #23

Open
wants to merge 75 commits into
base: master
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
35f1e35
nlp merge
lenkaB Oct 3, 2019
7c6578d
nlp merge2
lenkaB Oct 3, 2019
0c175ae
add warnings on missing java
as-the-crow-flies Oct 16, 2019
800f97b
adding comments
lenkaB Oct 16, 2019
5864c64
merge
lenkaB Oct 16, 2019
4e40701
add howto
as-the-crow-flies Oct 17, 2019
8cfba56
Merge branch 'develop' of https://github.com/cltl/pepper into develop
as-the-crow-flies Oct 17, 2019
6d37d9a
add oid object detection
as-the-crow-flies Oct 17, 2019
e893ad8
Merge branch 'develop' of https://github.com/cltl/pepper into develop
as-the-crow-flies Oct 17, 2019
54bf109
add handling of scene objects like office/room/restaurant
as-the-crow-flies Oct 18, 2019
92c5fc3
add subtitles component
as-the-crow-flies Oct 18, 2019
b8bb255
add unicode to tablet.show typing
as-the-crow-flies Oct 18, 2019
770a10a
add high rate naoqi camera code example
as-the-crow-flies Oct 18, 2019
943a477
add better support for moving object positioning
as-the-crow-flies Oct 18, 2019
ab2869d
nlp documentation and minor changes
lenkaB Oct 19, 2019
2f43996
include perspective in responso of question
Oct 20, 2019
47de2dd
minor tweaks to HOWTO.md
as-the-crow-flies Oct 21, 2019
be15e4d
Merge branch 'develop' of https://github.com/cltl/pepper into develop
as-the-crow-flies Oct 21, 2019
45d5cc9
add OWL:sameAs relations
Oct 27, 2019
7007f7f
fix phrasing of thoughts
Oct 27, 2019
96c029b
included negation
lenkaB Oct 27, 2019
03ed468
Merge branch 'develop' of https://github.com/cltl/pepper into develop
lenkaB Oct 27, 2019
420a4a6
fixes for iswc
Oct 28, 2019
2c52698
config tweaks
as-the-crow-flies Oct 28, 2019
bb60e81
Merge branch 'develop' of https://github.com/cltl/pepper into develop
as-the-crow-flies Oct 28, 2019
91b6967
add (tmp) brexit news sentences
as-the-crow-flies Oct 28, 2019
c3f65e5
fix typo
as-the-crow-flies Oct 28, 2019
57a2605
add hmk app + fixes
as-the-crow-flies Oct 31, 2019
ca8276b
Update README.md
as-the-crow-flies Oct 31, 2019
e5a7683
remove subtitles as it created weird errors: needs fix
as-the-crow-flies Jan 13, 2020
116bfdb
add more jokes and phrases
Feb 14, 2020
d53c741
Merge branch 'develop' of https://github.com/cltl/pepper into develop
Feb 14, 2020
80bc6b1
modify general responder to include wikipedia
as-the-crow-flies Mar 6, 2020
ad6eaab
Merge branch 'develop' of https://github.com/cltl/pepper into develop
as-the-crow-flies Mar 6, 2020
13809a3
load subtitle component properly
as-the-crow-flies Mar 6, 2020
469d8f2
masters day app
as-the-crow-flies Mar 6, 2020
e56d1e3
last minute fixes for masters day
as-the-crow-flies Mar 10, 2020
e8f3801
visual graph
selBaez Mar 20, 2020
3e227da
configurations for visual graphs, refactoring of brain package, fix u…
selBaez Mar 24, 2020
dd0a96c
Update README.md
selBaez May 25, 2020
1bc1cf7
Major updates: 1) Add trust calculation (includes tailored queries) 2…
selBaez May 25, 2020
ff1ffa9
Merge branch 'develop' of https://github.com/cltl/pepper into develop
selBaez May 25, 2020
3968703
Update README.md
selBaez May 26, 2020
45e31a1
update documentation
selBaez May 26, 2020
305762c
Merge branch 'develop' of https://github.com/cltl/pepper into develop
selBaez May 26, 2020
e40435d
last fixes to ensure running on new system
selBaez May 31, 2020
233d39d
clean up tests
selBaez Jun 30, 2020
3b2ed30
Bump nltk from 3.4 to 3.4.5
dependabot[bot] May 31, 2020
0ca4f87
leftover changes in windows laptop, mostly documentation
selBaez Jul 3, 2020
f30eb41
Merge branch 'develop' of https://github.com/cltl/pepper into develop
selBaez Jul 3, 2020
19c3d33
fixing trust network computation, improve on logs for reasoning and t…
selBaez Jul 21, 2020
5815555
location responder can now reason location on cue
selBaez Jul 21, 2020
75926db
Add required folders to git
numblr Jul 20, 2020
5fd7efd
testing factual responders
selBaez Aug 5, 2020
6e55d96
fixing trust network computation, improve on logs for reasoning and t…
selBaez Jul 21, 2020
736229a
location responder can now reason location on cue
selBaez Jul 21, 2020
855760b
fix typos in location responder
selBaez Aug 7, 2020
662d894
add test for factual responder
selBaez Aug 26, 2020
e038e15
Merge branch 'develop' into bug-2/wolfram-responder-init
selBaez Aug 27, 2020
71482a0
Merge pull request #33 from cltl/bug-2/wolfram-responder-init
selBaez Aug 27, 2020
0c05de6
advances to phrasing
selBaez Aug 31, 2020
433ffcf
changes to response generation, bug on windows parsing single quotes
selBaez Sep 1, 2020
d3424b8
apologise if knowledge is lacking
lkra Sep 1, 2020
40b1d80
Merge remote-tracking branch 'origin/develop' into develop
lkra Sep 1, 2020
11b0da8
changes on the go to responders
selBaez Sep 2, 2020
4d57d56
refactor trust, add documentation to language package
selBaez Oct 2, 2020
529cf56
update documentation
selBaez Nov 18, 2020
6afc85c
eliza app
piekvossen Jan 29, 2021
3db1a58
Merge remote-tracking branch 'origin/develop' into develop
piekvossen Jan 29, 2021
b13bd21
eliza
piekvossen Jan 29, 2021
6b2f664
last fixes to documentation
selBaez Jun 3, 2021
47e2d0c
Merge branch 'develop' of https://github.com/cltl/pepper into develop
selBaez Jun 3, 2021
e81ca73
optimize trust
selBaez Jun 3, 2021
3ea67df
prepare or archive
selBaez Jun 3, 2021
5d34fc5
add citation and authors information
selBaez Jun 4, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
nlp documentation and minor changes
lenkaB committed Oct 19, 2019
commit ab2869d3659fabe8107427fe95be8efdc64a7476
50 changes: 25 additions & 25 deletions pepper/brain/long_term_memory.py
Original file line number Diff line number Diff line change
@@ -55,7 +55,7 @@ def get_thoughts_on_entity(self, entity_label, reason_types=False):

triple = self._rdf_builder.fill_triple_from_label('leolani', 'see', entity_label)

# Check how many items of the same type as subject and object we have
# Check how many items of the same type as subject and complement we have
entity_novelty = self.thought_generator.fill_entity_novelty(entity.id, entity.id)

# Check for gaps, in case we want to be proactive
@@ -95,25 +95,25 @@ def update(self, utterance, reason_types=False):

if reason_types:
# Try to figure out what this entity is
if not utterance.triple.object.types:
object_type, _ = self.type_reasoner.reason_entity_type(str(utterance.triple.object_name),
if not utterance.triple.complement.types:
complement_type, _ = self.type_reasoner.reason_entity_type(str(utterance.triple.complement_name),
exact_only=True)
utterance.triple.object.add_types([object_type])
utterance.triple.complement.add_types([complement_type])

if not utterance.triple.subject.types:
subject_type, _ = self.type_reasoner.reason_entity_type(str(utterance.triple.subject_name),
exact_only=True)
utterance.triple.object.add_types([subject_type])
utterance.triple.complement.add_types([subject_type])

# Create graphs and triples
instance = self._model_graphs_(utterance)

# Check if this knowledge already exists on the brain
statement_novelty = self.thought_generator.get_statement_novelty(instance.id)

# Check how many items of the same type as subject and object we have
# Check how many items of the same type as subject and complement we have
entity_novelty = self.thought_generator.fill_entity_novelty(utterance.triple.subject.id,
utterance.triple.object.id)
utterance.triple.complement.id)

# Find any overlaps
overlaps = self.thought_generator.get_overlaps(utterance)
@@ -124,20 +124,20 @@ def update(self, utterance, reason_types=False):

# Check for conflicts after adding the knowledge
negation_conflicts = self.thought_generator.get_negation_conflicts(utterance)
object_conflict = self.thought_generator.get_object_cardinality_conflicts(utterance)
complement_conflict = self.thought_generator.get_complement_cardinality_conflicts(utterance)

# Check for gaps, in case we want to be proactive
subject_gaps = self.thought_generator.get_entity_gaps(utterance.triple.subject,
exclude=utterance.triple.object)
object_gaps = self.thought_generator.get_entity_gaps(utterance.triple.object,
exclude=utterance.triple.complement)
complement_gaps = self.thought_generator.get_entity_gaps(utterance.triple.complement,
exclude=utterance.triple.subject)

# Report trust
trust = 0 if self.when_last_chat_with(utterance.chat_speaker) == '' else 1

# Create JSON output
thoughts = Thoughts(statement_novelty, entity_novelty, negation_conflicts, object_conflict,
subject_gaps, object_gaps, overlaps, trust)
thoughts = Thoughts(statement_novelty, entity_novelty, negation_conflicts, complement_conflict,
subject_gaps, complement_gaps, overlaps, trust)
output = {'response': code, 'statement': utterance, 'thoughts': thoughts}

else:
@@ -377,29 +377,29 @@ def _create_instance_graph(self, utterance):
elif utterance.type == UtteranceType.EXPERIENCE:
self._link_leolani()

# Object
utterance.triple.object.add_types(['Instance'])
self._link_entity(utterance.triple.object, self.instance_graph)
# Complement
utterance.triple.complement.add_types(['Instance'])
self._link_entity(utterance.triple.complement, self.instance_graph)

# Claim graph
predicate = utterance.triple.predicate if utterance.type == UtteranceType.STATEMENT \
else self._rdf_builder.fill_predicate('see')

claim = self._create_claim_graph(utterance.triple.subject, predicate, utterance.triple.object,
claim = self._create_claim_graph(utterance.triple.subject, predicate, utterance.triple.complement,
utterance.type)

return claim

def _create_claim_graph(self, subject, predicate, object, claim_type=UtteranceType.STATEMENT):
def _create_claim_graph(self, subject, predicate, complement, claim_type=UtteranceType.STATEMENT):
# Statement
claim_label = hash_claim_id([subject.label, predicate.label, object.label])
claim_label = hash_claim_id([subject.label, predicate.label, complement.label])

claim = self._rdf_builder.fill_entity(claim_label, ['Event', 'Instance', claim_type.name.title()], 'LW')
self._link_entity(claim, self.claim_graph)

# Create graph and add triple
graph = self.dataset.graph(claim.id)
graph.add((subject.id, predicate.id, object.id))
graph.add((subject.id, predicate.id, complement.id))

return claim

@@ -471,11 +471,11 @@ def _create_perspective_graph(self, utterance, subevent, claim_type, detection=N
# Bidirectional link between mention and individual instances
if claim_type == UtteranceType.STATEMENT:
self.instance_graph.add((utterance.triple.subject.id, self.namespaces['GRASP']['denotedIn'], mention.id))
self.instance_graph.add((utterance.triple.object.id, self.namespaces['GRASP']['denotedIn'], mention.id))
self.instance_graph.add((utterance.triple.complement.id, self.namespaces['GRASP']['denotedIn'], mention.id))
self.perspective_graph.add(
(mention.id, self.namespaces['GRASP']['containsDenotation'], utterance.triple.subject.id))
self.perspective_graph.add(
(mention.id, self.namespaces['GRASP']['containsDenotation'], utterance.triple.object.id))
(mention.id, self.namespaces['GRASP']['containsDenotation'], utterance.triple.complement.id))
else:
self.instance_graph.add((detection.id, self.namespaces['GRASP']['denotedIn'], mention.id))
self.perspective_graph.add((mention.id, self.namespaces['GRASP']['containsDenotation'], detection.id))
@@ -514,11 +514,11 @@ def _create_query(self, utterance):
?author rdfs:label ?authorlabel .
}
""" % (utterance.triple.predicate_name,
utterance.triple.object_name,
utterance.triple.complement_name,
utterance.triple.predicate_name)

# Query object
elif utterance.triple.object_name == empty:
# Query complement
elif utterance.triple.complement_name == empty:
query = """
SELECT distinct ?olabel ?authorlabel
WHERE {
@@ -556,7 +556,7 @@ def _create_query(self, utterance):
}
""" % (utterance.triple.predicate_name,
utterance.triple.subject_name,
utterance.triple.object_name,
utterance.triple.complement_name,
utterance.triple.predicate_name)

query = self.query_prefixes + query
100 changes: 50 additions & 50 deletions pepper/brain/utils/response.py
Original file line number Diff line number Diff line change
@@ -242,7 +242,7 @@ def _fix_predicate_morphology(subject, predicate, object, format='triple'):


class Triple(object):
def __init__(self, subject, predicate, object):
def __init__(self, subject, predicate, complement):
# type: (Entity, Predicate, Entity) -> None
"""
Construct Triple Object
@@ -252,13 +252,13 @@ def __init__(self, subject, predicate, object):
Instance that is the subject of the information just received
predicate: Predicate
Predicate of the information just received
object: Entity
complement: Entity
Instance that is the object of the information just received
"""

self._subject = subject
self._predicate = predicate
self._object = object
self._complement = complement

@property
def subject(self):
@@ -271,9 +271,9 @@ def predicate(self):
return self._predicate

@property
def object(self):
def complement(self):
# type: () -> Entity
return self._object
return self._complement

@property
def subject_name(self):
@@ -286,19 +286,19 @@ def predicate_name(self):
return self._predicate.label if self._predicate is not None else None

@property
def object_name(self):
def complement_name(self):
# type: () -> str
return self._object.label if self._object is not None else None
return self._complement.label if self._complement is not None else None

@property
def subject_types(self):
# type: () -> str
return self._subject.types_names if self._subject is not None else None

@property
def object_types(self):
def complement_types(self):
# type: () -> str
return self._object.types_names if self._object is not None else None
return self._complement.types_names if self._complement is not None else None

# TODO not good practice and not used, might think of deleting three setters below
def set_subject(self, subject):
@@ -309,9 +309,9 @@ def set_predicate(self, predicate):
# type: (Predicate) -> ()
self._predicate = predicate

def set_object(self, object):
def set_complement(self, complement):
# type: (Entity) -> ()
self._object = object
self._complement = complement

def casefold(self, format='triple'):
# type (str) -> ()
@@ -326,11 +326,11 @@ def casefold(self, format='triple'):
"""
self._subject.casefold(format)
self._object.casefold(format)
self._predicate.casefold(self.subject, self.object, format)
self._complement.casefold(format)
self._predicate.casefold(self.subject, self.complement, format)

def __iter__(self):
return iter([('subject', self.subject), ('predicate', self.predicate), ('object', self.object)])
return iter([('subject', self.subject), ('predicate', self.predicate), ('complement', self.complement)])

def __repr__(self):
return '{} [{}])'.format(hash_claim_id([self.subject_name
@@ -339,12 +339,12 @@ def __repr__(self):
self.predicate_name
if self.predicate_name is not None
and self.predicate_name not in ['', Literal('')] else '?',
self.object_name
if self.object_name is not None
and self.object_name not in ['', Literal('')] else '?']),
self.complement_name
if self.complement_name is not None
and self.complement_name not in ['', Literal('')] else '?']),
hash_claim_id([self.subject_types if self.subject_types is not None else '?',
'->',
self.object_types if self.object_types is not None else '?']))
self.complement_types if self.complement_types is not None else '?']))


class Perspective(object):
@@ -698,29 +698,29 @@ def __repr__(self):


class Gaps(object):
def __init__(self, subject_gaps, object_gaps):
def __init__(self, subject_gaps, complement_gaps):
# type: (List[Gap], List[Gap]) -> None
"""
Construct Gap Object
Parameters
----------
subject_gaps: List[Gap]
List of gaps with potential things to learn about the original subject
object_gaps: List[Gap]
List of gaps with potential things to learn about the original object
complement_gaps: List[Gap]
List of gaps with potential things to learn about the original complement
"""
self._subject = subject_gaps
self._object = object_gaps
self._complement = complement_gaps

@property
def object(self):
def subject(self):
# type: () -> List[Gap]
return self._subject

@property
def subject(self):
def complement(self):
# type: () -> List[Gap]
return self._object
return self._complement

def casefold(self, format='triple'):
# type (str) -> ()
@@ -736,12 +736,12 @@ def casefold(self, format='triple'):
"""
for g in self._subject:
g.casefold(format)
for g in self._object:
for g in self._complement:
g.casefold(format)

def __repr__(self):
s = random.choice(self._subject) if self._subject else ''
o = random.choice(self._object) if self._object else ''
o = random.choice(self._complement) if self._complement else ''
return '{} - {}'.format(s.__repr__(), o.__repr__())


@@ -808,29 +808,29 @@ def __repr__(self):


class Overlaps(object):
def __init__(self, subject_overlaps, object_overlaps):
def __init__(self, subject_overlaps, complement_overlaps):
# type: (List[Overlap], List[Overlap]) -> None
"""
Construct Overlap Object
Parameters
----------
subject_overlaps: List[Overlap]
List of overlaps shared with original subject
object_overlaps: List[Overlap]
List of overlaps shared with original object
complement_overlaps: List[Overlap]
List of overlaps shared with original complement
"""
self._subject = subject_overlaps
self._object = object_overlaps
self._complement = complement_overlaps

@property
def subject(self):
# type: () -> List[Overlap]
return self._subject

@property
def object(self):
def complement(self):
# type: () -> List[Overlap]
return self._object
return self._complement

def casefold(self, format='triple'):
# type (str) -> ()
@@ -846,18 +846,18 @@ def casefold(self, format='triple'):
"""
for g in self._subject:
g.casefold(format)
for g in self._object:
for g in self._complement:
g.casefold(format)

def __repr__(self):
s = random.choice(self._subject) if self._subject else ''
o = random.choice(self._object) if self._object else ''
o = random.choice(self._complement) if self._complement else ''
return '{} - {}'.format(s.__repr__(), o.__repr__())


class Thoughts(object):
def __init__(self, statement_novelty, entity_novelty, negation_conflicts, object_conflict,
subject_gaps, object_gaps, overlaps, trust):
def __init__(self, statement_novelty, entity_novelty, negation_conflicts, complement_conflict,
subject_gaps, complement_gaps, overlaps, trust):
# type: (List[StatementNovelty], EntityNovelty, List[NegationConflict], List[CardinalityConflict], Gaps, Gaps, Overlaps, float) -> None
"""
Construct Thoughts object
@@ -869,12 +869,12 @@ def __init__(self, statement_novelty, entity_novelty, negation_conflicts, object
Information if the entities involved are novel
negation_conflicts: Optional[List[NegationConflict]]
Information regarding conflicts of opposing statements heard
object_conflict: List[CardinalityConflict]
complement_conflict: List[CardinalityConflict]
Information regarding conflicts by violating one to one predicates
subject_gaps: Gaps
Information about what can be learned of the subject
object_gaps: Gaps
Information about what can be learned of the object
complement_gaps_gaps: Gaps
Information about what can be learned of the complement
overlaps: Overlaps
Information regarding overlaps of this statement with things heard so far
trust: float
@@ -884,15 +884,15 @@ def __init__(self, statement_novelty, entity_novelty, negation_conflicts, object
self._statement_novelty = statement_novelty
self._entity_novelty = entity_novelty
self._negation_conflicts = negation_conflicts
self._object_conflict = object_conflict
self._complement_conflict = complement_conflict
self._subject_gaps = subject_gaps
self._object_gaps = object_gaps
self._complement_gaps = complement_gaps
self._overlaps = overlaps
self._trust = trust

def object_conflict(self):
def complement_conflicts(self):
# type: () -> List[CardinalityConflict]
return self._object_conflict
return self._complement_conflict

def negation_conflicts(self):
# type: () -> List[NegationConflict]
@@ -906,9 +906,9 @@ def entity_novelty(self):
# type: () -> EntityNovelty
return self._entity_novelty

def object_gaps(self):
def complement_gaps(self):
# type: () -> Gaps
return self._object_gaps
return self._complement_gaps

def subject_gaps(self):
# type: () -> Gaps
@@ -938,16 +938,16 @@ def casefold(self, format='triple'):
n.casefold(format)
for c in self._negation_conflicts:
c.casefold(format)
for c in self._object_conflict:
for c in self._complement_conflict:
c.casefold(format)
self._subject_gaps.casefold(format)
self._object_gaps.casefold(format)
self._complement_gaps.casefold(format)
self._overlaps.casefold(format)

def __repr__(self):
representation = {'statement_novelty': self._statement_novelty, 'entity_novelty': self._entity_novelty,
'negation_conflicts': self._negation_conflicts, 'object_conflict': self._object_conflict,
'subject_gaps': self._subject_gaps, 'object_gaps': self._object_gaps,
'negation_conflicts': self._negation_conflicts, 'complement_conflict': self._complement_conflict,
'subject_gaps': self._subject_gaps, 'complement_gaps': self._complement_gaps,
'overlaps': self._overlaps}

return '{}'.format(representation)
521 changes: 291 additions & 230 deletions pepper/language/analyzer.py

Large diffs are not rendered by default.

20 changes: 13 additions & 7 deletions pepper/language/language.py
Original file line number Diff line number Diff line change
@@ -26,12 +26,18 @@


class Time(enum.Enum):
"""
This will be used in the future to represent tense
"""
PAST = -1
PRESENT = 0
FUTURE = 1


class Emotion(enum.Enum): # Not used yet
class Emotion(enum.Enum):
"""
This will be used in the future to represent emotion
"""
ANGER = 0
DISGUST = 1
FEAR = 2
@@ -346,11 +352,11 @@ def analyze(self):
if not analyzer:
return "I cannot parse your input"

for el in ["subject", "predicate", "object"]:
for el in ["subject", "predicate", "complement"]:
Analyzer.LOG.info(
"RDF {:>10}: {}".format(el, json.dumps(analyzer.rdf[el], sort_keys=True, separators=(', ', ': '))))
"RDF {:>10}: {}".format(el, json.dumps(analyzer.triple[el], sort_keys=True, separators=(', ', ': '))))

self.pack_triple(analyzer.rdf, analyzer.utterance_type)
self.pack_triple(analyzer.triple, analyzer.utterance_type)

if analyzer.utterance_type == UtteranceType.STATEMENT:
self.pack_perspective(analyzer.perspective)
@@ -380,10 +386,10 @@ def pack_triple(self, rdf, utterance_type):
subject = builder.fill_entity(casefold_text(rdf['subject']['text'], format='triple'),
rdf['subject']['type'])
predicate = builder.fill_predicate(casefold_text(rdf['predicate']['text'], format='triple'))
object = builder.fill_entity(casefold_text(rdf['object']['text'], format='triple'),
rdf['object']['type'])
complement = builder.fill_entity(casefold_text(rdf['complement']['text'], format='triple'),
rdf['complement']['type'])

self.set_triple(Triple(subject, predicate, object))
self.set_triple(Triple(subject, predicate, complement))

def pack_perspective(self, persp):
self.set_perspective(Perspective(persp['certainty'], persp['polarity'], persp['sentiment']))
157 changes: 84 additions & 73 deletions pepper/language/utils/helper_functions.py
Original file line number Diff line number Diff line change
@@ -21,38 +21,36 @@
lexicon = json.load(open(os.path.join(ROOT, 'data', 'lexicon.json')))


def trim_dash(triple):
"""
:param triple: a set with three elements (subject, object, complement)
:return: clean triple with extra dashes removed
"""
for el in triple:
if triple[el]:
if triple[el].startswith('-'):
triple[el] = triple[el][1:]
if triple[el].endswith('-'):
triple[el] = triple[el][:-1]
return triple


def trim_dash(rdf):
'''
:param rdf:
:return: clean rdf-triple with extra dashes removed
'''
for el in rdf:
if rdf[el]:
if rdf[el].startswith('-'):
rdf[el] = rdf[el][1:]
if rdf[el].endswith('-'):
rdf[el] = rdf[el][:-1]
return rdf


def get_type(element, forest):
'''
:param element: text of rdf element
def get_triple_element_type(element, forest):
"""
:param element: text of one element from the triple
:param forest: parsed tree
:return: semantic type of the el.
'''
type = {}
:return: dictionary with semantic types of the element or sub-elements
"""

types = {}

if '-' in element:
text = ''
for el in element.split('-'):
text+=el+' '

text += el+' '

text = text.strip()
uris = get_uri(text)
uris = get_uris(text)

print('LOOKUP: ', text, len(uris))

@@ -64,38 +62,38 @@ def get_type(element, forest):
return 'NE-col'

# collocations which exist in WordNet
syns = wu.get_synsets(text, get_node_label(forest, text))
if len(syns):
typ = wu.get_lexname(syns)
synsets = wu.get_synsets(text, get_node_label(forest, text))
if len(synsets):
typ = wu.get_lexname(synsets)
return typ+'-col'

# if entity does not exist in DBP or WN it is considered composite
for el in element.split('-'):
type[el] = get_word_type(el, forest)
types[el] = get_word_type(el, forest)
else:
type[element] = get_word_type(element, forest)
return type
types[element] = get_word_type(element, forest)

return types


def get_word_type(word, forest):
'''
:param word: one word from rdf element
"""
:param word: one word from triple element
:param forest: parsed syntax tree
:return: semantic type of word
'''
"""

if word== '':
if word == '':
return ''

type = get_lexname(word, forest)
lexname = get_lexname(word, forest)

if type is not None:
return type
if lexname is not None:
return lexname

# words which don't have a lexname are looked up in the lexicon
entry = lexicon_lookup(word)
if entry is not None:
#print(element, entry)
if 'proximity' in entry:
return 'deictic:'+entry['proximity']+','+entry['number']
if 'person' in entry:
@@ -108,6 +106,7 @@ def get_word_type(word, forest):
return 'numeral:'+entry['integer']

types = {'NN': 'person', 'V': 'verb', 'IN': 'prep', 'TO': 'prep', 'MD': 'modal'}

# for words which are not in the lexicon nor have a lexname, the sem.type is derived from the POS tag
pos = pos_tag([word])[0][1]
if pos in types:
@@ -120,17 +119,18 @@ def get_word_type(word, forest):

def get_lexname(word, forest):
'''
:param word:
:param forest:
:param word: word for which we want a WordNe lexname
:param forest: parsed forest of the sentence, to extract the POS tag
:return: lexname of the word
https://wordnet.princeton.edu/documentation/lexnames5wn
'''
if word== '':
if word == '':
return

label = get_node_label(forest[0], word)
if label=='':
if label == '':
label = pos_tag([word])
if label=='':
if label == '':
return None
label = label[0][1]

@@ -145,20 +145,20 @@ def get_lexname(word, forest):

def fix_pronouns(pronoun, self):
"""
:param pronoun:
:param self:
:return:
:param pronoun: personal ronoun which is said in the sentence
:param self: Utterance object from which we can get the speaker and lexicon
:return: disambiguated first or second person pronoun
In the case of third person pronouns - guesses or asks questions
* plural *
"""
#print('fixing', dict)
lexicon = self.LEXICON
speaker = self.chat.speaker

dict = lexicon_lookup(pronoun, lexicon)
speaker = self.chat.speaker
entry = lexicon_lookup(pronoun, lexicon)

if dict and 'person' in dict:
if dict['person'] == 'first':
if entry and 'person' in entry:
if entry['person'] == 'first':
return speaker
elif dict['person'] == 'second':
elif entry['person'] == 'second':
return 'leolani'
else:
#print('disambiguate third person')
@@ -168,12 +168,12 @@ def fix_pronouns(pronoun, self):


def lemmatize(word, tag=''):
'''
"""
This function uses the WordNet lemmatizer
:param word:
:param word: word to be lemmatized
:param tag: POS tag of word
:return: word lemma
'''
"""
lem = ''
if len(word.split()) > 1:
for el in word.split():
@@ -185,29 +185,30 @@ def lemmatize(word, tag=''):


def get_node_label(tree, word):
'''
"""
This function extracts POS tag of a word from the parsed syntax tree
:param tree: syntax tree gotten from initial CFG parsing
:param word: word whose POS tag we want
:return: POS tag of the word
'''

#if '-' in word:
# word = word.replace('-',' ')

"""
label = ''
for el in tree:
for node in el:
if type(node)==ntree.Tree:
if type(node) == ntree.Tree:
for subtree in node.subtrees():
for n in subtree:
if n==word:
if n == word:
label = str(subtree.label())
return label



def lexicon_lookup(word, typ=None):
""" Look up and return features of a given word in the lexicon. """
"""
Look up and return features of a given word in the lexicon.
:param word: word which we're looking up
:param typ: type of word, if type is category then returns the lexicon entry and the word type
:return: lexicon entry of the word
"""

# Define pronoun categories.
pronouns = lexicon["pronouns"]
@@ -315,19 +316,22 @@ def lexicon_lookup(word, typ=None):
question_words,
kinship]

# print("looking up: ", word)

for category in categories:
for item in category:
if word == item:
if typ=='category':
#print(type(category), category)
return category, category [item]
if typ == 'category':
return category, category[item]
return category[item]
return None


def dbp_query(q, baseURL, format="application/json"):
def dbp_query(q, base_url, format="application/json"):
"""
:param q: query for DBpedia
:param base_url: URL to connect to DBpedia
:param format: format for query, typically json
:return: json with DBpedia responses
"""
params = {
"default-graph": "",
"should-sponge": "soft",
@@ -338,19 +342,26 @@ def dbp_query(q, baseURL, format="application/json"):
"save": "display",
"fname": ""
}

querypart = urllib.urlencode(params)
response = urllib.urlopen(baseURL, querypart).read()
response = urllib.urlopen(base_url, querypart).read()
return json.loads(response)


def get_uri(string):
def get_uris(string):
"""
:param string: string which we are querying for
:return: set of URIS from DBpedia for the queried string
"""
query = """PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?pred WHERE {
?pred rdfs:label """ + "'" + string + "'" + """@en .
}
ORDER BY ?pred"""

results = dbp_query(query, "http://dbpedia.org/sparql")
uris = []
for x in results['results']['bindings']:
uris.append(x['pred']['value'])

return uris
9 changes: 4 additions & 5 deletions test/language/data/statements.txt
Original file line number Diff line number Diff line change
@@ -1,13 +1,12 @@
I have three white cats: lenka have three-white-cats
where is she: she be ?
I know you: lenka know leolani
my best friend is he: lenka best-friend-is he

I have three white cats: lenka have three-white-cats
I think Selene doesn't like cheese: selene like cheese
I think Selene hates cheese: selene hate cheese
selene might come today: selene might-come today


I don't think selene likes cheese: selene like cheese


lana can read a book: lana can-read a-book
lana must read: lana must-read
john will come to Amsterdam: john will-come-to Amsterdam
32 changes: 16 additions & 16 deletions test/language/test.py
Original file line number Diff line number Diff line change
@@ -31,31 +31,31 @@ def load_golden_triples(filepath):
gold = []

for sample in test:
rdf = {}
triple = {}
if sample == '\n':
break

# print(sample.split(':')[0],sample.split(':')[1])
test_suite.append(sample.split(':')[0])
rdf['subject'] = sample.split(':')[1].split()[0].lower()
rdf['predicate'] = sample.split(':')[1].split()[1].lower()
triple['subject'] = sample.split(':')[1].split()[0].lower()
triple['predicate'] = sample.split(':')[1].split()[1].lower()
if len(sample.split(':')[1].split()) > 2:
rdf['object'] = sample.split(':')[1].split()[2].lower()
triple['complement'] = sample.split(':')[1].split()[2].lower()
else:
rdf['object'] = ''
triple['complement'] = ''

if len(sample.split(':')) > 2:
rdf['perspective'] = {}
rdf['perspective']['certainty'] = float(sample.split(':')[2].split()[0])
rdf['perspective']['polarity'] = float(sample.split(':')[2].split()[1])
rdf['perspective']['sentiment'] = float(sample.split(':')[2].split()[2])
# print('stored perspective ', rdf['perspective'])
triple['perspective'] = {}
triple['perspective']['certainty'] = float(sample.split(':')[2].split()[0])
triple['perspective']['polarity'] = float(sample.split(':')[2].split()[1])
triple['perspective']['sentiment'] = float(sample.split(':')[2].split()[2])
# print('stored perspective ', triple['perspective'])

for el in rdf:
if rdf[el] == '?':
rdf[el] = ''
for el in triple:
if triple[el] == '?':
triple[el] = ''

gold.append(rdf)
gold.append(triple)

return test_suite, gold

@@ -102,10 +102,10 @@ def compare_triples(triple, gold):
else:
print('MISMATCH: ', triple.subject, gold['subject'])

if str(triple.object) == gold['object']:
if str(triple.complement) == gold['complement']:
correct += 1
else:
print('MISMATCH: ', triple.object, gold['object'])
print('MISMATCH: ', triple.complement, gold['complement'])

return correct