Added example #48

GHLgh · 2017-04-25T01:39:33Z

@danyaljj Example for the first bullet point in #44

We can close this pr after the example is put in ipython notebook

danyaljj · 2017-04-25T01:41:36Z

demo.py

@@ -0,0 +1,43 @@
+from sioux import remote_pipeline


maybe we create an examples folder/module?

Actually I'm thinking maybe we should make it a iPython notebook. Like all the tutorials and examples can be iPython notebooks. @bhargav comment on this?

It would be good to have ipython notebooks. 👍

danyaljj · 2017-04-25T01:53:30Z

How about we change it slightly in this way?
In the example, retrieve a random document. (Say this: https://github.com/ryanmcdermott/trump-speeches/blob/master/speeches.txt ).
Then count all the verbs (POS = VB, VBB, VBD, VBG, VBN, VBZ, VBP) that occur "immediately after" a person (NER = PER). (By "immediately after" Like same sentence, I mean after, same sentence, within window of 3 words.)

What do you think?

GHLgh · 2017-04-25T03:01:50Z

It's doable, I can try that.

When you said 3 words, do you mind 3 tokens? I ask about it because punctuations are also counted as tokens, right?

danyaljj · 2017-04-25T04:05:35Z

Yeah tokens should be fine.

danyaljj · 2017-04-25T04:06:55Z

BTW, we shouldn't send everything altogether to the pipeline. We can split based on new lines and tabs, before sending it to the pipeline.

bhargav · 2017-04-25T05:37:09Z

Also as a general comment, the usage is not easy. We should make it easier to access neighboring tokens somehow. https://github.com/CogComp/sioux/pull/48/files#diff-dc8b50acc65729bc37a3b573f4ab541eR31

Also being able to iterate over a view would be useful IMO.

for ner_token in pipeline.get_ner(doc):
    print(ner_token['label'])

GHLgh · 2017-04-25T14:12:35Z

Good idea, I can make the class a iterator, then we can get rid of some_view_class.get_cons()

@bhargav how would you want to be easier to access neighboring tokens? If we can iterate the view and find constituent by index, would that be sufficient?

I can make the usage simpler by adding corresponding tokens in the constituent (then we have ner_con['tokens'] == 'tokens of this constituent'). Right now we have to do some_view.get_cons(key='token')[constituent_index]

Added example

13322ed

danyaljj reviewed Apr 25, 2017

View reviewed changes

GHLgh closed this Apr 26, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added example #48

Added example #48

GHLgh commented Apr 25, 2017

danyaljj Apr 25, 2017

danyaljj Apr 25, 2017

bhargav Apr 25, 2017

danyaljj commented Apr 25, 2017 •

edited

Loading

GHLgh commented Apr 25, 2017

danyaljj commented Apr 25, 2017

danyaljj commented Apr 25, 2017

bhargav commented Apr 25, 2017

GHLgh commented Apr 25, 2017

Added example #48

Added example #48

Conversation

GHLgh commented Apr 25, 2017

danyaljj Apr 25, 2017

Choose a reason for hiding this comment

danyaljj Apr 25, 2017

Choose a reason for hiding this comment

bhargav Apr 25, 2017

Choose a reason for hiding this comment

danyaljj commented Apr 25, 2017 • edited Loading

GHLgh commented Apr 25, 2017

danyaljj commented Apr 25, 2017

danyaljj commented Apr 25, 2017

bhargav commented Apr 25, 2017

GHLgh commented Apr 25, 2017

danyaljj commented Apr 25, 2017 •

edited

Loading