Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project Scope. First Draft. #11

Open
anvk opened this issue Feb 18, 2015 · 7 comments
Open

Project Scope. First Draft. #11

anvk opened this issue Feb 18, 2015 · 7 comments

Comments

@anvk
Copy link
Collaborator

anvk commented Feb 18, 2015

I think this project is missing a Project Scope. Before jumping into actual development or designs or UI it would be great to outline main functionality we are trying to achieve here. We need to highlight main functionality and think of the ways how they would help to resolve "real-life problems" for users who will be using TweetGeoViz

Jared is it possible for you to formulate and to write down user problems and use cases for TweetGeoViz. What exactly this project is help to resolve? How it can be helpful to others? What examples and use cases for the project you foresee?

After you have user problems, could you to think of the main functionality points of the App and listed them here. Also rank them from 1 to 6 from (1 - very important and main functionality and 6 - nice thing to have).

Once we have this preliminary set we can discuss it during our next online meetup and try to figure out the first milestone with its features. This will spec out what EXACTLY we are trying to build and what UI, tools, components and modules we need to build.

@JaredHawkins
Copy link
Owner

User problem
We have a large database of geotagged tweets (approaching 6 billion). We want to be able to efficiently search tweets based on time/location/text and display this in a user-friendly and interactive visualization. Ideally, we also want to be able to automatically detect clusters of tweets in space/time.

Use cases

  1. Public health: Identifying outbreaks and tracking their spread, in close to real-time (this is my interest, and the inspiration for this project!).
    more to come

Functionality (note: ranked with a score from 1-6, with 1 - very important and main functionality and 6 - nice thing to have).

  • 3 - I would like to be able to visualize tweet text (presumably a small random subset of tweets). This should be presented in a clean UI.
  • 1 - I would like to track tweets geo-temporally, with the goal of being able to visualize the construction of a network over time. One example would be tweets marked by GPS pins, with an network connecting them (perhaps using an animation?).
  • 2 - It would be great to automatically detect (and visualize) clusters, as defined by a close proximity in space and/or time (parameters set by user).
  • 4 - Population density differs widely across different regions. It would be great to normalize aggregated results (e.g., a heat map) based on known population values (for example, in the US this could be done with census data).

@anvk
Copy link
Collaborator Author

anvk commented Feb 19, 2015

Thanks! That looks like a great start already.

We can just pick the most important functionality and try to split it even more looking into what exactly do we need.
Which would be

I would like to track tweets geo-temporally, with the goal of being able to visualize the construction of a network over time. One example would be tweets marked by GPS pins, with an network connecting them (perhaps using an animation?).

But even before going into nitty-gritty details for the functionality, UI, data and other. There are still some unanswered questions.

  • Can anyone get an access to this large database of geotagged tweets? Can someone like me who is fresh and just joined the project can query this database or get the data?
  • What is the process for regular user tweets to become geotagged?
  • How this project is different to an existing HealthMap website http://www.healthmap.org/en/ ? Are we trying to build something similar but OSS ?
  • What is the general UI do you foresee in this project?
  • Would any other medical school be able to use this OSS project or it is something which could be used mainly by Boston Children’s Hospital?

@JaredHawkins
Copy link
Owner

  • Can anyone get an access to this large database of geotagged tweets? Can someone like me who is fresh and just joined the project can query this database or get the data?

The specific database I spoke of is proprietary to our research group. I may be able to get a large data cut for developmental purposes. We can discuss that more for sure.

  • What is the process for regular user tweets to become geotagged?

Twitter users are able to turn on/off geotagging in their preferences. From our own estimates, ~2% of all tweets are geotagged.

  • How this project is different to an existing HealthMap website http://www.healthmap.org/en/ ? Are we trying to build something similar but OSS ?

We are making OSS that any researcher, with a database of tweets (which are freely obtainable via the public Twitter API), can utilize.

HealthMap (of which I am a member) is run out of Boston Children's Hospital and uses various forms of digital data to track disease spread. Primarily, this has been from news articles, blogs, governmental reports, etc. We have done a bit of work in the social media space, but much of this is not live on HealthMap.org (rather, it has been in various side projects). One goal of this endeavor is to use this OSS, and our own tweet database, to push our results to HealthMap.org and other research projects we run.

  • What is the general UI do you foresee in this project?
  • Heatmap (or other way of visualization tweet density)
  • Network/graph of tweets
  • Dynamic cluster detection based on user defined text/time/spatial parameters

This needs to be fleshed out...

  • Would any other medical school be able to use this OSS project or it is something which could be used mainly by Boston Children’s Hospital?

Anyone can use the OSS project. They will need to acquire their own tweets, but as I said this is free through the public Twitter API.

@anvk
Copy link
Collaborator Author

anvk commented Feb 20, 2015

Great! Thanks for answering those. I wanted to be sure that the project is not related to some proprietary code base or organization. Furthermore, sometimes it is very helpful to get new people along so far if they know that everything is open sourced and available for the public.

I think the next step would be to discuss the first milestone in the channel. Next, document everything discussed here in this Issue, to make it available for everyone and for comments. Finally, create list of Tasks/Issues to work on and start developing.

@JaredHawkins
Copy link
Owner

Sounds great - I agree completely!

On Thu, Feb 19, 2015 at 9:31 PM, Alexey Novak [email protected]
wrote:

Great! Thanks for answering those. I wanted to be sure that the project is
not related to some proprietary code base or organization. Furthermore,
sometimes it is very helpful to get new people along so far if they know
that everything is open sourced and available for the public.

I think the next step would be to discuss the first milestone in the
channel. Next, document everything discussed here in this Issue, to make it
available for everyone and for comments. Finally, create list of
Tasks/Issues to work on and start developing.


Reply to this email directly or view it on GitHub
#11 (comment)
.

@anvk
Copy link
Collaborator Author

anvk commented Mar 3, 2015

So here are the basic requirements we want to build for the first version discussed during our last meeting in Slack

  • Ability to display clusters as a heatmap (other ideas of variations are welcome for a discussion)
  • Clusters have to be configured by the user by specifying a set of related words/mentions over a specified time-frame
  • Add ability to show changes in data visualization(heatmap) over time.
  • Ability to see a list of related tweets once clicked in the area. (for now could be just determined by the radius around the cursor click)

@JaredHawkins
Copy link
Owner

This sounds good to me - thanks Alexey!

On Mon, Mar 2, 2015 at 8:49 PM, Alexey Novak [email protected]
wrote:

So here are the basic requirements we want to build for the first version
discussed during our last meeting in Slack

  • Ability to display clusters as a heatmap (other ideas of variations
    are welcome for a discussion)
  • Clusters have to be configured by the user by specifying a set of
    related words/mentions over a specified time-frame
  • Add ability to show changes in data visualization(heatmap) over time.
  • Ability to see a list of related tweets once clicked in the area.
    (for now could be just determined by the radius around the cursor click)


Reply to this email directly or view it on GitHub
#11 (comment)
.

@anvk anvk added this to the 1.0.0 milestone Apr 16, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants