redditPulse

A website for assisting with training topic models on Reddit data.

Currently, this tool automatically pulls large data sets from pushshift and delivers them either in their original format, or in the correct format for jsLDA 2.0. The ultimate plan for this website is to host features similar to jsLDA 2.0's features, but more tailored for reddit data.

Notice:

This tool relies heavily on the aggregation feature of pushshift in order to simulate a random distribution of comments over time. Currently, pushshift has disabled this function because it was too computationally costly. It is unclear when or if pushshift will enable this feature, but as long as it is down, this tool will not work. I am working on a fix which will involve automatically detecting whether the aggregation feature is running and an option to gather data in a way that does not control for changes in the frequency of posts over time (and therefore doesn't require any aggregation calls).

Installation:

To use, use npm install and then npm start.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.vscode		.vscode
node_modules		node_modules
public		public
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_React.md		README_React.md
package-lock.json		package-lock.json
package.json		package.json
redditPulse-ScreenShot.png		redditPulse-ScreenShot.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

redditPulse

A website for assisting with training topic models on Reddit data.

Notice:

Installation:

About

Releases

Packages

Languages

License

theobayard/redditPulse

Folders and files

Latest commit

History

Repository files navigation

redditPulse

A website for assisting with training topic models on Reddit data.

Notice:

Installation:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages