From 012b1246f3e8009bb80058700eb31e867d7952f8 Mon Sep 17 00:00:00 2001
From: SzymonPajzert <szymonpajzert@gmail.com>
Date: Tue, 12 Apr 2016 17:00:27 +0200
Subject: [PATCH] similiar papers on Stack Overflow

---
 .../StackOverflowCrawler.md                   |  0
 docs/StackOverflowCrawler/similiarPapers.md   | 31 +++++++++++++++++++
 2 files changed, 31 insertions(+)
 rename docs/{ => StackOverflowCrawler}/StackOverflowCrawler.md (100%)
 create mode 100644 docs/StackOverflowCrawler/similiarPapers.md

diff --git a/docs/StackOverflowCrawler.md b/docs/StackOverflowCrawler/StackOverflowCrawler.md
similarity index 100%
rename from docs/StackOverflowCrawler.md
rename to docs/StackOverflowCrawler/StackOverflowCrawler.md
diff --git a/docs/StackOverflowCrawler/similiarPapers.md b/docs/StackOverflowCrawler/similiarPapers.md
new file mode 100644
index 0000000..d414da1
--- /dev/null
+++ b/docs/StackOverflowCrawler/similiarPapers.md
@@ -0,0 +1,31 @@
+# List of similar projects and notes
+Most of the papers in the list is taken from:
+http://meta.stackexchange.com/questions/134495/academic-papers-using-stack-exchange-data
+
+### [Mining StackOverflow to Turn the IDE into a Self-Confident Programming Prompter](http://www.inf.usi.ch/phd/ponzanelli/profile/publications/2014b/Ponz2014b.pdf) + [Prompter: A Self-confident Recommender System](http://www.inf.usi.ch/phd/ponzanelli/profile/publications/2014d/Ponz2014d.pdf)
+IDE plugin, querying code snippets and retrieving evaluated solution. Unluckily, algorithm uses search engines instead of their own machine learning algorithm.
+
+**Possible project value:** important
+
+### [Predicting Tags for StackOverflow Posts](http://chil.rice.edu/research/pdf/StanleyByrne2013StackOverflow.pdf)
+Prediction of tags for given text with 65% accuracy. Prediction done with use of Bayesian probabilistic model.
+
+**Possible project value:** significant
+
+### [StORMeD: Stack Overflow Ready Made Data](http://www.inf.usi.ch/phd/ponzanelli/profile/publications/2015a/Ponz2015a.pdf)
+Ready model and algorithms to mine data in Stack Overflow.
+
+**Possible project value:** meagre
+
+### [Mining Questions Asked by Web Developers](http://salt.ece.ubc.ca/publications/docs/kartik-msr14.pdf)
+Unsupervised learning - topic clustering. Data contained questions about HTML5, JavaScript and CSS. Main goal was to divide and label questions as  using natural language processing and Latent Dirichlet Allocation - type of statistical modeling that can be used to discover hidden topics in
+a collection of documents, based on the statistics of words in each document.
+
+**Possible project value:** meagre
+
+### [Automatic categorization of questions from Q&A sites](http://lascam.facom.ufu.br/cms/userfiles/downloads/2014/SAC2014CameraReady.pdf)
+Q&A questions classification algorithms. Questions on SO are divided into 3 categories: how-to-do-it, need-to-know, seeking-something. Presented algorithms, with varying efficiency classify data - the best turned out to be Naive Bayes.
+
+Naive Bayes: These classifiers assume that all the attributes are independent and that each contributes equally to the categorization. A category is assigned to a project by combining the contribution of each feature.
+
+**Possible project value:** meagre