-
Notifications
You must be signed in to change notification settings - Fork 27
/
Copy pathtodo.tex
393 lines (266 loc) · 11.7 KB
/
todo.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
##########
# Project Deliverables
## About delivering the project material
The project files, including the code and the report should be placed
in the Github repo of the leader of the team, while all the teammates
update their Readme files with the information about the project,
including the names of the teammates in the same order. All that are
not the leader must include the field:
duplicate: yes
in their project. You also must indicate if it is a project or report
in the type field
type: project
or
type: report
While collaborating to the code,
the teammates can make pull requests to the repository of the leader,
which is the proper way to work on Github. However, if it’s necessary
for the other teammates to have writing access to the repo, they can
post to Piazza (mentioning the names and HIDs of the teammates), so
that the TAs will give them the privilege. Please do not share your
git account with other team members as we will chack if all team
members have contributed to the project. Team members that do not
contribute will not get any credit. Part of the class goal is to use
git properly.
## About reproducibility of the project
You are responsible for
providing clear and concise instructions on how to replicate your
project. If your code has a dependency on external libraries, you
might need to create a requirements file, Makefile, or just some bash
script that takes care of installing the dependencies and running your
project.
## About the project report
See the class requirements for the length and format of the
paper. Tables, figures, and algorithms/ code snippets are excluded
from the page count. Here is some suggestion for its structure:
2 pages to introduce your project in which you provide a title,
authors, abstract, keywords, hid, explain the problem, provide a
review, and motivate your solution for the problem.
2 pages for how you set up your solution, its architecture and
methods you use.
2 pages for results in which you evaluation of your solution, in terms
of performancem accuracy and compare it concretly to other work or
systems if avalable.
Your paper must be compilable with the Makefile. This is also mandated
for students using sharelatex. At one mpoint you need to install latex
locally on your machine and make sure your paper compiles without
errors. If you set up the paper in share latex exactly how you would
do it on your local computer, even share latex reports you the
errors. HOwever in the past we have seen that sharelatex produces a
PDF with errors and the students have ignored the errors produced by
share latex. Thus we make it mandatory that you check your paper locally.
success storries
https://piazza.com/class/j5wll7vzylg25j?cid=892
video
https://piazza.com/class/j5wll7vzylg25j?cid=679
NIST
https://piazza.com/class/j5wll7vzylg25j?cid=675
can be simplified as we no longer need to readd ```
https://github.com/bigdata-i523/hid231/blob/master/experiment/parse-readme/checkreadme.py
# DAL
http://sc17.supercomputing.org/2017/08/22/5863/
https://youtu.be/QymcdxZt4sM
################
sechi
https://piazza.com/class/j5wll7vzylg25j?cid=806
https://github.com/bigdata-i523/hid201/blob/master/experiment/Secchi%20Disk%20Data.ipynb
pi cluster
https://piazza.com/class/j5wll7vzylg25j?cid=150
bashrc
https://piazza.com/class/j5wll7vzylg25j?cid=147
virtualbox
https://piazza.com/class/j5wll7vzylg25j?cid=145
backups
https://piazza.com/class/j5wll7vzylg25j?cid=213
machines at iu
https://piazza.com/class/j5wll7vzylg25j?cid=212
article writing, not sure if useful
https://piazza.com/class/j5wll7vzylg25j?cid=203
https://piazza.com/class/j5wll7vzylg25j?cid=178
no windows 7
https://piazza.com/class/j5wll7vzylg25j?cid=192
vtx
https://piazza.com/class/j5wll7vzylg25j?cid=156
https://piazza.com/class/j5wll7vzylg25j?cid=246
old github
https://docs.google.com/document/d/1spfJdha1-9bb69tBgAGKGK2L0b8rb8wWJiY1r6VLKsc/edit
jabref video
https://piazza.com/class/j5wll7vzylg25j?cid=231
health care system cost
https://f100-res.cloudinary.com/image/upload/s--m6ebL-06--/c_fill,f_auto,g_faces,w_1200/v1/a/public/j7qglicviwcwfaofwgw7.jpg
connect to pi
https://piazza.com/class/j5wll7vzylg25j?cid=217
https://www.youtube.com/watch?v=gk7JAS5W7cE
experiments
https://piazza.com/class/j5wll7vzylg25j?cid=242
https://piazza.com/class/j5wll7vzylg25j?cid=241
https://piazza.com/class/j5wll7vzylg25j?cid=240
https://piazza.com/class/j5wll7vzylg25j?cid=239
https://piazza.com/class/j5wll7vzylg25j?cid=238
https://piazza.com/class/j5wll7vzylg25j?cid=237
https://piazza.com/class/j5wll7vzylg25j?cid=236
https://piazza.com/class/j5wll7vzylg25j?cid=235
# bibtex, latex:
code into latex: https://piazza.com/class/j5wll7vzylg25j?cid=846
https://piazza.com/class/j5wll7vzylg25j?cid=345
no use of names https://piazza.com/class/j5wll7vzylg25j?cid=312
no proposals: https://piazza.com/class/j5wll7vzylg25j?cid=434
sharelatex: https://piazza.com/class/j5wll7vzylg25j?cid=421
workbreakdown if working in teams
see https://cloudmesh.github.io/classes/i523/2017/project.html#common-deleiverable
we actually use em https://piazza.com/class/j5wll7vzylg25j?cid=405
page length https://piazza.com/class/j5wll7vzylg25j?cid=387
grammarly.com
videos: https://piazza.com/class/j5wll7vzylg25j?cid=359
latex ubuntu: https://piazza.com/class/j5wll7vzylg25j?cid=1021
https://www.tablesgenerator.com/
print df.to_latex()
ref, label: https://piazza.com/class/j5wll7vzylg25j?cid=1065
turnitin.com
https://piazza.com/class/j5wll7vzylg25j?cid=603
https://piazza.com/class/j5wll7vzylg25j?cid=733
figure ref https://piazza.com/class/j5wll7vzylg25j?cid=759
no how published https://piazza.com/class/j5wll7vzylg25j?cid=769
https://piazza.com/class/j5wll7vzylg25j?cid=773
# Content of a paper
What should my paper include?
a) a statement and description of your chosen topic/problem
b) an explanation of how and why Big Data is involved in your
topic/problem
c) an explanation of how Big Data analytics and methods can help to
study your topic or solve the problem
d) an overview of what infrastructure, methods, etc. exist to perform
c), or how other people have tried to do
However, I feel that point (c) does not apply to my paper. I have
chosen the topic, Big Data Analytics using Spark. Since my paper
itself revolves around how big data analytics can be performed in an
efficient manner using Spark, I don't think that question (c) applies
to my topic. Please let me know if I am missing anything here.
Similarly, for point (d), I have included a comparison of Spark with
other frameworks like MapReduce to show how these two methods differ
when analyzing Big Data. However, I am not sure if that is what you
expect for point (d).
contributing
https://cloudmesh.github.io/classes/lesson/contrib/contributing.html
policy inactive students
https://piazza.com/class/j5wll7vzylg25j?cid=253
cmd5
https://piazza.com/class/j5wll7vzylg25j?cid=346
https://piazza.com/class/j5wll7vzylg25j?cid=337
pi
https://github.com/bigdata-i523/hid201/blob/master/python_stuff/raspberry_pi.md
https://github.com/seashiva94/cloudmesh.pi/blob/master/cloudmesh/pi/led_bar.py
piazza description exampel
https://piazza.com/iu/fall2017/i523/home?
some update tips (policy)
a) update notebook.md every week even if you have not done
anything. This is not much to ask as even if you were busy, I am sure
you can include a single entry in the notebook.md that you did not
have time. You can see the example in the Piazza resources link
b) In case you worked on the paper, you need to ckeck the updated
version into github. you can either work directly in githib, use
advanced commandline tools (some of you already know them, we will
explain them in the upcomming weeks) or paste and copy from sharelatex
or your local copy into the github repository.
c) If you work in teams it is up to you to identify a pathway so that
the verison in github is the one your team developed ofer that week.
d) We say: "mille viae ducunt homines per saecula Romam", which
translates to thousand roads lead men forever to Rome, and is these
days just refered to as All roads lead to Rome.
location in survey
https://piazza.com/class/j5wll7vzylg25j?cid=308
fork
https://piazza.com/class/j5wll7vzylg25j?cid=311
readme.yaml must be better dpcumented
https://piazza.com/class/j5wll7vzylg25j?cid=372
gregors mqtt
https://github.com/bigdata-i523/sample-hid000/tree/master/experiment/mqtt
pi project
https://piazza.com/class/j5wll7vzylg25j?cid=368
https://piazza.com/class/j5wll7vzylg25j?cid=547
yaml
use "if there is a : in the ext"
there is always a space after attribute :
piazza post hid
plagiarizm: https://piazza.com/class/j5wll7vzylg25j?cid=503
miniproject
https://piazza.com/class/j5wll7vzylg25j?cid=526
https://piazza.com/class/j5wll7vzylg25j?cid=527
https://piazza.com/class/j5wll7vzylg25j?cid=528
https://piazza.com/class/j5wll7vzylg25j?cid=529
https://piazza.com/class/j5wll7vzylg25j?cid=530
https://piazza.com/class/j5wll7vzylg25j?cid=531
https://piazza.com/class/j5wll7vzylg25j?cid=532
https://piazza.com/class/j5wll7vzylg25j?cid=533
paper review
https://piazza.com/class/j5wll7vzylg25j?cid=536
what is good
what can be improved
slurm: https://piazza.com/class/j5wll7vzylg25j?cid=550
for gergor
https://piazza.com/class/j5wll7vzylg25j?cid=1018
https://piazza.com/class/j5wll7vzylg25j?cid=1056
CANVAS: https://piazza.com/class/j5wll7vzylg25j?cid=1057
CANVAS: https://piazza.com/class/j5wll7vzylg25j?cid=1058
https://piazza.com/class/j5wll7vzylg25j?cid=1070
https://piazza.com/class/j5wll7vzylg25j?cid=551
https://piazza.com/class/j5wll7vzylg25j?cid=601
https://piazza.com/class/j5wll7vzylg25j?cid=673
https://piazza.com/class/j5wll7vzylg25j?cid=717
https://piazza.com/class/j5wll7vzylg25j?cid=836
https://piazza.com/class/j5wll7vzylg25j?cid=861
https://piazza.com/class/j5wll7vzylg25j?cid=854
maps:
https://github.com/vgm64/gmplot
https://peak5390.wordpress.com/2012/12/08/matplotlib-basemap-tutorial-plotting-points-on-a-simple-map/
histogram
https://piazza.com/class/j5wll7vzylg25j?cid=563
https://piazza.com/class/j5wll7vzylg25j?cid=564
https://piazza.com/class/j5wll7vzylg25j?cid=566
https://piazza.com/class/j5wll7vzylg25j?cid=572
https://piazza.com/class/j5wll7vzylg25j?cid=574
latex praise:
https://piazza.com/class/j5wll7vzylg25j?cid=600
python
wordcount
greyscale: https://piazza.com/class/j5wll7vzylg25j?cid=622
cmd5 projects:https://piazza.com/class/j5wll7vzylg25j?cid=644
https://piazza.com/class/j5wll7vzylg25j?cid=658
https://piazza.com/class/j5wll7vzylg25j?cid=780
https://piazza.com/class/j5wll7vzylg25j?cid=782
get data: https://piazza.com/class/j5wll7vzylg25j?cid=848
https://piazza.com/class/j5wll7vzylg25j?cid=886
https://piazza.com/class/j5wll7vzylg25j?cid=912
i do not recommend to use tkinter. maybe you can look at guizero or kivy. the other approach is to develop the gui side in javascript but have the backend in python
histogram
https://github.com/bigdata-i523/hid331/blob/master/experiment/jupyter/JupyterExperiment.ipynb
singularity
https://piazza.com/class/j5wll7vzylg25j?cid=656
jupyter
https://piazza.com/class/j5wll7vzylg25j?cid=657
https://piazza.com/class/j5wll7vzylg25j?cid=803
water
https://piazza.com/class/j5wll7vzylg25j?cid=704
assignments
https://piazza.com/class/j5wll7vzylg25j?cid=722
chameleon acess
https://piazza.com/class/j5wll7vzylg25j?cid=784
git
https://www.youtube.com/watch?v=WN0wGDfq57M&feature=youtu.be
ska lecture
https://piazza.com/class/j5wll7vzylg25j?cid=810
ethics and big data
https://piazza.com/class/j5wll7vzylg25j?cid=837
chameleon
resource limitation: https://piazza.com/class/j5wll7vzylg25j?cid=840
no tableu
https://piazza.com/class/j5wll7vzylg25j?cid=871
guizero
https://piazza.com/class/j5wll7vzylg25j?cid=875
cover page
https://piazza.com/class/j5wll7vzylg25j?cid=980
poll
https://piazza.com/class/j5wll7vzylg25j?cid=981
images 300dpi if not scalable
look at pinned discussons.
https://piazza.com/class/j5wll7vzylg25j?cid=694