Review the brief and ask any questions on this project #1

barrytarter · 2022-10-03T13:50:51Z

@hardcommitoneself if you have any technical questions, feel free to post them in this issue here as should allow us to document the development process better.

hardcommitoneself · 2022-10-04T07:51:44Z

I'd like to know more about the spreadsheet and roster table. Please send me sample spreadsheet file.

Schema::create('rosters', function (Blueprint $table) {
            $table->id();
            $table->string('university');
            $table->string('url');
            $table->string('sport');
            $table->timestamps();
        });

hardcommitoneself · 2022-10-04T15:30:40Z

Should we use TALL stack in our app?

barrytarter · 2022-10-04T18:13:31Z

@hardcommitoneself

Here is what @edgrosvenor shared with me -- this will mainly be all back-end functionality so feel free to use whatever you prefer, e.g. browsershot, curl, even python is ok, etc. The early output might be CSVs of the profile data just to check it (e.g. name, position, year in school, etc).

If you are planning to do a front-end piece, TALL would be useful.

Does that make sense?

hardcommitoneself · 2022-10-04T18:29:33Z

Thanks for letting me know, @barrytarter .
It makes sense. So first I will scrap basic profile data(name, position, year etc) from the url provided from excel.
I am not sure if you did check slack message.
I mentioned that I will use Roach PHP to scrap data from the other sites.

barrytarter · 2022-10-04T18:53:07Z

@hardcommitoneself great, yes, this is best place to reach both me and Ed!

hardcommitoneself · 2022-10-05T14:07:51Z

@barrytarter

I just finished import excel feature and now I am gonna build scrapper.
So, after import excel file, should our scrapper work automatically? or we need to handle it manually?(start scrapping button something like that)

barrytarter · 2022-10-05T15:13:40Z

@hardcommitoneself For now, whatever is easiest to get a 'test' version live that successfully pulls and stores data. If @edgrosvenor has any tips, he'll share them here as well.

You'll need to create unique decision rules for pulling the roster data as some rosters are very similar and others are different, e.g.
these two are sites that use "Sidearm Sports" templates:
https://acusports.com/sports/womens-volleyball/roster
https://asugrizzlies.com/sports/mens-soccer/roster

These ones also use Sidearm sports, but a different template I think:
https://aupanthers.com/sports/mens-soccer/roster
https://bamastatesports.com/sports/womens-volleyball/roster
https://auwolves.com/sports/mens-soccer/roster

These are both from Presto Sports templates, but the templates look different:
https://goamcats.com/sports/msoc/2017-18/roster
https://www.sunyadktimberwolves.com/sports/msoc/2017-18/roster

hardcommitoneself · 2022-10-06T18:14:33Z

@barrytarter

In my opinion, how about checking the number of tr of all tables in each page?
So, as I noticed so far, it seems that there is only one table which have over many items(I think that is what we want).

barrytarter · 2022-10-06T19:41:49Z

@hardcommitoneself I like that approach. We might need a way to decipher the type of content listed.

e.g. grade level (aka "graduation Year") values could be categorized by word, e.g. 'freshman', sophomore, junior, senior? I look forward to seeing how you figure it out!

hardcommitoneself · 2022-10-07T02:12:29Z

@barrytarter

I just noticed that some rosters have no tables(instead list). https://www.artuathletics.com/sports/womens-volleyball/roster
I think we need to build logic for the ul list.

hardcommitoneself · 2022-10-07T18:52:01Z

It is what I just reached out to now. I think it will be base of our scrapper. Please check it out and let me know feedback.

hardcommitoneself · 2022-10-12T13:39:16Z

Please take a look at this screenshot.
You can notice that the Year field. The filed's value is different with the others.
How can I convert the numbers(1, 3, etc) to real year value(Fr., Sr etc)?

barrytarter · 2022-10-12T21:46:54Z

@hardcommitoneself here is one possible guide on how to map the data: https://docs.google.com/spreadsheets/d/1QBCGpvXjoDAH50wQTTnYLj5cWzb3TlXWWUPn-g3kk78/edit?usp=sharing.

Specifically for the numbers, it could map as 1 = Freshman, 2 = Sophomore, 3 = Junior; 4 = Senior; 5 = Senior; 6 = Senior.

hardcommitoneself · 2022-10-15T00:34:53Z

@barrytarter

https://www.loom.com/share/262f7d29525f45eba0caa4e8455a965d
Please check this video. And give me feedback.

hardcommitoneself · 2022-10-17T16:08:53Z

@barrytarter @edgrosvenor

Regarding the extra field of athlete table, should we add the follow fields to it?

barrytarter · 2022-10-17T16:14:34Z

@hardcommitoneself ,

Thanks for sharing. Can we store both as text for now? The first is a height field and the second is where they played in high school. These are pretty common, so good to collect.

barrytarter · 2022-10-17T16:18:06Z

@hardcommitoneself will you be able to begin developing the crawler that will find the missing Twitter and Instagram IDs?

Step 3 in https://docs.google.com/document/d/1YmfAFYu4Cyl99ninB4KAeML4y-nmRW0gzI6Xeydg_2g/edit?usp=drivesdk

Can you get a v1 of that part ready by Wednesday?

edgrosvenor · 2022-10-17T16:19:42Z

@hardcommitoneself Go ahead and add any data that you think might be valuable as key / value pairs in the extra column. While you're at it, enable this package for that column: https://github.com/spatie/laravel-schemaless-attributes That will allow you to do things like $athlete->extra->set('height', '5\'9"');. I think maybe I've included the package in composer (maybe not), but I haven't added the trait to the model.

hardcommitoneself · 2022-10-19T03:06:54Z

@barrytarter @edgrosvenor

Regarding the second crawler, I think we can use opendorse.com to scrap our athlete's contact info.
The following is just my opinion.

First, we need to search university by university name https://opendorse.com/searchshowAthletesNotOptedInToDeals=true&showUnclaimedAccounts=true&term=Abilene+Christian+University
Then we need to go to relevant university page
https://opendorse.com/abilenechristian-wildcats
And we need to filter by sport
https://opendorse.com/abilenechristian-wildcats?sports=Soccer
That's it, we should find our athletes in the page.
https://opendorse.com/profile/ellen-joss?from=abilenechristian-wildcats

That's it. I am not sure this approach is working for all rosters. So I just want to test with real links.

hardcommitoneself · 2022-10-22T00:36:10Z

@barrytarter @edgrosvenor

I wrote my suggestion below.
I think we'd better to use Google search engine by using name, sport, college for our contact crawler.
I checked manually with many athletes and it looked nice.

example search query -
google.com/search?q=twitter+Nicole+Barham+ACU+soccer
https://www.google.com/search?q=instagram+Nicole+Barham+ACU+soccer

Please take a look at it and give me your idea.

barrytarter · 2022-10-22T01:00:30Z

Sure, we can test that out and see how the data looks.

hardcommitoneself · 2022-10-23T02:58:43Z

@barrytarter @edgrosvenor

Hi, Hope you are having nice weekend!

Please take a look at this video.
https://www.loom.com/share/96661444867a4df98f6fdef1756662e3
You can notice that this scrapper is working well.
Give me your feedback.

Sorry to bother you. :)

barrytarter · 2022-10-24T13:29:14Z

@hardcommitoneself here are some more good links of rosters. Can you check to see how many profiles you can pull from the rosters (100%?), how much data is filled in for position, height, weight, grad year, from?, how many twitter links you get? how many instagram? How many opendorse?

https://artuathletics.com/sports/mens-soccer/roster
https://asugrizzlies.com/sports/mens-soccer/roster
https://aupanthers.com/sports/mens-soccer/roster
https://adrianbulldogs.com/sports/msoc/roster
https://www.albertusfalcons.com/sports/msoc/2022-23/roster
https://gobrits.com/sports/mens-soccer/roster
https://albrightathletics.com/sports/mens-soccer/roster
https://alfredstate.prestosports.com/sports/msoc/2022-23/roster
https://gosaxons.com/sports/mens-soccer/roster
https://alicelloydeagles.com/sports/msoc/2022-23/roster
https://www.ahcbulldogs.com/sports/msoc/2022-23/roster
https://www.allegany.edu/athletics/mens-soccer.html
https://alleghenygators.com/sports/mens-soccer/roster
https://sccstorm.com/sports/msoc/2022-23/roster
https://almascots.com/sports/msoc/2022-23/roster
https://auwolves.com/sports/mens-soccer/roster
https://www.aicyellowjackets.com/sports/msoc/2022-23/roster
https://www.arcbeavers.com/sports/msoc/2022-23/roster

hardcommitoneself · 2022-10-25T00:42:47Z

@barrytarter @edgrosvenor

Please take a look at the following.
https://www.loom.com/share/0708dffa27714d9eb3f0ac3072bb77c7
I implemented 100% automation for scrapping twitter id for test.
I think this scrapper got almost twitter ids, so please check it manually. Then give me feedback.
I already implemented opendorse logic last week, so I need to implement instagram logic now.

barrytarter · 2022-10-25T14:06:39Z

@hardcommitoneself could we add a method that would allow us to get this person's instagram and twitter?
Caleb Kendra at 0:57 you'll see his name in https://www.loom.com/share/0708dffa27714d9eb3f0ac3072bb77c7.
e.g. https://www.instagram.com/c_kendra2/

hardcommitoneself · 2022-10-25T15:44:38Z

@barrytarter

So, do you want to get full twiiter link of atheltes like https://www.instagram.com/c_kendra2/ ?

barrytarter · 2022-10-25T15:51:12Z

@hardcommitoneself yes, we want the twitter, instagram, opendorse links for all athletes in the crawler.

hardcommitoneself · 2022-10-25T16:25:39Z

@barrytarter

OK, as we discussed before, we can not get many athlete's social links since most of them don't have it.
Anyway please take a look at the following.

barrytarter · 2022-10-25T16:39:53Z

@hardcommitoneself yes, if it doesn't exist, we definitely can't store one.

Caleb Kendra does have one but we didn't store it -- how do we fix that?

hardcommitoneself · 2022-10-25T16:48:34Z

@barrytarter

I think we can store it. What's the problem?

This is the structure of athlete table.

barrytarter · 2022-10-25T16:49:39Z

Great! Why didn't it store previously?

hardcommitoneself · 2022-10-26T09:42:59Z

@barrytarter

Please take a look at it. I implemented opendorse scrap method, so we can get not only opendorse link but also twitter or instagram link from there.
https://www.loom.com/share/2b18bd1d24a04f5bbdcd018221ab7a4a

barrytarter added the Current Sprint label Oct 3, 2022

barrytarter assigned hardcommitoneself Oct 3, 2022

hardcommitoneself closed this as completed Oct 4, 2022

hardcommitoneself reopened this Oct 4, 2022

hardcommitoneself added the question Further information is requested label Oct 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Review the brief and ask any questions on this project #1

Review the brief and ask any questions on this project #1

barrytarter commented Oct 3, 2022

hardcommitoneself commented Oct 4, 2022 •

edited

Loading

hardcommitoneself commented Oct 4, 2022

barrytarter commented Oct 4, 2022

hardcommitoneself commented Oct 4, 2022

barrytarter commented Oct 4, 2022

hardcommitoneself commented Oct 5, 2022

barrytarter commented Oct 5, 2022

hardcommitoneself commented Oct 6, 2022 •

edited

Loading

barrytarter commented Oct 6, 2022

hardcommitoneself commented Oct 7, 2022 •

edited

Loading

hardcommitoneself commented Oct 7, 2022 •

edited

Loading

hardcommitoneself commented Oct 12, 2022

barrytarter commented Oct 12, 2022

hardcommitoneself commented Oct 15, 2022

hardcommitoneself commented Oct 17, 2022

barrytarter commented Oct 17, 2022

barrytarter commented Oct 17, 2022

edgrosvenor commented Oct 17, 2022

hardcommitoneself commented Oct 19, 2022 •

edited

Loading

hardcommitoneself commented Oct 22, 2022

barrytarter commented Oct 22, 2022

hardcommitoneself commented Oct 23, 2022

barrytarter commented Oct 24, 2022

hardcommitoneself commented Oct 25, 2022

barrytarter commented Oct 25, 2022

hardcommitoneself commented Oct 25, 2022

barrytarter commented Oct 25, 2022

hardcommitoneself commented Oct 25, 2022

barrytarter commented Oct 25, 2022

hardcommitoneself commented Oct 25, 2022

barrytarter commented Oct 25, 2022

hardcommitoneself commented Oct 26, 2022

Review the brief and ask any questions on this project #1

Review the brief and ask any questions on this project #1

Comments

barrytarter commented Oct 3, 2022

hardcommitoneself commented Oct 4, 2022 • edited Loading

hardcommitoneself commented Oct 4, 2022

barrytarter commented Oct 4, 2022

hardcommitoneself commented Oct 4, 2022

barrytarter commented Oct 4, 2022

hardcommitoneself commented Oct 5, 2022

barrytarter commented Oct 5, 2022

hardcommitoneself commented Oct 6, 2022 • edited Loading

barrytarter commented Oct 6, 2022

hardcommitoneself commented Oct 7, 2022 • edited Loading

hardcommitoneself commented Oct 7, 2022 • edited Loading

hardcommitoneself commented Oct 12, 2022

barrytarter commented Oct 12, 2022

hardcommitoneself commented Oct 15, 2022

hardcommitoneself commented Oct 17, 2022

barrytarter commented Oct 17, 2022

barrytarter commented Oct 17, 2022

edgrosvenor commented Oct 17, 2022

hardcommitoneself commented Oct 19, 2022 • edited Loading

hardcommitoneself commented Oct 22, 2022

barrytarter commented Oct 22, 2022

hardcommitoneself commented Oct 23, 2022

barrytarter commented Oct 24, 2022

hardcommitoneself commented Oct 25, 2022

barrytarter commented Oct 25, 2022

hardcommitoneself commented Oct 25, 2022

barrytarter commented Oct 25, 2022

hardcommitoneself commented Oct 25, 2022

barrytarter commented Oct 25, 2022

hardcommitoneself commented Oct 25, 2022

barrytarter commented Oct 25, 2022

hardcommitoneself commented Oct 26, 2022

hardcommitoneself commented Oct 4, 2022 •

edited

Loading

hardcommitoneself commented Oct 6, 2022 •

edited

Loading

hardcommitoneself commented Oct 7, 2022 •

edited

Loading

hardcommitoneself commented Oct 7, 2022 •

edited

Loading

hardcommitoneself commented Oct 19, 2022 •

edited

Loading