Would it be interesting for someone to have the script for scraping the data out of Euroleague's website?
Announcement
Collapse
No announcement yet.
Euroleague data scraping
Collapse
X
-
Ok, so I assume someone would need sooner or later. Anyone who needs the script I've put it here.
You need to install Python 2.7.x (personally I used 2.7.13) with pip (my suggestion install Python with all of the features), and you need to run these commands from command prompt:
> pip install beautifulsoup4
> pip install requests
> pip install openpyxl
and you should be ready to start the script. It generates and data.xlsx file with the season totals for each player currently in the roster of euroleague team. Hope it will help someone
-
Originally posted by Oly_fan View PostThanks, I'll check it out. I haven't practiced python in a while.
PS. I've added the comments in the script so you can understand it betterLast edited by unnamed; 12-28-2016, 12:56 AM.
Comment
-
Just a heads-up, the code as it is currently breaks down on Darussafaka because Zizic doesn't have a position assigned to him yet. Very easily fixed and other than that, it works fine. Since it was my first time scraping data from sites I'm glad I learnt something.
I had this optimistic idea; since euroleague only has shot charts for each game, I thought I could get the datasets behind each chart, aggregate them and produce shot charts for players/teams over whole seasons. However, the actual data seems unavailable? I know nothing about HTML so I could be missing it.
Comment
-
Originally posted by Oly_fan View PostJust a heads-up, the code as it is currently breaks down on Darussafaka because Zizic doesn't have a position assigned to him yet. Very easily fixed and other than that, it works fine. Since it was my first time scraping data from sites I'm glad I learnt something.
I had this optimistic idea; since euroleague only has shot charts for each game, I thought I could get the datasets behind each chart, aggregate them and produce shot charts for players/teams over whole seasons. However, the actual data seems unavailable? I know nothing about HTML so I could be missing it.
Concerning the script, I had several different ideas and I decided to use the most straightforward one. I realize if I used dictionary instead of list this wouldn't collapse in case of Zizic like it is now, but then again I thought every player would have his playing position
Getting the shooting chart would require completely different script. You'd need to go game by game and collect the data. You'll probably need images as well, for the court, for the made and missed attempt. That would require you to install yet another library pillow, should you decide to use openpyxl.Last edited by unnamed; 12-28-2016, 10:29 PM.
Comment
-
Originally posted by unnamed View PostGetting the shooting chart would require completely different script. You'd need to go game by game and collect the data. You'll probably need images as well, for the court, for the made and missed attempt. That would require you to install yet another library pillow, should you decide to use openpyxl.
Comment
-
Originally posted by Oly_fan View PostI did go game by game but I couldn't find the data. I think, if I had it, I could make the rest using R and shiny; I'm more familiar with those. I saw people sharing python code for making shot charts too if that didn't work out.
Comment
-
I did that but I couldn't find the coordinates data. Maybe I missed it, I'll have another look when I can.
Comment
-
-
" ng-attr-cx="{{point.COORD_Y * 776 / 2800 + 56}}" ng-attr-cy="{{point.COORD_X * 416 / 1500 + 218}}
" ng-attr-cx="{{(800 - (point.COORD_Y * 776 / 2800 + 56))}}" ng-attr-cy="{{point.COORD_X * 416 / 1500 + 218}}
bg-shooting-md.jpg
The point.COORD_X and point.COORD_Y are the actual variables for each shot.
Comment
-
Tag looks like this:
<circle ng-class="selected == point.NUM_ANOT ? 'selected' : ''" ng-attr-cx="{{point.COORD_Y * 776 / 2800 + 56}}" ng-attr-cy="{{point.COORD_X * 416 / 1500 + 218}}" ng-click="setPlayer(point.EQUIPO, point.JUGADOR, point.MINUTO, point.CONSOLA, point.ID_JUGADOR, point.PUNTOS_A, point.PUNTOS_B, point.NUM_ANOT)" r="10" fill="#FFFFFF" stroke="#F7941E" stroke-width="2" cx="101.17428571428572" cy="320.61333333333334"></circle>
Comment
Comment