feat(sportsbooks): dockerized Airflow + Postgres sportsbook webscraper pipeline#4
feat(sportsbooks): dockerized Airflow + Postgres sportsbook webscraper pipeline#4omkar055 wants to merge 24 commits into
Conversation
5a96c73 to
c5847cf
Compare
| metadata = MetaData() | ||
| table = Table(table_name, metadata, autoload_with=db_eng) | ||
|
|
||
| with db_eng.begin() as conn: # transaction automatically commits |
There was a problem hiding this comment.
may want to include a verifier to make sure the commit was made correctly
thoughts on adding something to erase part of a commit if the entire thing failed for some reason halfway through?
JonathanPLev
left a comment
There was a problem hiding this comment.
overall good work, just a few changes. logic looks great. could use less comments that explain basic functions (maybe less what does it do and more of why does it do it, if it needs that comment). basically think about like do i know why this is doing x thing and can i easily understand it? if the answer is no then you should leave a comment
|
please add a schema for the database, and then make sure thats included in your docker compose file so the database tables are created when you create the database, and then provide commands in the readme to rollback (Sql down) the table creation or create them again if something happens (sql up) |
Initial implementation of the Airflow ETL pipeline to scrape various nba sportsbook data
Includes Airflow + Postgres Docker setup and NBA webscraper DAG
With this setup, hopefully everyone can run and access the Airflow pipeline and check Postgres to collect and access NBA player line data
This PR migrates the standalone sportsbook scraping pipeline from the Data team's webscrape repository into the TransformerPredictionModel repository
For review and testing — not for merge yet.
How it was tested
docker compose up --build)SELECT COUNT(*) FROM player_lines;)