You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: lectures/polars.md
+227Lines changed: 227 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -592,4 +592,231 @@ Note that polars offers many other file type alternatives.
592
592
593
593
Polars has [a wide variety](https://docs.pola.rs/user-guide/io/) of methods that we can use to read excel, json, parquet or plug straight into a database server.
594
594
595
+
## Exercises
596
+
597
+
```{exercise-start}
598
+
:label: pl_ex1
599
+
```
600
+
601
+
With these imports:
602
+
603
+
```{code-cell} ipython3
604
+
import datetime as dt
605
+
import yfinance as yf
606
+
```
607
+
608
+
Write a program to calculate the percentage price change over 2021 for the following shares using Polars:
609
+
610
+
```{code-cell} ipython3
611
+
ticker_list = {'INTC': 'Intel',
612
+
'MSFT': 'Microsoft',
613
+
'IBM': 'IBM',
614
+
'BHP': 'BHP',
615
+
'TM': 'Toyota',
616
+
'AAPL': 'Apple',
617
+
'AMZN': 'Amazon',
618
+
'C': 'Citigroup',
619
+
'QCOM': 'Qualcomm',
620
+
'KO': 'Coca-Cola',
621
+
'GOOG': 'Google'}
622
+
```
623
+
624
+
Here's the first part of the program that reads data into a Polars DataFrame:
625
+
626
+
```{code-cell} ipython3
627
+
def read_data_polars(ticker_list,
628
+
start=dt.datetime(2021, 1, 1),
629
+
end=dt.datetime(2021, 12, 31)):
630
+
"""
631
+
This function reads in closing price data from Yahoo
632
+
for each tick in the ticker_list and returns a Polars DataFrame.
633
+
"""
634
+
# Start with an empty list to collect DataFrames
635
+
dataframes = []
636
+
637
+
for tick in ticker_list:
638
+
stock = yf.Ticker(tick)
639
+
prices = stock.history(start=start, end=end)
640
+
641
+
# Create a Polars DataFrame from the closing prices
642
+
df = pl.DataFrame({
643
+
'Date': pd.to_datetime(prices.index.date),
644
+
tick: prices['Close'].values
645
+
})
646
+
dataframes.append(df)
647
+
648
+
# Join all DataFrames on the Date column
649
+
result = dataframes[0]
650
+
for df in dataframes[1:]:
651
+
result = result.join(df, on='Date', how='outer')
652
+
653
+
return result
654
+
655
+
ticker = read_data_polars(ticker_list)
656
+
```
657
+
658
+
Complete the program to plot the result as a bar graph using Polars operations and matplotlib visualization.
659
+
660
+
```{exercise-end}
661
+
```
662
+
663
+
```{solution-start} pl_ex1
664
+
:class: dropdown
665
+
```
666
+
667
+
Here's a solution using Polars operations to calculate percentage changes:
ax.set_ylabel('percentage change in price', fontsize=12)
693
+
df_pandas['pct_change'].plot(kind='bar', ax=ax)
694
+
plt.xticks(rotation=45)
695
+
plt.tight_layout()
696
+
plt.show()
697
+
```
698
+
699
+
```{solution-end}
700
+
```
701
+
702
+
703
+
```{exercise-start}
704
+
:label: pl_ex2
705
+
```
706
+
707
+
Using the method `read_data_polars` introduced in {ref}`pl_ex1`, write a program to obtain year-on-year percentage change for the following indices using Polars operations:
708
+
709
+
```{code-cell} ipython3
710
+
indices_list = {'^GSPC': 'S&P 500',
711
+
'^IXIC': 'NASDAQ',
712
+
'^DJI': 'Dow Jones',
713
+
'^N225': 'Nikkei'}
714
+
```
715
+
716
+
Complete the program to show summary statistics and plot the result as a time series graph demonstrating Polars' data manipulation capabilities.
717
+
718
+
```{exercise-end}
719
+
```
720
+
721
+
```{solution-start} pl_ex2
722
+
:class: dropdown
723
+
```
724
+
725
+
Following the work you did in {ref}`pl_ex1`, you can query the data using `read_data_polars` by updating the start and end dates accordingly.
726
+
727
+
```{code-cell} ipython3
728
+
indices_data = read_data_polars(
729
+
indices_list,
730
+
start=dt.datetime(1971, 1, 1), # Common Start Date
731
+
end=dt.datetime(2021, 12, 31)
732
+
)
733
+
734
+
# Add year column for grouping
735
+
indices_data = indices_data.with_columns(
736
+
pl.col('Date').dt.year().alias('year')
737
+
)
738
+
739
+
print("Data shape:", indices_data.shape)
740
+
print("\nFirst few rows:")
741
+
print(indices_data.head())
742
+
```
743
+
744
+
Calculate yearly returns using Polars groupby operations:
745
+
746
+
```{code-cell} ipython3
747
+
# Calculate first and last price for each year and each index
0 commit comments