From b450fed05b0f4626b5204cd54d19f94f909ef0ce Mon Sep 17 00:00:00 2001 From: nidhiraju10 Date: Fri, 7 Nov 2025 21:57:37 -0500 Subject: [PATCH 1/3] Assignment 2 --- README.md | 42 +++++++++++++++++- image.png | Bin 0 -> 34779 bytes main.py | 130 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 171 insertions(+), 1 deletion(-) create mode 100644 image.png create mode 100644 main.py diff --git a/README.md b/README.md index 05aa109..20282d0 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,43 @@ # Text-Analysis-Project +Project Title: Text Analysis Project – Alice in Wonderland and Frankenstein -Please read the [instructions](instructions.md). +1. Project Overview + +For this project, I used two books from Project Gutenberg — Alice’s Adventures in Wonderland by Lewis Carroll and Frankenstein by Mary Shelley. The goal was to explore how Python can be used to analyze, compare, and visualize text data. I applied techniques such as text cleaning, stopword removal, word frequency analysis, and summary statistics. Through this I aimed to learn how language, tone and theme differ between these 2 distinct genres and allowed me to explore the deeper qualitative insights such as emotional tone. + +2. Implementation + +The system is built with Python and Utilizes several libraries for different analysis techniques: + +Text Cleaning: Unnecessary characters, punctuation, stopwords and headers were removed, and text was converted to lowercase. + +Word Frequency: The remaining words were counted with a Python Counter dictionary to identify the most common ones and its frequency and then I used the ACII bar chart to visualize the top 20 words. + +Sentiment Analysis: NLTK’s SentimentIntensityAnalyzer determined emotional tone per text. + +Cosine Similarity: Scikit-learn’s TF-IDF vectorizer calculated how similar the two books were in vocabulary and themes. + +Design Decision: Instead of heavy plotting libraries, I used an ASCII bar chart for visualization. + +GenAI (chatgpt) has helped and guided me in optimizing the code + +3. Results +The project acheived the following results: + +Word Frequency: +Alice in Wonderland – Common words included said, Alice, little, Queen, and thought, reflecting a story driven by character dialogue and whimsical interactions. + +Frankenstein – Frequent words such as life, father, eyes, shall, and man indicate a more reflective and emotional tone centered on human experience and morality. + +Cosine Similarity: +The similarity score between the two texts was 0.25, meaning limited overlap in vocabulary and subject matter which makes sense as they are in completely different genres. + +Sentiment Analysis: +Alice in Wonderland had generally neutral to positive sentiment whereas,Frankenstein showed more negative or somber sentiment with words suggesting conflict, guilt or emotional struggle. + +Visualization: +The ASCII bar chart clearly highlighted differences — Alice is dominated by character dialogue, while Frankenstein emphasizes abstract and emotional words. +![alt text](image.png) + +4. Reflection +This project was both challenging and insightful. From a learning perspective, I realized the versatility of text analysis in understanding themes, sentiment, and content generation. Alice in Wonderland used simple, lively language with lots of dialogue, while Frankenstein had a heavier tone and more emotional depth. The low similarity score proved how different their writing styles really are. Cleaning the text, removing stopwords, and looking at word frequencies made me see how much detail is hidden in plain text. I also learned how sentiment analysis can capture the overall mood of a story without needing to read every line. diff --git a/image.png b/image.png new file mode 100644 index 0000000000000000000000000000000000000000..4bf68dfe87bb2b13c81f0339b8ba507025f01dbc GIT binary patch literal 34779 zcmd43cRZE<|37|=qL5WGGD4J9B9bjLtFl-2%F4*zLN*E6%1UJKV`Xm{;n;g*J2*JT z?|JC;s@LcJet*87KYq9G=X-9F#5vcwuIKadxUa_*@KjC$?=tyi2n2#BB`NwG0>Kc5 zK+uJ;FM{uQ2G&@BztC)-ONc;nyKgLme_$F5%YvUM2*Wwjy#W4w>6N6KEd+9<0rdy1 z#p<&G1mgcnN>o_U@x|)6prnS|>FL(uF-FlGt`R$>oe7Cn;fJOxl}n#6ef28RHN`5V z17mCg=Gg*ObHn>+NvzaubTybJidCP~s9t_?+fvJr1I>u!&G#L{xsF zl@vuieg+28puR(x#mwY=_MNoc&;O7Akep<2nyFp}k)1R2?1;O=&)@GvbLIC~ng^8M zHcMR$hu+GuIB9s4cZ&utzZJG;ZI5Bg^jLquUGFq;1b3jiR$ZORD&BX4Mp2SO@T&3X zC@U>^*od3ZFD2|N9@#4q?;t&UTu z1n0EuuxTANVosCt%=v-{iqy=B9&H4U-w+W5xEcv19; zEB;aLbtU^5wojQifK60hQNG(Y2Rjd4A z{7WbP1Z8E^x$Vw7OuV9!+MV@5;i)Q_@!1ndtXV;SyxR}+4aD3P6cY8I+_RCxL*z;ndv18D=4tAJ4PDpMY_K-~& zSkOqCg%R)J-~CRp?UkrM`;B_u)F+jLVvaAFy}`F~b7s(qKFRPzW8>w? zF^$t=>OGkTTXE^j1|)9iMHF<2F5H4FMtNz~AYcOiwWqO-l|0t)9B#ZAO{g*H|59-(! z!rZEdUCfn8gdozyqN&duD4#QS$wH{xl&=YC7j}=QQ?lTSxb=TnFjJ=LVTVnw;{~G5 z!qGiObgO{5`k}A7`yu0li1?h*Dyorrg+WWWbn{s?cUt&HJ@{;J$NTG*lbcG zYHAd3p|XyLI39BOOm3gTY&kJIB517Y8p#k}cB%Nya)mV({PjE4Y;HHsk}4`WW@gly zTpR*xvX%{J^rH9IV9LViOu+%nx)y$iMbEq?D?D_rrZ}uO_63)Pe8Mu;mA!rW?eB}d zsGjuEVXm%5JE}8A!UJAdmB?QW^1Lflv{?1E;GVxLvjls?Vd_KTA^*20wtmKBIIcIH z^?fGNP;aX>t3NK~U0&`b?`hv5O;b1z1s)6A_R$&@XDNrPZGx91L2`A+&k_y3yX$$W zZB||I&3WGYE@DSrPg>0p(lT03~a=UcC7V6Rs|>mP(|?MqR6nUjdz6sR-{ zd09E}cnE$Y*MY6cak#GAVpC_~=BQ!SF@iM(mdWM%V|!!{ONAc+YdULX)_$yW(bS6uOk!!5 zi7TjLAEA)G{7PciJs==9b9IhoHE==1J%$^8`sMcRn#Sh#wN3K{#|s2?i#_Ridyd)# z?B5oUePfmA%7e$-&^!dr#Q5T55bxaObU*Zp9S$RL73)%s&-s+b)~vGKU#ifdGvta8 z)JAFz%Zy0rp9;nNQH58EdWUZl9))0@Z81r$XN2^qt55q{Of3BD{|2soz4TXiiuj*3 zrHIE(l5q@KqM2on&q@i;HQ}@8e6FI1*%sS>Lwb9JlU>DVdC$c~4D|ywVSzL}w$2SV zT_IlgwALmj@~kuRW3w>h8W~W(rwzJdcfzg=2}NCd{%-fEi@R@TG87bA!Ylc_pfiSJRSU)^H|NlFjT#;WqLy{{JyMAf1^xvO^5LGgcM1%LU$(>!kT7c zxNW$8`{bZ=D2Fe}@ZOPr23i*OI}T$jQ}2%6+r4IpVHRhXo;-$H>dPe1bo~bWs|5m% z-tEzMR`%|!?EIIHAK_nPJ;ECuF&jo-v`v1Sn;q&5L3}6SOo-+8Urx|FsO{e=xxn1L zNcY_)AfoX`!NQlFtfz$Z+dgm5FJ29DwF&car+%7NzIcgABjfDYul)69KHt*F9v)K~ zapi()m|mWc8`9AWOd_GaYLhlZb&umv{u4M>llKAXfm2XU7qOt#W`&cZu4QVseBthk zt~Vd$`Z3pM&Qf1z)7(!kdp4(W)_qLqmZCT8Z?+(J7d!bpG@{`CE&+J>eT0|w_MgD` zU|)^LoQt(0R_P<+7@i+&iQMcudYaI#nBHA*LA+7~Q<{N=#BnuAt0TlLj6)EwF zvUNG-$pz*Wi$~Vvm)?#ibC?Yv806zwc0D)+8_Pbj|T=}pLwl)q|xG64ULL@q0hbNzxup5FE@@0ieu2$~R9@cprWGPA z_x>n*=fErc>FBy|lp-|`_24BDDSaa!Sfa(XgWCT?ECMRe8ZrBi6GPEttk3SZ9fc0sXiz(&yG*QT~$%&0e+(XlJr zF%E{yAGD_T()}Jgm_9pSO<+m!cdb6V@tOiyRmyGe z2-$zt%QpA;P*8#Cc9Yuo3>M`%cboQ0h7V-eMz?g}$m}`T6cndqez;j(B!RYhkRWCA z9+82C>_^(5*%k1L&sZY}ORbOI;l_qtpphI}rB4hI(&udmFwBH(H}||1ueB{I{G@@! zwNLKZGwjcz_xxA={iBAtTpiu|xXthWg$lmz=3QuJ8DGSw3tz*!yl(=1)S`+w372o# zyup3Q074^wuYLPd3r9uL(crrS{UH;OEXbSZO->zNF1vPdaSCfh6W1 z;!rcM!w%b6w7f!K$}^|nt6He{`A4;OIH7LkSgGJZWzAtI-;~+1?xAnM!ORsET47h^ z=&_jfiQUMO3%Dt7=R>dXwvbR~TV*TuX66(|xwSKC!2bTip?Eb*HCdO#%!lyP#zN#a zvT{_V9y7l!mwzsT*C!Lca4*KYTVlwo^#s}2E>U<`!^|ir7CvCa+dZQmmtd&EIdXLy zS`~YGomd(P@864NP#tN1M#ehl08`a8V62@~E^7!Go9qH+~uno%Qcl*vJ2_ML+nf z4TGI}){y^b@_)4G-$=mbGpSTRebmy9Jqil6@?`!&PY-U!=}WzcXIN-w8F#nJJNtpK zR9eZG15&hxdU4mn6uqrv$yf0FH_NRsOiUaSB&j!Z zMmCPkhQhANa5U+5g#OvbuNayUuO}Ml@+fWTuA_kPeNq8~xF`CI`ZyhFOodr30UJH- zu_@qXi)CGJKj9?5MV`p0wYf)a`Hx<{@;Gu5|3c7sU%@n1pei+=uwlph*U0y;4}X7r z6%HPD#J3Pm{nRCqGEYPoz@yi7t*gt2g)cTEkG0i(#J`uwy6|gDXM5~NTFE)AOBZl& zlN`)ZVI2#kNLu$0E0)Rt-0|1C*Ps|}FWlP6Me16H1+7vCs6EPKhBs*_KhpU1m=@tU6>9X9 z@0yHKIJ|x*DNB6Tm3{7%)ox?N)W;O&Hp?|zhJ9DEru5Z}Qwdv#Gb!-D8Q>|clT@Kd z$G1y3%|C8SUF-8;tCa{&;}KGDPWrfiJ#P;?0sf|{3+w7zaqcLk!SvtQ<8s=f|99 zqn2gU(Yt&QSMl)D8q;V-kM9pH&4TAV#;4~3SMI6P1~=v^GmvRYvX7T(L(uloZysix zN*BK5(?HRwSE%B>df7Y5Xw5L|f#)RFCfQ2e)e{GUJ@b@+g>LntU$0^mV@1%KQ&?Ks zw1;I(pYF2FKFiO#zpc@!Su5X+EVc^8R60~HP0X6-wY%`G+gR13qP1^he~x~HxA4>=nu7kLL3(!{-=TvT)97qQK- z6L5HQ8!A3Nwyzfu}K_7PA59No{CO5b5n=ai*;m5w( z$0FDH9*07}O~=+WLlc6wWB+DTw*Wph=ze8bv$E-B^TpKIy-*uC?0fH%?^wzcRhBz@kPqFFTHAul20( zvn>*5ZctwODUV9YutZru`y4OP9KBSV%~)at?Cp>RjUovoRDI3J|M4c>DPhSCn2th z$+D{PP=JcJ(^I`5KJ1@3soWs7L?VBXo$net(-nG*576PuY55djUpT z-QBP;%m81JGot@Y&sWl^8B@#02-t@2Xbkz4M7o@u!gi)?@QI2ac|1lr5p1a7Gn_; z8ZE1TXIsdv4zhl`4KxGM4{T+(5dQDX+DNnuiq~IAeD z+{-v*I&1l)dAF4x5vM(=xbDI3VEab;CdMkCMi2)}b>2fOaCJBoF)z7D6SG*%=dT`@ z0?r%KBxb)I)z-i3ZtCsPRjbRbt#kU^JZJ1mPjq`cf)@%4>E1NYJdEq%8*R3!H8=Sz zARzQd@m+bk2OU-p?Tp6~%^TN;2X(x&U@Y5uFL?So5vhG;@IzFxl$p2amDh#fGpg>a zf|d4*>;7vS?2Y@2^n$Ai*4_VErR7lsLg>3EcFw$K&z_+0uatV6QL_8398s1jt+<%N zSv6k^WRPHm%?3trVwpVY;kYVTQG}9vT6;1hXd(qC_$|GI9&{*Q%c1ZrpW;s3EFsbVDhL9uzG%Jr=r!+XHiJ&9C zoz6BvkYSDo z_Nh}ibm2>=EhS7QBN!?WsGYH zI_^8gF~&Ic5!2UhS>+14YX9J_Fz+MoEMj*~^mieUj>COpZewM%fBej_G!YiXt9cov z`*eO)xuyk<#t4u!xo3oD?7gt&0qb@u}0dCT1h-cWsW()~cz}r@} z)Cd|LvBXZcNt$pqLko+bHT>>3G5=tzWE=f2$;Y#9X#_=*Ut%INnBK+iitRfXBmA^V z-WzZW(9ykk*zD~!dZ&F!0#89c8NI#B7Ew^y=QK#8h=6UA5*MxXQ=@~1on}2;#GB8~ zBLP}v;P|;hFx&2o{QLfumy;Tmf}eB5izq+uN1o^Xh+nozGkR5GqyDB(#?!a?4k7Xd zLbJo@%*?jI&R}X?*P;X>unynHIJkxIE+*7SGzOq3+x1DicKlzeL1Qr2Mq8YOy2 z#)lmZ)>gQm^6Q7m9;%xcni0#eC~Zuxt)1HcvMD7D>K! zf%?Jn+Gs$>Fk1hnW69N0hSC5`yNtMUI)wDPS$p`-G_A8&#<_WhM7_56K_azCD5yV9 zU|}rv-ga~FYG@*u9THZ9cr`C!L4tPrj)`vskPt@n zXV6N@bxrhPN^9`s^l3V$gijjkTSp|i8Wze8b44Q!A3xsRLOav{T$lpmt;IKT-*u}g z9vNSTh8sKRN9@b&GGIw_SSPLBIIu@qv_O-TUIY;#Lna!N`Mo?@7HX9$QYUP zVPU9e2&H}A&)UYq&r@^a>(JQ`zd7|7MWh;v04H4Z3{24;1U#*6!{58fa(Q?YLo5eBDW%Y`eK4gUa(GPt z`Q7zbsf?m^quN=vf%B(w{X-;;x493oi#7G=*(Gu+Xh->4>mz;CQg6)|XfPL|Uj0oV zs@Yw=m!dis>2d>jeY8#d@=sy+4a0KXjxJ(Mli-?fFSFX$?g`ibA$cBb>h65I$U^~m zR5>}Dgi8ySDaWra@>V?c$aaS<``l}2P|73$mmUAadb-~C)QMtVrC*Bnf$SSj_+#tR z>hbiG<1nX>uOzvm8*gAMmENE|BRA@W6AM-^@>0*NJTC9`*Q3hOA(;RJ?YM2p&}LjAT0UtsBFq-Ch`ujMpxYbpKL1 z4NzVv+Bp4l?tp!JwQBpIMWxh+b5=Da?a=~GhjEO!_napAzpy2?Fv&zmhvdn;7N@9# zcD_M6CRH^=HpHE}&Nv4n(@nyt=VF}6Nna%$04p3%=!9QSo6s?brW*1R-^rl62rIcw zJh_iu9-TA0GBic4sbiRimt4{@QsmEK@d)l;sgDzkTTFP{K=nqP?l`<^Bw;}EX&Vhr zo`gyF`|ib1f+#_26=m36!?0A&N{vy;?{kGx^7%qW)^B59d)`y%umzl}J*TjIKd{V2 zjQp;>t)W5qx+G4swfI*6hurUj22#;KN?7mT)a|2zQYCGdNwUc(f{jLaza(wLUl$YY zcttssjjfo_LR1fexwFgz3I!D%^n86eurc}O1IJO(xpHS>=CO5CuT_c}4xmJPB0MH%5oR&ttgES_ceQ5` zj^wy?-S8I|0L)C_0EIOHX0$;k1O&f6H{UA)R}StWGlxU*2}8tVVeL>}cb*{!hn6po zr~<8+v(1(E1`Zm6Z)tN-U3xmKHNV|R_Y$V?@d1|zJ;Z*GHLofWwNes*gAy$-wmAO$ zvsvi$p174mlzO2=K}|e{_f97(!OBE!a+})Vl7r#{d;w`ix=g^eQboNy4-8yl->KfeU@qQx*!8kAT9OykxYs$>XC%KLn9i`m$$Z&>cs;N7+UqrHRB4|oWFEX zkghIF*NMTR?0+fFwKjW_HizXU@CWW~QQ*GOxR>9SSvJh-NQ`{h zhCYlcjSd0gseoIc+He6x`|YbdZjj>GHalqX#kqd6XmOQ?{p6lcq&}{wdV4CD{Y{C5 zA9OQ$_mLHmAs&8xS4d#IL_TsmW)zF@eq1meTtLICqv1wh30qU)juI9452Kw*$ zZ_66fHrX{om>Yi1(DqEO3(tyGN@S33n0}K;`Wm}3)-!OAND(%hbUI}#S1%QU8HHQ8 z>HKCvX4!Qe`nl2?$*-cS#fKeFsqT5n>ip}tgPyO$DgS<_6X8<#Zv097WR0q;Pe zPhNW%vTZ(=XnmQ+Q=S?~MZ7)tfedfvkeE}qe&*twcJo`J>SW*@uUo*wseQEovzspbDx#`neRE z3i`r7r5ZXba71%4Kw9Q8-1`PFa(4p$XitE)hj{Zv^^tbOBS4D}t^os8K~Hy7C_k#; zu~F5oe;j1Ld{K_l=0e<~FQFPpB(@v!@g9d6uYf2gxuFsB#`={7a|zeK%A5uktSE2k zXdH0fqZ!~5-*FNH8VT!8R~P%(w6X{Qr;O2Lmnd{L-nlQY8yLcdg=qLxYxLa^s_z;@ zHm_S9?q@CWaSuFbdF7H?hLkbg;pOgPsl@4~aK)=LUhP34Qn2z_DoNLkSjgz~JgcF* zi5i_jpd4GEa4w&=I^&8WmuAH=W3 zx;naU(-~p0A&5Ea9*;#|TlfA(&PZcRCV7oINTBNan9$-Nfi zRFQG8#e8d(lhDoxKUbJT_RX`R32i``1@`_Z0WG3Pg6=OvK=EJg|8s3bG%Xj!lTeTU zQzmWhDf_z&DW=}#HB_uE!WQTZ|4~!{DFfht{~H&7kyRt|FRaSlGXykjoAyS|#Fmw) zmd?F0XMLrk@k`GwCuE{?Uw}hG+32&J2ggtIhgm5p+*97}egiuEYL#JF+CLHKMz3Iu zUS(cLO2Om;PVuYd<{NFp*Dv2Zr3u)=z9x@4+{^z#r|8z$yFJokb&tUETOC94gkIT{ z>?$15WBtG!LK6E!5LN0tkdZ%vFYIIQHk0nSBwr|~$SyuGw=e5+o!sJAvuVDciVJkx^>(5DGJg5aiaPK_Jq4?2ZVX z7BV^fH{`V6{S7&-TzJiJe?URG2>RBK+~+XU_AZ?oT*<9;M>NBM|>!;x|IQ8 zK&q`QfRTrOWZ3o6J?BMO1eu+`+O3` zYFj2{XWDMi#q8nGK$3e!3=Be}L>&yt@0n0~)Z1xiLC`k|O`!Fwio4zc-+UV~uzi=} zPwuv<+8Qd(nLC{$)%mYK@s}6}aEH~nm~m2??fr<^>(L<8UDJWJA%jvG7e)4SD3hAG z4+b5;^2t&`?Mgejrni`=GCTgvfvJa@Z@5(#!>d&(qy9$^J8nbU=>Y{GDT3oUgg-wu zbnRnZFCQ`T1V9!8QPq1Jf6%^Qlkg{1X4zq}U@s9dh)OJR*jwCATO41}qQeQ-vJLZp z&!4HbvDDReX$FBdC^0b`{AEv$0VQ=sNlH6ahY_0C*8Viuv&R?aZHb2>9+q^V3f+az zpwRtYnWqml9MGCF@y!K}5{+Zg3DQcS&I8TeR$DrDev2CkcPuujEZ>!QU8Kgd%qXLK zgCCTLRi-*JrVT7vyPzMzc;Xyu>4D2ROaHEzFS59(2VNit3Z7c5eih>zwBZBJ?CjVt zy8`0@6hLt@7@~guz!o4G(JSBMXn1a|tA98m;TYPmonqBz3E(04^ii zCA&eHa3lBI-CUHww^``pE**H}n53TA8JB{JVz!ncmYC8O)eJOYwjYj@h9nMl-In4~ zd{){})_SHx;%h5F<=35)J#rruxpH`0A;@BwuVX7200J#4Y5+`)x%V(EoRp0|&pb|g z`fcf=d<#T}NOVXO8Jjj#X=1+t$Rw{M(np}`>=5ly$%7Xn8K!!2&`|^bi({q7NJ`*2 z4scu(ws|tf{py=?^A%WT%CEdxTyKO$5rWV-Zro%*E5>@$T3KLc;x|?f_ya4u$Q5NS zUW4&@%G$#P@Q1Pm<=%>^L*kE+WlC#cCvKxk{rf7~$PzD-!-WOX^;gUvQR}cH_zhmZ zgH&yf+v%6x@f!qb#+F{HWyZ$79AA1j#&n97)+fe)Yw$c)u>MU1l!Z$`r1}z3!bi=9ACKLHBO|)tf~s7gm}f<{^<9RA?Aomaelm(oOcGA5e=@WmU5GNhFJsV;9e*( z7)zt%UM)Av!hip*%tozWIXlxnr>k zxWmg6)Yo^>^E_9kr?*_Lp=x1kW#?zi%ahYiocy{pWZoV2{i&O)XJIHjn*V|{#}-E3 zF6ze}Bm%6cW)(n}B0R&*y#JfH*+wK`S>K4siy{0hrlAHDBH+l%v+Bo5>8)G;A$@df zlrn^@?Z`8zv3FS|c3SnQhBZZq~7ie_4rfj?jlGbY1DRzCyHpz`eb zOt+CAbqX{daUBT}uK1u(uTuw=ls;oYMN8U?v1xJnx1ME_spq=4uzqdVb(8775T17h z?IO3u&8z%N+i3*QSz}10Itk5kO@Qu@W&8h2kI5mOEBMMnA;0mc!UW|lKkOx#o@nO) zSZ1K?SK?Yc!fN@3hE5u@==V(}hpT;^_)Rc2O61?&M||zC<^x$ucIgVqmjyvB$Af?$ zdQy>IPtQQ?hNRvV&WI#xL%LP>xOI&D0{fwIY7e=LCH}`>_2|3M;`-pYZp)uS8-MHR zsM0aAviEdNfsYDNbg(IqcYh5PL=JA~nXOv4IMd5}k%39}Mk?#C?&uUDSMQg-P_lC> zmSck_C()pE3F(O!;PfA+MbKoau^fyfCe7aQtU4N}PwHg1F|7q{hKyBId_h&p4rB9M zA;d>$)ICG}gC})9YoXjYCTVe~pl)TJB4k?T*c5c50TmLJw@2n1>u}`m!@L@`EX2Q6 z)U`K$+s^4_Dj-(w4zO(8zW?XAHj!SR#O8778}238{?)4B0Q;&UTZ{5A{&U2~u8^w<=HOJ32sR};83U-z%rgQrjB;PY6_8FWOv zip;^|+&B5$ns|xwE2f*y{A5LoEa=SIR3jsqIin8Ug_-H9Dc0s*QBge2E+V8L7Y;=I zY37cAc2`!du_Lq4$6S#Bb7pC%6VpFfvKzW<021dMN4hpAgAa?Ho<#@I?j5!%d(VpgsnyT?2f|oGk+4BFuw%AOki>QjlCl!_#%kjm zG*@pJ85np_Y8H>#8NaRJf;r8L12*zdn2Wa3qDf7D~Bxnex^o-vG z)$mW`yqn(sg7j5jIHn@_~Fu2{byIa z$t_64RX1_#xKHhCV#(zxGy1;%z=%=9ENqXBs+KFI9AiHx5(;hnr))ENVKA>xE=92y zwJ|oypw2h1-PW+5D9oJRa8NUMedA0_?>Cy*44Uks^3o-AUtBY77!^{lJifpn_$}Jr zdbIK7?6+?vkDn9LKTy+llc3rx*|~RI-{@-X&m2J`z%1Qe^aOd(S6TE-80br*@(GmL zPn^sPSETtI9CDoF%REVF34M-`wTaEO{RS33rKL2(BrYvgJo9SbN$w?xbRXZF>$qHy$?7}%6+d0X#lyaCD0oKBb8ilJct;+i9DV{Her%}p&eK-$x&$$8uW;hM3yga3;gy_wI`x&um zJF{yaw7zjF+$dd@RBKUdrBXW&YsnA;X5BTEM#V@9x@_ABO7FJ;R9{L>-w;TSus%l@al934taS z=z=mJjBmPJt)BHMCevBua@=gO0%Ft&k8=zlX*C4Z1=IS5Aay=xcQjk?i*pL+dO{g+ ziycd3LOBeTM~xotF&;W;d-tGeSmzB1Mv9O?QTJ}_;sj&=d7erbp>K5-?UJXmh?(%7 zv6SK$0@)zD1y&ygeOz(%1o?Ixp1!-(M5jnMv6Ih>ARpt5>Xl7p)rk*YooY=iU|beW zjc-qiK!u}IPe-6qX_iD=YlY5hBt1b zuBZRWuY7v952$k1^rqMr#yf*bNhjiQ`S zI1huhD1h3Ia2{$!Hy9e5y=mCv`+Wx;RR#GZ@Wq-t1}W$Q!sbi1#8 zi2>>w6@!8`TVU^a?@9&#Md+VeX1b4l453=aoTku$<;wFUTxHiWtmd>#lgwZ$59I|l{U~5nBMV>Zr z%Dy;y9qH9h*cFHJ!)(+6!)br?w{bG@G5Otg9F0{*r0B~yY|{n*8T0;ctL+wS7nnG( z){v->^n;tG=F{Up~WLut6A{*J>@KJ;>UPEo4c@`uob;y&XgQ{7DxZ$fjMjpA8JW z1BzrsBnXJy;e|!eyuBrRQN)Q$JP*DZuYpeu42?lJ)zF_AV#<}jGQ>9(_ErsQRnBNX zwtceo!65UQV*uXqnT>6j9N2BI^f8wQli9tBD{XWG;8wQF5!O{CU$0go<;C97y z342kfkLJa7(Xv}9p~o$a=!P-lRR5z5)0a4;LWCV!E*m#AT=@(}B0)%3ko!zgOCOpH zs{H`vsZS3_vrqckL>A3tWvz6+V|)W?V2E`AMDlugk-1e5`(OMm&+1-jDbTPQ=8QS5&?BlL$m5O73uZV7bqum4Yq>+$N!^T9^f zL_H_n;Lyi2sC8TjB)%gWS$Dy!EcVhZe$==1+%_?zSb1?^bf#1W(`}q?7Q(?csrae4 z>QVB)nb@FUYrV&KPTr?+V*>M)S$mtny+9*C{&<}J6N?eqyY2j+t=IV+kMejf?blo& z_S)iuXTwEcp3bQG+8Onfx~E_;w*Is#gjP@!rCiWLNeuO}3R8b?vY4VS5U^>^&cH0l z-R3CVcs2xi%rdC>qyy(IV)C$;JDg_U(3qB@3V`8-x!&1=f2j34M@2#Zut?8pzE zWl2vGUB*x&A!I1=x+t9g9d6JH6iDr^zN7Ld`2CztTSKxJX<4sP+*)N`C^Qfx;rn_~ z_9KD61p|;7Xc?u=yywh6JIQ|6Roe{XUpPHfzp|Fy)s}M>4h_Ar$5pO=KGu;Gv1;wU z_{(A?dL`j*9;9g$RGopwR-(E(R^41(=J{*z6`ZCjO;^7k){&~(*ckamW|}88*HK|} zS0B(AV~94LGN{11-Z-sk*%)GjD*XKP%WusrY=fVCjr{MzJjFS(pvNepCckcdA9(h9 zY-CbmX_R^*X1DAjTD;!GDBRY(#UCUHI?dbCtLxLreT#JK zU@rR#vx0EJx+IifdJBZ{YNO**soTvWs9o0wtPGeAY&|Ye_~N)2;kdpO5qCXim0FD8 zcd22mH|tjG45j%Z+Bi>JHy5m|4i{en-(r@fk~_onw?muYyIR2cf_{v@)%<>w2hd4Z zSyrUei9R<<*bb47U#I9`y$)?zdbwAj0rF&tsDWulY2X3kt#p@uBee+bOmw5 znt_RA8?LL`gxX^?#%tWyY)q57f(v&G?(g5?QnvIr<^UNS{IQ&Q^L&dZr&Ygf+5}(o zE^~8c38UFN9w#f05d3*9Ge!wO;YhP3u@{4CwH84OBwfYhX;Cl0wRVMZu}8(;pO@$V zHAK9q{IImHeFd%%{ws{r@jpVuOegD&C$`-kL26)CqC%sK%IC;8|yF(iK51{2%OF@atk5`!p68hb!=?j}g2D0p2q#v8W3|JurOE=$)g|%R)VCU2vDM*`de-6=Vjd0q1yPw_Zp6faR zG#_Nw_una;@v$L8ls51>E)EQlo=0m`j*w;x35yykxEDV4${O}aa`6i`_Ij-SnVpHs z^EfZ-b>D^z{*9fHeF_0^dhRN4v;H1a@cesB0l$0R^rv(EnBc^^dk4O*m}J7) z?h|r@ivSm&U~eu&2*R6ew7ZT$@jrky`F{(nMg9WTx*PQ+az<7Nf@0gM3YW0Bnkdvx0z+2l z=C&J$4=nkQ;@9xgm56>15L&0?L)+(9tdnYAlRR7f+M?DqEEN9qu8^RcE=FsSwL?b2 zY(Gdyo^%(KajLkBjdq(AbAX5DV6rziT67O`a-G=V0HvBWU zW24jWv)+iub^*|Oe5aj>{yX5@3BDNw=k6p_H^a{by9{&F-BR|OR#?N(;JT;BD*Nfs z(9B+ineN%ZM~_!o%(~O%^atG9e0j#QPYvZ!-u%Nk9>!^dLCmP!t$&NZw`aKj*>N6w zc+bZZx#hgW&vtZ98E z^~|$qtd0psey}fVyMkA4fkfN`-Vwm+w3163K?EtzE|XtMh$IFk$uZ3Xoglx4?C;Nh z{bhPQdB^-`M@m*3-MR^Q8i$6H!w~zTvCCh!xr<*RWBOlFplh=N;M{FTg2X?9>JCVU z_GCaAmh>Eg8)w(=S~9NU+Y#Ac_e93?k|t?<%)4;gz!A<}$9!Ln!sMfsW$5&kGNPB} z#t#cRC5Is{F+0)@+AJORLyEZfN$tBw`J?Hr>10xNWcdZIB`*3^E|6Lxa@8McfFXsh zhe7M_vd;t{WD$sKwSnRyJu6l}?$gPjaAZvLrl-5GX70UEy2xss{zSF>Qzu)$qT&i5 zGqmP`uu48E%P8(x#mB#h3YhL8Y6WP9A9N)^AxEc4V+&6nLR8aY)?K3rwdFk4>_O05bAj)Y<)g$R?%b&KvA~h$k>;<=>{Kq!WO`5F1iIG-baZf{Ay; zJS^w((dY)m{NtAs0tqio&8>sm#sr{l8)M0Zx@ipXNo?I~Yhg}$aQco6sh7D!{0hYG zf=`S0N09y0#sL8!bG$@+yAzD41UgPo$5$dp@T_1tcPJZi3W#i6g3@?rI2 z*~uKZyr?jcF_@B{a}=mC1e4EqMTaD7%Mc${eJ*@QgQ{Mrqo$D&j36-JJ~hhIA^7>y zUtG|AYW78lm7^G+vlXub=h}Z5nyBtUkJHz@MA5CDB6hxrP%)o>xD*DQJX7T9*P2B=@6Pt8v7>cm%v3DMLwbz@(5bE(qSHeU3Rr5Si!&gvu9Vd zhJ2lFA3^t1NT`4xCxBMxzdDg(ZaCRy9QJcwogC+$h}+oFiZ48lQ}(11CeaGvg?4{c z`O!lb{F)GR+bVghOxz39$8HudrF}ec+SNJ!0H_&kAw_(zgGH8(iHJGmVQp%|*SdU< z*sK;#n?##3e>PRiTQc+@yPlW81IyG9v4axbbH{~9gC+u)AJc6Y$Z|pe~D4xWV-qg{6&?|7V>uUSnV4U=}q- zfnC~1yvKJ3s@$z4z!1&HgA&2A{u-GO3;jd-c7aMCg8{~A99TDI*7<#1ns}QBRo?+q zQGmB+a0U~>>=yxM5J{I`(k&%TmKK`x9Y;!xrk8!2r2G3GGIc)&>;6a)D4eC(=2c+l zH-tF|@vuFfMt-RQcT`F`zemQH{wZSB=@_+dU?3{MQ?!SZa)Sbqjg#`t_iDcz@&bY9 zU4lOubGl^m6}N<|JYIQ4MR2(evYGWz6&5d75wC@^-n;)@JSAM$+Qe`7 zIP*rsKQg}B6g*G&?2057@$n-+REucLn=k+!jwIGm@u2ddL1Ry8!8Tdx3jv*FQE^5T zu}%Xo-JLKveSrw$rYk7N{1*s>tR%3hNqF%sQTG?=O4rCSGQSsRy7Ae$y=Um3Gv6Gj zU~6X7S7G*;TgYB0(qCBNw&=bF0=j*GT`Wra0Tsf_Mr9gA-Dy#XVZV7vF3QS4noigk z{X#?CLj%m#-gEn+YVAj%*(8+o*g)JHWp`wgh*9Q-CmBar8_Kh5z&CnP2|!s|w0yYf z3pRmg1R;D5$%%^zF3GS;{n5sZz6HSi`C?4ETeNM~94349?i- zz6gz~?TaWBp9#{gCBS2p?RP6jK z0(UGSd!K5ZE-Izyr-7jMd8GJ}+uxC5P)mPwlyH*hyg4jC7z)~6S;>ocdxM)_v@*t| z#05fnNkh|X6#Q~*wM~8Fwfx4{!No|H0(zj<#;MW5I=LDJdQ7ug64N!2(J_j|X_(xV zjy=9bmh2BiKfZ5<+s5y*H6k*+kh&l59dY$>vzudmi)Hj${6wuS3%N^Zxv9x9{!t8~t;fI4{R} zy;J|6g$%)YITk`-_;# zm;WQ{xRO>BgZ^M_4$FRtqG^6a4feF|_inUS)cem{j+hOSGah!=`vvNLSU(jpDm9!~ zVtcMD58?gb=RptCoaY8F@uHJ@bNYY4z<}d~=+F#(AKkljXol`5JFrs_5FK?>yo!ck zp8sZs2Jh>CSHvv34?>Z%L>=xwsL800Rr*`-RqM1Ic#4!`ySeYO^?M9lsP7X{$ym6! zQ2wbrjH2@}^se5^Zt=+CtZhvE&Aat?Y14n7>QHu^sA0V4S+bAtLl#_uL7K0XhvGsuVfb+}oZey~#{DX!@OaCx z*qiB6s8k`vZ{^+54#ArwCjTSn5_t;AxlnA_=d9*RoMk#+>NI=YVPu}Ru=tea_P(cv z#Ti&MwB2)Ih227R2FhnsubXg|lt6)93q;odIOU5v3qR|f5h@_h+Wv&+VHo@&!Y)2a zVumTWS)d=J*2ZCMR(i9IRL#yN*DT{EF5KuCbJ8`9eUZRKRBOZrVPw$aT@PPWH z!4MnQ($w%mFZn>fNS=*48@%5rM~~(qbiI*+2;KU4S>>6C+OU0&2h^Q<;D*059P)ktt*Lt*|7&`bM69% z2#&Sx3oWE%1E}ld|d$j(?uO4^~S16-&~x&E^VlzZ6zmurI=A(^!a~me6X+x0D>s-|QRX0osM3!OcZbto$gy9$0MRX{{m_%>{iWo zXd^zg9|=zcB}Hqt)S5NB|8cSp!sffVdy(i_LyVo9KVPe0L}j_$WIdxGYV`#{roD>a zJ1@*43*_-gAch)g#Hn8?EgtKG)KF;APN{VfsiDciDS~EuYn7yHZc_tpNRa_rhbzn? z$B&i~z|g4B@5k-z5M~Rv)~-2o;O=hhs<8*VM*B*stZ4!XXkEzn3Ap;s*l_{F3)Bwi zN%()p#p1L1Pc9Y;tN@JUEN*%e(wzOAGO%y>UUq}l9XWprhvp27f{J01%Aqk`Cz9oi zy8Wlrgz$gus`&}NbYWHnXb;2z!!nijE_CzW1DT!=uv9bxBwuS4)flz{S|-mw4Kk_V zm>YPi8w=2;KY{k`qGp=z&YJmglC9@~*Yw7Oz8&|kr|}Gz@ne{L#iM*3x{wOMh4gRC zD{!d$8$Lx(Ry>Krmn5I73~807b! z{{xTh5iPyS=eFMKBDDE`P*;W!PuWstEoqwZ1;(Il!f(M0_=~~ygafSbhm63aZA>R5 zQZKoI@iDr>Zge6qqQ;u*Tv7NB4mnqztY18BVuSM0&#+skK4w2bO6DpNS`6z^V%|z;b-dm7iFUU{Y<#1y5Bkm(}Z7VDd~sJ^V@X?$#Wfdv=yh8OnW@xkm-10+w z2>0Lm;>DL9dHcIZNP7oLNMP)diSwvaQ6>);B8b#@C3VEJkc?iADn+Xmg4Hj@&JcGH zX1^#5P_23VEwz!-br$D5f`&YYea5xm5at!-SAJd$p7NL?J(4-K{9k7^nJ) zh_VKGp=GBO3NwbO;YdFV!3-v5ZlYficdlvXwoF^@chA)I^CpM<|GN^Q{&IU2Tvt*O z_Z5y?kc;>#N@@^X*`SvJl$le}6Ls^f5@%Q+6>AOX(c_VUmA$*ylgRC%&^`j7Of*vU z$kT>(*lGn+$D`~US)N&2e0LRsr#Jwc=kY zXd`KOIIp60>Y}Q}cbjk-b{$f-llIB2pMzvp*n}$|RJ~N^i9wO({^^o&+qmN%Y8Yy2 zJHPG=7iP{y?+X`tH0W2;oXl^}Et_aVD-!ziqN9T@$LHi1dThIp8gl``j{R;rQKRtK zt5_$CJ?!FKmdGLb;MC)n?tv@yIh=Ovh@B3e&6p_H3j>MwPnY5jbFUjgViL9N+@;oj z9!XTy&iN<%dIi3|RHKx9c)IPi{w5VJ$)7M62VYM4Xy~HK0b%T5Ux)&zMrA|GFp$UZ z9Agi&@X5TTCf`Rzu_9#Kf4Y|Z80*=!&I?~5l2M9oIG{g<3wAvW?8U}a_-Wm>3PBzT zdw$6E)R!haKH3GlT{W8>2;^h-G3pupmK`Q-O-%!zI1QfVR_Ya#KsD@&Y{P7@*
  • i&4&v6t+0+#f@dtQ9L^~xCsU9ZVm zeSy3%`jts6h%&s5ytE8utLBdbBcGVE#!Ya=ZbVewH|$KYc>ADKzM0g)O+a4G(BURj z>+rdN$iVaJ+!?-E*Nc%q95jx~D`+3T`BUn7PmB1d)ltQ$b#Rjv^<~LgxQNj zPd0nK$inP$erL=1iE6T}C8}O8$;Uyr8Wt^)YH)b`5vJHXztHq9`6hexo(Wl zi>QlX9eZ>8?mO~Bs!29j(2P(cKR4mon?J_`Z-OIo&kfC7+aQBZ#8c6 z1+Y9Zcoc2QR=%gmo6&L77iR&AA;Df2J6ZQQb`_`2VmoS< zFh#pPw1G*eLhw>Tm(xZVUCO=(zb^fDgvv71WUY0I501C_l019Dh8|?havHMIAIraX zC8(b5_wcIle)sSQqZ-;<8s~sr|1N0kwTg{blY}KvL3rpuM6u+l5mw~}6W+T|OJ7<{ z-wV2u(fpoC4<_k7T`zmiy{CzO+-3fMSTTOV=lAiu1xqDJ@!)TF+U*CX(+W#YO7m)w z@8umgj)5AH$}c==MNp-$>W9=9m*`_TPHJ?(jyc|*vW-@MJ0)V|2ponqX%l>6*9v=G zbr!PKcw&tG@0F_973ynNa{j%|)vW8!cFVp=i0Z*QS&y2eC#+jtkEyiY9LrCEkQZ4z zTouEr@zirJ$|=2yx1^aSyq`0Dudqk_YFX8Dkx^5nvVQ1#qX4%S6+m8-#f+c@jQD$s ze4-4@;wz+{Uz(oBAF0$sRIrENK#BAY9`Ymqn@+m$gV3Af+?IrdzSLuNg}0dX4QeI=M~ zPMS=QPHNk%<~zGUF{-#FGJfS#F;ndp(@@ck&-*mVM}}v5MFNY=6D_~x^6Vl3kgM3+ z#`zK}Dt2}A3gMppP%8E>*VAFed$F9*mH5Z<*E6AkUXy_q!#AiKZH#T zTA~5NZ9zB_ZR|D6ZBg4Zd}=NG>Xch&=@Br(nN&|Gm#v82ZsB|awz6QLrc6nC9n<|b zOYh~HloeAt;kg|IK{MBd4YAegy9+cPMOQu4cEUim>#U5@ zDrMf9-Li(f(1F4mQ&Wk_Y}DR9K1E zDJOg03$L49-ArrptPKVQX`JU{^J*bjRLwVm=-Ll|C1Fs^{xu-sP^pd441rH(=8fdP zxo5z4{)tgIbDsW5c8g^G%wpQgj0d+`%THL$$Dd1)qg?{eq~+o zyHm1^;%+;KFq( zSVD@VS$DsVPD|!9mL^Dtg>UbAUEgE>lGBSMx%ub`ZUWGf& zMSFRN3JHA!c|DRZkOl%vX_K*;^-WB`Bux+^VczsjW|@;1X18%W%m`Xstw`JsvgB!d z8HgQWl<)Yxpls9?GZ!HHMhPXYtIC|<_o}wZUAln7eJ6($fRbW7^n>0Db2+HjHUcVA z+FSoqI1WZ{Oi)jL6*&2Ay=6yOQEjSl1iX#IRnz%2=(2P-mQ@(A=~^bkuzfPt?7MPE zu5_X)oMaX7a#-^VNWL^uj7KMho3eK;3LQrK<N(W5`In8IaY($dGedVd z{5mSDBYpY&Mkm;bpD_uo=nS zvGHTotyDer&9VFP*;@hrtq_%LxN=e)X*Pj%tjr-f61p)?UGbV3WPWqIoRo$n9r?un z1-i|p_}0U<{%@VT!eWZlgBAc%K+{|eAO)3Hiz9yHze0qx^Tx!1_slSF8EI|^FPlIH zDB1~<3@L$34}c`Ug$R_{OQM}uV4`9Jc%*yR(RMX}Hw!g#6b2fC&{M@dvt&`9a?TlJ zq+MJ0Cb2*u!hkx2nS zz_%H8<{C67?BgL-6%X-{ppD5xHcN*;J|sN0AP$y2M`8OWOY31*u>kp!(8)G6Di2d3 zfx2dRz-_nW`@{Q-B()2ry1Riq!b;2^Fif0+H^(Ia`W(p0O)2p$4tNG^eSHW;nf2(e zDh|-rmYqsHKPbu+ANU{Q|p2(97tcVK;|`&u=p((7t`pV~`Df^s&8<7Wyd! z0JlL!+24cSHWjN5>ZF?q@V+(h0e}u%BHa|lNdN<(%So~qVOxN)4<Yeo2F5Or8 zk+c(%4NU1d@p}bp(yQ&@(Litem86%um8_=x`%5h6Nnp4Yi|50WdQsxgs&bO7;2XrO z_q#wxZeMPUCQ7io?2gJVlY8R3WL=YxENlUVCIxtN{xnz8P&Tyvs zhYmo@?HxTA>2SBXqLq;TEQGj;gIcF(8+xD9IF%)~-y<-n0XZb{i%t5_?yzUnX^C{M z^Xt(Ymz@z)XVCGW_je3Yyhl;e@U$Acyf=bbjqj=L+gYK><`IOd9vv8i;8vY}`o(Io zi~pgAsHQgd=VgVh4VdmRKp2iyR^*Fm97NI@BOGl|S{<{{NIf|3dC@X{6Q2X?|JOt7 zKP*13G!tvQ=et1VNN|`|3?bhp{pH=(Lo#H3}GhgDWY*6U|iJJNPu(JfM5VFC52UzXRZPgI)n*Bmoev) z8w1d2_HHBlGYR*XlVr+2Uglgd&T=OCQj&81*kZ2#qai8d-T@|o>%)2TLzPf(1x|Qg z)UNqQo~I70zE7j$EXMbru_>z;pZx7(Z7Z-K)ozWrA=uujm2;?&%sohurF}Q4>iRN~ z|GnO~Bb~gz^=^{uE~|kK#LJ@YCa;xymu~I*w)uU4SlhuAW8=J3GQi{fFHAKn5f=F} z+aX4^yc|_A$DoQ#>SxF-hS!&LvCkVFpPD;C9ACbrj4B|FqlFqLT82}on;Bd~d+93% z9%qtL_A7|N#Uq-x2QqnvX`uw(rvX7!;L}CU^Llug!7ueu+|h}$H=$tkoPf zYnlkhhc*~M4k7S{S|z1L0CX%P(N7;)d$%VZU8!sNa-GKd21p3^83nY9Y;4V%3^8<7 zG=f|;{rwti-#&J+3dYW7xQqGAZ6Yiq_j`W`3ZOiq9&!eh%U>aB9XhGLG(jF)er084 zRFEpdM)G!x)9QgywBDu>IBkFX_XYJ2uxt0x(oXv-Za7Dm+#ek`jLSEp_WO%qGne!Z z()fwC(+_M)I_7h4PRCU~0lIFHyaP)USOk1sSH&Ui-QVTi@Y&5Yr)$DjQ=Y+#`ow|9=4d()fIu_c@CHYrStcLEJ57FN z!$EbCc34_iGiU#1$2ja3gW-%ARrHPpsp!866qbFV%avFKy!TWI(fP0?aHef8g zaQ})#l_K30<3|!>aGrZeLP*MEN#fqHI=nPUP=Gk@~sx@wUL{rgK>u)ehitg zrU_-OB&;UtDfAjV*;Sk$bl`tCW9VoYv`*%&ZW5i>SH~dTwq=ChhPPtr9^s1r^Pxtm zU0Iv|?7KbTx=9J;SWZ;l90hP2|6NWcuPLxGp^#8nn|o z4X1dT7~3$hp=<3oXh7g)XP6`ZRGU^h1WqF!Bpj{0;%DHlS8~?UGVwmptK}qna9R6V z%r<*pW%#n&+hXaseZeF!X`aa~Em*%98_QeNv&qOo+5;WKp3Nxzl^BHTfK~UxO6d@W z=~XMHJ0T3x4>YggfTz-5SzppXk9JiqUJT%ij`KzbkK#e#`KLlXeDK|nN6G}E(BsS- z+B^JhG~5LgzQi%P0^omEUvuO9K9Czp({paYriogiLTQI@_8HK(m&m+U?`zy+d=2~g z&^EUN3C=`_FqL?a90SkIri3IuLtGOSg_#q#JpNhJ?DcGW3%5J|oIwpunOnhj{jL%m z`L*}+DOFv#l6@2Iak42HQl^Ml#3cB~5Xf_8ZrGMFp=^U+OzZMTC%$D!<{bZ3UZ-*7#MXdU_`6egR zgz|P@TRE%IDO}0SvVGUsG9;bU{;Ba4Jo|Sy-e0U;-Y#e2 zXI3e0g~Qu;O{3$6txX7i2+Bkdpm(?YG+&9U7|bJiN<|ibz(<~>+vkRSp%NijPTgMR zoQz%2(%K_`-Z;K#-1}aS#$_r)=ySc*ZeJ2EW81W%2}{;Bp7Jc=<-91tLN`yx%!+{z zbJXzbHv%~<%a}#2CN8B9xY!vFO^F4>N(6^}FzMeS_PiIu5aSNVnf_%wLza}08)-}@ zT_~AU;7+SW$vibRJCrgXr!_9Q;10un^)qyJa3dn3Vsgiyf8~LY^r*Kf`;HYD66L<^ z<%<70)+6U8cYR`>b%qyxqV2V<9Gi85kfi(cqIH5p3jE{} zG9sQ#p?CB3hY5RRpklSI`|C#2w@V(VApFjPl?eyab@`R4r>wy`zKM9K_u6S1x%Px| zHo0KO@JVQoCqV<=&{+3i*S(f>_s?-!iB2{s#}4w3+eFj(!naHU^(~pCw6n{5944b~*coKy!-n<9gWXa~GfaQcBa|qz>)abHcDv;T{bcQM5lJpUifP%q6#2#Y*6K z>E-fhNfBGoTsH4iQgfkWw{=r|@+Na($`EftEOuLNy8Oa3GTAx?mLhP+J|cxa5!axI z_>!Jyy*-{Y-@l1E>qJBanr9W=MQZM%%4RhOn}NR$u{!kj#^?7Fg}KZd#jM<-XK~e2 z1s!YqpE4@M%u+qxb|z#GCW^{AW)SXa%u9@_F%A@0=^tL2JLVN|4c8^qg*z~AuYg!8 zgux9P{6^-_o_ECOZzPx3x8%EE;SaCP2$ia{Vqg2%(f5;M@M0=X{10y7vhazzhOSh& zza{&MpBR4d`0dXq3!zX3X`{EacjDg4#paY8Q!se^rEPHIOI-W%AtyNtFuJ88AE_ASa0sL8K|`7k$M(H zg+P=)xG7AT1Z$LqV`%q<28Z%agsx@2vE^j>`pk-#-hBVfuj^!a2&x<^ z*pkv_H4|a6)S8i%7rd8rr%G+?TKqi)%``=Fyt(2pY2Vhn!@0v$qg6r%v$k9i6=@AD zqu%51kk^#wXLF685jWp*>>|LF?e7|PAB_2`i}}7gnjPA z@DI|!e($BdiZTV+&9HJ-su}=+WSC>p$jA+YU!u0 zo*L#a2kv!St(+_9r;{46H<+*c#1X(%l3ClX5umY#NIMIL951e7`^uwCtmR2n-&1|2 zS9{0`QX_+S&}#7Y+X)IMyL%~XKHiELwv+vkTSGdpvs#mMPix@=&pG-BMIOrk7S?;u zn1+MvG?ezv=ADowvbPPz$pOTz_zItkH6I$!XRXQmwYL*_KJfHUEHZexE5{$h6eGvY zRi;f^(k5$y^W(=!hqB!ot)?yq<;&i;Mg@hEpG!f1w}RZ-O?p|z z!$_FS=zCdAIN7b&qmx`+T!e&OQV&-H&$M?9iWP6NyQy5sL>mOmh04M>WQ8YE9@SR= zg7qMLv`bAek*QW~^5T%PmOb|GSU#+2u_}^G!K_+@X5WwHJ0pARcg%R5Y`3t%oT(yl zRi%d9<6DtZq~C`^0{ZGjx<4k?{t zucmnGuuJ|ol(o7A>%}-~M#rmWf+Jr^dRmuHe-Do_VgP5wy<7=D(LZ^HD3x&;hEAMu zEbo0fpJnLzEMO8x!GJmS`t5EL`yhdjJr(PM5px-)VKm`$BPo4liw*M)8pfX6IV`1( z=~S1YlNUyNhgvHWKJbBnP+JL!?c`(;=8At^!WcG8RFX`lU@#%~(uecw(nOO92>ig9 zD|5?hWuPfjl>zo~$p$9D$s`TSvY~tw>C5=K=9!KUY0-kmMcZiRbDr@V+5F~oXK50rGxz2C9XK zC5unsmLZUgPms-9;rm_^jCH%!a;;j&B82Shk6xZ7B?cQeu0p0+=jpgiFr9NUpWlkl+!ndZ_I!0TJ zi_!CxG*k!;BLM_mU$L5&;f~!M!5;CiJaT?SqNVGXJ^T?pSZ&kSBXad}S5zZy`I}=%VU7A?A?K)kOZ$|iMEoJLI0X}3if1j2O? zY)=a6a#Lfougb|;CniLCU+nwNl~vyHh~NkN4JzeGy7A{k5*ME=@X4oUR=JY9t9#sj zZ#-6yo_Wqo=peG&z@t>iqmV94k@VlP&l}`-uDb3QHx+bD(8H{B*Bwr`yQN!MG!!6p z3nS@o-3bzT%pAxd?Ysce&v`01r*b6YK~*tuq^fAHe7ZG8S_L!yP$DF`)L{|kUuG6W zsQ)dGiH4wCm@;i-{adwun5EKw0-|Uwsp!>LgFiB~(vxNT8cTSuPeJCn?vr?5smn-WJaG=JW$tKHts zUzwQa?TDnyP;zL86Kjp9WDwZj3>%v8p%q>0dik<)EMLWiZ6$(@u=5lSo?I*@HSxJ3 zQw_-rS6DCcAXHVZfZ69VH}oIkfa<~%sK@9_;*gdgh48yYp8yYkl^)>v1I?w_gNh~M s;PLFgrmp`iTrjWxXW8>VdNJA~&EeY9&fd8#82BfBTTUWfOvmeg0sg*IcK`qY literal 0 HcmV?d00001 diff --git a/main.py b/main.py new file mode 100644 index 0000000..01bb712 --- /dev/null +++ b/main.py @@ -0,0 +1,130 @@ +import urllib.request, urllib.error, re, os, sys, json +from collections import Counter +from typing import List, Tuple, Dict + +import nltk +from nltk.sentiment import SentimentIntensityAnalyzer +from sklearn.feature_extraction.text import TfidfVectorizer +from sklearn.metrics.pairwise import cosine_similarity + +nltk.download('vader_lexicon') + +STOPWORDS = { + "a","an","the","and","or","but","if","then","else","when","while","of","to","in","on","for","from","by", + "with","as","at","is","are","was","were","be","been","being","that","this","it","its","they","them","their", + "she","her","he","his","you","your","i","we","us","our","not","no","do","does","did","so","such","than","too", + "very","can","could","should","would","may","might","will","just","my","me","him","her","his","hers","ours","theirs","our","your","yours", + "which","who","whom","whose","what","this","that","these","those","am","is","are","was","were","be","been","being","have","has","had","do","does","did" +} + +def fetch_text(url: str) -> str: + req = urllib.request.Request(url, headers={"User-Agent": "TextAnalysisProject/1.0"}) + with urllib.request.urlopen(req, timeout=40) as f: + return f.read().decode("utf-8", errors="ignore") + +def strip_gutenberg(text: str) -> str: + lines = text.splitlines() + start, end = 0, len(lines) + for i, line in enumerate(lines): + if "start of the project gutenberg" in line.lower(): + start = i + 1 + break + for j in range(len(lines)-1, -1, -1): + if "end of the project gutenberg" in lines[j].lower(): + end = j + break + return "\n".join(lines[start:end]) + +def clean_text(text: str) -> str: + text = re.sub(r"\[[^\]]*\]", " ", text) + text = re.sub(r"[^A-Za-z\s'\-]", " ", text) + text = re.sub(r"\s+", " ", text) + text = text.lower().strip() + text = re.sub(r"\b[a-zA-Z]\b", " ", text) + text = re.sub(r"\s+", " ", text).strip() + return text + + +def tokenize(text: str) -> List[str]: + return [t for t in re.split(r"\s+", text) if t] + +def remove_stopwords(tokens: List[str]) -> List[str]: + return [t for t in tokens if t not in STOPWORDS] + +def word_frequencies(tokens: List[str]) -> Counter: + return Counter(tokens) + +def ascii_bar_chart(pairs: List[Tuple[str, int]], width: int = 50) -> str: + maxcount = max(c for _, c in pairs) or 1 + return "\n".join(f"{w:>15} | {'#' * int((c / maxcount) * width)} {c}" for w, c in pairs) + +def summary_stats(tokens: List[str]) -> Dict[str, float]: + if not tokens: + return {"num_tokens": 0, "vocab_size": 0, "avg_word_len": 0.0} + lengths = [len(t) for t in tokens] + return { + "num_tokens": len(tokens), + "vocab_size": len(set(tokens)), + "avg_word_len": sum(lengths) / len(lengths), + } + +def sentiment_sample(text: str, take: int = 10): + sia = SentimentIntensityAnalyzer() + sentences = re.split(r'(?<=[.!?])\s+', text) + return [(s.strip(), sia.polarity_scores(s.strip())["compound"]) for s in sentences[:take] if s.strip()] + +def tfidf_similarity(text1: str, text2: str) -> float: + vec = TfidfVectorizer( + stop_words='english', + token_pattern=r'(?u)\b[a-zA-Z]{2,}\b', # ignore 1-letter tokens + min_df=1, + max_df=1.0 # allow terms that appear in both documents + ) + X = vec.fit_transform([text1, text2]) + S = cosine_similarity(X) + return float(S[0, 1]) + +def main(): + url = "https://www.gutenberg.org/cache/epub/11/pg11.txt" # Alice in Wonderland + compare_url = "https://www.gutenberg.org/cache/epub/84/pg84.txt" # Frankenstein + + raw = fetch_text(url) + text = clean_text(strip_gutenberg(raw)) + tokens = remove_stopwords(tokenize(text)) + freqs = word_frequencies(tokens) + top_words = freqs.most_common(20) + stats = summary_stats(tokens) + + print("\n=== Top Words ===") + print(ascii_bar_chart(top_words)) + print("\n=== Summary Stats ===") + for k, v in stats.items(): + print(f"{k}: {v}") + + comp_text = clean_text(strip_gutenberg(fetch_text(compare_url))) + comp_tokens = remove_stopwords(tokenize(comp_text)) + comp_freqs = word_frequencies(comp_tokens) + comp_top = comp_freqs.most_common(20) + + print("\n=== Top Words (Comparison Book) ===") + print(ascii_bar_chart(comp_top)) + + similarity = tfidf_similarity(text, comp_text) + print(f"\nCosine Similarity with comparison text: {similarity:.3f}") + + sentiment = sentiment_sample(text) + + os.makedirs("data", exist_ok=True) + with open("data/top_words.txt", "w", encoding="utf-8") as f: + f.write(ascii_bar_chart(top_words)) + with open("data/summary.json", "w", encoding="utf-8") as f: + json.dump({"top_words": top_words, "stats": stats, "similarity": similarity, "sentiment": sentiment}, f, indent=2) + +if __name__ == "__main__": + main() + + +# AI Useage : +# I used chatGPT to help me structure the functions for text cleaning, and frequency analysis. +# I provided prompts describing the desired functionality, and ChatGPT generated code snippets which I then reviewed, tested, and modified as needed to fit the overall program. +# I also used ChatGPT to help debug some syntax errors and optimize certain parts of the code. However, I ensured that I understood all the code and made final decisions on implementing the code. \ No newline at end of file From ab61b6694a2f40c492a30ffb29c5187c46519c63 Mon Sep 17 00:00:00 2001 From: nidhiraju10 Date: Fri, 7 Nov 2025 21:59:28 -0500 Subject: [PATCH 2/3] Assignment 2 --- README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 20282d0..3529616 100644 --- a/README.md +++ b/README.md @@ -1,11 +1,11 @@ # Text-Analysis-Project Project Title: Text Analysis Project – Alice in Wonderland and Frankenstein -1. Project Overview +# Project Overview For this project, I used two books from Project Gutenberg — Alice’s Adventures in Wonderland by Lewis Carroll and Frankenstein by Mary Shelley. The goal was to explore how Python can be used to analyze, compare, and visualize text data. I applied techniques such as text cleaning, stopword removal, word frequency analysis, and summary statistics. Through this I aimed to learn how language, tone and theme differ between these 2 distinct genres and allowed me to explore the deeper qualitative insights such as emotional tone. -2. Implementation +# Implementation The system is built with Python and Utilizes several libraries for different analysis techniques: @@ -21,7 +21,7 @@ Design Decision: Instead of heavy plotting libraries, I used an ASCII bar chart GenAI (chatgpt) has helped and guided me in optimizing the code -3. Results +# Results The project acheived the following results: Word Frequency: @@ -39,5 +39,5 @@ Visualization: The ASCII bar chart clearly highlighted differences — Alice is dominated by character dialogue, while Frankenstein emphasizes abstract and emotional words. ![alt text](image.png) -4. Reflection +# Reflection This project was both challenging and insightful. From a learning perspective, I realized the versatility of text analysis in understanding themes, sentiment, and content generation. Alice in Wonderland used simple, lively language with lots of dialogue, while Frankenstein had a heavier tone and more emotional depth. The low similarity score proved how different their writing styles really are. Cleaning the text, removing stopwords, and looking at word frequencies made me see how much detail is hidden in plain text. I also learned how sentiment analysis can capture the overall mood of a story without needing to read every line. From 1df00cfcbd11926e03feaad22ae97db83cd62b03 Mon Sep 17 00:00:00 2001 From: nidhiraju10 Date: Fri, 7 Nov 2025 22:16:35 -0500 Subject: [PATCH 3/3] Assignment 2 --- main.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/main.py b/main.py index 01bb712..0eb3bfa 100644 --- a/main.py +++ b/main.py @@ -76,9 +76,9 @@ def sentiment_sample(text: str, take: int = 10): def tfidf_similarity(text1: str, text2: str) -> float: vec = TfidfVectorizer( stop_words='english', - token_pattern=r'(?u)\b[a-zA-Z]{2,}\b', # ignore 1-letter tokens + token_pattern=r'(?u)\b[a-zA-Z]{2,}\b', min_df=1, - max_df=1.0 # allow terms that appear in both documents + max_df=1.0 ) X = vec.fit_transform([text1, text2]) S = cosine_similarity(X)