tutorials on pitchfork data (ULMfit) by jlealtru · Pull Request #2 · datawrestler/Tutorials

jlealtru · 2019-07-05T14:09:57Z

Adding tutorials on pitchfork data and some old code.

…odel original file

datawrestler

Overall, good start - add intro sections to both scripts, take advantage of headers to break things up, change the training process to iteratively unfreeze weights, possibly check out fastprogress, use relative paths, and never put keys/secrets in source code again.

datawrestler · 2019-07-11T00:45:08Z

TextAnalytics/pitchfork_data/pitchfork_classification_script.ipynb

+   ],
+   "source": [
+    "print(os.getcwd())\n",
+    "path='/media/jlealtru/data_files/github/Tutorials/TextAnalytics/pitchfork_data'"


Use relative paths - add either a standalone script that secures data from source or run it in an intro section, but show how to download the source data directly so all your steps can be rebuilt.

datawrestler · 2019-07-11T00:46:14Z

TextAnalytics/pitchfork_data/pitchfork_classification_script.ipynb

+    "learn_classifier.freeze_to(-2)\n",
+    "lr /= 2\n",
+    "learn_classifier.fit_one_cycle(1, slice(lr/(2.6**4),lr), moms=(0.8,0.7))\n",
+    "#learn_classifier.fit_one_cycle(2, slice(1e-4/2,1e-2/2), moms=(0.8,0.7))"


Look at fastprogress - I think folks would find it really interesting to be abel to iteratively build a training graph as you progress.

datawrestler · 2019-07-11T00:48:23Z

TextAnalytics/pitchfork_data/pitchfork_classification_script.ipynb

+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "learn_classifier.unfreeze()\n",


The fastai folks recommend iteratively unfreezing layers sequentially. Start with -1, then -2, then -3, then unfreeze all. That will likely help out.

datawrestler · 2019-07-11T00:48:56Z

TextAnalytics/pitchfork_data/pitchfork_language_model.ipynb

+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In this tutorial we are going to implement a transfer learning model for text version of the ULMfit. \n",


Format this markdown and add TOC with hyperlinks and additional sources to review.

datawrestler · 2019-07-11T00:53:46Z

TextAnalytics/pitchfork_data/pitchfork_language_model.ipynb

+    "# username \n",
+    "os.environ['KAGGLE_USERNAME'] = \"jlealtru\" \n",
+    "# key\n",
+    "os.environ['KAGGLE_KEY'] = \"6c3a4d6b4d8e7804780d6cb02879ac53\""


@jlealtru never post secrets/keys in source code. You have a couple options. The easiest, although not safest, is creating a separate file and import that file in and reference the variable name only in the code. Alternatively, you can leverage something like Azure Key Vault (easy to use, super powerful - think of OnePassword or LastPass except at scale/programatically)

datawrestler · 2019-07-11T00:55:58Z

TextAnalytics/pitchfork_data/pitchfork_language_model.ipynb

+   ],
+   "source": [
+    "#learn.fit_one_cycle(10, 2e-3, moms=(0.8,0.7), wd=0.1)\n",
+    "learn_pitchfork.fit_one_cycle(12, 2e-3/3, moms=(0.8,0.7), wd= 0.1)"


again - iteratively unfreeze layers and train - track progress using something like fastprogress

jlealtru added 4 commits February 20, 2019 01:42

adding additional neural network models LSTM and more advanced ones

074bddb

Merge remote-tracking branch 'upstream/master' to include financial m…

1dca606

…odel original file

first version of lstm model working

29b8bf6

scripts for classification using ulmfit and pitchfork data

1e2b55f

datawrestler self-requested a review July 11, 2019 00:41

datawrestler assigned jlealtru Jul 11, 2019

datawrestler requested changes Jul 11, 2019

View reviewed changes

jlealtru added 2 commits October 2, 2019 16:15

adding first version of bert tutorial as well as drafts of other files

079deb3

BERT pretraining and classification tutorials

b8c6339

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tutorials on pitchfork data (ULMfit)#2

tutorials on pitchfork data (ULMfit)#2
jlealtru wants to merge 6 commits intodatawrestler:masterfrom
jlealtru:master

jlealtru commented Jul 5, 2019

Uh oh!

datawrestler left a comment

Uh oh!

datawrestler Jul 11, 2019

Uh oh!

datawrestler Jul 11, 2019

Uh oh!

datawrestler Jul 11, 2019

Uh oh!

datawrestler Jul 11, 2019

Uh oh!

datawrestler Jul 11, 2019

Uh oh!

datawrestler Jul 11, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jlealtru commented Jul 5, 2019

Uh oh!

datawrestler left a comment

Choose a reason for hiding this comment

Uh oh!

datawrestler Jul 11, 2019

Choose a reason for hiding this comment

Uh oh!

datawrestler Jul 11, 2019

Choose a reason for hiding this comment

Uh oh!

datawrestler Jul 11, 2019

Choose a reason for hiding this comment

Uh oh!

datawrestler Jul 11, 2019

Choose a reason for hiding this comment

Uh oh!

datawrestler Jul 11, 2019

Choose a reason for hiding this comment

Uh oh!

datawrestler Jul 11, 2019

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants