-
Notifications
You must be signed in to change notification settings - Fork 3
contributing
The DCL project is a mixture of data science, script writing, and (hopefully eventually) software package development.
We aim to keep the barriers to becoming a contributor low, which means that we aim to have a tutorial for EVERYTHING - even for things that some people might say "google it". In some cases, our tutorials will provide links to webpages that describe what to do or give more details.
If you see an area where a tutorial doesn't exist or could be improved, please add an issue to the overview repo. Your issue should state what you were trying to do, where you got stuck, and what you plan to add.
There are a number of steps to contributing using git (version control software) and github (cloud based storage space). The major steps are:
- Set up a remote github account (in the cloud)
- Fork a repository (repo for short) from the remote DCL account, e.g. mitolin, to your remote origin account
- Pictures of the fork button can be found on GitHub Help pages
- Using a terminal on your 'local' computer, navigate to a directory where you want to keep your new DCL repo
- Use the command line to clone your forked repo from your remote origin account to your local machine, e.g.
git clone https://github.com/deena-b/mitolin.git(replace 'deena-b' with your own github username) - Connect your local repo to the upstream remote repo
git remote add upstream https://github.com/deepcelllineage/mitolin.git- If you mistakenly cloned the repo from upstream, rename the remote from origin to upstream and connect your local repo to your remote origin
git remote rename origin upstreamgit remote add origin https://github.com/deena-b/mitolin.git
- View all branches
git branch -a- Note the presence of the following branches
remotes/upstream/master
remotes/upstream/dev
- If you don't see a dev branch, make one, move onto it and tell it to track the upstream version, all with the single command
git checkout --track upstream/dev - If a dev branch already exists, move onto it
git checkout dev - Determine your current status
git status- You should see the following
- You are on branch dev
- Your branch is up to date with 'upstream/dev'
- If you do not see the above line, set your branch to track upstream/dev with:
git branch -u upstream/dev
- If you do not see the above line, set your branch to track upstream/dev with:
- Nothing to commit, working tree clean
- You should see the following
- Create and move onto your very own local feature branch
git checkout -b feature_name- Take a minute to think of a good name for your feature branch (naming things in programming is notoriously hard, but don't worry, you will get better with practice and that's what the DCL project is all about)
- start your branch name with a short word that helps you remember what you plan to work on in this branch, e.g. "distcalc" for the distance calculator tutorial issue
- Next use CAPS to write you initials (use 3 initials!!!) e.g. mine are "DRB"
- Your feature should always relate to an issue. If a relevant issues doesn't exist, then submit one! At the end of your branch name write the issue number "i#".
- A full branch name looks like this: "distcalcDRBi12"
- Take a minute to think of a good name for your feature branch (naming things in programming is notoriously hard, but don't worry, you will get better with practice and that's what the DCL project is all about)
- Make files
- Make a file in the nb directory
- This can be a .md or .ipynb
touch distcalctut.md
- In the first line write what you aim to accomplish for your new 'feature'. e.g.:
- "The aim of this feature is to create a tutorial that breaks down BioPython's distance calculations for nucleotides"
- This can be a .md or .ipynb
- Make any other files or folders that you need, eg a .sh file in the
src/directory or a dated folder for generated data in thedata/gen/nguyen_nc_2018/directory

-
When to add (stage), commit, and when to make a pull request to merge your branch with the dev branch???
- Why stage?
- Staging allows you to customize what goes into a commit. For example, if you make three changes and only two relate to each other, you can stage & commit 2, then stage and commit the other change separately
- When to commit??
- After you made someting work
- After you made a meaningful change
- Mantra: Commit Often, Perfect Later
- How to use
git diff- git diff can compare commits, branches, files and more
-
git diffcompares changes since last commit -
git diff branch1 branch2- a space means compare the tips of each branch
- instead of a space you could use two dots between the branches
- 3 dots changes branch1 into a ref of the shared common ancestor commit between the two diff inputs
- a space means compare the tips of each branch
-
git diff branch1 branch2 fileA- just shows differences of fileA between the two branches
-
git squash- before a PR
-
git push- to origin often
- to remote - when you have finished with a feature and are ready to merge it to the dev branch
- Why stage?
-
Push your work to your remote origin whenever you get interupted, regardless of whether it is the end of the day or you need to work on another DCL issue
git statusgit add filenamegit commit -m "describe what you changed in the file"- (repeat
git add ...&git commit ...for each file or just usegit commit -a -m "summarize what you did") - You could add (aka stage) and commit changes as you go if you want to keep track of your changes in smaller steps
git push origin feature_branch_name
-
Return to work
-
git checkout dev -
git fetch upstream -
git status -
If there are differences,
git rebase upstream/dev- You should be able to see a log of diffs somewhere...
git show
- You should be able to see a log of diffs somewhere...
-
git checkout your_feature -
Pull any updates into your feature branch, so you are working with the most up-to-date files
git rebase upstream/dev -
If your uncommitted changes clash with the differences you have two choices
-
git stash(and then what if you can'tgit stash pop????) or - Commit the changes and manually go through the diffs to choose which to keep
-
-
Done with your feature? Rebase it to the dev branch
git checkout devgit rebase feature_branch
-
When do we get to rebase dev to master?
- When it the group moves from one biggish issue to another. For example, when we feel we fully understand and have created a cohesive set of notebooks that explain distance calculations with BioPython
-
If you are interested in contributing, email Deena (deenab7 at gmail dot com) with your github username. Deena will add you as a collaborator with write permissions to the overview repo and a collaborator with Triage permissions to any other repo you request. Write permissions allow you to push directly to all branches of the overview and overview/wiki repos. Traige permissions allow you to accept PRs.
-
Note that github wiki pages are weird! You should clone the overview/wiki from the remote upstream (i.e.
git clone https://github.com/deepcelllineage/overview.wiki.git) and push directly there. When I tried to connect my origin overview.wiki it deleted all my upstream history!! If you have a better understanding of how this works, please make an issue in the upstream overview repo to tell us what you know.
Zvonimir Spajic's 3 part Hackernoon blog on Data Storage, Branching, Indexing
Indexing/Staging
Git Diff
Git Workflow
Best Practices
If you don't already have a github account, set one up here.
The first steps in becoming a contributor (or a user) are to fork and clone a repo.
- See instructions from GitHub Help pages
To open the file where this information is stored (.git/config) type git config --edit.
The DCL group will aim to use the commonly used git workflow imaged below with some modifications related to the nature of our project.
