Shells | Development Environments | Git | Virtural Environments | Markdown and Editors | Programming Languages | Task Management | Specialized Tools
A shell is a computing environment where commands can be interpreted, evaluated, and its output displayed (i.e., an instance of a read–eval–print loop (REPL)). A good shell provides access to a rich set of commands and allows simple programming of commands, which can be used to create powerful scripts and tools.
But with great power comes great bullshittery. Commands and their options can be terse, inconsistent, and difficult to learn. A steep learning curve often prevents novices from enjoying the eventual payoff.
If you've hardly used a command line environment before, you might want to go review this more thorough tutorial: http://swcarpentry.github.io/shell-novice/index.html -- this page is more of a collection of pointers and resources.
If you have Windows 10, you can use the Windows Subsystem for Linux! Follow the Windows 10 Installation Guide.
If you have an earlier version of Windows, do one of the following:
- Install Git for Windows. Git for Windows provides a Bash shell that includes several helpful commands.
- Install Chocolatey on Windows.
- Install gnuwin basic utils for windows:
choco install gnuwin32-coreutils.install - Update your System Environment Variables Path:
C:\Program Files (x86)\GnuWin32\bin - Restart your shell (the gnuwin32 commands will work in Git Bash too!)
- Install gnuwin basic utils for windows:
The UNIX shell tools push data from sources through filters along pipes. In a shell there are three sources of I/O: standard in, standard out, and standard error. Standard error is a specialized version of standard out, so we'll focus on standard in and standard out. The default for standard in is the keyboard and the default for standard out is to print to the shell (or console).
Pipes and redirects change standard in and standard out from defaults.
command # default standard in and standard out
command < inputFile # redirect of inputFile contents to command as standard in
command > outputFile # redirect command output to outputFile as standard out
command1 | command2 # pipes output of command1 as standard in to command2
command & # run in background, typically used for applicationsWhat does tar -zxvf ph.tar.gz do?
http://explainshell.com/explain?cmd=tar+-zxvf
A list of command line examples for interesting tasks:
http://www.commandlinefu.com/commands/browse
Essential commands:
ls: list content of a directory.cd: change directories to a new path.mkdir: make a new directory.pwd: output current directorycp: copy filesrm: rm filestouch: make a new file/update status**cat: output the contents of a file.head: output the first lines of a file.tail: output the last lines of a file.grep: search files for a key phrase.wget: retrieve file from the web.cut: extract output of a file (columns)awkandsed: Magic commands for extracting, searching, and transforming content.
You want to keep your research files organized on your local system. We suggest creating a Research folder with subfolders for specific projects. Set up a research environment using the command line.
Note that if you're using Linux on Windows, the Linux system is a virtual machine. That means that the files that make up the Linux system are not easy to find and if you do find them, you consider them read-only when accessing them outside of Linux.
- Create a
product-hunt-datasetfolder in your research environment. - Download the Product Hunt Data
wget https://s3-us-west-2.amazonaws.com/producthunt-downloads/ph-export--2016-04-01.tar.gz - Extract dataset:
tar -zxvf ph-export--2016-04-01.tar.gz - Change into
product-huntfolder. - Inspect the file contents:
head posts--2016-04-01_14-36-24-UTC.csv - Estimate number of contents:
cat posts--2016-04-01_14-36-24-UTC.csv | wc -l
Missing wget? https://chocolatey.org/packages/Wget
- Write a simple shell script to find the largest file
Hint: (loop through each file in directory and runwc -l
The shell is a general programming language that can be parameterized. You may find it useful to create a bash script to automate frequent commands as part of data collection and analysis. There are many online resources (like Ryans Tutorials - Bash Scripting Tutorial to help with the creation of a bash script.
e=expansion
$e
$(command)
'literal string'
"string with \$ $e"Command can run sequentially or conditionally:
command1 ; command2
(command1 ; command2) # in a sub-shell
command1 || command2 # do command2 only if command1 fails
command1 && command2 # do command2 only if command1 succeedsUsual conditionals and loops:
if command; then
commands
fi
while command; do
commands
done
while read var; do
commands
done
# looping over lists
for var in a b c; do
commands # that can access $var
done
# looping over numerics
for((x=1;x<=10;x++); do
commands # that can access $x
done
case word in
pattern1) commands1;;
pattern2) commands2;;
esac