Qish is an implementation of OSTEP's wish shell with additional pipe operator support.
Thanks for visiting, and I would love for you to try the program out, and give some feedback on the code.
Simply compile ./shell.c
then run the executable ./shell
- Built in command: exit, cd, path
- File redirection: >
- Parallel Commands: &
- Pipe functionality (e.g.
ls&ls >output.txt |wc -l) - Simple Program Errors
- External Commands: Should run almost any exec where it's input and output (additionally, even man and ssh work)
- No command history (up/down arrows)
- No nested redirection (e.g.,
ls > out1.txt > out2.txt) - No environment variable support
- Limited path management
Here are some commands to try out:

This passes the WISH correctness tests. The tests/, tests-out/, tester/, test-shell.sh folders are from the OSTEP repo to help me do correctness tests.
bash test-shell.sh to run these tests.
Qish Overall average: 0.298ms
Bash Overall average: 30.606ms
Interestingly, qish performs quite fast despite the lack of optimization relative to bash. However, this is likely due to the lack of functionalities of the shell. (E.g. No histories from up down arrow).
The tests cover a variety of operations including basic file operations, file content operations, directory manipulation, complex piping operations, to system information in attempt to simulate performance.
More details on the tests can be found in performance.c.
Shell Performance Benchmark Results
Date: Sun Jan 26 15:24:02 2025
Number of iterations per test: 100
Results for Bash:
----------------------------------------
Basic Command Tests:
Parallel execution time: 108.886 ms
Redirection time: 2.651 ms
Built-in command time: 2.014 ms
External Command Tests:
Command Time (ms)
----------------------------------------
Simple directory listing (ls) 3.322
File Content Analysis (wc shell.c) 3.643
File Content Viewing (more shell.c) 3.793
File Comparison (diff shell.c performance.c) 5.720
Directory Creation (mkdir TEST) 3.025
Directory Removal (rmdir TEST) 3.747
Recursive directory traversal (ls -R /etc) 6.848
Process Information - Heavy system call (ps aux) 51.784
System information (uname -a) 2.564
Directory listing with pipe and counting (ls -1 /etc | wc -l) 4.267
sort unique with output (cat shell.c | sort | uniq > uniq.txt) 7.842
remove test file, if exists (rm -f uniq.txt) 3.133
Summary:
Average external command time: 8.871 ms
Overall average: 30.606 ms
Results for qish:
----------------------------------------
Basic Command Tests:
Parallel execution time: 0.302 ms
Redirection time: 0.294 ms
Built-in command time: 0.289 ms
External Command Tests:
Command Time (ms)
----------------------------------------
Simple directory listing (ls) 0.296
File Content Analysis (wc shell.c) 0.302
File Content Viewing (more shell.c) 0.312
File Comparison (diff shell.c performance.c) 0.318
Directory Creation (mkdir TEST) 0.314
Directory Removal (rmdir TEST) 0.329
Recursive directory traversal (ls -R /etc) 0.324
Process Information - Heavy system call (ps aux) 0.293
System information (uname -a) 0.288
Directory listing with pipe and counting (ls -1 /etc | wc -l) 0.295
sort unique with output (cat shell.c | sort | uniq > uniq.txt) 0.341
remove test file, if exists (rm -f uniq.txt) 0.323
Summary:
Average external command time: 0.307 ms
Overall average: 0.298 ms
(This serves to help me remember and for interested people to quickly understand structure)
After starting, the code operates as a continuous while loop.
First, fetching the line that the user types in. Then, does (a lot of) parsing, which is the majority of the code.
Since any shell implementation handles a set of internal commands (cd, exit, path...), and delegates external commands to external executables, the actual running of some program (such as ssh) is not done by the shell itself.
Therefore, the main job of the shell for each command, is to:
- Get input
- Initial parsing for checking.
- Check if the input is an internal command (something the shell itself should handle)
- Otherwise, (do more parsing), find the right external executable's directory, and pass it correct arguments.
- Rinse and repeat (unless error or exit)
Which is roughly the structure of this program in the while loop of the main function starting from Line 85.
There are a few more things that I found interesting and challenging.
At first to format the raw char array sequence, I made null_terminate_input and collapse_white_space_group. These serve to do preliminary formatting.
Then, as commands got more complex since there were redirection >, parallel execution &, and piping |.
Redirection operators: split_input_redir_operator
For parsing parallel and pipe commands: parse_operator_in_args
The program currently supports arbitrary configurations of parallel commands and piping of output (Something like ls >output.txt |wc &ls & ls | wc -l &ls). But does not support "nested" redirection like ls > output.txt > output2.txt.
Before all of this processing begins, the semi-processed input (like, ls -l|wc)is first put into a char parsed_input[MAXLINE] array, then, we segment the individual characters into separate strings in char **args.
char **args = malloc(MAXARGS * sizeof(char*));
char **args is the main memory block of strings that every other parsing operation operates on.
For example ls>filename.txt or ls >filename.txt or ls> filename.txt can all be possible. Something like ls&ls &ls& ls is also possible.
The additional challenge was that these operators were allowed to have no space between neighboring commands.
In split_input_redir_operator, we first add each space-segregated character sequence (as a string) into args while at the same time handling the redirection operator. (FUTURE: There could be a change in implementation here to allow for arbitrary amounts of > operators)
parse_operator_in_args(&args, an-operator-either-'&'-or-'|') then does an operation on each string argument of char **args to detect the existence of the operator symbol inside.
It first initalizes a char **new_args array to add items to.
Then, it uses a continuous parsing strategy to find the operator symbol inside a string argument. If there exists an operator, add to the new_args array, and continue searching inside the same string for more until there is non left. In addition, it adds the other relevant string contents, if they exist, after the operator symbol to the new_args array.
At end we get a new_args array with segregated content for a specific operator.
We can then run the parse_operator_in_args function again with another operator.
For more information on the algorithm: Line 321
// parse_operator_in_args(args, "|"): args {"ls|wc", ">", "output.txt"} -> {"ls", "|", "wc", ">", "output.txt"}
Using this parsing technique, we get an array of arguments with clearly segregated operators.
For example, this enables us to break commands up into parallel commands by the "&" operator. Which we run a for loop through.
The pipe implementation was the hardest engineering problem of the project, and it was beyond OSTEP's specifications.
Points to solve:
- Output passing between consecutive commands
- Handling > operator and | at the same time
Initially, I'd thought, ok, I'd create an array of pipes to pass information from one index to the next. While parsing each command, I could check for the existence of a redirection operator and then send the output of the program to both files.
It turns out that this solution was too naive. It's not possible to direct the process out to both an output file and a pipe at the same time (Unless using tee, which is a non-unix native command).
So, the idea was to implement a "personal pipe" for each of the separate commands. First, set my output for the process to the personal pipe. Then, read from the personal pipe into a buffer to pass it to the designated pipe which the next program reads from, AND, the output file for this specific command.
And I think it worked.
This was a pain in the 🍑.
As mentioned above, the main block of memory operated upon is the char **args block.
- My parsing strategy causes the modification of the total number of malloc'ed elements in
char **args, due to parsing for non-properly formatted operators.
For example, the args of "ls -l| wc" would have size 3 and then would have size 4 after parsed. It is set to a more extended array of strings when doing parse_operator_in_args.
- String constants in args are not free'd at the end of each loop because they are set to NULL for proper command formatting.
For example, for parallel commands, as execv needs null-terminated string arrays to be passed in for each command. e.g. "ls", "-l", "&", "ls" is turned into "ls", "-l", NULL, "ls". For each command's execution, I would pass in the pointer referring to the initial string of the first command. The first pointer would be "ls" and the second is also "ls".
(FUTURE: I could've just used string literals for operators when parsing them, then I don't need to free)
- Other miscellancious cases
Since args is the memory block that stores the information for all commands and their parsed form, we can devise a iterative mechanism to free args after command execution.
From Line 194 - 196, we free the args memory block using free_args_elements.
Ok, but just freeing it naively until the memory is a NULL element will not work due to Problem 1.
Ok, there may exist elements after NULL pointers, so maybe we can just check until MAXARGS length. But I realized that malloc'd memory space might still contain random non-NULL garbage values, which causes double-free errors.
So, this ultimately needed a tracking mechanism for the total number of arguments that are in the args to free. Therefore, I implemented a counter mechanism globally using number_of_args. And, in parse_operator_in_args, whenever we are incrementing the size of args, I'd count the new number of arguments. Using the old global size, I'd free the old args array accordingly. Then I'd update the global_size to the new args size.
This solves problem 1.
Problem 2 is an extension of problem 1. Although each part of args is freed at the end of each loop, some operators are set to NULL.
Therefore, before the explicit set to NULL of the operator, we free the operator first.
Examples:
Line 479 - We free "&" in configure_parallel before the set to NULL, which is meant to segregate processing by execv for each command.
Line 641 - We free "|" in execute_piped_commands before we set the pointer to it to NULL for proper execv formatting.
Line 666 - Free ">" before we set it to NULL for execv formatting. We didn't need to free the filename after it because the ending free loop in main will catch it.
By explicitly freeing every operator string before we set that pointer position to NULL for execv processing, we prevent the memory leak.
Problem 3 is the other cases:
- Processing and freeing global var paths
- Freeing input at the end
- Freeing function locally created strdup variables (e.g. the
file_namepart of the struct inexecute_piped_command)


