A high-performance server that searches for strings in a text file and responds whether the string exists as a whole line in the file.
- Accepts TCP connections and searches for strings in a specified file
- Responds with "STRING EXISTS" or "STRING NOT FOUND"
- Handles multiple concurrent connections using multithreading
- Configurable SSL authentication
- Detailed logging
- Can optionally re-read the file for each query (REREAD_ON_QUERY option)
- Python 3.7 or higher
- Required Python packages (see requirements.txt)
The easiest way to install the server is to use the provided installation script:
# From within the algoscience directory, run the installation script as root
sudo ./install.shThe installation script will:
- Install essential dependencies
- Generate SSL certificates if they don't exist
- Update config.ini to use the correct path to 200k.txt
For a more lightweight setup during development, you can use the minimal setup script:
# Run the minimal setup script
./setup_minimal.shThis will install essential dependencies and generate a small test file.
If you prefer to install the server manually, follow these steps:
pip3 install -r requirements.txtEdit the config.ini file to set the file path, port, SSL options, etc.
[Server]
host = 0.0.0.0
port = 44444
max_payload_size = 1024
linuxpath=/path/to/your/file.txt
reread_on_query = False
enable_ssl = False
ssl_cert_file = certs/server.crt
ssl_key_file = certs/server.key# Create certs directory
mkdir -p certs
# Generate a self-signed certificate
openssl req -x509 -newkey rsa:4096 -keyout certs/server.key -out certs/server.crt -days 365 -nodes -subj "/CN=localhost"python3 server.py --config config.ini# Search for a string
python3 client.py --host localhost --port 44444 "your search string"
# With SSL
python3 client.py --host localhost --port 44444 --enable-ssl "your search string"
# Batch mode (search multiple strings from a file)
python3 client.py --host localhost --port 44444 --batch-file strings.txt
# Run a benchmark test
python3 client.py --host localhost --port 44444 --benchmark --threads 10 --iterations 5 "test string"
# Run a load test
python3 client.py --host localhost --port 44444 --load-test --start-threads 1 --max-threads 50 --step 5 --duration 10 "test string"The server can be configured using the config.ini file. Here's what each configuration option means:
host: The host to bind to (default: 0.0.0.0)port: The port to bind to (default: 44444)max_payload_size: Maximum payload size in bytes (default: 1024)linuxpath: Path to the file to search in (REQUIRED)reread_on_query: Whether to re-read the file on every query (default: False)enable_ssl: Whether to enable SSL (default: False)ssl_cert_file: Path to the SSL certificate filessl_key_file: Path to the SSL key file
# Run all the tests
pytest
# Run the tests with coverage
pytest --cov=.
# Run specific test files
pytest tests/test_server.py
# Run the speed tests
python3 run_speed_test.py
# Run with smaller file sizes for quicker testing
python3 run_speed_test.py --small
# Run all tests and generate a submission package
python3 run_all.pyThe server is designed to work with files up to 250,000 rows with the following performance targets:
- 40ms execution time with REREAD_ON_QUERY=True
- 0.5ms execution time with REREAD_ON_QUERY=False
Based on our benchmarks, we've implemented the following search algorithms for optimal performance:
- When REREAD_ON_QUERY is False: Set-based search for O(1) lookup time
- When REREAD_ON_QUERY is True: Line-by-line search for minimal memory usage
For detailed performance benchmarks, see the speed testing report in the benchmark_results directory.
The server implements several security features:
- Buffer overflow protection
- Configurable SSL authentication
- Input validation
- Robust error handling
This project is proprietary and confidential. Do not share it or any part of it without written permission.