-
Notifications
You must be signed in to change notification settings - Fork 128
(WIP) Installation Guide
A quick start guide for installing from binaries or compiling from source.
Install and configure the database of choice. MADlib currently supports the following platforms:
- PostgreSQL
- Greenplum database
- Apache HAWQ (incubating)
This guide describes the installation steps for PostgreSQL and Greenplum. (HAWQ installation steps will be added at a later date.)
PostgreSQL platform notes:
- Ensure that you install PostgreSQL with Python extension specified
- If environment variables are defined, this can save you some typing.
-
Download the MADlib binary
- Postgres: Get either the OSX or Redhat/CentOS binary from the MADlib download page
- Pivotal Greenplum Database: Download the .gppkg binary from Pivotal Network
-
Install the package at the OS level.
-
Postgres:
-
on OSX double click the installer package
-
on Redhat / CentOS run the following as root:
yum install <madlib_package> --nogpgcheck
-
-
Pivotal Greenplum Database:
-
on Redhat / CentOS run the following as gpadmin
gppkg install <madlib_package>
-
-
-
Ensure that the environment is setup for your database deployment and that the database is up and running.
-
Ensure that psql, postgres, and pg_config are in your path
which psql which postgres which pg_config -
Ensure that the database is started and running
psql –c 'select version()'
The above may need user/port/password setting depending on how the database has been configured.
-
-
Run the MADlib deployment utility to install MADlib into each database that you want to use it in:
-
Postgres:
/usr/local/madlib/bin/madpack -s madlib –p postgres install
if environment variables are defined. Otherwise use a fully defined connection string:
/usr/local/madlib/bin/madpack -s madlib -p postgres -c [user[/password]@][host][:port][/database] install-
Pivotal Greenplum Database:
/usr/local/madlib/bin/madpack –p greenplum install
The above may need user/port/password setting depending on how the database has been configured.
For more information on madpack:
/usr/local/madlib/bin/madpack --help -
-
Test your installation to validate proper installation
-
Postgres:
/usr/local/madlib/bin/madpack -s madlib –p postgres install-check -
Pivotal Greenplum Database:
/usr/local/madlib/bin/madpack –p greenplum install-check
The above may need user/port/password setting depending on how the database has been configured.
-
Requirements for installing MADlib:
- gcc (For OSX, Clang will work for compiling the source, but not for documentation.)
- An installed version of Pivotal HAWQ, Pivotal Greenplum Database 4.2+ or PostgreSQL (64-bit) 9.2+ with plpython support enabled. Note: plpython may not be enabled in PostgreSQL by default.
In the $MADLIB_ROOT (location of MADlib source) run the following commands
mkdir build
cd build
cmake ..
makeAbove, we built the executables in the build folder. This can, however, be any user-named folder (henceforth called $BUILD_ROOT).
Deploy MADlib into the database with MADlib package manager madpack located under $BUILD_ROOT/src/bin.
- to install, run `$BUILD_ROOT/src/bin/madpack -p postgres -c [user[/password]@][host][:port][/database] install`
- to make sure that the installation is successful, run `$BUILD_ROOT/src/bin/madpack -p postgres -c [user[/password]@][host][:port][/database] install-check`
- for more information on the usage of `madpack`, run `$BUILD_ROOT/src/bin/madpack --help`
The below variables will be automatically used by the madpack installer if no connection string is provided.
- User:
PGUSERorUSER(defaults to OS username) - Password:
PGPASSWORD(defaults to empty) - Host:
PGHOST(defaults to 'localhost') - Database:
PGDATABASE(defaults to OS username) - Port:
PGPORT(defaults to 5432)
An example of deploying MADlib using the environment variables:
export PGPORT=5430
export PGHOST=127.0.0.1
export PGDATABASE=madlibtest
$BUILD_ROOT/src/bin/madpack -p postgres install