Skip to content
Erich Seifert edited this page Sep 21, 2017 · 2 revisions

Importing, filtering, and exporting data

GRAL provides a set of interfaces and classes to deal with data sources. The most important is de.erichseifert.gral.data.DataSource. This interface is used to feed plots with data and also as input for data filters.

DataSource objects can also processed various ways and the processed data can then be exported to files.

Getting a DataSource

First, we will read some data from a tab-separated CSV file which resides on the local file system. To do this we create a new method called readData:

public static DataSource readData(String filename) throws IOException {
    FileInputStream dataStream = new FileInputStream(filename);
    DataReaderFactory factory = DataReaderFactory.getInstance();
    DataReader reader = factory.get("text/tab-separated-values");
    try {
        DataSource data = reader.read(dataStream, Double.class, Double.class);
        return data;
    } finally {
        dataStream.close();
    }
}

The method has one parameter: the filename. This way we can reuse it to read similar data from different files. First, readData opens the file as a java.io.FileInputStream. Then it uses the instance of the class de.erichseifert.gral.io.data.DataReaderFactory to get a new de.erichseifert.gral.io.data.DataReader object. For CSV files with tabs as spearator character we have to use the MIME type text/tab-separated-values. The resulting object is responsible for actually doing the work. We could also change the settings on how the data should be imported using the DataReader. Next, use the read method of the reader to import the data into a DataSource object. Errors can occur during the import so we have to take care the data source we be closed even when there is an error. The read method expects an java.io.InputStream object and a list of the column data types.

We have to be take care of two types of exceptions which can occur in this piece of code: A FileNotFoundException maybe thrown if the file doesn't exist and an java.io.IOException maybe thrown when there is an error while reading the CSV data.

The following image shows the result of the data that was imported from a possible CSV file:

https://eseifert.github.io/gral/tutorial/data/unfiltered.png

Sorting

The data we just read from the CSV file can be very scattered and isn't necessarily ordered. However, in the next step we want to filter the data and for this the data has to be sorted.

DataSource is read-only as it has to support various types of input and not all of them may be changeable or sortable. That's why a DataSource object can't be sorted. To sort the data we have to make it writeable, e.g. put all the data in a DataTable object. To do this we just have to wrap a DataTable object around our DataSource:

DataSource source = readData("myData.csv");
DataTable data = new DataTable(source);

This creates a modifiable copy of the data we read from the CSV file.

Now, we can sort the data by the first column in an ascending order using the sort method:

data.sort(new Ascending(0));

The de.erichseifert.gral.data.comparators.Ascending object determines the ordering and gets the index of the column as a that should be sorted.

Filtering

GRAL can be used to change data in various ways: you can query statistics like arithmetic mean or median, rescale the data, or filter the data values using a convolution kernel.

To do convolution filtering we first have to define a filtering kernel. For the data of this tutorial we create a 30 elements wide filter kernel with uniform values which is centered (offset at 15):

Kernel kernel = KernelUtils.getUniform(30, 15, 1.0).normalize();

Next, we can use the filter to create a new DataSource object we can plot later on:

DataSource filtered = new Convolution(data, kernel, Mode.REPEAT, 1);

The following image shows the result of the filtering. This time the data points are connected using lines:

https://eseifert.github.io/gral/tutorial/data/filtered.png

Exporting

Exporting data works exactly like reading so we can just write the method writeData which takes a data source and a filename as parameters and exports the filtered data to a CSV file:

private static void writeData(DataSource String filename) throws IOException {
    FileOutputStream dataStream = new FileOutputStream(filename);
    DataWriterFactory factory = DataWriterFactory.getInstance();
    DataWriter writer = factory.get("text/tab-separated-values");
    try {
        writer.write(data, dataStream);
    } finally {
        dataStream.close();
    }
}

Plotting

Finally, we can plot the unfiltered data as points and the convolved data as a line in a nice x-y plot:

XYPlot plot = new XYPlot(data, filtered);

plot.setPointRenderers(filtered, null);
DefaultLineRenderer2D lineRenderer = new DefaultLineRenderer2D();
lineRenderer.setColor(Color.BLUE);
plot.setLineRenderers(filtered, lineRenderer);

The code first creates an of instance of de.erichseifert.gral.plots.XYPlot and passes the data sources to its constructor. By default the data sources are displayed as points, so the code doesn't have to change anything for the unfiltered data. The filtered data should be shown as a line and so the code sets the point renderer to null. This will cause no point to be displayed. Next, a de.erichseifert.gral.plots.lines.DefaultLineRenderer2D object is created. The color setting of renderer are then changed using the setColor method. Finally, the data source and the line renderer objects are connected using the method setLineRenderers of the XYPlot instance.

This image shows the result of our little tutorial:

https://eseifert.github.io/gral/tutorial/data/final.png

Final code

This is how the final code would look like if we want to show the plot in a Java Swing window:

import java.awt.BorderLayout;
import java.awt.Color;
import java.io.*;

import javax.swing.JFrame;

import de.erichseifert.gral.data.*;
import de.erichseifert.gral.data.comparators.Ascending;
import de.erichseifert.gral.data.filters.*;
import de.erichseifert.gral.data.filters.Filter.Mode;
import de.erichseifert.gral.io.data.*;
import de.erichseifert.gral.plots.XYPlot;
import de.erichseifert.gral.plots.lines.DefaultLineRenderer2D;
import de.erichseifert.gral.ui.InteractivePanel;
import de.erichseifert.gral.util.Insets2D;

public class DataFiltering extends JFrame {

    public DataFiltering() throws IOException {
        DataSource source = readData("myData.csv");
        DataTable data = new DataTable(source);
        data.sort(new Ascending(0));

        Kernel kernel = KernelUtils.getUniform(30, 15, 1.0).normalize();
        DataSource filtered = new Convolution(data, kernel, Mode.REPEAT, 1);

        writeData(filtered, "myDataFiltered.csv");

        XYPlot plot = new XYPlot(data, filtered);

        plot.setPointRenderers(filtered, null);
        DefaultLineRenderer2D lineRenderer = new DefaultLineRenderer2D();
        lineRenderer.setColor(Color.BLUE);
        plot.setLineRenderers(filtered, lineRenderer);

        plot.setInsets(new Insets2D.Double(20.0, 50.0, 40.0, 20.0));
        getContentPane().add(new InteractivePanel(plot), BorderLayout.CENTER);

        setDefaultCloseOperation(EXIT_ON_CLOSE);
        setMinimumSize(getContentPane().getMinimumSize());
        setSize(800, 400);
    }

    private static DataSource readData(String filename) throws IOException {
        FileInputStream dataStream = new FileInputStream(filename);
        DataReaderFactory factory = DataReaderFactory.getInstance();
        DataReader reader = factory.get("text/tab-separated-values");
        try {
            DataSource data = reader.read(dataStream, Double.class, Double.class);
            return data;
        } finally {
            dataStream.close();
        }
    }

    private static void writeData(DataSource data, String filename) throws IOException {
        FileOutputStream dataStream = new FileOutputStream(filename);
        DataWriterFactory factory = DataWriterFactory.getInstance();
        DataWriter writer = factory.get("text/tab-separated-values");
        try {
            writer.write(data, dataStream);
        } finally {
            dataStream.close();
        }
    }

    public static void main(String[] args) throws IOException {
        DataFiltering df = new DataFiltering();
        df.setVisible(true);
    }
}

Clone this wiki locally