-
Notifications
You must be signed in to change notification settings - Fork 34
data
GRAL provides a set of interfaces and classes to deal with data sources. The
most important is de.erichseifert.gral.data.DataSource. This interface is
used to feed plots with data and also as input for data filters.
DataSource objects can also processed various ways and the processed data
can then be exported to files.
First, we will read some data from a tab-separated CSV file which resides on
the local file system. To do this we create a new method called readData:
public static DataSource readData(String filename) throws IOException {
FileInputStream dataStream = new FileInputStream(filename);
DataReaderFactory factory = DataReaderFactory.getInstance();
DataReader reader = factory.get("text/tab-separated-values");
try {
DataSource data = reader.read(dataStream, Double.class, Double.class);
return data;
} finally {
dataStream.close();
}
}The method has one parameter: the filename. This way we can reuse it to read
similar data from different files. First, readData opens the file as a
java.io.FileInputStream. Then it uses the instance of the class
de.erichseifert.gral.io.data.DataReaderFactory to get a new
de.erichseifert.gral.io.data.DataReader object. For CSV files with tabs
as spearator character we have to use the MIME type
text/tab-separated-values. The resulting object is responsible for
actually doing the work. We could also change the settings on how the data
should be imported using the DataReader. Next, use the read method
of the reader to import the data into a DataSource object. Errors can
occur during the import so we have to take care the data source we be closed
even when there is an error. The read method expects an
java.io.InputStream object and a list of the column data types.
We have to be take care of two types of exceptions which can occur in this piece
of code: A FileNotFoundException maybe thrown if the file doesn't exist
and an java.io.IOException maybe thrown when there is an error while
reading the CSV data.
The following image shows the result of the data that was imported from a possible CSV file:
The data we just read from the CSV file can be very scattered and isn't necessarily ordered. However, in the next step we want to filter the data and for this the data has to be sorted.
DataSource is read-only as it has to support various types of input and
not all of them may be changeable or sortable. That's why a DataSource
object can't be sorted. To sort the data we have to make it writeable, e.g.
put all the data in a DataTable object. To do this we just have to wrap
a DataTable object around our DataSource:
DataSource source = readData("myData.csv");
DataTable data = new DataTable(source);This creates a modifiable copy of the data we read from the CSV file.
Now, we can sort the data by the first column in an ascending order using the
sort method:
data.sort(new Ascending(0));The de.erichseifert.gral.data.comparators.Ascending object determines
the ordering and gets the index of the column as a that should be sorted.
GRAL can be used to change data in various ways: you can query statistics like arithmetic mean or median, rescale the data, or filter the data values using a convolution kernel.
To do convolution filtering we first have to define a filtering kernel. For the data of this tutorial we create a 30 elements wide filter kernel with uniform values which is centered (offset at 15):
Kernel kernel = KernelUtils.getUniform(30, 15, 1.0).normalize();Next, we can use the filter to create a new DataSource object we can plot
later on:
DataSource filtered = new Convolution(data, kernel, Mode.REPEAT, 1);The following image shows the result of the filtering. This time the data points are connected using lines:
Exporting data works exactly like reading so we can just write the method
writeData which takes a data source and a filename as parameters and
exports the filtered data to a CSV file:
private static void writeData(DataSource String filename) throws IOException {
FileOutputStream dataStream = new FileOutputStream(filename);
DataWriterFactory factory = DataWriterFactory.getInstance();
DataWriter writer = factory.get("text/tab-separated-values");
try {
writer.write(data, dataStream);
} finally {
dataStream.close();
}
}Finally, we can plot the unfiltered data as points and the convolved data as a line in a nice x-y plot:
XYPlot plot = new XYPlot(data, filtered);
plot.setPointRenderers(filtered, null);
DefaultLineRenderer2D lineRenderer = new DefaultLineRenderer2D();
lineRenderer.setColor(Color.BLUE);
plot.setLineRenderers(filtered, lineRenderer);The code first creates an of instance of de.erichseifert.gral.plots.XYPlot
and passes the data sources to its constructor. By default the data sources are
displayed as points, so the code doesn't have to change anything for the
unfiltered data. The filtered data should be shown as a line and so the code
sets the point renderer to null. This will cause no point to be displayed.
Next, a de.erichseifert.gral.plots.lines.DefaultLineRenderer2D object is
created. The color setting of renderer are then changed using the
setColor method. Finally, the data source and the line renderer objects
are connected using the method setLineRenderers of the XYPlot
instance.
This image shows the result of our little tutorial:
This is how the final code would look like if we want to show the plot in a Java Swing window:
import java.awt.BorderLayout;
import java.awt.Color;
import java.io.*;
import javax.swing.JFrame;
import de.erichseifert.gral.data.*;
import de.erichseifert.gral.data.comparators.Ascending;
import de.erichseifert.gral.data.filters.*;
import de.erichseifert.gral.data.filters.Filter.Mode;
import de.erichseifert.gral.io.data.*;
import de.erichseifert.gral.plots.XYPlot;
import de.erichseifert.gral.plots.lines.DefaultLineRenderer2D;
import de.erichseifert.gral.ui.InteractivePanel;
import de.erichseifert.gral.util.Insets2D;
public class DataFiltering extends JFrame {
public DataFiltering() throws IOException {
DataSource source = readData("myData.csv");
DataTable data = new DataTable(source);
data.sort(new Ascending(0));
Kernel kernel = KernelUtils.getUniform(30, 15, 1.0).normalize();
DataSource filtered = new Convolution(data, kernel, Mode.REPEAT, 1);
writeData(filtered, "myDataFiltered.csv");
XYPlot plot = new XYPlot(data, filtered);
plot.setPointRenderers(filtered, null);
DefaultLineRenderer2D lineRenderer = new DefaultLineRenderer2D();
lineRenderer.setColor(Color.BLUE);
plot.setLineRenderers(filtered, lineRenderer);
plot.setInsets(new Insets2D.Double(20.0, 50.0, 40.0, 20.0));
getContentPane().add(new InteractivePanel(plot), BorderLayout.CENTER);
setDefaultCloseOperation(EXIT_ON_CLOSE);
setMinimumSize(getContentPane().getMinimumSize());
setSize(800, 400);
}
private static DataSource readData(String filename) throws IOException {
FileInputStream dataStream = new FileInputStream(filename);
DataReaderFactory factory = DataReaderFactory.getInstance();
DataReader reader = factory.get("text/tab-separated-values");
try {
DataSource data = reader.read(dataStream, Double.class, Double.class);
return data;
} finally {
dataStream.close();
}
}
private static void writeData(DataSource data, String filename) throws IOException {
FileOutputStream dataStream = new FileOutputStream(filename);
DataWriterFactory factory = DataWriterFactory.getInstance();
DataWriter writer = factory.get("text/tab-separated-values");
try {
writer.write(data, dataStream);
} finally {
dataStream.close();
}
}
public static void main(String[] args) throws IOException {
DataFiltering df = new DataFiltering();
df.setVisible(true);
}
}