Tech Tips
Aggregate Records in IBM SPSS Modeler
Posted on
To improve your experience using IBM SPSS Modeler, the Version 1 SPSS experts have created various Tech Tips. This Tech Tip shows how to aggregate records in IBM SPSS Modeler.
IBM SPSS Modeler is an extensive predictive analytics platform designed to bring predictive intelligence to decisions made by individuals, groups, systems, and the enterprise. Modeler has an easy-to-use drag-and-drop user interface with a complete set of tools for accessing data, data examination, preparation, modelling, evaluation, and deployment.
IBM SPSS Modeler users have a complete toolset to build predictive models from start to finish. Modeler uses node-based, visual programming. Users pick nodes from palettes and place them on the stream canvas. Once nodes have been placed on the stream canvas and edited, they can be linked to form a stream. A stream represents a flow of data through several operations (nodes) to a destination that can be in the form of output (either text or chart), a model, or the export of data to another format (e.g., a database).
One of the most common tasks in data preparation is aggregation. Aggregation changes the unit of analysis. For example, we want to aggregate or group data by region for analysis. One region can have many rows of data, and aggregation summarises the specific region information in one row.
To aggregate records in IBM SPSS Modeler, the Aggregate Node makes data aggregation quick and easy and is located on the Record Ops palette. To aggregate data, go to the Record Ops palette. Select the Aggregate node and drag it onto the stream canvas. You can also double-click the node to drop it onto the stream canvas.
Once it is on the canvas, you can connect it to your stream. Double-click to open the node. Select the field to aggregate data on. For example, the data will be aggregated by ‘REGION’. Next, choose Aggregate fields using the field chooser button. We will select ‘INCOME’. Click OK. Select statistics. We will tick the mean, the minimum and the maximum. Also, Record Count will provide counts for each region. Click OK. Now, your data will be aggregated.
You can preview the data by clicking on the Preview button. The Aggregate node allows for quick aggregation and the ability to select summaries for data.
Related Tech Tips
Our SPSS experts have created a range of Tech Tips for IBM SPSS Modeler. Take a look through.