Tech Tips
Sampling Records in IBM SPSS Modeler
Posted on
To improve your experience using IBM SPSS Modeler, the Version 1 SPSS experts have created various Tech Tips. This Tech Tip shows sampling records in IBM SPSS Modeler.
IBM SPSS Modeler is an extensive predictive analytics platform designed to bring predictive intelligence to decisions made by individuals, groups, systems, and the enterprise. Modeler has an easy-to-use drag-and-drop user interface with a complete set of tools for accessing data, data examination, preparation, modelling, evaluation, and deployment.
IBM SPSS Modeler users have a complete toolset to build predictive models from start to finish. Modeler uses node-based, visual programming. Users pick nodes from palettes and place them on the stream canvas. Once nodes have been placed on the stream canvas and edited, they can be linked to form a stream. A stream represents a flow of data through several operations (nodes) to a destination that can be in the form of output (either text or chart), a model, or the export of data to another format (e.g., a database).
Sampling records in IBM SPSS Modeler is an essential task in data understanding, examination, validation, and preparation. The Sample node makes sampling data easy and is located on the Record Ops palette. To sample data, go to the Record Ops palette. Select the Sample node and drag it onto the stream canvas.
You can also double-click the node to drop it onto the stream canvas. Once it is on the canvas, you can connect it to your stream. Double-click to open the node. Select the sample percentage. For example, we want to take a small (5%) random sample from our data. Under sample, select random %. Enter the proportion value. Click OK. Now, we have a smaller random sample. The Sample node will show the proportion taken on the node.
Related Tech Tips
Our SPSS experts have created a range of Tech Tips for IBM SPSS Modeler. Take a look through.