Technical Requirements
This section lists high-level technical requirements.
1. Mock data generation.
-
1.1. Datatypes: Support for primitive datatypes such as integer, double,
date, and string.
-
1.2. Collections: support for populating collections with real
or mock values. There should be a simple way to configure the range of
values in a mock collection. It should be easy to access the values from
a collection many times.
-
1.3. Iterators: values can be accessed from a collection with a simple
and well-known interface (e.g. java.util.Iterator). It should be
possible to access the values from a collection repeatedly, without
creating a new iterator each time.
-
1.3. Containers: A variety of containers should be available for
grouping collections. The container should define the group's behavior.
For example, a simple container simply generates a value from each
collection in turn. A sequence container generates all values from the
first collection before moving on to the second collection, and so on. A
parallel container generates values from each collection concurrently.
-
1.4. Functions: support for applying various operators to the values
generated from one or more iterators. For example, an addition operator
might add the values generated by three iterators. A function should
itself look like an iterator, by returning the result of its operator.
2. Data presentation.
-
2.1. Source/Sink: Support for a variety of data sources and sinks, for
example: standard input/output/error, files, JDBC, and Java objects
through reflection.
-
2.2. Streaming: Support for input and output streams. For example, a
stream of stock quotes is used as input. As another example, a stream of
mock CPU load data is generated as output.
-
2.3. Structure: Support for a variety of data structures, for example
tabular and tree structures.
-
2.4. Format: Support for a variety of formats, for example
tab-delimited, CSV, and XML.
-
2.5. Interaction with mock data: It should be easy to associate data
collections with I/O elements. It should be easy to read (real) data values
from a data source and mix them with mock data values.
3. Configuration. It should be easy to configure datamixer elements for
access in a given language. For example, elements accessed in Java are
configured in XML. A configuration script allows element attributes to be
defined, and elements to be placed in relationship with one another. For
example, a script could allow an iterator to be created from a particular
collection. The iterator could be named, and accessed at runtime by a Java
program.
-
3.1. Variables. It should be possible to assign a name to a value
defined in a configuration, so that the value can be accessed later by
name at runtime.
-
3.2. Namespaces. It should be possible to define lexically scoped
regions in a configuration script, so that names defined in one region
can be the same as names in another region.
4. Deployment
-
4.1. Language support. It should be possible to write a datamixer
application in a variety of languages, for example Java, Perl, or
Javascript.
-
4.2. Context. It should be easy to run a datamixer application
standalone (for example as a Perl or Java application) or embedded in
another application (for example, a JSP page).
5. Performance. It should be possible to generate large, complex datasets in
a reasonably short time.